Detecting Financial Statement Fraud Using Machine-Learning Methods

Xin CHEN, Yang WANG, Yifei ZHANG

Research output: Book Chapters | Papers in Conference ProceedingsBook ChapterResearchpeer-review


Financial statement fraud raises substantial concerns for regulators worldwide, and regulators face severe challenges in detecting and addressing the increased incidence of this type of fraud. This chapter compares three popular machine-learning approaches based on 35 firm-level financial and linguistic features derived from annual reports. Using hand-collected financial statement fraud data in China, we aim to compare different machine-learning models and select the most accurate fraud detection model to improve fraud detection ability. In particular, we aim to assess the predictive performance of the least absolute shrink-age and selection operator (LASSO); random forest and bagging; and support vector machine (SVM) models, and compare the results with the logistic regression method. The findings suggest that the LASSO method outperforms relative to other two methods. This chapter contributes to the literature by selecting both financial and linguistic fraud predictors and contributing to an under-researched area by employing different machine-learning algorithms to detect fraud in financial statements.
Original languageEnglish
Title of host publicationFinTech Research and Applications : Challenges and Opportunities
EditorsDaisy CHOU, Conall O'SULLIVAN, Vassilios G PAPAVASSILIOU
PublisherWorld Scientific Publishers
ISBN (Electronic)9781800612730
ISBN (Print)9781800612716
Publication statusPublished - 1 Mar 2023

Publication series

NameTransformations in Banking, Finance and Regulation
PublisherWorld Scientific
ISSN (Print)2752-5821
ISSN (Electronic)2752-583X


  • China
  • financial statement fraud
  • fraud detection
  • fraud factor
  • machine learning


Dive into the research topics of 'Detecting Financial Statement Fraud Using Machine-Learning Methods'. Together they form a unique fingerprint.

Cite this