Abstract: Forensic accountants and fraud examiners use a range of techniques to uncover fraudulent journal entries and illegal activities. As data professionals, most of us will never unravel a Bernie Madoff scheme, but we can apply these same techniques in our own environments to learn more about our data. This video will use a combination of SQL Server and the Python programming language to apply these fraud detection techniques and gain a better understanding of your data.
You will learn a variety of techniques to learn enough about your data to draw interesting inferences, starting with the use of basic analytical techniques, including regression analysis. From there, you will learn how to use cohort analysis to find outliers between groups, leading you on a data-driven approach to forensic investigation. Finally, you will review numeric techniques around data set validity, including rules around the distributions of the first and last digits in data sets.
- Set up the environment, including all materials for learners to participate in certain labs. Introduce the problem space and explore the data set. Learners will know the problem we are trying to solve and have an understanding of the kinds of data at our disposal.
- Perform summary, growth, and gap analyses. Learners will understand how to perform aggregate analyses of static data, as well as flow analyses of time-series data. Learners will additionally understand the importance of gap analysis for forensic accounting and the risks of using identity columns for accounting-critical sequences.
- Perform a regression analysis. Learners will understand how to perform regression using techniques such as Ordinary Least Squares and Ridge regression. Learners will also understand the concept of collinearity and how it can harm our understanding of a regression result. Learners will additionally understand how to build ensemble regression models using tools like gradient boosting and random forests.
- Perform cohort and time series analyses. Learners will understand how to slice data and track changes across relevant features, including time.
- Perform numeric analysis of fact data. Learners will understand the relevance and importance of first-digit and last-digit analysis of data sets for fraud detection purposes.
Note: This recorded class is available in the format of a video course. Content is presented in modular videos. Learn more.