Most accounting data is not fraudulent. Thus, we do not want to analyze legitimate accounting data visually. That's not the purpose of visualization in our case. Visualization aims to inspect an account once the AI has detected unusual activities. In other words, we only need visualizations of red-flagged accounts. Once the AI has seen unusual patterns, the goal is to have a standard visualization that shows unique activities quickly without too much cognitive effort.
We want to visualize potential accounting fraud because we humans have a hard time seeing patterns in tabular data (e.g., Excel or SQL). In the table below, can you see anything suspicious? The answer follows soon.
The human brain isn't built to interpret raw data; we need clear patterns and visual cues to help us quickly make sense of complex information. Data visualization puts our prefrontal and visual cortex to work, combining the power of cognition (slow and conscious) and perception (instantaneous).
The Traditional Histogram
A histogram is a chart that plots the distribution of a numeric variable’s values as a series of bars. Each bar typically covers a range of numeric values called a bin; a bar’s height indicates the frequency of data points with a value within the corresponding bin.
We've visualized the data from before and compared it to a "modified histogram" for visualizing accounting fraud. Yes, it's the very same data!
The modified histogram visualizes the data slightly differently. While both histograms count the frequency of accounting transactions, the modified histogram only considers the first two digits. The exact calculation can be seen in the following image:
Many transactions in the dataset start with the digit 45 but have different amounts (e.g., 450000, 450060, 450090…). The traditional histograms disdain them, while the modified histogram aggregates them and makes them visible.
For this modified type of histogram, all credits go to Prof. Dr. Mark Nigrini. Technically, the modified histogram is the visualization of Benford’s Law for the digits 10 to 99.
While an AI scans all accounting transactions, visualization is unimportant. If the AI classifies a customer as low risk, a visualization would likely not reveal anything except confirm that the accounting transactions are legitimate. I'm stating "most likely" as there's one exception to the rule. As an auditor, you don't want to peruse countless visualizations of legitimate data.
However, the visualization of potential accounting fraud becomes vital once the AI has classified a dataset as high risk. In many cases, the correct visualization type can immediately point the auditor to the problems in the dataset.