We’ve spent an almost unhealthy amount of time researching and applying (using code) Benford's Law. Coding requires you to be meticulous. If you’re not, the program doesn't run correctly or not run at all. When you're coding, spotting your mistakes and those of others is easy.

This post isn't about bragging. However, anyone spending several years researching a topic should become a world expert in his domain.

We're standing on the shoulders of titans. We couldn't have developed AuditedAI without the extensive work done by others in the past. Thus, this is our potential contribution to push research further. Who knows what the next generations will be able to solve?

Mistake #1: Aggregating Data There's a staggering difference between researching and applying it to the real world. We've never seen this mistake in research, but when you're applying code to run the software, it can easily sneak in. We've seen PhDs in statistics committing this mistake.

A simplified example: we have three companies, A, B, and C. As an auditor, analyzing all the accounting data for companies A, B, and C together leads to a high probability of a wrong conclusion. The correct way is to explore each company's accounting data separately. Namely, we investigate all the accounting transactions for company A, followed by company B, and finally company C. Or in any order you want, but do not combine all accounting transactions for all companies. That’s a grave mistake.

In data analytics lingo, analyzing the data for each company is called a “group by.” In the case of Benford’s Law, if we don’t group by company, we’re committing a fatal error.

Mistake #2: Defining Non-Conformity

All data deviates from Benford's Law expected distribution, even legitimate data. However, how much deviation is normal and determining that the data is non-conform can be challenging. What’s the wrong approach? Essentially, any approach that is not based on science is wrong. For example, adding an arbitrary range of +/- 10% for determining non-conformity is wrong. Using a statistical measure such as the Chi-square test, which assumes a normal distribution, is wrong. The expected distribution of Benford’s Law is far from normal; it’s highly skewed to the right.

So, what’s the correct approach? Essentially, any method is based on science. Science means we can create a hypothesis, test it, and make the result repeatable. For example, Prof. Dr. Mark Nigrini analyzed many natural datasets (legitimate data) and set a statistical threshold based on MAD (Mean Absolute Deviation). An alternative approach is via simulation. Through large-scale simulations, we can test a hypothesis and make it repeatable.

Mistake #3: Ignoring the Sample Size

Ignoring the sample size is a common mistake, even among experts. For example, I've seen experts (i.e., PhDs in statistics) applying Benford's Law to potential election fraud with only 200 samples. That's not enough samples! We need a relatively high number of samples before we can use Benford's Law. While the Central Limit Theorem (CLT) still holds for Benford's Law, it's substantially larger for normally distributed datasets (CLT often manifests in only 30 samples). The sample size was the focus of my doctoral thesis and based on my research, ignoring the sample size often creates too many false positives (i.e., false alarms).

Calculating the right sample size is the topic of another post. However, depending on the type of Benford's Law (e.g., the 1-9 version needs fewer samples than the 10-99 version), the sample size can reach 1,000s samples at a high XX% confidence level.

Conclusion:

If you're using Benford's Law to detect fraud, you must be careful not to commit one of those potential mistakes. Any single one of those mistakes can make your model useless.