Number Of Records In The Data Set example essay topic
For the most part, Newcomb just considered it a curiosity and left it at that. (Caldwell 2004) In the 1920's, a physicist at the GE Research Laboratories, Frank Benford, thought it more than a curiosity and conducted extensive testing of naturally occurring data and computed the expected frequencies of the digits. In Table 1, there is a table of these expected frequencies for the first four positions. Benford also determined that the data could not be constrained to only show a restricted range of numbers such as market values of stock nor could it be a set of assigned numbers such as street addresses or social security numbers. (Nigrini 1999) The underlying theory behind why this happens can be illustrated using investments as an example.
If you start with an investment of $100 and assume a 5% annual return, it would be the 15th year before the value of the investment would reach $200 and therefore change the first digit value to 2. It would only take an additional 8 years to change the first digit vale to 3, an additional 6 years to change the first digit to 4, etc. Once the value of the investment grew to $1,000 the time it would take to change the first digit (going from $1,000 to $2,000) would revert back to the same pace as it took to change it from $100 to $200. Unconstrained naturally occurring numbers will follow this pattern with remarkable predictability. (Ettredge and Srivastava 1998) In 1961, Roger Pinkham tested and proved that Benford's law was scale invariant and therefore would apply to any unit of measure and any type currency. In the 1990's, Dr Mark Nigrini discovered a powerful auditing tool using Benford's law.
He was able to determine that most people assume that the first digit of numbers would be distributed equally amount the digits and that people that make up numbers tend to use numbers starting with digits in the mid range (5, 6, 7). Therefore, performing an analysis of a set of numbers comparing their distribution of the digits to the frequency expected under Benford's Law can indicate that there may be manipulation of the set of numbers and this manipulation could represent fraud. Dr. Nigrini has continued to focus on ways to use Benford's Law as a fraud detection tool. (Walthoe et al.
1999) As I first learned of this predictability of numbers and how they could be used to detect fraud, my interest perked. I understand the basic logic of how and why this works but I am most interested in how to apply Benford's Law in the work I do. It appears fairly simple, with the power of computers and the easy access to enormous amounts of data, to quantify how a set of data compares to the expected results using digital analysis based on Benford's Law. Once you compare the results, what does it all mean? The first criterion is to have sufficient data. Dr. Nigrini suggests that a minimum of 1,000 records are needed to expect good conformity to Benford's law.
A data set with 3,000 or more records should provide excellent conformity to Benford's Law. Data sets below 300 records are not practical to be tested using Benford's Law and data sets between 300 and 1,000 records should be expected to have higher deviations from Benford's Law. Secondly, we must choose the tool we will use to perform the analysis. Microsoft Excel will perform the analysis but you will be limited to data sets with less than 65,536 records. Microsoft Access seems to be a preferred choice as it is not limited as to the number of records in the data set and it can easily read various files using its ODBC capabilities. Access also has strong grouping and data filtering features that are useful for digital analysis.
There are also many specialized data analysis software products which are usually more expensive than Microsoft Access. (Nigrini 2002) Once you have performed the analysis, how do you interpret the results? You are looking for an abnormal occurrence of a set of digits. The most appropriate test is to test the occurrence of the first 2 digits. Once you find an abnormal occurrence, you would group and filter the data to isolate the abnormal transactions and evaluate the items to determine what caused this abnormality. In one example, there was a high occurrence of the first 2 digits of 50 and 80 based on a data set of 89,000 check payments.
Upon further review, it was determined that there was a high incidence of travel advances of $500 and $800. While this abnormality did not represent fraud, it did lead to a management recommendation that credit cards be issue to key employees to reduce the need for the excessive travel advance transactions. Likewise, you can test the last 2 digits. In one example, a company that typically would not have sales that lead to zero cents tested their data based on the last 2 digits. The results produced an abnormally high incidence of sales ending in zero cents, fifty cents and seventy-five cents.
After filtering out these transactions and reviewing the sales orders, it became apparent that the sales were made to a company that did not exist and the goods were located in the garage of the sales person. (Lanza 1999) The auditor should consider that the analysis may test normal for the entire population of the data set but when it is subdivided into logical subsets such as office locations, the analysis may show abnormal distribution. Since the fraud is probably limited to a particular office, performing an analysis on these subsets should also be a part of the tests. (Albrecht 2002) Tax authorities have determined that the use of Benford's Law will help them detect fraud as taxpayers who make-up numbers when completing their tax returns fail to consider that the numbers do not follow the distribution patterns of Benford's Law. Therefore, more and more tax compliance software includes digital analysis of the numbers on tax returns to identify potential fraudulent returns. The returns must still be reviewed or audited to determine if there was fraud.
Using Benford's Law to detect fraud is limited when the incidence of the fraudulent transactions is a small number of transactions. Likewise, a data set that does not conform to Benford's Law does not mean there was fraud. This can be illustrated by the excessive travel advances mentioned earlier. If the data set "being tested has a large number of transactions, it will take a smaller proportion of inconsistent numbers to trigger a significant difference from expected... ". .
For example, it would take 75 (0.75%) abnormal transactions to signal a deviation in a data set of 10,000 records while it would take 23 (2.3%) abnormal transactions in a data set of 1,000 records. Since the analysis is relatively easy and quick, it makes sense to test the entire data set rather than a sample. (Durtschi et al. 2004) Benford's law is also limited in the type of fraud it will detect.
The person perpetrating the fraud has either added transactions or removed transactions on a basis that would not conform to Benford's law. (Durtschi et al. 2004) Benford's Law lends itself to assist in meeting the requirement under SAS 99 to use analytical tests in the planning stages of the audit. The auditor must also realize that Benford's law has limitations and cannot be relied upon as the sole method used to detect fraud.
Benford's Law is best considered as a tool to identify accounts reflecting abnormal distributions and further investigation and analysis should be performed. (Durtschi et al. A taxpayer compliance application of Benford's Law. The Journal of the American Taxation Association 18 (Spring): 72-91.
Bibliography
Albrecht, W. 2002.
Root out financial deception: detect and eliminate fraud or suffer the consequences. Journal of Accountancy (April). Caldwell, C. 2004.
Benford's law. The Prime Glossary. Durtschi, C., and Hillis on, W., and P acini, C. 2004.
The Effective Use of Benford's Law to Assist in Detecting Fraud in Accounting Data. Journal of Forensic Accounting (Vol. V): 17-34. Ettredge, M., and Srivastava, R 1998.
Using Digital Analysis to Enhance Data Integrity. Lanza, R. 1999.
Digital Analysis? Real World Eaxmples. IT AUDIT (Vol. 2, July). Nigrini, M. 1999.
I've Got Your Number. Journal of Accountancy (May). Nigrini, M. 2002.
Using Microsoft Access for Data Analysis and Interrogation: 17. Walthoe, J., and Hunt, R., and Pearson, M. 1999.