Benford’s Law is used to assist in the auditing of specific transactions or balances. Principles of Benford’s Law explains that we can identify potential fraudlent transactions by comparing actual patterns of leading digits to the expected patterns indicated by the law.

library(benford.analysis)
setwd("/Users/robertvargas/Documents/Projects/R/Benford's Law Data")
PBCdata<- read.csv("AP Data.csv")

Lets double check that the sum of the invoices ties to our CSV file. I don’t need every column in order to conduct the analysis, so I include only the columns I need. Vendor names will be excluded.

colnames(PBCdata)
##  [1] "Vendor.No"        "Vendor.Name"      "Invoice.No"      
##  [4] "Invoice.Date"     "Invoice.Due.Date" "Terms.Code"      
##  [7] "Invoice.Amt"      "Invoice.balance"  "X"               
## [10] "X.1"
PBCdata<- PBCdata[,c(1,3,4,7)]
str(PBCdata$Invoice.Amt)
##  Factor w/ 973 levels "","0.75","1,000.00",..: 585 585 585 502 149 147 970 957 957 957 ...
## our data at this point is a factor, we need to convert it to numeric values
PBCdata$Invoice.Amt<- as.character(PBCdata$Invoice.Amt)
PBCdata$Invoice.Amt<- gsub(",","",PBCdata$Invoice.Amt)
PBCdata$Invoice.Amt<-as.numeric(PBCdata$Invoice.Amt)
sum(PBCdata$Invoice.Amt)
## [1] NA
## it appears we have NA values, let's find those and flush them out
which(is.na(PBCdata$Invoice.Amt))
## [1] 1232
PBCdata<- PBCdata[-c(1232),]
sum(PBCdata$Invoice.Amt)
## [1] 14439214
## this sum ties to the total payables balance at year-end



The data doesn’t exactly conform to Benford’s Law (see below). The shape of the plotted frequencies isn’t too unusual but there are clear discrepancies at digits 52 and 11. Judgement comes into play when creating these analytics, many times there are transactions that are repeat transactions that would throw off the frequencies.

bfd<- benford(PBCdata$Invoice.Amt, number.of.digits = 2)
plot(bfd)

Let’s see if there are any repeat transactions. Without too much knowledge of the client’s operations, I would assume there is a reason that a transactions would occur more than 3 times. It might make sense to analyze these as well but I will remove them from our sample to re-run our analysis.

library(plyr)
frequencies<-as.data.frame(count(PBCdata,"Invoice.Amt"))
repamounts<- subset(frequencies, freq >=3)
exclude<- which(frequencies$freq >= 3)
extransactions<- repamounts$Invoice.Amt
'%notin%'<- Negate('%in%')
ALTdata<- PBCdata[PBCdata$Invoice.Amt %notin% extransactions,]
REPtransactions<- PBCdata[PBCdata$Invoice.Amt %in% extransactions,]
bfd2<- benford(ALTdata$Invoice.Amt, number.of.digits = 2)
plot(bfd2)

Samples Selections

We can go ahead and select a sample of transactions that fall out of the frequencies stated by Benford’s Law. Let’s go with 25 selections, 15 from our Benford’s Law population and 10 from the population of repeated that transactions that we created earlier. Selections from the excluded transactions will be entirely judgmental, given the frequency of these transactions could be related to certain operational activities.

## get sample from both variables
getSuspects(bfd2,ALTdata,how.many = 15)
##      Vendor.No  Invoice.No Invoice.Date Invoice.Amt
##   1:      3196      874765      2/28/18    67705.68
##   2:      3196     1116676     10/19/18    66232.32
##   3:      3196     1125263     10/29/18    66232.32
##   4:      3196     1146425     11/20/18    63937.68
##   5:      3196     1118369     10/22/18    63912.04
##  ---                                               
## 253:      3578      295899      12/7/18       10.70
## 254:       589 2.47619E+11     12/15/18       10.44
## 255:      1586       89767     12/20/18        6.76
## 256:      3578      296526     12/13/18        6.72
## 257:      1907    IM394453      12/7/18        5.51
## based on judgement, I've selected these invoices to inquire about.
REPtransactions[c(1,4,8,13,16,22,28,45,50,56),]
##     Vendor.No Invoice.No Invoice.Date Invoice.Amt
## 1         812    5317043     10/19/18   413235.60
## 8        3196    1142802     11/15/18    91394.68
## 37       3196    1139841     11/13/18    63945.72
## 52       3196    1154325     11/30/18    61625.44
## 92       3196    1074334      9/10/18    50004.24
## 112      3196    1162731      12/7/18    47568.48
## 123      3196    1118291     10/22/18    45439.72
## 199      3196    1027276      7/26/18    25163.96
## 207      3196    1043540      8/13/18    21548.56
## 213      3196    1045323      8/13/18    20813.64

Conclusion

Using R I was able to conduct an analysis rooted in science. I think it’s important to note that we still had to use professional judgement when it came to analyzing the overall activity in the detail. Clients have different operations but Benford’s Law helps establish a standard bar to audit important financial statement accounts.