Introduction -

Here our objective in this case is to use these data to determine the likelihood of fraud. Is it likely that a random sample of 253 items selected from the population of 3,005 items would yield a mean GPF of at least 50.8%? or, is it likely that two independent, random samples of size 134 and 119 would yield a mean GPF’s of at least 50.6% and 51.0%, respectively? To find out the information corresponding to the objective we will perform some statistical hypothesis testing to check the validity of so called fraud.

Background -

Let us first highlight the problem on which we want to discuss further as follows,

A wholesale furniture retailer stores in-stock items at a large warehouse located in Tampa, Florida. Early in the year, a fire destroyed the warehouse and all the furniture in it. After determining the fire was an accident, the retailer sought to recover costs by submitting a claim to its insurance company. As is typical in a fire insurance policy of this type, the furniture retailer must provide the insurance company with an estimate of “lost” profit for the destroyed items. Retailers calculate profit margin in percentage form using the Gross Profit Factor (GPF). By definition, the GPF for a single sold item is the ratio of the profit to the item´s selling price measured as a percentage, that is:

\[Item\ GPF = \frac{Profit}{Sales price} \times 100%\]

Of interest to both the retailer and the insurance company is the average GPF for all of the items in the warehouse. Because these furniture pieces were all destroyed, their eventual selling prices and profit values are obviously unknown. Consequently, the average GPF for all the warehouse items is unknown. One way to estimate the mean GPF of the destroyed items is to use the mean GPF of similar, recently sold items. The retailer sold 3,005 furniture items in the year prior to the fire and kept paper invoices on all sales. Rather tan calculate the mean GPF for all 3,005 items (the data were not computerized), the retailer sampled a total of 253 of the invoices and computed the mean GPF for these items. The 253 items were obtained by first selecting a sample of 134 items and then augmenting this sample with a second sample of 119 items. The mean GPF s for the two subsamples were calculated to be 50.6% and 51.0%, respectively, yielding an overall average GPF of 50.8%. This average GPF can be applied to the costs of the furniture items destroyed in the fire to obtain an estimate of the “lost” profit. According to experienced claims adjusters at the insurance company, the GPF for sale items of the type destroyed in the fire rarely exceeds 48%. Consequently, the estimate of 50.8% appeared to be unusually high. (A 1% increase in GPF for items of this type equates to, approximately, an additional $16,000 in profit.) When the insurance company questioned the retailer on this issue, the retailer responded, “Our estimate was based on selecting two independent, random samples from the population of 3,005 invoices. Because the samples were selected randomly and the total sample size is large, the mean GPF estimate of 50.8% is valid…” A dispute arose between the furniture retailer and the insurance company, and a lawsuit was filed. In one portion of the suit, the insurance company accused the retailer of fraudulently representing their sampling methodology. Rather than selecting the samples randomly, the retailer was accused of selecting an unusual number of “high profit” items from the population in order to increase the average GPF of the overall sample. To support their claim of fraud, the insurance company hired a CPA firm to independently assess the retailer’s Gross Profit Factor. Through the discovery process, the CPA firm legally obtained the paper invoices for the entire population of 3,005 items sold and input the information into a computer.

Methods -

\(\textbf{Z-test -}\)

Here we will use Z-test for the hypothesis testing. The Z-test is defined is as follows,

The assumptions for Z-test are as follows,

  1. The first assumption made regarding Z-tests concerns the scale of measurement. The assumption for a t-test is that the scale of measurement applied to the data collected follows a continuous or ordinal scale.
  2. The second assumption made is that of a simple random sample, that the data is collected from a representative, randomly selected portion of the total population.
  3. The third assumption is the data, when plotted, results in a normal distribution, bell-shaped distribution curve.
  4. The final assumption is the homogeneity of variance. Homogeneous, or equal, variance exists when the standard deviations of samples are approximately equal.

The formula for computing the Z-value is:

\(\begin{aligned}&\text{Z-value} = \frac{ mean - \mu_0 }{ \sigma } \sim N(0,1), \text{Under}\ H_0 \\& \text{where:}\\&mean_1 = \text{Average values of the sample}\\& \sigma = \text{Variance of the population} \end{aligned}\)

\(\textbf{Central Limit Theorem (CLT) -}\)

In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a bell curve) even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.

If \({ X_{1},X_{2},...,X_{n}}\) are \({ n}\) random samples drawn from a population with overall mean \({ \mu }\) and finite variance \({ \sigma ^{2}}\), and if \({ {\bar {X}}_{n}}\) is the sample mean, the limiting form of the distribution, \({ Z=\lim _{n\to \infty }{\sqrt {n}}{\left({\frac {{\bar {X}}_{n}-\mu }{\sigma }}\right)}}\), is the standard normal distribution.

Data -

Let us first read the data as follows,

FIRE = read.csv("C:/Users/Lenovo/OneDrive/Desktop/FIRE.csv", 
                header = TRUE)
head(FIRE)

Now drawing the random sample of size 253 out of 3005 furniture as follows,

set.seed(1997)
rand = sample(nrow(FIRE), 253)
df_main = FIRE[rand,]
head(df_main)

Now calculating the GPF using the following formula,

\[Item\ GPF = \frac{Profit}{Sales price} \times 100%\]

Now the highlight of data with GPF is as follows,

gpf = (df_main$Profit/df_main$Sales)*100

df_main = data.frame(cbind(df_main,gpf))
head(df_main)

Population Characteristics -

Let us plot the histogram of GPF as follows,

GPF = (FIRE$Profit/FIRE$Sales)*100
hist(GPF,prob = TRUE, main = "Histogram of GPF", xlim = c(0,100))
x = seq(0,100, by = 1)
y = dnorm(x, mean = mean(gpf), sd= sd(gpf))
lines(x,y)

So, the density curve looks like normal. So, the GPF maybe normally distributed.

Now, let us perform Shapiro-Wilk test for testing the normality of GPF as follows,

shapiro.test(GPF)
## 
##  Shapiro-Wilk normality test
## 
## data:  GPF
## W = 0.74484, p-value < 2.2e-16

Here, observe that, for the test, p-value < 0.05, so we reject null hypothesis at level 0.05. So, the population GPF doesn’t follows normal distribution.

Now the population mean and standard deviation of GPF,

pop_gpf_mean = mean(GPF) # Population mean
pop_gpf_mean
## [1] 48.89571
pop_gpf_sd = sd(GPF)  # Population standard deviation
pop_gpf_sd
## [1] 13.82595

Hence the population GPF mean is \(48.89571\) and standard deviation is \(13.82595\).

Sample Characteristics -

Now, let us visualize the sampling distribution of the whole sample i.e. 253 samples as follows,

hist(df_main$gpf)

So, the histogram looks like normal, so we can conclude that, by CLT, as sample size increases, the sampling distribution goes to normal distribution.

Now, let us test for normality as follows,

shapiro.test(df_main$gpf)
## 
##  Shapiro-Wilk normality test
## 
## data:  df_main$gpf
## W = 0.91323, p-value = 5.956e-11

As, p-value < 0.05, so we reject null hypothesis, so the sampling distribution doesn’t follows normal.

Similarly, for 134 items, the histogram is as follows,

hist(df_main$gpf[1:134])

So, the histogram looks like normal, so we can conclude that, by CLT, as sample size increases, the sampling distribution goes to normal distribution.

Now, let us test for normality as follows,

shapiro.test(df_main$gpf[1:134])
## 
##  Shapiro-Wilk normality test
## 
## data:  df_main$gpf[1:134]
## W = 0.94047, p-value = 1.703e-05

As, p-value < 0.05, so we reject null hypothesis, so the sampling distribution doesn’t follows normal.

Similarly, for 134 items, the histogram is as follows,

hist(df_main$gpf[135:253])

So, the histogram looks like normal, so we can conclude that, by CLT, as sample size increases, the sampling distribution goes to normal distribution.

Now, let us test for normality as follows,

shapiro.test(df_main$gpf[135:253])
## 
##  Shapiro-Wilk normality test
## 
## data:  df_main$gpf[135:253]
## W = 0.88133, p-value = 2.698e-08

As, p-value < 0.05, so we reject null hypothesis, so the sampling distribution doesn’t follows normal.

Hypothesis testing -

Now let us first compute the average GPF for the 253 sample observations as follows,

# overall mean GPF
mean(df_main$gpf)
## [1] 49.19778

So, the mean GPF is \(49.19778\).

Now let us compute the average GPF for the 134 sample observations as follows,

# mean GPF for 134 people
mean(df_main$gpf[1:134])
## [1] 49.76665

SO, the mean GPF is \(49.76665\).

Now let us compute the average GPF for the 119 sample observations as follows,

# mean GPF for 119 people
mean(df_main$gpf[135:253])
## [1] 48.55721

So, the mean GPF is \(48.55721\).

Now we want to test whether the mean GPF is at least 50.8% from 253 items as follows,

library(BSDA)
## Loading required package: lattice
## 
## Attaching package: 'BSDA'
## The following object is masked from 'package:datasets':
## 
##     Orange
# Z test for mean GPF of at least 50.8%? 
z.test(df_main$gpf, alternative = "greater",mu = 50.8, sigma.x = pop_gpf_sd)
## 
##  One-sample z-Test
## 
## data:  df_main$gpf
## z = -1.8433, p-value = 0.9674
## alternative hypothesis: true mean is greater than 50.8
## 95 percent confidence interval:
##  47.76802       NA
## sample estimates:
## mean of x 
##  49.19778

Here observe that for \(H_1: true\ mean\ is\ greater\ than\ 50.8\), p-value \(0.9674 > 0.05\), so we accept null hypothesis at level \(0.05\). So, mean GPF is not more than 50.8%.

Now we want to test whether the mean GPF is at least 50.6% from \(134\) items as follows,

# t test for mean GPF of at least 50.6%? 
z.test(df_main$gpf[1:134], alternative = "greater",mu = 50.6, sigma.x = pop_gpf_sd)
## 
##  One-sample z-Test
## 
## data:  df_main$gpf[1:134]
## z = -0.69773, p-value = 0.7573
## alternative hypothesis: true mean is greater than 50.6
## 95 percent confidence interval:
##  47.80207       NA
## sample estimates:
## mean of x 
##  49.76665

Here observe that for \(H_1: true\ mean\ is\ greater\ than\ 50.8\), p-value \(0.7573 > 0.05\), so we accept null hypothesis at level \(0.05\). So, mean GPF is not more than 50.6%.

Now we want to test whether the mean GPF of at least 51.0% for \(119\) items as follows,

# t test for mean GPF of at least 51.0%? 
z.test(df_main$gpf[135:253], alternative = "greater",mu = 51, sigma.x = pop_gpf_sd)
## 
##  One-sample z-Test
## 
## data:  df_main$gpf[135:253]
## z = -1.9274, p-value = 0.973
## alternative hypothesis: true mean is greater than 51
## 95 percent confidence interval:
##  46.47248       NA
## sample estimates:
## mean of x 
##  48.55721

Here observe that for \(H_1: true\ mean\ is\ greater\ than\ 50.8\), p-value \(0.973 > 0.05\), so we accept null hypothesis at level \(0.05\). So, mean GPF is not more than 51.0%.

Conclusions -

Here the conclusions/findings from the work are,

  1. For 253 samples, the mean GPF is not more than 50.8%.
  2. For 134 samples, the mean GPF is not more than 50.6%.
  3. For 119 samples, the mean GPF is not more than 51.0%.

So, the insurance company what suspected is correct i.e. the wholesale retailer was doing fraud. Therefore, it is very unlikely that the retailer selected the samples randomly, as it claims.

Also, we can use use the resampling technique to get unbiased decision from the sample as there could be some sample bias which affecting our decisions. So, removing multiple observations from the sample and then testing the hypothesis and doing it over all possible cases, aggregation the results will lead to our final decision which will be free of sample bias.

References -

  1. E.L.Lehman : Testing Statistical Hypotheses
  2. C.R.Rao : Linear Statistical Inference and its Applications
  3. S.Zacks : The Theory of Statistical Inference
  4. J.Maindonald and J.Braun : Data Analysis and Graphics Using R, Cambridge University Press, Cambridge, 2nd edition, 2007
  5. E.L. Lehmann : Large Sample Theory