Exercise 13: hypothesis testing (Pg. 255)
The Excel file Sales Data provides data on a sample of customers. An industry trade publication stated that the average profit per customer for this industry was at least $4,500. Using a test of hypothesis, do the data support this claim or not?
data <- read.csv("./data/sales_data.csv")
data
summary(data)
## Customer Percent.Gross.Profit Gross.Sales Gross.Profit
## Min. : 1.00 Min. :0.0300 Min. : 170 Min. : 40.6
## 1st Qu.:15.75 1st Qu.:0.1400 1st Qu.: 2646 1st Qu.: 435.1
## Median :30.50 Median :0.2000 Median : 6760 Median : 1662.4
## Mean :30.50 Mean :0.2119 Mean : 25016 Mean : 4239.2
## 3rd Qu.:45.25 3rd Qu.:0.2450 3rd Qu.: 32171 3rd Qu.: 5690.4
## Max. :60.00 Max. :0.6000 Max. :179101 Max. :25379.3
## Industry.Code Competitive.Rating
## Min. :1.000 Min. :1
## 1st Qu.:3.000 1st Qu.:2
## Median :5.000 Median :3
## Mean :4.483 Mean :3
## 3rd Qu.:6.000 3rd Qu.:4
## Max. :7.000 Max. :5
H0: Mean(Profit) >= 4500 H1: Mean(Profit) < 4500
m <- 4500
p <- 4239.2
n <- nrow(data)
s2 <- var(data$Gross.Sales)
z <- (p-m)/sqrt(s2/n)
z
## [1] -0.05558322
z = 0.019
In this case, the sample proportion of 4239.2 is 0.056 standard error below the hypothesized value of 4500. Because this is an lower-tailed test, we reject H0 if the value of the test statistic is smaller than the critical value. In other words, if the significance is from 1.9% and above, the data does not support the hypothesis. Otherwise, the data supports the claim if the significance is smaller than 1.9%