library(tidyverse)
v <- read_csv("velo.csv")
Lightly comment your code and use pipes for readability.
Comment briefly on each of the questions, as directed. Only the the final question requires a lengthier response.
Plot the distribution of spent by
checkout_system. Below you will use a t-test to compare
these distributions statistically. However, a t-test assumes normally
distributed data. Is that assumption valid in this case? Why or why
not?
Note:
You could compare the two distributions using histograms but a density plot works better. (A boxplot is also an option.)
Make sure to include a plot title.
library(ggplot2)
ggplot(v, aes(x = spent, col = checkout_system))+
geom_density() +
theme_minimal() +
labs(title = "distribution of spent by checkout system")
Answer: The assumption is valid. According to the graph, the distribution of spent by checkout system is a normal distribution with a slightly longer right tail, and hence a t-test is an approprite method to compare the distrubutions statistically.
Create a summary table of spent by
checkout_system with the following statistics:
Your table should have 2 rows and 8 columns.
v %>%
group_by(checkout_system) %>%
summarize(n = n(),
mean = mean(spent),
sd = sd(spent),
median = median(spent),
se = (sd/sqrt(n)),
lowerCI = (mean - 1.96 * se) %>% round(2),
upperCI = (mean + 1.96 * se) %>% round(2))
## # A tibble: 2 Ă— 8
## checkout_system n mean sd median se lowerCI upperCI
## <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 new 1828 2280. 1316. 2100. 30.8 2220. 2340.
## 2 old 1655 2217. 1277. 2091. 31.4 2156. 2279.
Is average spending significantly higher in the treatment group? (The
treatment group consists in the customers using the new checkout
system.) Answer this question using a 2 sample, 2-tailed t-test with
alpha set at .05. (Note that these are the default settings for the
t.test() function when vectors are supplied for the x and y
arguments.)
t.test(x = filter(v, checkout_system == 'old')$spent,
y = filter(v, checkout_system == 'new')$spent)
##
## Welch Two Sample t-test
##
## data: filter(v, checkout_system == "old")$spent and filter(v, checkout_system == "new")$spent
## t = -1.4272, df = 3464.4, p-value = 0.1536
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -148.93475 23.45215
## sample estimates:
## mean of x mean of y
## 2217.148 2279.890
Answer: The p-value for the t-test is 0.1536, which is greater than the alpha at 0.05. In the density graph created to show the distribution of spent by checkout system appears to have little The average spending in the treatment group is not significantly different from that in the control group. Hence, we would not reject the null hypothesis. In other words, the checkout system has no positive effect in the average spending of the customers.
Create another summary table of spent by
checkout_system and device. Include these same
statistics:
v %>%
group_by(checkout_system, device) %>%
summarize(n = n(),
mean = mean(spent),
sd = sd(spent),
median = median(spent),
se = (sd/sqrt(n)),
lowerCI = (mean - 1.96 * se) %>% round(2),
upperCI = (mean + 1.96 * se) %>% round(2))
## # A tibble: 4 Ă— 9
## # Groups: checkout_system [2]
## checkout_system device n mean sd median se lowerCI upperCI
## <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 new computer 829 2228. 1303. 2058. 45.2 2139. 2317.
## 2 new mobile 999 2323. 1326. 2145. 42.0 2241. 2405.
## 3 old computer 857 2256. 1274. 2147. 43.5 2171. 2342.
## 4 old mobile 798 2175. 1279. 2027. 45.3 2086. 2264.
The table should have 4 rows and 8 columns.
Based on this information (as well as Sarah’s observation, noted in the case description, that the glitch in the checkout system seemed more prevalent for mobile users), an additional statistical comparison of new and old among just mobile users seems warranted. Make that comparison using a 2 sample, 2-tailed t-test with alpha set at .05. Report your results.
Note that a t-test can only compare two groups. Therefore, you will need to subset the data before making the comparison.
x <- v %>%
filter(checkout_system == 'old', device == 'mobile')
y <- v %>%
filter(checkout_system == 'new', device == 'mobile')
t.test(x = x$spent,
y = y$spent)
##
## Welch Two Sample t-test
##
## data: x$spent and y$spent
## t = -2.399, df = 1733.1, p-value = 0.01655
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -269.13848 -27.01302
## sample estimates:
## mean of x mean of y
## 2174.920 2322.996
Answer: The p-value is 0.01655 which is smaller than the alpha level at 0.05. The average spending in the treatment group is significantly different from that in the control group. Hence, we would reject the null hypothesis. In other words, the average spending is significant higher in the group of mobile users using the new checkout system. The glitch in the checkout system has a prevalent effect on the mobile users.
What course of action should Sarah recommend to the management at velo.com? Please incorporate your analytic results from above in fashioning an answer.
Answer: Sarah should recommend to the management at velo.com to roll out with the new checkout system and retire the old system. P-value in a t-test represents the probaility to see sample distributions falling within the range (the range depends on the confidence level), based on an assumption that the two groups of interest are not significantly different. From the statstistical testing in Q4, the p-value is at 0.01655, which is smaller than the alpha level at 0.05. Hence, there is a small probability of getting results falling within the 95% range to show that the average spendings of the mobile users are not significantly different in the new and old checkout system. In other words, the glitch in the checkout system has a prevalent effect on the mobile users, causing a loss in the revenue in velo.com. Therefore, to capture the profit from the mobile customers, velo.com should implement the new checkout system.
In looking at the summary tables you created above you might wonder about differences not just in spending but also in the number of customers. After all, the case description indicated that customers may have been prevented from completing purchases using the old checkout system. Here are the counts:
table(v$checkout_system)
##
## new old
## 1828 1655
Obviously there are some notable differences in the number of customers Are these differences statistically significant?
We could answer this question using simulation. For example, the binomial distribution could be used to represent the null distribution, the number of expected buyers under the null hypothesis of no difference between the checkout systems (that is, no difference in buying probability). The observed proportion of buyers under the new system is 1828 / (1828 + 1655) = .525. How often would this proportion occur under the null?
# We will use the rbinom() function to do this simulation. n refers to the number of simulations,
# size refers to the number of trials, and prob is the probability of getting a 1 under the null.
# Example:
rbinom(n = 1, size = 1, prob = .5)
## [1] 1
rbinom(10, 1, .5)
## [1] 0 1 1 1 0 0 1 0 1 0
rbinom(10, 10, .5)
## [1] 5 8 3 6 7 4 3 5 4 5
# Here is the simulation. Note that we divide by the total number of trials to obtain the proportion of 1s.
set.seed(123)
sims <- rbinom(n = 100000, size = nrow(v), prob = .5) / nrow(v)
hist(sims)
The observed proportion would not happen very often under the null. Let’s calculate a formal p-value.
(sims >= (1828 / (1828 + 1655))) %>% mean
## [1] 0.00179
We would double this for a 2-sided test, of course, but the result is still easily statistically significant at the conventional threshold of p < .05.
The Chi-squared test is the statistical test typically used in this situation to do a formal hypothesis test of the counts in a 1 x 2 or 2 x 2 (or larger) contingency table. Here is a Kahn Academy video on it:
And here is the Wikipedia article:
https://en.wikipedia.org/wiki/Chi-squared_test.
Here is the R function:
?chisq.test
Note that this R function takes a table as its argument:
chisq.test(table(v$checkout_system))
##
## Chi-squared test for given probabilities
##
## data: table(v$checkout_system)
## X-squared = 8.5929, df = 1, p-value = 0.003375
Notice that the p-value is almost identical to what we calculated using simulation!