library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.0.3 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.1 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
Questions
Read the following article on applied statistical practice:
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004961
Read the following eBay digital experimentation blog post:
https://tech.ebayinc.com/research/measuring-success-with-experimentation/
You want to evaluate the efficiency of employees and consider two potential approaches for improvement. Employees were randomly divided into three separate groups: Treatment A, Treatment B, and Control condition. Those in the control group were “business as usual” (i.e., no change). Those in Treatment A participated in a seminar informing the employees of how their work explicitly impacts the company and consumers in need, highlighting how beneficial each employee is, and encouraged communication between all of the employees, including upper management. Those in Treatment B were told they would receive a small bonus if they were able to increase overall efficiency by 10%. You would like to evaluate if there are any difference (i.e., does it mater/is there an effect?). The results are given below; a higher score indicates a higher level of efficiency. The scores are as follows:
Treatment A: 150, 170, 148, and 146
Teatment B: 157, 129, 141, and 141
Control: 119, 142, 123, and 136
```r
A <- c(150,170,148,146)
B <- c(157,129,141,141)
control <- c(119,142,123,136)
grand_mean <- ((mean(A)*4)+(mean(B)*4)+mean(control)*4)/12
```
Carefully input these values into your program of choice and perform an ANOVA.
What is the sum of squares between treatments.
SSbetween <- (4*(mean(A)-grand_mean)^2)+ (4*(mean(B)-grand_mean)^2)+
(4*(mean(control)-grand_mean)^2)
SSbetween
## [1] 1104.667Compute the mean square between treatments.
MSbetween <- SSbetween/(3-1)
MSbetween
## [1] 552.3333Compute the sum of squares due to error.
SSwithin <-sd(A)^2 * 3 + sd(B)^2 * 3 + sd(control)^2 * 3
SSwithin
## [1] 1117Compute the mean square due to error.
MSwithin <- SSwithin/(10-1)
MSwithin
## [1] 124.1111Fill in the values for the ANOVA table for this problem by replacing 1–10, which serve as placeholders in the ANOVA source table.
| Source | SS | DF | MS | F | p-value |
|---|---|---|---|---|---|
| Between | 1 | 2 | 3 | 4 | 5 |
| Within | 6 | 7 | 8 | ||
| Total | 9 | 10 |
| Source | SS | DF | MS | F | p-value |
|---|---|---|---|---|---|
| Between | 1104.6666667 | 3 | 552.3333333 | 4.4503133 | |
| Within | 1117 | 9 | 124.1111111 | ||
| Total | 2221.6666667 | 11 |
With a Type I error rate of \(\alpha\)=.05, are there any differences among the groups with regards to their mean? What is the managerial take-away?
Answer:
Because the p-value is less than .05, we reject the null hypothesis. We infer that the population groups’ means are not all equal. There is a statistical significance. You get different levels of efficiency with different approaches to improvement. As a separate but related question, formally evaluate if the population variance of Treatment A is equal to the population variance of the Control group? Answer:
Completing a F-test between these two Treatments the resulting p_value = 0.9629 and therefore greater than .05. This makes us fail to reject the null and therefore cannot accept the alternative hypothesis that there is a significant difference between the two Treatments.
```r
pf(4.45, df1=2, df2=10, lower.tail= TRUE)
```
```
## [1] 0.9585341
```The question above only addressed four observations per group. Now, address the same question with the full set of data. The data file is available as a here as CSV file, here as an SPSS file, or in the following directory: http://bit.ly/MSBA_Data. Alternatively, the data can read the data into R using the following code:
read.csv("https://www.dropbox.com/s/uoc47ijkig3aqx2/Efficiency_Treatments.csv?dl=1")
In words, what is the null hypothesis? Answer:
H0: µ1 = µ2 = µ3
All of the efficiency group means are the same, regardless of treatment. In words, what is the alternative hypothesis?
Answer:
H0: µ1 <> µ2 <> µ3
All of the efficiency group means are the not the same, regardless of treatment. Assuming homogeneity of variance across the groups, what is the best estimate of each groups population variance?
treatment <- read.csv("https://www.dropbox.com/s/uoc47ijkig3aqx2/Efficiency_Treatments.csv?dl=1 ")
treatment <- treatment %>%
mutate(Treatment = as.factor(Treatment))
model_treatment <- aov(Efficiency ~ Treatment, data = treatment)
summary(model_treatment)
## Df Sum Sq Mean Sq F value Pr(>F)
## Treatment 2 14630 7315 83.69 <2e-16 ***
## Residuals 147 12849 87
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mean square error = 87
Given your analysis, in words, what can you conclude about the treatments (using a Type I error rate of .05)?
Answer:
p- value < .05 reject NullA bank is studying the average time that it takes tellers to serve a customer. Customers line up and are served by the next available teller. Tellers that are slower will serve fewer customers, on average, than those that are faster. The ANOVA table below was developed based on the transaction between tellers and customers.
| Source | SS | DF | MS | F | p-value |
|---|---|---|---|---|---|
| Between | 3419.20 | 7 | |||
| Within | 61259.80 | ||||
| Total | 153 |
How many groups (i.e., Tellers) took part in this study?
Answer:
8 tellers (7+1) How many customers were in this study (i.e., what is the total sample size)?
Answer:
154 customerss (153+1)What is the value of the \(SS_{Total}\)?
61259.80 + 3419.20
## [1] 64679What is the value of the \(MS_{Between}\)?
MSbetween <- 3419.20/7
MSbetween
## [1] 488.4571What is the value of the \(MS_{Within}\)?
MSwithin <- 61259.80/(153-7)
MSwithin
## [1] 419.5877What is the \(F\)-value?
MSbetween/MSwithin
## [1] 1.164136What is the \(p\)-value?
1-pf(1.164136,7,153)
## [1] 0.3266453In words, what is the null hypothesis?
Answer:
The Null Hypothesis would be defined as there is no significant difference in the amount of time it takes the tellers to service the customers, aka they all relatively serve the customers at a similar speed that is insignificant. In words, what is an alternative hypothesis?
Answer:
The alternative hypothesis is that the tellers service customers in varying speeds that is not insignificant meaning some are faster than others and some are slower than others when it comes to servicing customers and that time is not insignificant. In words, what are the overall conclusions that can be drawn from the study?
Answer:
The tellers service customers at relatively the same speed and efficiency meaning the difference in time in which they serve customers is insignificant. Therefore we conclude that we fail to reject the Null Hypothesis.As a consultant at pharmaceutical company who wishes to determine if the temperature used when producing a specific drug will have an impact on the efficacy, that is, the drug’s effectiveness at treating disease. The thinking is that, due to a different chemical reaction at the different temperatures, the efficacy could be impacted but it is unknown. To study the effect of temperature, five batches were produced at each of three temperature levels. The Temperature_on_Effectiveness data file is available here as a CSV file and here as an SPSS file). Alternatively, the data can read the data into R using the following code:
read.csv("https://www.dropbox.com/s/b1wm79zipfizc1m/Temperature_on_Effectiveness.csv?dl=1")
drugs <- read.csv("https://www.dropbox.com/s/b1wm79zipfizc1m/Temperature_on_Effectiveness.csv?dl=1")
drugs <- drugs %>%
mutate(Label = as.factor(Label))
drugs_model <- aov(Effectiveness ~ Label, data = drugs) %>%
summary()
What is the value of the sum of squares between treatments?
70
## [1] 70What is the value of the mean square between treatments?
35
## [1] 35What is the value of the mean square within?
19.67
## [1] 19.67In words, what is the mean square within?
Answer:
Within group variation is described as how much variation there is within individual samples. Thus, the mean square within is defined as the mean square within those individual samples Use a .05 level of significance, explain whether Temperature has an effect on the mean yield of the process.
.21
## [1] 0.21
Fail to Reject
After some recent bad press regarding the field of marketing, a public relations firm working on behalf of marketing associations gathered public opinion on the perception of corporate ethical values among individuals specializing in marketing. A higher score indicates the perception of higher ethical values. The data file is available here as a CSV file, here as an SPSS file, or in the following directory: http://bit.ly/MSBA_Data. Alternatively, the data can read the data into R using the following code:
read.csv("https://www.dropbox.com/s/y6ur8rg1spb5hhf/Marketing_Ethics.csv?dl=1")
marketing <- read.csv("https://www.dropbox.com/s/y6ur8rg1spb5hhf/Marketing_Ethics.csv?dl=1")
marketing <- marketing %>%
mutate(Labels = as.factor(Labels))
marketing_model <- aov(Rating ~ Labels, data = marketing)
summary(marketing_model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Labels 2 7.0 3.5 7 0.00712 **
## Residuals 15 7.5 0.5
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
What are the relevant descriptive statistics of the data (be sure to appropriately provide what is of interest in this multiple group context)?
marketing %>%
group_by(Labels) %>%
summarise(mean = mean(Rating, na.rm=TRUE),
median = median(Rating, na.rm=TRUE),
sd = sd(Rating, na.rm=TRUE),
Pct.1 = quantile(Rating, probs=c(.01), na.rm=TRUE),
Pct.5 = quantile(Rating, probs=c(.05), na.rm=TRUE),
Pct.10 = quantile(Rating, probs=c(.1), na.rm=TRUE),
Pct.90 = quantile(Rating, probs=c(.90), na.rm=TRUE),
Pct.95 = quantile(Rating, probs=c(.95), na.rm=TRUE),
Pct.99 = quantile(Rating, probs=c(.99), na.rm=TRUE),
IQR = IQR(Rating, na.rm=TRUE)) -> Descriptives_by_Labels
## `summarise()` ungrouping output (override with `.groups` argument)
Descriptives_by_Labels
Whare is/are relevant visuals in this context?
qplot(Labels, Rating, data = marketing,
geom= "boxplot", fill = Labels)
Are there any group differences in terms of public perceptions among these three branches of marketing?
'p-value = 0.00712'
## [1] "p-value = 0.00712"
TukeyHSD(marketing_model)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Rating ~ Labels, data = marketing)
##
## $Labels
## diff lwr upr p adj
## Managers-Advertisers -1.0 -2.060413 0.06041278 0.0659476
## Researchers-Advertisers -1.5 -2.560413 -0.43958722 0.0060038
## Researchers-Managers -0.5 -1.560413 0.56041278 0.4574280Suppose an analyst performs seven 95% confidence intervals that are independent of one another. What is the probability that all seven of the confidence intervals correctly bracket their respective population parameter?
Answer:
.95*.95*.95*.95*.95*.95*.95 = 0.6983373After performing an ANOVA on the pharmaceutical effectiveness data, in which temperature was evaluated as a factor in the efficacy of the drug, the results led to further questions. Perform the appropriate follow-up tests to address the more targeted questions. Recall that the data set used for the ANOVA you performed is here as a CSV file, here as an SPSS file, or by downloading the data Temperature_on_Effectiveness from the following directory: http://bit.ly/MSBA_Data.
temperature <- read_csv('https://www.dropbox.com/s/b1wm79zipfizc1m/Temperature_on_Effectiveness.c sv?dl=1')
## Parsed with column specification:
## cols(
## Temperature = col_double(),
## Label = col_character(),
## Effectiveness = col_double()
## )
temperature <- temperature %>%
mutate(Label = as.factor(Label))
Use the most appropriate (i.e., statistically optimal) method for evaluating the differences between the (i) Medium and Low group (i.e., \(\mu_{Medium} - \mu_{Low}\)), (ii) High and Low group (i.e., \(\mu_{High} - \mu_{Low}\)), and (iii) High and Medium group \(\mu_{High} - \mu_{Medium}\), such that the set of tests have a 5% Type I error rate. Write the conclusions in a sentence, include the confidence interval, \(p\)-value, and summary of the findings.
TukeyHSD(aov(Effectiveness~Label,data = temperature))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Effectiveness ~ Label, data = temperature)
##
## $Label
## diff lwr upr p adj
## Low-High 5 -2.482712 12.482712 0.2166815
## Medium-High 1 -6.482712 8.482712 0.9327141
## Medium-Low -4 -11.482712 3.482712 0.3591042
What is the conclusion regarding the effect of Medium as compared to Low?
Answer:
Not Statistically Significant What is the conclusion regarding the effect of High as compared to Low?
Answer:
Not Statistically Significant What is the conclusion regarding the effect of High as compared to Medium?
Answer:
Not Statistically Significant DS Sporting Goods offers a reward program that emails members showing the products on sale the upcoming week and offers a printable coupon. To assess the effectiveness of the program, 4,000 members were randomly selected to take part. In the first week of October, (a) 1,000 of the Rewards Members were sent an email with a flyer but no printable coupon, (b) 1,000 Rewards Members were sent an email with a flyer and a 10% off coupon for a single item, (c) 1,000 Rewards Members were sent an email with a flyer and a 20% off coupon for a single item, and (d) 1,000 Rewards Members were sent an email with a flyer and a 25% off coupon for a single item. Non-Members were also tracked as a control group. Note that because the data consists of those that actually shopped, it is not the case that the 1,000 selected per group to participate are represented in the data.
The data file is available here as a CSV file, here as an SPSS file, or by downloading the data DSSportingGoods in the following directory: http://bit.ly/MSBA_Data. Use SPSS and a .05 Type I error rate.
sports <- read_csv("https://www.dropbox.com/s/30dm1wy5havw8ua/DSSportingGoods.csv?dl=1")
## Parsed with column specification:
## cols(
## Group = col_double(),
## Sale = col_double()
## )
sports <- sports %>%
mutate(Group = replace(Group, Group == 0, 'control'),
Group = replace(Group, Group == 1, 'no_coupon'),
Group = replace(Group, Group == 2, '10%_coupon'),
Group = replace(Group, Group == 3, '20%_coupon'),
Group = replace(Group, Group == 4, '25%_coupon'),
Group = as.factor(Group))
What are the relevant descriptive statistics of the data (be sure to appropriately provide what is of interest in this multiple group environment)?
sports %>%
group_by(Group) %>%
summarise(n = n(),
mean = mean(Sale, na.rm=TRUE),
median = median(Sale, na.rm=TRUE),
sd = sd(Sale, na.rm=TRUE),
Pct.1 = quantile(Sale, probs=c(.01), na.rm=TRUE),
Pct.5 = quantile(Sale, probs=c(.05), na.rm=TRUE),
Pct.10 = quantile(Sale, probs=c(.1), na.rm=TRUE),
Pct.90 = quantile(Sale, probs=c(.90), na.rm=TRUE),
Pct.95 = quantile(Sale, probs=c(.95), na.rm=TRUE),
Pct.99 = quantile(Sale, probs=c(.99), na.rm=TRUE),
IQR = IQR(Sale, na.rm=TRUE)) -> Descriptives_by_Group
## `summarise()` ungrouping output (override with `.groups` argument)
Descriptives_by_Group
Are there any differences in the mean per-transaction sale price among any of the groups?
Descriptives_by_Group[,1:2]
summary(aov(Sale~Group, data = sports))
## Df Sum Sq Mean Sq F value Pr(>F)
## Group 4 71307 17827 20.83 3.03e-16 ***
## Residuals 705 603371 856
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
For the four contrasts that are requested in Parts , , and (i.e., what follows), maintain the overall Type I error rate (i.e., the family-wise Type I error rate) to .05 and thus the confidence intervals should be simultaneous 95% confidence intervals.
Compare the 25% coupon group to the 20% coupon group (inference 1 of 4 in the set).
model <- aov(Sale ~ Group, data = sports )
TukeyHSD(model)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Sale ~ Group, data = sports)
##
## $Group
## diff lwr upr p adj
## 20%_coupon-10%_coupon 15.1177380 3.172462 27.063014 0.0051533
## 25%_coupon-10%_coupon 9.2642084 -1.879070 20.407487 0.1546685
## control-10%_coupon -10.6682260 -20.334775 -1.001677 0.0220766
## no_coupon-10%_coupon -0.5701893 -15.214822 14.074443 0.9999709
## 25%_coupon-20%_coupon -5.8535297 -16.635248 4.928189 0.5727389
## control-20%_coupon -25.7859640 -35.033394 -16.538534 0.0000000
## no_coupon-20%_coupon -15.6879273 -30.059359 -1.316495 0.0243851
## control-25%_coupon -19.9324343 -28.117664 -11.747204 0.0000000
## no_coupon-25%_coupon -9.8343976 -23.546476 3.877680 0.2861259
## no_coupon-control 10.0980367 -2.443518 22.639592 0.1800860
What is the mean difference?
Answer:
-5.85353 What is the 95% (simultaneous) confidence interval?
Answer:
-16.635248 4.928189What is the (corrected) \(p\)-value?
Answer:
0.5727389 Provide a single sentence summarizing this analysis.
Answer:
fail to reject Compare the Non-Rewards Member to the 10% coupon group (inference 2 of 4 in the set).
What is the mean difference?
Answer:
-10.6682260 What is the 95% (simultaneous) confidence interval?
Answer:
-20.334775 -1.001677 What is the (corrected) \(p\)-value?
Answer:
0.0220766 Provide a single sentence summarizing this analysis.
Answer:
Fail to RejectCompare the large discount groups (i.e., the mean of the 20% and 25% groups) to the 10% coupon group (inference 3 of 4 in the set).
What is the mean difference?
mean(75.64880+69.79527)
## [1] 145.4441What is the 95% (simultaneous) confidence interval?
# The answer is.... What is the (corrected) \(p\)-value?
# The answer is.... Provide a single sentence summarizing this analysis.
# The answer is.... Compare the large discount groups (i.e., the mean of the 20% and 25% groups) to the no/small coupon groups (i.e., the mean of no coupon and the 10% coupon group) (inference 4 of 4 in the set).
What is the mean difference?
# The answer is.... What is the 95% (simultaneous) confidence interval?
# The answer is.... What is the (corrected) \(p\)-value?
# The answer is.... Provide a single sentence summarizing this analysis.
# The answer is.... Below are the descriptive statistics for three groups with equal sample sizes in each of the groups here:
| Source | SS | df | MS | F |
|---|---|---|---|---|
| Between | 34.656 | 2 | 17.328 | .213 |
| Within | 2201.011 | 27 | 81.519 | |
| Total | 2235.667 | 29 |
Assume that all ANOVA assumptions are satisfied. Please answer the questions that follow based on the above descriptive statistics and the ANOVA table.
Can we conclude, statistically, that there are mean differences among the groups?
Answer:
MS Between : 17.328
MS Within : 81.519
E[MS between] > E [MS within] = population mean difference exists
17.328 < 81.519 population mean difference does not exist
No, the F test for the ANOVA is not statistically significant at .213 as it is not larger than 1. The variance of the means is no larger than it should have been.
P value = .810 and larger than .05 the Null Hypothesis of equal population means is failed to rejected. What is the best, that is, statistically optimal, 95% confidence interval for the population mean of the Phase 2 Locations (i.e., the 1st group)? Hint: it is not \(CI_{.95}\)=[37.064, 51.956]; see the “Application of What You Learned: ANOVA” for a demonstration.
# The answer is.... Why is the above confidence interval you formed the “best?” Explain.
# The answer is.... Suppose that the ANOVA \(p\)-value would have been .001, for example, rather than the .810 (as it is above), would your confidence interval in part b and your explanation in part c been any different? Explain.
*Answer:
We Are looking for something more like this:
No, the p-value in this situation has no effect on determining the confidence interval. The confidence interval was calculated using MSWithin, as the best estimate of the variance, and therefore a more statistically optimal standard error. It thus has more degrees of freedome (from all groups) and therefore a smaller critical value (i.e., more power). The p-value from the ANOVA (i.e., the .81) has nothing to do with our estimate of the pooled variance (MS_within). The p-value is rather related to have much the means differ from one other (as compared to the MS_W), but we would conclude the means are different if the p would have been <.05 (not the same as noted here). At the following location, here, is an R script file that includes a Monte Carlo simulation study. A Monte Carlo simulation such as this helps to understand the properties of statistical procedures when we know the truth and use the method in a non-standard way or when mathematical properties are not known, so as to evaluate the properties of the methods under a variety of situations (that we specify) in a large number of repeated trials. The idea, here, is to attempt to learn the behavior of some outcome of interest under the specified conditions. Here, we examine the Type I error rate from two scenarios in which the null hypothesis of a four group design is true (i.e., there is no effect): a preplanned test and a test in which the largest and smallest sample mean are compared (i.e., when we use the data to decide which question to ask). See this page and accompanying video for more information about Monte Carlo simulation studies: https://www.investopedia.com/terms/m/montecarlosimulation.asp.
What is the empirical (from the simulation) Type I error rate for the preplanned study, in which the population means of Group 1 and Group 4 are compared?
# The answer is.... What is the empirical (from the simulation) Type I error rate for the post-hoc (after the fact) comparison, of the largest and smallest sample means being compared?
# The answer is.... If there is a difference between the values of Parts and , why; if there is not a difference why not?
# The answer is.... How can the findings of Part be mitigated?
# The answer is....