Question

  1. A client has come to you. There in-house data scientist has gone crazy and fled to study the social habits of monkeys in the Amazon. Unfortunately, they had just run an important study trying to determine their new ad campaign and their data scientist left before analyzing the results. All they have is a piece of paper with a table on it (see below) and a glimmer of hope. Properly analyze the data showing your code. Then summarize the results.
library(moments)

Mydata <- read.csv(file="C:\\Users\\Calmth of Life\\Dropbox\\Harrisburg Semesters\\ANLY 510\\Ad Campaign Data.csv")
Mydata
##         Time Audience Day Ad Rating
## 1  Afternoon        3   1  1      9
## 2    Morning        5   2  2      9
## 3    Morning        4   2  1     10
## 4  Afternoon        1   1  2      5
## 5    Evening        3   1  2      2
## 6    Evening        2   1  1      8
## 7    Morning        2   1  2      9
## 8    Morning        6   2  3      9
## 9  Afternoon        5   2  3      8
## 10 Afternoon        2   1  3      8
## 11 Afternoon        4   2  2      4
## 12   Evening        4   2  3      8
## 13   Evening        6   2  2      2
## 14   Evening        1   1  3      7
## 15   Morning        1   1  1     10
## 16 Afternoon        6   2  1     10
## 17   Morning        3   1  3      8
## 18   Evening        5   2  1      6
# Ensuring if 'Rating' is normally distributed
plot(density(Mydata$Rating))

#'Rating' is negatively skewed
agostino.test(Mydata$Rating)
## 
##  D'Agostino skewness test
## 
## data:  Mydata$Rating
## skew = -1.0248, z = -2.0462, p-value = 0.04074
## alternative hypothesis: data have a skewness
#Running the test to get the curve normally distributed
agostino.test(log(max(Mydata$Rating+1)-Mydata$Rating))
## 
##  D'Agostino skewness test
## 
## data:  log(max(Mydata$Rating + 1) - Mydata$Rating)
## skew = 0.0052267, z = 0.0113140, p-value = 0.991
## alternative hypothesis: data have a skewness
#The curve is distributed normally now, skewness is fixed
Mydata$Rating2 <- log(11-Mydata$Rating)
plot(density(Mydata$Rating2))

#next step is to check moments.
#Factorizing the variables
Mydata$Audience <- factor(Mydata$Audience)
Mydata$Day <- factor(Mydata$Day)
Mydata$Ad <- factor(Mydata$Ad)
bartlett.test(Mydata$Rating2,Mydata$Ad)
## 
##  Bartlett test of homogeneity of variances
## 
## data:  Mydata$Rating2 and Mydata$Ad
## Bartlett's K-squared = 5.6376, df = 2, p-value = 0.05968
#Upon factorization, the below results reflect the homogeneity of variances
#Day 1 and Day 2 are replicates, Time and Ad are significant
model <- aov(Rating~Day+Day/Audience+Time+Ad, data=Mydata)
model <- aov(Rating2~Day+Day/Audience+Time+Ad, data=Mydata)
summary(model)
##              Df Sum Sq Mean Sq F value  Pr(>F)   
## Day           1  0.029  0.0289   0.167 0.69374   
## Time          2  3.430  1.7150   9.893 0.00687 **
## Ad            2  3.118  1.5592   8.994 0.00898 **
## Day:Audience  4  0.262  0.0655   0.378 0.81847   
## Residuals     8  1.387  0.1734                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#We use TukeyHSD test. 
TukeyHSD(model,"Ad")
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Rating2 ~ Day + Day/Audience + Time + Ad, data = Mydata)
## 
## $Ad
##           diff        lwr       upr     p adj
## 2-1  1.0195360  0.3326321 1.7064398 0.0070720
## 3-1  0.5121156 -0.1747883 1.1990194 0.1447803
## 3-2 -0.5074204 -1.1943242 0.1794834 0.1488850
#From the result, we can get a comparison table with lwr and upr bounds of the differences. 
#Only Ad 2 and Ad 1 show significant differences, with Ad 2 better than Ad 1.

Summary

The original data was negatively skewed and we distributed it normally. Out of all the variables, only Ad was significant. Based on the results from Tukey method, Ad 2 is better than Ad 1. And Ad 3 also better than Ad 1. So Ad 2 being the best followed by Ad 3 and then Ad 1 .