1.Preparation and data cleaning

library(readxl)
library(moments)
WeeklyLab3Data <- read_excel("~/Downloads/WeeklyLab3Data.xlsx")
mydata <- WeeklyLab3Data
str(mydata)
## Classes 'tbl_df', 'tbl' and 'data.frame':    18 obs. of  5 variables:
##  $ Time    : chr  "Afternoon" "Afternoon" "Evening" "Evening" ...
##  $ Audience: num  3 1 3 2 2 2 1 1 3 5 ...
##  $ Day     : num  1 1 1 1 1 1 1 1 1 2 ...
##  $ Ad      : num  1 2 2 1 2 3 3 1 3 2 ...
##  $ Rating  : num  9 5 2 8 9 8 7 10 8 9 ...
mydata$Day=replace(mydata$Day,mydata$Day==1,"Day 1")
mydata$Day=replace(mydata$Day,mydata$Day==2,"Day 2")
mydata$Day=factor(mydata$Day,levels=unique(mydata$Day))
levels(mydata$Day)
## [1] "Day 1" "Day 2"
mydata$Ad=replace(mydata$Ad,mydata$Ad==1, "Ad 1")
mydata$Ad=replace(mydata$Ad,mydata$Ad==2, "Ad 2")
mydata$Ad=replace(mydata$Ad,mydata$Ad==3, "Ad 3")
mydata$Ad=factor(mydata$Ad,levels=unique(mydata$Ad))
levels(mydata$Ad)
## [1] "Ad 1" "Ad 2" "Ad 3"

2. Plotting and Analysis - Rating

library(moments)
barplot(mydata$Rating)

plot(density(mydata$Rating))

agostino.test(mydata$Rating)
## 
##  D'Agostino skewness test
## 
## data:  mydata$Rating
## skew = -1.0248, z = -2.0462, p-value = 0.04074
## alternative hypothesis: data have a skewness
agostino.test(log(max(mydata$Rating+1)-mydata$Rating))
## 
##  D'Agostino skewness test
## 
## data:  log(max(mydata$Rating + 1) - mydata$Rating)
## skew = 0.0052267, z = 0.0113140, p-value = 0.991
## alternative hypothesis: data have a skewness
mydata$Rating2 <- log(11-mydata$Rating)
plot(density(mydata$Rating2))

The data was categoried with Audience, Day, Ad, and Rating of Day 1 and Day 2.

Rating is easiest plotted variable and we can see it has a negative skewness, with the highest 10 and lowest 2. In addition, I used the log of rating for a more significant result in the Agostino test, which the p-value is 0.991. Much better than the original p-value 0.04074.

3. Anova analysis - Ad

anova(lm(mydata$Rating~mydata$Ad))

The F-value is 5.0635, p-value is 0.02088. I can tell that Rating and Ad as two variables has a significant relationship. However, I am going to test seperately which ad is better.

model <- aov(Rating2~Day+Day/Audience+Time+Ad, data=mydata)
summary(model)
##              Df Sum Sq Mean Sq F value  Pr(>F)   
## Day           1  0.029  0.0289   0.188 0.67359   
## Time          2  3.430  1.7150  11.169 0.00283 **
## Ad            2  3.118  1.5592  10.154 0.00391 **
## Day:Audience  2  0.114  0.0568   0.370 0.69994   
## Residuals    10  1.535  0.1535                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(model,"Ad")
## Warning in replications(paste("~", xx), data = mf): non-factors ignored:
## Day, Audience
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Rating2 ~ Day + Day/Audience + Time + Ad, data = mydata)
## 
## $Ad
##                 diff        lwr       upr     p adj
## Ad 2-Ad 1  1.0195360  0.3993559 1.6397160 0.0029463
## Ad 3-Ad 1  0.5121156 -0.1080645 1.1322956 0.1080397
## Ad 3-Ad 2 -0.5074204 -1.1276005 0.1127596 0.1116348

From the result of Tukey test, Ad 2-1 the lwr is 0.3993, Ad 3-1 the lwr is -0.1080. Ad 3-2 the lwr is -1.1276. So we can generally conclude that the effect is Ad 2 > Ad 1 > Ad 3.