1.Preparation and data cleaning
library(readxl)
library(moments)
WeeklyLab3Data <- read_excel("~/Downloads/WeeklyLab3Data.xlsx")
mydata <- WeeklyLab3Data
str(mydata)
## Classes 'tbl_df', 'tbl' and 'data.frame': 18 obs. of 5 variables:
## $ Time : chr "Afternoon" "Afternoon" "Evening" "Evening" ...
## $ Audience: num 3 1 3 2 2 2 1 1 3 5 ...
## $ Day : num 1 1 1 1 1 1 1 1 1 2 ...
## $ Ad : num 1 2 2 1 2 3 3 1 3 2 ...
## $ Rating : num 9 5 2 8 9 8 7 10 8 9 ...
mydata$Day=replace(mydata$Day,mydata$Day==1,"Day 1")
mydata$Day=replace(mydata$Day,mydata$Day==2,"Day 2")
mydata$Day=factor(mydata$Day,levels=unique(mydata$Day))
levels(mydata$Day)
## [1] "Day 1" "Day 2"
mydata$Ad=replace(mydata$Ad,mydata$Ad==1, "Ad 1")
mydata$Ad=replace(mydata$Ad,mydata$Ad==2, "Ad 2")
mydata$Ad=replace(mydata$Ad,mydata$Ad==3, "Ad 3")
mydata$Ad=factor(mydata$Ad,levels=unique(mydata$Ad))
levels(mydata$Ad)
## [1] "Ad 1" "Ad 2" "Ad 3"
2. Plotting and Analysis - Rating
library(moments)
barplot(mydata$Rating)

plot(density(mydata$Rating))

agostino.test(mydata$Rating)
##
## D'Agostino skewness test
##
## data: mydata$Rating
## skew = -1.0248, z = -2.0462, p-value = 0.04074
## alternative hypothesis: data have a skewness
agostino.test(log(max(mydata$Rating+1)-mydata$Rating))
##
## D'Agostino skewness test
##
## data: log(max(mydata$Rating + 1) - mydata$Rating)
## skew = 0.0052267, z = 0.0113140, p-value = 0.991
## alternative hypothesis: data have a skewness
mydata$Rating2 <- log(11-mydata$Rating)
plot(density(mydata$Rating2))

The data was categoried with Audience, Day, Ad, and Rating of Day 1 and Day 2.
Rating is easiest plotted variable and we can see it has a negative skewness, with the highest 10 and lowest 2. In addition, I used the log of rating for a more significant result in the Agostino test, which the p-value is 0.991. Much better than the original p-value 0.04074.
3. Anova analysis - Ad
anova(lm(mydata$Rating~mydata$Ad))
The F-value is 5.0635, p-value is 0.02088. I can tell that Rating and Ad as two variables has a significant relationship. However, I am going to test seperately which ad is better.
model <- aov(Rating2~Day+Day/Audience+Time+Ad, data=mydata)
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Day 1 0.029 0.0289 0.188 0.67359
## Time 2 3.430 1.7150 11.169 0.00283 **
## Ad 2 3.118 1.5592 10.154 0.00391 **
## Day:Audience 2 0.114 0.0568 0.370 0.69994
## Residuals 10 1.535 0.1535
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(model,"Ad")
## Warning in replications(paste("~", xx), data = mf): non-factors ignored:
## Day, Audience
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Rating2 ~ Day + Day/Audience + Time + Ad, data = mydata)
##
## $Ad
## diff lwr upr p adj
## Ad 2-Ad 1 1.0195360 0.3993559 1.6397160 0.0029463
## Ad 3-Ad 1 0.5121156 -0.1080645 1.1322956 0.1080397
## Ad 3-Ad 2 -0.5074204 -1.1276005 0.1127596 0.1116348
From the result of Tukey test, Ad 2-1 the lwr is 0.3993, Ad 3-1 the lwr is -0.1080. Ad 3-2 the lwr is -1.1276. So we can generally conclude that the effect is Ad 2 > Ad 1 > Ad 3.