510-Lab 1

A client has come to you. There in-house data scientist has gone crazy and fled to study the social habits of monkeys in the Amazon. Unfortunately, they had just run an important study trying to determine their new ad campaign and their data scientist left before analyzing the results. All they have is a data file and a glimmer of hope. Properly analyze the data showing your code. Then summarize the results.

library(readxl)
WeeklyLab1Data <- read_excel("C:/Users/yanru/Desktop/R File/WeeklyLab1Data.xlsx")
View(WeeklyLab1Data)

shapiro.test(WeeklyLab1Data$Rating)

## 
##  Shapiro-Wilk normality test
## 
## data:  WeeklyLab1Data$Rating
## W = 0.90503, p-value = 0.07036

library(moments)
agostino.test(WeeklyLab1Data$Rating)

## 
##  D'Agostino skewness test
## 
## data:  WeeklyLab1Data$Rating
## skew = -0.0068299, z = -0.0147850, p-value = 0.9882
## alternative hypothesis: data have a skewness

a=density(WeeklyLab1Data$Rating)
plot(a)

bartlett.test(WeeklyLab1Data$Rating, WeeklyLab1Data$Ad)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  WeeklyLab1Data$Rating and WeeklyLab1Data$Ad
## Bartlett's K-squared = 4.1404, df = 2, p-value = 0.1262

model <- aov(Rating~Day+Day/Audience+Ad, data=WeeklyLab1Data)
summary(model)

##              Df Sum Sq Mean Sq F value Pr(>F)  
## Day           1   0.89    0.89   0.128 0.7263  
## Ad            1  40.33   40.33   5.787 0.0305 *
## Day:Audience  1   1.20    1.20   0.172 0.6845  
## Residuals    14  97.58    6.97                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

WeeklyLab1Data$Ad=factor(WeeklyLab1Data$Ad)
model2=aov(WeeklyLab1Data$Rating~WeeklyLab1Data$Ad)
TukeyHSD(model2)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = WeeklyLab1Data$Rating ~ WeeklyLab1Data$Ad)
## 
## $`WeeklyLab1Data$Ad`
##          diff        lwr       upr     p adj
## 2-1 -6.333333 -8.0062631 -4.660404 0.0000002
## 3-1 -3.666667 -5.3395964 -1.993737 0.0001192
## 3-2  2.666667  0.9937369  4.339596 0.0023622

From the shapiro test, Agostino test and the density graph, we can tell that our data are nearly normal distributed. The p value for my Bartlett test is greater than 0.05, so there is equality variance across the categorical predictors. I used anova test. The p value for Ad is smaller than 0.05, so we reject the null. The Ad has significant impact on the ratings. But the p values for Audience and Day have a value larger than 0.05, we accept the null that there are no significant differences between the means. So the two factors Audience and Day do not have significant impact on rating. From the TukeyHSD test, I know that Ad 1 is better than Ad 2 and Ad 3 and Ad 2 is better than Ad 3.

510-Lab 1

Rui Yan

3.30.2019