summary(Lab1)
## Audience Day Ad Rating
## Min. :1.0 Min. :1.0 Min. :1 Min. : 2.000
## 1st Qu.:2.0 1st Qu.:1.0 1st Qu.:1 1st Qu.: 4.000
## Median :3.5 Median :1.5 Median :2 Median : 6.000
## Mean :3.5 Mean :1.5 Mean :2 Mean : 6.333
## 3rd Qu.:5.0 3rd Qu.:2.0 3rd Qu.:3 3rd Qu.: 9.000
## Max. :6.0 Max. :2.0 Max. :3 Max. :10.000
str(Lab1)
## 'data.frame': 18 obs. of 4 variables:
## $ Audience: int 3 1 3 2 2 2 1 1 3 5 ...
## $ Day : int 1 1 1 1 1 1 1 1 1 2 ...
## $ Ad : int 1 2 2 1 2 3 3 1 3 2 ...
## $ Rating : int 9 3 2 9 4 4 6 10 8 5 ...
Lab1$Day <- as.factor(Lab1$Day)
Lab1$Ad <- as.factor(Lab1$Ad)
ggplot(Lab1,aes(Lab1$Ad,Lab1$Rating))+
geom_bar(stat = "identity")+
theme_classic()+
ggtitle("Rating vs. Ad")+
xlab("Ad")+
ylab("Rating")
ggplot(Lab1,aes(Lab1$Rating))+
geom_histogram(binwidth = 0.5)+
theme_classic()+
xlab("Rating")
ggplot(Lab1,aes(x=Lab1$Rating,fill=Lab1$Day))+
geom_density(alpha= 0.3)+
xlab("Rating")+
ggtitle("Rating by")
plot(density(Lab1$Rating))
ggplot(Lab1,aes(x=Lab1$Rating))+
stat_density(aes(group = Lab1$Rating, color = Lab1$Rating), position = "identity", geom = "line")
## Warning: Groups with fewer than two data points have been dropped.
## Warning: Groups with fewer than two data points have been dropped.
## Warning: Groups with fewer than two data points have been dropped.
## Warning: Removed 3 rows containing missing values (geom_path).
Conducted quick summary analysis on dataset and generated various visualizations including histogram, density chart to better understand dataset. Converted Day and Ad into 2-Level and 3-Level factor respectively.
##
## D'Agostino skewness test
##
## data: Lab1$Rating
## skew = -0.0068299, z = -0.0147850, p-value = 0.9882
## alternative hypothesis: data have a skewness
##
## Anscombe-Glynn kurtosis test
##
## data: Lab1$Rating
## kurt = 1.6127, z = -2.1684, p-value = 0.03013
## alternative hypothesis: kurtosis is not equal to 3
## [1] -0.006829878
Dataset appears to be negatively skewed,
Lab1$Rating2 <- as.integer(Lab1$Rating^2)
Lab1$Rating2
## [1] 81 9 4 81 16 16 36 100 64 25 100 25 49 16 36 4 100
## [18] 100
Since the dataset is negative skew, proceed to squre data.
agostino.test(Lab1$Rating2)
##
## D'Agostino skewness test
##
## data: Lab1$Rating2
## skew = 0.35169, z = 0.75291, p-value = 0.4515
## alternative hypothesis: data have a skewness
skewness(Lab1$Rating2)
## [1] 0.3516884
plot(density(Lab1$Rating2))
Data is now positive skew.
Lab1$Audience <- as.factor(Lab1$Audience)
str(Lab1)
## 'data.frame': 18 obs. of 5 variables:
## $ Audience: Factor w/ 6 levels "1","2","3","4",..: 3 1 3 2 2 2 1 1 3 5 ...
## $ Day : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 2 ...
## $ Ad : Factor w/ 3 levels "1","2","3": 1 2 2 1 2 3 3 1 3 2 ...
## $ Rating : int 9 3 2 9 4 4 6 10 8 5 ...
## $ Rating2 : int 81 9 4 81 16 16 36 100 64 25 ...
bartlett.test(Lab1$Rating2~Lab1$Day)
##
## Bartlett test of homogeneity of variances
##
## data: Lab1$Rating2 by Lab1$Day
## Bartlett's K-squared = 0.032222, df = 1, p-value = 0.8575
bartlett.test(Lab1$Rating2~Lab1$Audience)
##
## Bartlett test of homogeneity of variances
##
## data: Lab1$Rating2 by Lab1$Audience
## Bartlett's K-squared = 0.23195, df = 5, p-value = 0.9987
bartlett.test(Lab1$Rating2~Lab1$Ad)
##
## Bartlett test of homogeneity of variances
##
## data: Lab1$Rating2 by Lab1$Ad
## Bartlett's K-squared = 2.8134, df = 2, p-value = 0.245
All P-value appear to be greater than 0.05. Good to proceed.
model <- aov(Rating2 ~ Day + Day/Audience + Ad, data = Lab1)
model
## Call:
## aov(formula = Rating2 ~ Day + Day/Audience + Ad, data = Lab1)
##
## Terms:
## Day Ad Day:Audience Residuals
## Sum of Squares 128.000 20785.778 597.111 1550.889
## Deg. of Freedom 1 2 4 10
##
## Residual standard error: 12.45347
## 6 out of 14 effects not estimable
## Estimated effects may be unbalanced
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Day 1 128 128 0.825 0.385
## Ad 2 20786 10393 67.012 1.61e-06 ***
## Day:Audience 4 597 149 0.963 0.469
## Residuals 10 1551 155
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Construct a model for the dataset, and Ad is significant.
TukeyHSD(model,"Ad")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Rating2 ~ Day + Day/Audience + Ad, data = Lab1)
##
## $Ad
## diff lwr upr p adj
## 2-1 -81.33333 -101.043283 -61.62338 0.0000014
## 3-1 -56.00000 -75.709949 -36.29005 0.0000402
## 3-2 25.33333 5.623384 45.04328 0.0138864
Use TukeyHSD test to evaluate “Ad”. Lower and upper bounds of differences are displayed. All 3 Ad shows differences, and Ad-1 appears to be the better one.
After conducting analysis on the dataset, it appears that “Ad” is the only variable that was significant. According to the results from TukeyHSD test, Ad-1 appears to be best, Ad-3 comes after, and Ad-2 being the last.