Advertising Experiments at Restaurant Grades

Sameer Mathur

One-way ANOVA

---

READING AND DESCRIBING DATA

Reading Data

# reading data into R
resGrades.df <- read.csv(paste("AdvertisingDataV2.csv"))
# attaching data columns of the dataframe
attach(resGrades.df)
# dimension of the dataframe
dim(resGrades.df)
[1] 30000     6

Descriptive statistics

# descriptive statistics
library(psych)
describe(resGrades.df)[, c(1:5, 8:9)]
                vars     n     mean      sd  median min   max
adType*            1 30000     2.00    0.82     2.0   1     3
pageViews          2 30000   468.06  168.16   391.0 145   929
phoneCalls         3 30000    37.71    7.97    37.0  17    77
reservations       4 30000    36.55    7.99    36.0  15    79
businessID         5 30000 15000.50 8660.40 15000.5   1 30000
restaurantType*    6 30000     1.60    0.49     2.0   1     2

Number of observations in Ad Type

# number of observations in ad type
table(adType)
adType
Curr Ads  New Ads   No Ads 
   10000    10000    10000 

Descriptive Statistics by Ad Type

# descriptive statistics of reservations by ad type
library(data.table)
dt <- data.table(resGrades.df)
dt[, list(count = .N,
        mean = round(mean(reservations), 3), 
        sd = round(mean(reservations), 3),
        median = round(median(reservations), 3),
        min = min(reservations),
        max = max(reservations)), 
   by = list(adType)]
     adType count   mean     sd median min max
1:   No Ads 10000 33.960 33.960     33  15  58
2: Curr Ads 10000 34.021 34.021     33  18  59
3:  New Ads 10000 41.681 41.681     40  18  79

Boxplot of Reservations by Ad Type

# box plot of ad type
boxplot(reservations ~ adType, data = resGrades.df,
        main = "Boxplot of Ad Type",
        xlab = "Ad Type", ylab = "Reservations",
        digits = 2)

plot of chunk unnamed-chunk-6

Mean Plot of Reservations by Ad Type

# mean plot by ad type
library(gplots)
plotmeans(reservations ~ adType, data = resGrades.df,
          xlab = "Ad Type", ylab = "Number of Reservations",
          digits = 2, col = "black", ccol = "blue", barwidth = 2,
          legends = TRUE, mean.labels = TRUE, frame = TRUE)

plot of chunk unnamed-chunk-8

One-way ANOVA

# one-way ANOVA 
oneWayfit <- aov(reservations ~ adType, data = resGrades.df)
# summary of the ANOVA model
summary(oneWayfit)
               Df  Sum Sq Mean Sq F value Pr(>F)    
adType          2  394228  197114    3885 <2e-16 ***
Residuals   29997 1522018      51                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation of One-way ANOVA Output

The p-value < 0.05 for adType, we can conclude that there are significant differences in number of reservations between three different ad types.

Tukey HSD Pairwise Comparison Test

# Tukey comparison test
TukeyHSD(oneWayfit)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = reservations ~ adType, data = resGrades.df)

$adType
                    diff        lwr        upr     p adj
New Ads-Curr Ads  7.6593  7.4232043  7.8953957 0.0000000
No Ads-Curr Ads  -0.0608 -0.2968957  0.1752957 0.8181708
No Ads-New Ads   -7.7201 -7.9561957 -7.4840043 0.0000000

Tukey Multiple Pairwise Comparisons Plot

# Tukey parir-wise comparisons plot
plot(TukeyHSD(oneWayfit))

plot of chunk unnamed-chunk-12

ANOVA Assumptions

Normality of Residuals

# residual versus fitted plot
plot(oneWayfit, 2)

plot of chunk unnamed-chunk-14

Statistical Test for Normality

# extract the residuals
aovResiduals <- residuals(oneWayfit)
# Anderson-Darling normality test
library(nortest)
ad.test(aovResiduals)

    Anderson-Darling normality test

data:  aovResiduals
A = 305.49, p-value < 2.2e-16

Statistical Test Inference

Null hypothesis (\( H_0 \)): The data is normally distributed.

We fail to reject the null hypothesis. So, we cannot assume the normality of the data.

Homogeneity of Variance

# residual versus fitted plot
plot(oneWayfit, 1)

plot of chunk unnamed-chunk-17

Statistical Test

# test homogeneity of variance
library(car)
leveneTest(reservations ~ adType, data = resGrades.df)
Levene's Test for Homogeneity of Variance (center = median)
         Df F value    Pr(>F)    
group     2  134.29 < 2.2e-16 ***
      29997                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Statistical Test Inference

Null hypothesis (\( H_0 \)): The variance of number of reservations are homogenous between groups.

We fail to reject the null hypothesis. So, we cannot assume the homogeneity of variance between the groups.

Rectify Assumptions

Box-Cox Transformation using MASS Package

# Box-cox transformation using MASS package
library(MASS)
BCTrans <- boxcox(oneWayfit, lambda = seq(-2, 2))

plot of chunk unnamed-chunk-19

Value of Lambda

# Box-Cox transformation value of lambda
BCTrans$x[which(BCTrans$y == max(BCTrans$y))]
[1] -0.1414141

Box-Cox Transformation using caret Package

# Box-Cox transformation using caret package
library(caret)
ResTrans <- BoxCoxTrans(resGrades.df$reservations)
ResTrans
Box-Cox Transformation

30000 data points used to estimate Lambda

Input data summary:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  15.00   31.00   36.00   36.55   41.00   79.00 

Largest/Smallest: 5.27 
Sample Skewness: 0.777 

Estimated Lambda: -0.2 
With fudge factor, Lambda = 0 will be used for transformations

Inference After Box-Cox Transformation

The Box-cox transformation using MASS package suggests \( \lambda = -0.1414141 \) and Box-Cox transformation using caret package suggests \( \lambda = -0.2 \).

However, both \( \lambda \) have negative fraction values. It is quit tough to transform our dependent variable also interpretation become too complicated.

Hence, we use simple approach that \( \lambda = 0 \). In this case, our dependent variable will be a log transformation.

Log-Transformation of Variable Reservations

# change Box Office Collection to Log Box Office Collection
resGrades.df$logReservations <- log(resGrades.df$reservations)
# attaching data columns
attach(resGrades.df)
# first few rows of the dataframe
head(resGrades.df)
  adType pageViews phoneCalls reservations businessID restaurantType
1 No Ads       643         44           39          1          chain
2 No Ads       621         41           44          2          chain
3 No Ads       581         40           38          3          chain
4 No Ads       592         35           31          4          chain
5 No Ads       648         45           46          5          chain
6 No Ads       519         37           41          6          chain
  logReservations
1        3.663562
2        3.784190
3        3.637586
4        3.433987
5        3.828641
6        3.713572

One-way ANOVA After Box-Cox Transformation

# one-way ANOVA (after transformations)
oneWayTransfit <- aov(logReservations ~ adType, data = resGrades.df)
# summary of the ANOVA model
summary(oneWayTransfit)
               Df Sum Sq Mean Sq F value Pr(>F)    
adType          2  278.1  139.05    3845 <2e-16 ***
Residuals   29997 1084.8    0.04                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

COMPARING ANOVA ASSUMPTIONS BEFORE-AFTER LOG TRANSFORMATION

NORMALITY ASSUMTION: Comparing Normality of Residuals

Before Log-transformation

# normal Q-Q plot before log-transformation
plot(oneWayfit, 2)

plot of chunk unnamed-chunk-25

After Log-transformation

# normal Q-Q plot after log-transformation
plot(oneWayTransfit, 2)

plot of chunk unnamed-chunk-26

NORMALITY ASSUMTION TEST: Comapring Statistical Tests for Normality

Before Log-transformation

# Anderson-Darling normality test before log-transformation
library(nortest)
ad.test(reservations)

    Anderson-Darling normality test

data:  reservations
A = 222.08, p-value < 2.2e-16

After Log-transformation

# Anderson-Darling normality test after log-transformation
library(nortest)
ad.test(logReservations)

    Anderson-Darling normality test

data:  logReservations
A = 36.301, p-value < 2.2e-16

HOMOGENEITY OF VARIANCES: Comparing Plots for the Homogeneity of Variance

Before Log-transformations

# residual versus fitted plot before log-transformation
plot(oneWayfit, 1)

plot of chunk unnamed-chunk-29

After Log-transformations

# residual versus fitted plot after log-transformation
plot(oneWayTransfit, 1)

plot of chunk unnamed-chunk-30

HOMOGENEITY OF VARIANCES: Comparing Statistical Test for the Homogeneity of Variances

Before Log-transformation

# test homogeneity of variance after log-transformation
library(car)
leveneTest(reservations ~ adType, data = resGrades.df)
Levene's Test for Homogeneity of Variance (center = median)
         Df F value    Pr(>F)    
group     2  134.29 < 2.2e-16 ***
      29997                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

After Log-transformation

# test homogeneity of variance after log-transformation
library(car)
leveneTest(logReservations ~ adType, data = resGrades.df)
Levene's Test for Homogeneity of Variance (center = median)
         Df F value    Pr(>F)    
group     2  24.465 2.419e-11 ***
      29997                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Inference After Log-Transformation

After log transformation of the dependent variable number of reservations, we find homogeneity problem i.e. we didn't have equal variances between the groups.

Same for normality assumption, we fail to accept the null hypothesis. That concludes, we cannot assume the normality of the data after the transformation.

Hence, we will use non-parametric Kruskal-Wallis rank sum test in the case of non-normality in the data.

Kruskal-Wallis Rank Sum Test

A non-parametric alternative to ANOVA is Kruskal-Wallis rank sum test, which can be used when ANOVA assumptions are not met.

Kruskal-Wallis Rank Sum Test

# Kruskal-Wallis rank sum test
kruskal.test(logReservations ~ adType, data = resGrades.df)

    Kruskal-Wallis rank sum test

data:  logReservations by adType
Kruskal-Wallis chi-squared = 5856.4, df = 2, p-value < 2.2e-16

Pairwise-Comparison Test

pairwise.t.test(logReservations, adType, data = resGrades.df, p.adjust.method = "BH", pool.sd = FALSE)

    Pairwise comparisons using t tests with non-pooled SD 

data:  logReservations and adType 

        Curr Ads New Ads
New Ads <2e-16   -      
No Ads  0.32     <2e-16 

P value adjustment method: BH 

If Variances are Heterogeneous?

The classical one-way ANOVA test requires an assumption of equal variances for all groups. In our example, the homogeneity of variance assumption turned out to be fail. Hence, we use Welch one-way test.

Welch One-way Test

# anova test when variances are not same
oneway.test(logReservations ~ adType, data = resGrades.df)

    One-way analysis of means (not assuming equal variances)

data:  logReservations and adType
F = 3892.4, num df = 2, denom df = 19993, p-value < 2.2e-16