Q2 Midterm

Null Hypothesis (H₀): There is no significant difference in product purchases across the three ad groups (Ads = 0, 1, 2).

𝐻0:𝜇0 = 𝜇1 = 𝜇2

Alternative Hypothesis (H₁): At least one of the ad campaigns leads to a significantly different number of purchases.

𝐻1:𝜇𝑖≠𝜇𝑗for some𝑖≠𝑗

# Load necessary libraries
install.packages("readr")

## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.4'
## (as 'lib' is unspecified)

library(ggplot2)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(readr)

# Load your data (adjust the path as needed)
data2 <- read_csv("ab_testing1.csv")

## Rows: 29 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): Ads, Purchase
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Filter to include only Ads 0, 1, and 2 (excluding any unexpected group like 3)
data2 <- data2 %>% filter(Ads %in% c(0, 1, 2))

# Convert Ads to factor
data2$Ads <- factor(data2$Ads)

# Run ANOVA (linear regression with categorical predictor)
model <- aov(Purchase ~ Ads, data = data2)
summary(model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Ads          2  20122   10061   9.656 0.000731 ***
## Residuals   26  27090    1042                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ggplot(data2, aes(x = Ads, y = Purchase)) +
  geom_jitter(width = 0.1, alpha = 0.6, color = "darkgray") +  # actual points
  stat_summary(fun = mean, geom = "point", size = 4, color = "red") +  # group means
  stat_summary(fun.data = mean_cl_normal, geom = "errorbar", width = 0.2, color = "red") +  # error bars
  stat_summary(fun = mean, geom = "line", aes(group = 1), color = "red", linetype = "dashed") +  # regression line across factor levels
  labs(title = "Regression Model: Ad Campaign Effect on Purchases",
       x = "Ad Group (0 = Control, 1 = Ad1, 2 = Ad2)",
       y = "Purchase Count") +
  theme_minimal()

## Warning: Computation failed in `stat_summary()`.
## Caused by error in `fun.data()`:
## ! The package "Hmisc" is required.

#Interpretation The ANOVA results indicate whether there is a significant difference in purchase counts across the three ad groups. If the p-value is below 0.05, it suggests that at least one ad group leads to a different number of purchases. Visualizing the data, if Ad 1 shows higher average purchases than the other ads, it would be the most effective.

#Managerial Recommendations Focus on Ad 1: If Ad 1 significantly outperforms the others, prioritize it for future campaigns to maximize purchases.

Reevaluate Ad 2: If Ad 2 shows moderate performance, consider revising its content or testing different variations to improve its effectiveness.

Limit Control Group: Since the control group (Ad 0) performed the weakest, continue advertising with the experimental ads and minimize reliance on the control group for future promotions.

Q2 Midterm

2025-04-02