Hypotheses

  1. Hypothesis 1: Exposure to advertisements (Ads = 1 or Ads = 2) leads to higher product purchases compared to the control group (Ads = 0).
  2. Hypothesis 2: Version 2 of the advertisement (Ads = 2) is more effective in driving purchases than Version 1 (Ads = 1).

Load Data

library(readxl)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(broom)

# Load dataset
data <- read_excel("ab_testing1.xlsx")

# Ensure column names are clean
names(data) <- trimws(names(data))

# Convert to data frame
data <- as.data.frame(data)

# View structure and summary
str(data)
## 'data.frame':    29 obs. of  2 variables:
##  $ Ads     : num  1 0 2 0 1 1 2 2 2 0 ...
##  $ Purchase: num  152 21 77 65 183 87 121 104 116 82 ...
summary(data)
##       Ads           Purchase     
##  Min.   :0.000   Min.   : 14.00  
##  1st Qu.:0.000   1st Qu.: 51.00  
##  Median :1.000   Median : 77.00  
##  Mean   :1.069   Mean   : 76.07  
##  3rd Qu.:2.000   3rd Qu.:104.00  
##  Max.   :2.000   Max.   :183.00

Exploratory Data Analysis

# Check column names
colnames(data)
## [1] "Ads"      "Purchase"
# Boxplot of Ads vs Purchases
ggplot(data, aes(x = as.factor(Ads), y = Purchase)) +
  geom_boxplot(fill = "lightblue") +
  labs(title = "Effect of Ads on Purchases", x = "Ad Version", y = "Number of Purchases")

Regression Model: Predicting Purchase Based on Ads Exposure

# Convert Ads to a factor
data$Ads <- as.factor(data$Ads)

# Run regression model
model <- lm(Purchase ~ Ads, data = data)
model_summary <- summary(model)
model_summary
## 
## Call:
## lm(formula = Purchase ~ Ads, data = data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -59.75 -22.75  -3.75  30.25  64.29 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    49.00      10.21   4.800 5.69e-05 ***
## Ads1           69.71      15.91   4.383 0.000171 ***
## Ads2           24.75      13.82   1.791 0.084982 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 32.28 on 26 degrees of freedom
## Multiple R-squared:  0.4262, Adjusted R-squared:  0.3821 
## F-statistic: 9.656 on 2 and 26 DF,  p-value: 0.0007308

Interpretation & Managerial Recommendations

  • If the regression results show significant p-values for Ads = 1 or Ads = 2, it indicates that advertising has a statistically significant impact on purchases.
  • If Ads = 2 has a larger coefficient than Ads = 1, Version 2 of the ad is more effective than Version 1.
  • If Ads = 0 has the lowest purchase rate, then advertising indeed increases product sales.
  • Managers should prioritize the more effective ad version to maximize sales and allocate the budget efficiently.
  • Further testing can be conducted with larger samples or different ad formats to validate results.

Conclusion