Hypotheses
- Hypothesis 1: Exposure to advertisements (Ads = 1
or Ads = 2) leads to higher product purchases compared to the control
group (Ads = 0).
- Hypothesis 2: Version 2 of the advertisement (Ads =
2) is more effective in driving purchases than Version 1 (Ads = 1).
Load Data
library(readxl)
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(broom)
# Load dataset
data <- read_excel("ab_testing1.xlsx")
# Ensure column names are clean
names(data) <- trimws(names(data))
# Convert to data frame
data <- as.data.frame(data)
# View structure and summary
str(data)
## 'data.frame': 29 obs. of 2 variables:
## $ Ads : num 1 0 2 0 1 1 2 2 2 0 ...
## $ Purchase: num 152 21 77 65 183 87 121 104 116 82 ...
summary(data)
## Ads Purchase
## Min. :0.000 Min. : 14.00
## 1st Qu.:0.000 1st Qu.: 51.00
## Median :1.000 Median : 77.00
## Mean :1.069 Mean : 76.07
## 3rd Qu.:2.000 3rd Qu.:104.00
## Max. :2.000 Max. :183.00
Exploratory Data Analysis
# Check column names
colnames(data)
## [1] "Ads" "Purchase"
# Boxplot of Ads vs Purchases
ggplot(data, aes(x = as.factor(Ads), y = Purchase)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Effect of Ads on Purchases", x = "Ad Version", y = "Number of Purchases")

Regression Model: Predicting Purchase Based on Ads Exposure
# Convert Ads to a factor
data$Ads <- as.factor(data$Ads)
# Run regression model
model <- lm(Purchase ~ Ads, data = data)
model_summary <- summary(model)
model_summary
##
## Call:
## lm(formula = Purchase ~ Ads, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -59.75 -22.75 -3.75 30.25 64.29
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 49.00 10.21 4.800 5.69e-05 ***
## Ads1 69.71 15.91 4.383 0.000171 ***
## Ads2 24.75 13.82 1.791 0.084982 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 32.28 on 26 degrees of freedom
## Multiple R-squared: 0.4262, Adjusted R-squared: 0.3821
## F-statistic: 9.656 on 2 and 26 DF, p-value: 0.0007308
Interpretation & Managerial Recommendations
- If the regression results show significant p-values
for Ads = 1 or Ads = 2, it indicates that advertising has a
statistically significant impact on purchases.
- If Ads = 2 has a larger coefficient than
Ads = 1, Version 2 of the ad is more effective than
Version 1.
- If Ads = 0 has the lowest purchase rate, then
advertising indeed increases product sales.
- Managers should prioritize the more effective ad version to maximize
sales and allocate the budget efficiently.
- Further testing can be conducted with larger samples or different ad
formats to validate results.
Conclusion
- The analysis provides insights into which advertisement is more
effective in driving purchases.
- If advertising is proven effective, the retailer should scale the
campaign accordingly.
- Future research can explore variations in ad placement, frequency,
and audience targeting to further optimize performance.