Hypotheses
- Simple Regression Hypothesis: We hypothesize that
spend positively impacts
revenue—higher ad spend should correlate with higher
revenue.
- Multiple Regression Hypothesis: We hypothesize that
both spend and display campaign
presence influence revenue, with spend having
a stronger effect. Running a display campaign should contribute to
higher revenue.
Load Data
library(readxl)
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(broom)
# Load dataset
data <- read_excel("Display_data.xlsx")
# Ensure column names are clean
names(data) <- trimws(names(data))
# Convert to data frame
data <- as.data.frame(data)
# View structure and summary
str(data)
## 'data.frame': 29 obs. of 8 variables:
## $ spend : num 22.6 37.3 55.6 45.4 50.2 ...
## $ clicks : num 165 228 291 247 290 172 68 112 306 300 ...
## $ impressions : num 8672 11875 14631 11709 14768 ...
## $ display : num 0 0 0 0 0 0 0 0 0 0 ...
## $ transactions: num 2 2 3 2 3 2 1 1 3 3 ...
## $ revenue : num 58.9 44.9 141.6 209.8 197.7 ...
## $ ctr : num 1.9 1.92 1.99 2.11 1.96 1.98 2.33 1.89 2.07 2.02 ...
## $ con_rate : num 1.21 0.88 1.03 0.81 1.03 1.16 1.47 0.89 0.98 1 ...
summary(data)
## spend clicks impressions display
## Min. : 1.12 Min. : 48.0 Min. : 1862 Min. :0.0000
## 1st Qu.:28.73 1st Qu.:172.0 1st Qu.: 6048 1st Qu.:0.0000
## Median :39.68 Median :241.0 Median : 9934 Median :0.0000
## Mean :44.22 Mean :257.1 Mean :11858 Mean :0.3103
## 3rd Qu.:55.57 3rd Qu.:303.0 3rd Qu.:14789 3rd Qu.:1.0000
## Max. :91.28 Max. :593.0 Max. :29324 Max. :1.0000
## transactions revenue ctr con_rate
## Min. :1.000 Min. : 16.16 Min. :1.890 Min. :0.810
## 1st Qu.:2.000 1st Qu.:117.32 1st Qu.:1.970 1st Qu.:0.990
## Median :3.000 Median :235.16 Median :2.020 Median :1.130
## Mean :2.966 Mean :223.50 Mean :2.306 Mean :1.227
## 3rd Qu.:4.000 3rd Qu.:298.92 3rd Qu.:2.790 3rd Qu.:1.470
## Max. :6.000 Max. :522.00 Max. :3.290 Max. :2.080
Exploratory Data Analysis
# Check column names
colnames(data)
## [1] "spend" "clicks" "impressions" "display" "transactions"
## [6] "revenue" "ctr" "con_rate"
# Scatterplot of Spend vs Revenue
ggplot(data, aes(x = as.numeric(spend), y = as.numeric(revenue))) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
labs(title = "Spend vs Revenue", x = "Spend ($)", y = "Revenue ($)")
## `geom_smooth()` using formula = 'y ~ x'

Simple Regression Model: Predicting Revenue Based on Spend
simple_model <- lm(as.numeric(revenue) ~ as.numeric(spend), data = data)
simple_summary <- summary(simple_model)
simple_summary
##
## Call:
## lm(formula = as.numeric(revenue) ~ as.numeric(spend), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -145.210 -54.647 1.117 67.780 149.476
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.9397 37.9668 0.288 0.775
## as.numeric(spend) 4.8066 0.7775 6.182 1.31e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 86.71 on 27 degrees of freedom
## Multiple R-squared: 0.586, Adjusted R-squared: 0.5707
## F-statistic: 38.22 on 1 and 27 DF, p-value: 1.311e-06
Interpretation & Managerial Recommendations
- The coefficient for Spend should be positive,
indicating that increasing spending leads to higher revenue.
- If the p-value is significant (<0.05), this confirms a strong
relationship between Spend and
Revenue.
- Managers should allocate more budget toward spending but assess the
diminishing returns to ensure cost efficiency.
Multiple Regression Model: Predicting Revenue Based on Spend and
Display Campaign
multi_model <- lm(as.numeric(revenue) ~ as.numeric(spend) + as.factor(display), data = data)
multi_summary <- summary(multi_model)
multi_summary
##
## Call:
## lm(formula = as.numeric(revenue) ~ as.numeric(spend) + as.factor(display),
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -176.730 -35.020 8.661 56.440 129.231
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -50.8612 40.3336 -1.261 0.21850
## as.numeric(spend) 5.5473 0.7415 7.482 6.07e-08 ***
## as.factor(display)1 93.5856 33.1910 2.820 0.00908 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 77.33 on 26 degrees of freedom
## Multiple R-squared: 0.6829, Adjusted R-squared: 0.6586
## F-statistic: 28 on 2 and 26 DF, p-value: 3.271e-07
Interpretation & Managerial Recommendations
- If Display has a significant p-value, it indicates
that running a display campaign positively affects revenue.
- If the adjusted R² improves compared to the simple model, adding
Display helps explain revenue variability.
- Managers should run display campaigns strategically, ensuring they
enhance rather than merely increase ad spend.
Conclusion
- Spend is a key predictor of revenue, confirming our
hypothesis.
- Display campaigns may enhance revenue but should be
used efficiently based on ROI analysis.
- Further analysis could include interaction effects or time-based
trends to refine marketing decisions.