Introduction

This document provides an analysis for two datasets: 1. Display_data.csv – Analysis of revenue in relation to spend and display campaigns. 2. ab_testing1.csv – Analysis of the effect of different advertising campaigns on product purchase.

For each dataset, I will describe the hypotheses, run regression analyses, and provide managerial recommendations.


Q1: Analysis on Display_data.csv

Data Loading and Summary

# Fit a simple linear regression model
simple_model <- lm(revenue ~ spend, data = display_data)

# Display the summary of the model
summary(simple_model)
## 
## Call:
## lm(formula = revenue ~ spend, data = display_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -145.210  -54.647    1.117   67.780  149.476 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  10.9397    37.9668   0.288    0.775    
## spend         4.8066     0.7775   6.182 1.31e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 86.71 on 27 degrees of freedom
## Multiple R-squared:  0.586,  Adjusted R-squared:  0.5707 
## F-statistic: 38.22 on 1 and 27 DF,  p-value: 1.311e-06
# Fit a multiple regression model: revenue ~ spend + display
multiple_model <- lm(revenue ~ spend + display, data = display_data)

# Display the summary of the multiple regression model
summary(multiple_model)
## 
## Call:
## lm(formula = revenue ~ spend + display, data = display_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -176.730  -35.020    8.661   56.440  129.231 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -50.8612    40.3336  -1.261  0.21850    
## spend         5.5473     0.7415   7.482 6.07e-08 ***
## display      93.5856    33.1910   2.820  0.00908 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 77.33 on 26 degrees of freedom
## Multiple R-squared:  0.6829, Adjusted R-squared:  0.6586 
## F-statistic:    28 on 2 and 26 DF,  p-value: 3.271e-07

Plotted the relationship between spend and revenue

plot(display_data$spend, display_data$revenue,
     main = "Revenue vs Spend",
     xlab = "Spend",
     ylab = "Revenue",
     pch = 19, col = "darkolivegreen3")
abline(simple_model, col = "hotpink")

# Convert 'Ads' to a factor since it represents three groups:
# 0 = control, 1 = version 1 of ads, 2 = version 2 of ads.
ab_test$Ads <- as.factor(ab_test$Ads)

# Fit the regression model using Ads as the predictor for purchase.
# Replace 'Purchase' with the correct column name if needed.
ab_model <- lm(Purchase ~ Ads, data = ab_test)

# Display the summary of the regression model
summary(ab_model)
## 
## Call:
## lm(formula = Purchase ~ Ads, data = ab_test)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -59.75 -22.75  -3.75  30.25  64.29 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    49.00      10.21   4.800 5.69e-05 ***
## Ads1           69.71      15.91   4.383 0.000171 ***
## Ads2           24.75      13.82   1.791 0.084982 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 32.28 on 26 degrees of freedom
## Multiple R-squared:  0.4262, Adjusted R-squared:  0.3821 
## F-statistic: 9.656 on 2 and 26 DF,  p-value: 0.0007308
# Conduct an ANOVA to test for overall group differences
anova(ab_model)
## Analysis of Variance Table
## 
## Response: Purchase
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## Ads        2  20122 10061.1  9.6564 0.0007308 ***
## Residuals 26  27090  1041.9                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1