The equation for example
sales = b0 + b1* youtube + b2* facebook
is known as additive model. It assumes that there is no relationship between predictors.
This assumption may not be true. eg. spending money on facebook ad might increase the effectiveness of Youtube advertising.
In statistics, it is called as interaction effect
In that case the equation can become
sales = b0 + b1* youtube + b2* facebook + b3* (youtube*facebook)
Lets load the packages
library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ purrr 0.3.3
## ✓ tibble 3.0.0 ✓ dplyr 0.8.5
## ✓ tidyr 1.0.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ─────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(caret)
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
We’ll use the marketing dataset from datarium package
Load the data
data("marketing",package="datarium")
Inspect the data
sample_n(marketing,3)
## youtube facebook newspaper sales
## 1 207.00 21.72 36.84 17.28
## 2 263.76 40.20 54.12 23.52
## 3 91.56 33.00 19.20 14.40
Split the data into training and test set
set.seed(123)
training.samples <- createDataPartition(marketing$sales,p=0.8,list=FALSE)
train.data <- marketing[training.samples,]
test.data <- marketing[-training.samples,]
Build the model
model <- lm(sales~youtube+facebook,data=train.data)
Summarize the model
summary(model)
##
## Call:
## lm(formula = sales ~ youtube + facebook, data = train.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.8127 -1.0073 0.3236 1.4643 3.3454
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.658580 0.393609 9.295 <2e-16 ***
## youtube 0.044650 0.001548 28.846 <2e-16 ***
## facebook 0.190165 0.009006 21.114 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.038 on 159 degrees of freedom
## Multiple R-squared: 0.8954, Adjusted R-squared: 0.894
## F-statistic: 680.2 on 2 and 159 DF, p-value: < 2.2e-16
Make predictions
predictions <- predict(model,test.data)
Model Performance RMSE
RMSE(predictions,test.data$sales)
## [1] 1.949125
R-square
R2(predictions,test.data$sales)
## [1] 0.9069336
In R, you include interactions between variables using the * operator
Build the model
model12 <- lm(sales~youtube+facebook+youtube:facebook,data=marketing)
or you can simple use this
mode12 <- lm(sales~youtube*facebook,data=train.data)
Summarizing
summary(mode12)
##
## Call:
## lm(formula = sales ~ youtube * facebook, data = train.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.6325 -0.5051 0.2666 0.7425 1.8109
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.179e+00 3.306e-01 24.741 < 2e-16 ***
## youtube 1.859e-02 1.660e-03 11.196 < 2e-16 ***
## facebook 2.781e-02 1.016e-02 2.739 0.00687 **
## youtube:facebook 9.145e-04 4.952e-05 18.467 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.15 on 158 degrees of freedom
## Multiple R-squared: 0.9669, Adjusted R-squared: 0.9662
## F-statistic: 1537 on 3 and 158 DF, p-value: < 2.2e-16
predictions
predictions <- mode12 %>% predict(test.data)
RMSE(predictions,test.data$sales)
## [1] 1.055644
It can be seen that all the coefficients, including the interaction term coefficient, are statistically significant, suggesting that there is an interaction relationship between the two predictor variables (youtube and facebook advertising).
The model look like
sales = 7.89 + 0.019* youtube + 0.029* facebook + 0.0009* youtube* facebook
We can interpret this as an increase in youtube advertising of 1000 dollars is associated with increased sales of (b1 + b3* facebook)* 1000 = 19 + 0.9* facebook units. And an increase in facebook advertising of 1000 dollars will be associated with an increase in sales of (b2 + b3* youtube)* 1000 = 28 + 0.9* youtube units.
Note that, sometimes, it is the case that the interaction term is significant but not the main effects. The hierarchical principle states that, if we include an interaction in a model, we should also include the main effects, even if the p-values associated with their coefficients are not significant
The RMSE of interaction model is 1.05 compared to 1.9 for the additive model. Also R-2 is also better.
These results suggest that the model with the interaction term is better than the model that contains only main effects.