Email           :
Instagram   : https://www.instagram.com/marvis.zerex/
RPubs          : https://rpubs.com/invokerarts/
Linkedin     : https://www.linkedin.com/in/jeffry-wijaya-087a191b5/
Majors         : Business Statistics
Address      : ARA Center, Matana University Tower
                        Jl. CBD Barat Kav, RT.1, Curug Sangereng, Kelapa Dua, Tangerang, Banten 15810.



1 Case Studies

The impact of how a dollar spent on an organization’s marketing efforts on its sales is something that all organizations should consider. A fiscally prudent organization should be using its relatively scarce resources wisely. Thus, all organizations need to ask themselves, “Is the money I’m spending worth the return on sales?” Furthermore, organizations can delve deeper by asking, “For every dollar spent on marketing, how much are we getting in return on sales?” One can answer these questions using a simple linear regression model. As always, we will use a fabricated example to examine a store’s marketing efforts and their impact on sales. This will also be a more comprehensive primer on the simple linear regression model, the model that the majority of econometrics students are first exposed to.

Data <- readRDS("marketing.rds")
datatable(Data)

2 The Objective

attach(Data)
lm.youtube = lm(sales ~ youtube)
lm.facebook=lm(sales ~ facebook)
lm.newspaper = lm(sales ~ newspaper)
par(mfrow = c(1,3))
plot(youtube , sales, cex.lab=2, cex.axis=1.2)
abline(lm.youtube  , col = "red", lty=1, lwd = 2)
plot(facebook, sales,cex.lab=2,cex.axis=1.2)
abline(lm.facebook , col="green", lty=1, lwd=2)
  title("Data",cex.main = 2,font.main= 4, col.main= "black")
plot(newspaper,sales,cex.lab=2,cex.axis=1.2)
abline(lm.newspaper, col="blue" , lty=1, lwd=2)

Some questions, that you would like to answer properly are:

2.1 Is there a relationship between advertising budget and sales?

ad.lm <- lm(sales~., data=Data)
summary(ad.lm)
## 
## Call:
## lm(formula = sales ~ ., data = Data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.5932  -1.0690   0.2902   1.4272   3.3951 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.526667   0.374290   9.422   <2e-16 ***
## youtube      0.045765   0.001395  32.809   <2e-16 ***
## facebook     0.188530   0.008611  21.893   <2e-16 ***
## newspaper   -0.001037   0.005871  -0.177     0.86    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.023 on 196 degrees of freedom
## Multiple R-squared:  0.8972, Adjusted R-squared:  0.8956 
## F-statistic: 570.3 on 3 and 196 DF,  p-value: < 2.2e-16

Uji Hipotesis \(H_0:\beta_{youtube}=\beta_{facebook}=\beta_{newspaper}=0\). Dapat dilihat bahwa ada hubungan antara pengiklanan dan penjualan.

2.2 How strong is the relationship?

Ada 2 ukuran untuk akurasi model linear : RSE dan \(R^2\) dimana RSE: Residual (\(y-\hat{y}\)) Standard Error.

rse <- summary(ad.lm)$sigma
mean(Data$sales) 
## [1] 16.827
rse/mean(Data$sales)
## [1] 0.1202004
rsq=summary(ad.lm)$r.sq
rsq
## [1] 0.8972106

Prediktornya menjelaskan bahwa hampir 90% varians ada pada sales dengan \(RSE=16.827\) dan \(R^2=0.8972106\)

2.3 Which media contribute to sales?

coef <- summary(ad.lm)$coefficients
coef
##                 Estimate  Std. Error    t value     Pr(>|t|)
## (Intercept)  3.526667243 0.374289884  9.4222884 1.267295e-17
## youtube      0.045764645 0.001394897 32.8086244 1.509960e-81
## facebook     0.188530017 0.008611234 21.8934961 1.505339e-54
## newspaper   -0.001037493 0.005871010 -0.1767146 8.599151e-01

Dalam multiple linear regression yang ditunjukan, p-value dari youtube dan facebook rendah, tetapi p-value untuk newspaper tidak. ini menunjukan bahwa hanya youtube dan facebook yang memiliki hubungan dengan sales.

2.4 How accurately can we estimate the effect of each medium on sales?

lolim=coef[,1] - 1.96*coef[,2]
uplim=coef[,1] + 1.96*coef[,2]
cbind(lolim,uplim)
##                   lolim      uplim
## (Intercept)  2.79305907 4.26027542
## youtube      0.04303065 0.04849864
## facebook     0.17165200 0.20540804
## newspaper   -0.01254467 0.01046969

standard error dari \(\hat{\beta}_j\) dapat digunakan untuk membangun interval kepercayaan \(\beta_j\) . Untuk data yang kita miliki, 95% interval kepercayaan adalah sebagai berikut: (0.04, 0.05) untuk youtube, (0.17, 0.21) untuk facebook, and (-0.01, 0.01) untuk newspaper. Interval kepercayaan untuk youutbe dan facebook yidak menyentuh 0, membuktikan bahwa ada hubungan kedua media tersebut dengan sales. Namun interval untuk newspaper menyentuh 0, yang menyatakan bahwa variabel ini tidak signifikan secara statistik jika dibandingkan dengan youtube dan facebook.

\(VIF(\hat\beta_{j})=1/(1-R^2)\) dimana \(R^2\) regresi dari seluruh \(X_j\) prediktor lainnya. VIF>5 atau 10 menyatakan adanya kolineraritas.

vif(ad.lm)
##   youtube  facebook newspaper 
##  1.004611  1.144952  1.145187

VIF skor yaitu : 1.01, 1.15, and 1.15 untuk youtube, facebook, dan newspaper, menunjukan tidak ada bukti tentang kolineraitas.

2.5 How accurately can we predict future sales

predict(ad.lm, newdata=data.frame(youtube=176.451,facebook=27.9168,newspaper=36.6648),
        interval="confidence")
##      fit      lwr      upr
## 1 16.827 16.54494 17.10906
predict(ad.lm, newdata=data.frame(youtube=176.451,facebook=27.9168,newspaper=36.6648),
        interval="prediction")
##      fit      lwr      upr
## 1 16.827 12.82816 20.82584

Interval perdiksi selalu lebih lebar dibandingkan interval kepercayaan karena interval prediksi menghitung ketidakpastian dalam \(\epsilon\), yang merupakan error tetap.

2.6 Is there synergy (interaction) among the advertising media?

ad.lm2 <- lm(sales~.^2, data=Data)
summary(ad.lm2)
## 
## Call:
## lm(formula = sales ~ .^2, data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.1087 -0.4745  0.2248  0.7172  1.8320 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         7.752e+00  3.811e-01  20.342   <2e-16 ***
## youtube             2.033e-02  1.609e-03  12.633   <2e-16 ***
## facebook            2.293e-02  1.141e-02   2.009   0.0460 *  
## newspaper           1.703e-02  1.007e-02   1.691   0.0924 .  
## youtube:facebook    9.494e-04  4.764e-05  19.930   <2e-16 ***
## youtube:newspaper  -6.643e-05  2.983e-05  -2.227   0.0271 *  
## facebook:newspaper -9.133e-05  1.969e-04  -0.464   0.6433    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.126 on 193 degrees of freedom
## Multiple R-squared:  0.9686, Adjusted R-squared:  0.9677 
## F-statistic: 993.3 on 6 and 193 DF,  p-value: < 2.2e-16
summary(ad.lm2)$r.sq;summary(ad.lm)$r.sq 
## [1] 0.9686311
## [1] 0.8972106

Dengan terjadinya proses interaksi dalam model, dapat dilihat bahwa data menghasilkan peningkatan pada \(R^2\), dari sekitar 90% menjadi hampir 97%.

ad.lm3 <- lm(sales~.+I(youtube^2), data=Data)
summary(ad.lm3)
## 
## Call:
## lm(formula = sales ~ . + I(youtube^2), data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.8300 -1.0442 -0.0581  1.1475  4.2725 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.524e+00  4.494e-01   3.392  0.00084 ***
## youtube       7.847e-02  5.001e-03  15.690  < 2e-16 ***
## facebook      1.926e-01  7.794e-03  24.706  < 2e-16 ***
## newspaper     8.906e-04  5.306e-03   0.168  0.86688    
## I(youtube^2) -9.478e-05  1.403e-05  -6.757 1.59e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.825 on 195 degrees of freedom
## Multiple R-squared:  0.9167, Adjusted R-squared:  0.915 
## F-statistic: 536.6 on 4 and 195 DF,  p-value: < 2.2e-16
anova(ad.lm,ad.lm3)
## Analysis of Variance Table
## 
## Model 1: sales ~ youtube + facebook + newspaper
## Model 2: sales ~ youtube + facebook + newspaper + I(youtube^2)
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    196 801.83                                  
## 2    195 649.71  1    152.12 45.656 1.587e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(ad.lm3)

ad.lm4 <- lm(sales~.+poly(youtube,3), data=Data)
summary(ad.lm4)
## 
## Call:
## lm(formula = sales ~ . + poly(youtube, 3), data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.4387 -1.0011 -0.0783  0.9244  4.4774 
## 
## Coefficients: (1 not defined because of singularities)
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        3.304e+00  3.188e-01  10.362  < 2e-16 ***
## youtube            4.568e-02  1.184e-03  38.574  < 2e-16 ***
## facebook           1.961e-01  7.365e-03  26.629  < 2e-16 ***
## newspaper         -3.371e-04  4.997e-03  -0.067    0.946    
## poly(youtube, 3)1         NA         NA      NA       NA    
## poly(youtube, 3)2 -1.247e+01  1.729e+00  -7.212 1.20e-11 ***
## poly(youtube, 3)3  8.854e+00  1.725e+00   5.133 6.91e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.717 on 194 degrees of freedom
## Multiple R-squared:  0.9267, Adjusted R-squared:  0.9248 
## F-statistic: 490.3 on 5 and 194 DF,  p-value: < 2.2e-16
anova(ad.lm,ad.lm4)
## Analysis of Variance Table
## 
## Model 1: sales ~ youtube + facebook + newspaper
## Model 2: sales ~ youtube + facebook + newspaper + poly(youtube, 3)
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    196 801.83                                  
## 2    194 572.03  2    229.79 38.967 5.942e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(ad.lm3,ad.lm4)
## Analysis of Variance Table
## 
## Model 1: sales ~ youtube + facebook + newspaper + I(youtube^2)
## Model 2: sales ~ youtube + facebook + newspaper + poly(youtube, 3)
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    195 649.71                                  
## 2    194 572.03  1    77.676 26.343 6.915e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(ad.lm4)

ad.lm5 <- lm(sales~.+poly(youtube,3)+poly(facebook,3), data=Data)
plot(ad.lm5)

anova(ad.lm4,ad.lm5)
## Analysis of Variance Table
## 
## Model 1: sales ~ youtube + facebook + newspaper + poly(youtube, 3)
## Model 2: sales ~ youtube + facebook + newspaper + poly(youtube, 3) + poly(facebook, 
##     3)
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1    194 572.03                           
## 2    192 567.72  2    4.3097 0.7288 0.4838