Agradeep
16 october 2017
Summary of data
Data<-read.csv("StoreData.csv")
summary(Data)
storeNum Year Week p1sales
Min. :101.0 Min. :1.0 Min. : 1.00 Min. : 73
1st Qu.:105.8 1st Qu.:1.0 1st Qu.:13.75 1st Qu.:113
Median :110.5 Median :1.5 Median :26.50 Median :129
Mean :110.5 Mean :1.5 Mean :26.50 Mean :133
3rd Qu.:115.2 3rd Qu.:2.0 3rd Qu.:39.25 3rd Qu.:150
Max. :120.0 Max. :2.0 Max. :52.00 Max. :263
p2sales p1price p2price p1prom
Min. : 51.0 Min. :2.190 Min. :2.29 Min. :0.0
1st Qu.: 84.0 1st Qu.:2.290 1st Qu.:2.49 1st Qu.:0.0
Median : 96.0 Median :2.490 Median :2.59 Median :0.0
Mean :100.2 Mean :2.544 Mean :2.70 Mean :0.1
3rd Qu.:113.0 3rd Qu.:2.790 3rd Qu.:2.99 3rd Qu.:0.0
Max. :225.0 Max. :2.990 Max. :3.19 Max. :1.0
p2prom country
Min. :0.0000 AU:104
1st Qu.:0.0000 BR:208
Median :0.0000 CN:208
Mean :0.1385 DE:520
3rd Qu.:0.0000 GB:312
Max. :1.0000 JP:416
US:312
Correlation between coke sales and coke's promotion
attach(Data)
cor.test(Data$p1sales, Data$p1prom)
Pearson's product-moment correlation
data: Data$p1sales and Data$p1prom
t = 21.168, df = 2078, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.3851676 0.4559018
sample estimates:
cor
0.421175
correlation between coke sales and pepsi promotion
attach(Data)
cor.test(Data$p1sales, Data$p2prom)
Pearson's product-moment correlation
data: Data$p1sales and Data$p2prom
t = -0.60848, df = 2078, p-value = 0.5429
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.05629431 0.02964957
sample estimates:
cor
-0.01334702
correlation matrix of the sales and prices of Coke and Pepsi versus the promotions of Coke and Pepsi
a <- Data[,c("p1sales", "p2sales", "p1price", "p2price")]
# two columns in y
b <- Data[,c("p1prom", "p2prom")]
z <- cor(a,b)
round(z,2)
p1prom p2prom
p1sales 0.42 -0.01
p2sales -0.01 0.56
p1price -0.01 0.02
p2price 0.00 -0.01
corrgram
library(corrgram)
corrgram(Data[,c(4:9)], order=FALSE, lower.panel=panel.conf,
upper.panel=panel.pie, text.panel=panel.txt,
main="Corrgram - StoreData")
Test the null hypothesis that the sales of Pepsi are uncorrelated with Pepsi's promotions
cor.test(Data$p2sales,Data$p2prom)
Pearson's product-moment correlation
data: Data$p2sales and Data$p2prom
t = 30.804, df = 2078, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.5296696 0.5887155
sample estimates:
cor
0.559903
Test the null hypothesis that the sales of Pepsi are uncorrelated with Coke's promotions
cor.test(Data$p2sales,Data$p1prom)
Pearson's product-moment correlation
data: Data$p2sales and Data$p1prom
t = -0.6361, df = 2078, p-value = 0.5248
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.05689831 0.02904415
sample estimates:
cor
-0.01395285
Run a simple linear regression of the sales of Coke on the price of Coke
fit <- lm(p1sales ~ p1price, data = Data)
summary(fit)
Call:
lm(formula = p1sales ~ p1price, data = Data)
Residuals:
Min 1Q Median 3Q Max
-52.724 -17.454 -2.819 14.463 111.276
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 267.138 4.523 59.06 <2e-16 ***
p1price -52.700 1.766 -29.84 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 23.74 on 2078 degrees of freedom
Multiple R-squared: 0.3, Adjusted R-squared: 0.2997
F-statistic: 890.6 on 1 and 2078 DF, p-value: < 2.2e-16
Run another simple linear regression of the sales of Pepsi on the price of Pepsi
fit1 <- lm(p2sales ~ p2price, data = Data)
summary(fit1)
Call:
lm(formula = p2sales ~ p2price, data = Data)
Residuals:
Min 1Q Median 3Q Max
-45.657 -15.657 -3.077 11.400 110.184
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 196.788 3.877 50.76 <2e-16 ***
p2price -35.796 1.425 -25.11 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21.4 on 2078 degrees of freedom
Multiple R-squared: 0.2328, Adjusted R-squared: 0.2324
F-statistic: 630.6 on 1 and 2078 DF, p-value: < 2.2e-16
Compare the two simple linear regressions. The sales of which product are more responsive to a change in its price?
Beta for sales of coke on price of coke is -52.7,means an increase in price of coke by 1 units leads to drecrese in sales of coke by 52.7 units