Correlation Exercise

Tanushree
16 October 2017

Q1. What is the mean, standard deviation and variance of the sales of Coke?

  • Read the StoreData.csv file data
  • P1 are sales of coke
data <- read.csv("StoreData.csv", 
                      header  = TRUE,
                      sep = ",")

attach(data)
library(psych)
describe(p1sales)
   vars    n   mean    sd median trimmed   mad min max range skew kurtosis
X1    1 2080 133.05 28.37    129  131.08 26.69  73 263   190 0.74     0.66
     se
X1 0.62

Q2. What is the correlation of the sales of Coke with the promotions of Coke?

cor(p1sales, p1prom)
[1] 0.421175

Q3. What is the correlation of the sales of Coke with the promotions of Pepsi?

cor(p1sales, p2prom)
[1] -0.01334702

Q4. Create a correlation matrix of the sales and prices of Coke and Pepsi versus the promotions of Coke and Pepsi. (Hint: This should be a 4*2 matrix)

x <- data [4:7]
y <- data [8:9]
cor(x, y)
              p1prom      p2prom
p1sales  0.421174952 -0.01334702
p2sales -0.013952850  0.55990301
p1price -0.014731296  0.02426913
p2price -0.001363308 -0.01201736

Q5. Draw a corrgram illustrating the previous question

x <- data [4:7]
y <- data [8:9]

library(corrgram)
corrgram(cor(x,y), order=FALSE, lower.panel=panel.conf,
         upper.panel=panel.pie, text.panel=panel.txt)

plot of chunk unnamed-chunk-5

Q6. Test the null hypothesis that the sales of Pepsi are uncorrelated with Pepsi’s promotions

cor.test(p2sales, p2prom)

    Pearson's product-moment correlation

data:  p2sales and p2prom
t = 30.804, df = 2078, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5296696 0.5887155
sample estimates:
     cor 
0.559903 

Since the p-value is less than 0.05, we reject the null hypothesis that the sales of Pepsi are uncorrelated with Pepsi's promotions

Q7. Test the null hypothesis that the sales of Pepsi are uncorrelated with Coke's promotions

cor.test(p2sales, p1prom)

    Pearson's product-moment correlation

data:  p2sales and p1prom
t = -0.6361, df = 2078, p-value = 0.5248
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.05689831  0.02904415
sample estimates:
        cor 
-0.01395285 

Since the p-value is greater than 0.05, we do not the null hypothesis that the sales of Pepsi are uncorrelated with Coke's promotions

Q8. Run a simple linear regression of the sales of Coke on the price of Coke

lm(p1sales ~ p1price, data = data)

Call:
lm(formula = p1sales ~ p1price, data = data)

Coefficients:
(Intercept)      p1price  
      267.1        -52.7  

Q9. Run another simple linear regression of the sales of Pepsi on the price of Pepsi

lm(p2sales ~ p2price, data = data)

Call:
lm(formula = p2sales ~ p2price, data = data)

Coefficients:
(Intercept)      p2price  
      196.8        -35.8  

Q10. Compare the two simple linear regressions. The sales of which product are more responsive to a change in its price?

The coefficient for price of Coke varibale is -52.7 and fr the price of Pepsi is -35.8 -35.8 > -52.7 Thus the sales of Coke is less sensitive to change in product price than the sales of Pepesi