Analysis Purpose

(1) Which type of advertisement (natural vs family healthe caring) is more effective?

(2) What’s the price elasticity (the reaction of sales quantity of the grape juice to its own price change)?

(3) What’s the cross-price elasticity (the reaction of sales quantity of the grape juice to price change of other products)?

(4) What’s the unit price of the grape juice that can maximum the profit?

Load Data

setwd('~/Desktop/R/Marketing')
grape_sale.df <- read.csv('grapeJuice.csv', colClasses = c(ad_type = "factor")) 
summary(grape_sale.df)
##   sales_grape     price_grape     ad_type  price_apple    price_cookies   
##  Min.   :131.0   Min.   : 8.200   0:15    Min.   :7.300   Min.   : 8.790  
##  1st Qu.:182.5   1st Qu.: 9.585   1:15    1st Qu.:7.438   1st Qu.: 9.190  
##  Median :204.5   Median : 9.855           Median :7.580   Median : 9.515  
##  Mean   :216.7   Mean   : 9.738           Mean   :7.659   Mean   : 9.622  
##  3rd Qu.:244.2   3rd Qu.:10.268           3rd Qu.:7.805   3rd Qu.:10.140  
##  Max.   :335.0   Max.   :10.490           Max.   :8.290   Max.   :10.580

From the summary table, we could see that the mean sales of grape juice is 216.7 while the mean price of grape juice is $9.7

Check the outliers and distribution of sales data

par(mfrow=c(1,2))
boxplot(grape_sale.df$sales_grape,main="Boxplot for sales data", xlab="sales_grape")
hist(grape_sale.df$sales_grape,main='histogram plot for sales data',xlab='sales_grape',prob=T)
lines(density(grape_sale.df$sales_grape),lty='dashed',lwd=2.5)

From the boxplot, we don’t find outliers; For the histogram plot, the distribution for sales of grape juice is roughly normal, so we don’t need to transform the data

Ad Effectiveness Analysis

Determine the Ad effectiveness is simple, just compare the mean sales of two Ad and conducted t-tests

Sales_Ad_Nature <- grape_sale.df[grape_sale.df$ad_type==0,]
Sales_Ad_Family <- grape_sale.df[grape_sale.df$ad_type==1,]
mean(Sales_Ad_Nature$sales_grape)
## [1] 186.6667
mean(Sales_Ad_Family$sales_grape)
## [1] 246.6667
shapiro.test(Sales_Ad_Nature$sales_grape)
## 
##  Shapiro-Wilk normality test
## 
## data:  Sales_Ad_Nature$sales_grape
## W = 0.94255, p-value = 0.4155
shapiro.test(Sales_Ad_Family$sales_grape)
## 
##  Shapiro-Wilk normality test
## 
## data:  Sales_Ad_Family$sales_grape
## W = 0.89743, p-value = 0.08695
t.test(Sales_Ad_Nature$sales_grape,Sales_Ad_Family$sales_grape)
## 
##  Welch Two Sample t-test
## 
## data:  Sales_Ad_Nature$sales_grape and Sales_Ad_Family$sales_grape
## t = -3.7515, df = 25.257, p-value = 0.0009233
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -92.92234 -27.07766
## sample estimates:
## mean of x mean of y 
##  186.6667  246.6667

Conclusion: the mean sale with Family Care Ad is higher than Nature Ad. Both shapiro test for Nature Ad sales and Family Care Ad sales show that the sales data are distrubuted normaly. We could use t-test and the t-test shows that the difference is significant (P<0.05)

Price Elasticity Analysis

Use multiple linear regression to determine the Sales Drivers

library(s20x)
## Warning: package 's20x' was built under R version 3.2.5
pairs(grape_sale.df, pch=20)

pairs20x(grape_sale.df)

From the above figure, we could see that the correlation coefficients between grape juice sales and its price, AD’s type, apple juice price, and cookies price are 0.85, 0.58, 0.37, and 0.37 respectively. They are likely to be the sales drivers, so we will add of the them as independent variabels.

sales_grape.lm <- lm(sales_grape~price_grape+ad_type+price_apple+price_cookies, grape_sale.df)
summary(sales_grape.lm)
## 
## Call:
## lm(formula = sales_grape ~ price_grape + ad_type + price_apple + 
##     price_cookies, data = grape_sale.df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -36.290 -10.488   0.884  10.483  29.471 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    774.813    145.349   5.331 1.59e-05 ***
## price_grape    -51.239      5.321  -9.630 6.83e-10 ***
## ad_type1        29.742      7.249   4.103 0.000380 ***
## price_apple     22.089     12.512   1.765 0.089710 .  
## price_cookies  -25.277      6.296  -4.015 0.000477 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.2 on 25 degrees of freedom
## Multiple R-squared:  0.8974, Adjusted R-squared:  0.881 
## F-statistic: 54.67 on 4 and 25 DF,  p-value: 5.318e-12

We could conclude that grape juice price, ad_type, and cookies price all significantly affect the sales of grape juice.

For price elasticity (percent change in quantity divided by the percent change in price), PE = (ΔQ/Q) / (ΔP/P) = (ΔQ/ΔP) * (P/Q) = -51.24 * 0.045 = -2.3; it suggests that 1% decrease in price will increase the sales by 2.3%

For cross-price elasticity (percent change in quantity divided by the change in the price of some other product), CPEapple = (ΔQ/ΔPapple) * (Papple/Q) = 22.1 * ( 7.659 / 216.7) = 0.78, its suggests that 1% decrease in apple juice price will decrease the sales of grape juice by 0.78%; so the applie juice are substitutes of grape juice; CPEcookies = (ΔQ/ΔPcookies) * (Pcookies/Q) = -25.28 * ( 9.622 / 216.7) = – 1.12; it suggests that 1% decrease in cookies price will increase the grape juice sales by 1.12%, and cookies are compliments for grape juice.

Optimal Pricing

First, simplify the question by setting ad_type=1, price_apple as 7.66 (mean value), price_cookies=9.74 (mean value), then sale_grape=772.64 - 51.24price, assume that the cost puer unit of grape juice is $5, then the Profit (Y) Y= (price_grape-cost)(772.64-51.24price)

f <- function(x) {(-51.2)*x^2+1028.84*x-3863.2}
curve(f, from=0, to =20, ylab="Profit")

optimize(f, lower=0, upper=20, maximum=TRUE)
## $maximum
## [1] 10.04727
## 
## $objective
## [1] 1305.314

We may conclude that the price of $10.04 will product the maximum profit

Reference http://www.dataapple.net/?p=19