2023-04-29

OBJECTIVES

Primary Objective:

  • Calculate the price-elasticity of alcoholic beverages.

Secondary Objectives:

  • Compare price-elasticity between spirits and beer.

  • Compare price-elasticity between quantiles based on price.

DATA SOURCE

  • Data mining of two data sets: Monthly sales and Retail price for the state of Karnataka.

  • Time Period: March 2011 to March 2022.

  • Source: Karntaka State Beverage Corporation Limited (KSBCL).
    ! ##

DATA CLEANING

  • The data is in PDF files - requiring extraction of tables.

  • The item names are mentioned in a different way every month.

  • Further fragmentation due to inclusion of packaging information in the item name. See example:

#Show all Teacher's Highland Cream Scotch Whisky
unique(temp$item[str_detect(temp$item,
regex("Teacher'?s Highland Cream Scotch Whisky",ignore_case = T))])
## [1] "Teachers Highland Cream Scotch Whisky 375 Ml (0134)"       
## [2] "Teacher's Highland Cream Scotch Whisky 180ML(0134)"        
## [3] "Teachers Highland Cream Scotch Whisky 60 Ml (0134)"        
## [4] "Teachers Highland Cream Scotch Whisky 750 Ml (0134)"       
## [5] "Teachers Highland Cream Scotch Whisky 1000 Ml (0134)"      
## [6] "Teacher's Highland Cream Scotch Whisky 1000MLx12Btls(0134)"

DATA CLEANING-2

Steps:
- Extract beverage name.

  • Extract packaging volume, month-year and place of sale. 

  • Collapse items which are same but spelt differently.
    Tools Used:

  • ‘R’ platform with stringR, pdfTools packages.

  • Deep learning model - Sentence Transformers based on BERT, to calculate similarity between names and manually collapse them.

DATA CLEANING-3

Output:

#Number of initial items: 
nrow(temp)
## [1] 3742095
#Number of items with cleaning but no collapsing:  
length(unique(temp$itemName))
## [1] 14471
#Number of items with cleaning and collapsing:
length(unique(temp$itemNameNew))
## [1] 2137

DATA ENRICHMENT

  • Four team members coded alcoholic beverages in to ‘beer’,‘wine’ and ‘spirit’ category.
  • Essential step as they differ in alcohol concentration.
  • Due to almost negligible contribution of wine to alcohol sales this category has not been included in the analyses.

STATISTICAL METHODS

Overall approach: Linear Mixed Effects Models to explain sales of a particular beverage as a function of prevalent price, controlling for confounding variables.

  • All models have ‘time’ and ‘monthly seasonality’ as predictors (confounding variables).

  • All inference is based on bootstrapped confidence intervals.

  • All models are log-log i.e. sales and price both are on log scale.

  • Random effects are chosen based on the question we wish to answer:

STATISTICAL METHODS-2

What is the overall price-elasticity?

  • Random Intercept for beverages.

Is there variation between districts?

  • Random slopes of price for district.

Is there a difference between beer and spirits?

  • Random slopes of price for beverage type.

Is there a difference based on how costly is a standard drink?

  • Random slopes of price for each quantile (based on average price).

RESULTS

  • Exploratory Analysis.

  • Overall Price Elasticity Estimate.

  • Comparison between Districts.

  • Comparison between Beer and Spirits.

  • Comparison between Quintiles.

How have alcohol sales changed in the last decade?

Majority of ‘Ethanol’ is consumed in the most concentrated form = Spirits

What is the overall price sale relationship ?

Beer

Beer

Price-Sales (contd..)

Spirits

Spirits

The Rise of “small volume” packaging

Implications for price-elasticity.
The base price is ‘seemingly low’ and increase does not have sufficient dis-incentivising effect.

Overall Price Elasticity

What is the overall Price Elasticity of Alcoholic Beverages

## totalMlsSoldLog ~ (1 | itemNameNew) + district + lockdownEffect + 
##     month + linerTime + effectivePriceLog
## # Indices of model performance
## 
## AIC       |      AICc |       BIC | R2 (cond.) | R2 (marg.) |   ICC |  RMSE | Sigma
## -----------------------------------------------------------------------------------
## 2.407e+06 | 2.407e+06 | 2.408e+06 |      0.759 |      0.361 | 0.623 | 1.428 | 1.431

The model is a decent fit explaining 76% of variability in sales

Overall PE (contd.)

## # Fixed Effects
## 
## Parameter         | Coefficient |       SE |         95% CI | t(674438) |      p
## --------------------------------------------------------------------------------
## (Intercept)       |        9.67 |     0.04 | [ 9.58,  9.76] |    220.89 | < .001
## lockdownEffect    |       -0.76 |     0.05 | [-0.87, -0.66] |    -14.57 | < .001
## linerTime         |   -2.91e-03 | 6.59e-05 | [ 0.00,  0.00] |    -44.22 | < .001
## effectivePriceLog |       -1.61 |     0.03 | [-1.66, -1.55] |    -57.23 | < .001

Interpretation: For a 1% increase in price, sales decrease by 1.6% after controlling for seasonality, time effect, lockdown and allowing itemwise variation in expected sales as well as variation between districts (coefs not shown).
## Comparison between Districts

Comparison between Beer and Spirits

  • To be done

Comparison between quantiles