How is one to decide……

Bourbon has gained in popularity in recent years arguably due in part to its history and the way it is made (aged in virgin white oak barrels for years). The aging process which causes much of the product to evaporate over time results certain brands that are more popular and very difficult to find and buy. The history dates to barrels of corn whiskey being sent down the Mississippi river from Kentucky to New Orleans, which aged and sweetened in transit. The sellers would get paid in cash some of which they would spend on the fastest horse they could find to transit safely with their profits back to Kentucky. As a result, Kentucky is known for fast horses and bourbon and many bourbon consumers are looking for the “next big thing” in bourbon and look for bourbons with great taste and a reasonable cost.

The periodical Whisky Advocate has posted over 20 years of their reviews on their website. This data includes over 5000 reviews by 11 reviewers of many types of whiskey. These reviews include a score, a written review, and other data on the whiskeys. More information on the review and rating process can be found here.

This document provides an analysis of the review data for all reviews in the Bourbon category on the website. Some of the areas investigated include the connection of variables to rating, reliability of individual reviewers, keyword that are common in positive reviews, and a list of “hidden gem” bourbons that have high ratings at bargain costs. This will include statistical analysis to determine if there is correlation between the variables (Rating, Cost, and Alcohol by volume - ABV). Finally, I will do some predictive Analysis to predict the rating of a Bourbon is the price and ABV are known.


Details of the Data

The data in this document comes from the rating and review section of the Whisky Advocate website. The reviews are from 2000 to 2021 and contains 8 variables (10 with Category and Style but this data set was limited to Bourbon only) described below. Each row is associated to a review of a specific bourbon by a specific reviewer. I used the Rating variable to create a 9th categorical variable based on the Websites rating scale.


Summary of the dataset

Variables in Dataset Variable Type Explanation
Name Character Full name of Bourbon
ABV Numeric The Alcohol by volume (ABV) of the Bourbon
Rating Numeric The Rating (on a 100 point scale) assigned to the Bourbon
Price Numeric The price per bottle in US Dollars (USD)
Text Character The text of the written review on the Bourbon
Reviewer Character The name of the reviewer of the Bourbon
Review Season Character The season (Winter, Spring, Summer, Fall) the Bourbon was reviewed
Review Year Numeric The year the Bourbon was reviewed
Rating Category Factor The descriptive category based on rating from the website


Rating Process of Whisky Advocate

The following are the categories and descriptions that Whisky Advocate gives their rated whiskys.

Rating in Points Category Name Category Description
95-100 points Classic a great whisky
90-94 points Outstanding a whisky of superior character and style
85-89 points Very good a whisky with special qualities
80-84 points Good a solid, well-made whisky
75-79 points Not recommended -

Looking at the distribution by categories you can see that that it appears to take a normal distribution skewed toward higher ratings. As a consumer, that tells me that many “Very good” ratings could be below the mean (below average). As start to look for “hidden gems” I will focus on bourbons with “Classic” or “Outstanding” ratings.

Top 10 rated bourbons

This table shows the top 10 rated bourbons. For the TLDR crowd (or those not seeking bargains), here are the bourbons that received the highest ratings. But, for those of us looking for the best deal, will we find bourbons that are just as good for a reasonable price?


Summary Data Table

The table below summarizes some of the important characteristics of the data. In this table, you can find the names and other attributes found in the ratings. For brevity, a single sample of Text from a review is included below.

Use the tabs below to navigate through various details of the data.

Example Review Text: “A marriage of 13 and 18 year old bourbons. A mature yet very elegant whiskey, with a silky texture and so easy to embrace with a splash of water. Balanced notes of honeyed vanilla, soft caramel, a basket of complex orchard fruit, blackberry, papaya, and a dusting of cocoa and nutmeg; smooth finish. Sophisticated, stylish, with well-defined flavors. A classic!”

Summary Statistics

For the data analysts reading this, below is a table of summary statistics for the variables.

A couple interesting notes form this table. The highest price is $1500 which is clearly an outlier and skews the mean price higher. Also of note is that lowest rating is 60 and the highest in 97 which means even though it appears to be a 100 point scale, in reality there are only 37 unique ratings. There are some other interesting items which I will provide visualization of later. Based on ther summary statistics, I am creating an initial goal of keeping the price ceiling in the low 70s (USD - slightly less than mean) and a rating of great than 90 (in the top 25%.)

Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
ABV 1035 0.501 0.072 0.4 0.45 0.54 0.89
Rating 1035 88.337 4.585 60 86 92 97
Price 1035 76.553 83.458 12 40 90 1500
Review Season 1035
… Fall 284 27.4%
… Spring 234 22.6%
… Summer 238 23%
… Winter 279 27%
Review Year 1035 2016.002 4.481 2000 2014 2019 2021
Rating Category 1035
… Not recommended 8 0.8%
… Mediocre 34 3.3%
… Good 151 14.6%
… Very good 375 36.2%
… Outstanding 420 40.6%
… Classic 47 4.5%
Price Category 1035
… Under 40 212 20.5%
… 41-80 498 48.1%
… 81-120 175 16.9%
… 121-160 82 7.9%
… 161-200 15 1.4%
… 201-240 23 2.2%
… 241-280 13 1.3%
… Over 280 17 1.6%
Mediocre 1035
… No 1001 96.7%
… Yes 34 3.3%
Good 1035
… No 884 85.4%
… Yes 151 14.6%
Very good 1035
… No 660 63.8%
… Yes 375 36.2%
Outstanding 1035
… No 615 59.4%
… Yes 420 40.6%
Classic 1035
… No 988 95.5%
… Yes 47 4.5%


Ratings by Year

Whisky Advocate has over 20 years of ratings from 2000 to 2021. Base on the previous summary statistics table, I wanted to look the distribution of reviews over the years. The seemed to reviews less than 50 Bourbons a years until 2013 which coincided with what Fortune Magazine call “The billion-dollar bourbon boom” in a 2014 article here. We will have to take this into account during our analysis.


Impact of Reviewers

The following visualizations focus on the The reviewers and if there is any indication of calibration between the Reviewers. Use the tabs to learn more.

Boxplot of Review Ratings by Reviewers

Analyzing the boxplot of the reviews by reviewers. This plot begs several questions. We would want a bit of variance for our reviewers (assuming the have a large number of reviews) especially in the tight range of reviews scores (60-97) of the data. from the plot it appears some of the reviewers may have a small number of reviews. Finally the reveiws took place over 21 years. When did the reviewers cdocunt thier reviews.


Reviewers with more that 50 reviews

I wanted to pick reviewers with more that 50 reviews with the hope that they would have a large enoguh sample. Only 6 of 11 Reviers had more than 50 reivews.

Distribution of Reviews by Reviewer

I used a facet wrap to look at the distribution of the ratings by the Reviewers. It is important to note that only two reviewers have more than 150 reviews (John Hansell - who has almost 300 -and Susannah Skiver Barton). Both of them skew to the right (and Susannah have VERY few negative reviews). We have We already have seen that the distribution of reviews skewed right, and we can see that petty universal across the reviewers. David Fleming, Fred Minnick, and Jeffery Lindenmuth have the most balanced (those still skewed right)




Reviews by Year

Based on the data it appears that John Hansell did nearly all the reviews of Bourbon until 2015 when a team of other reviewers took over.






Sentiment Analyis of Text Reviews

The following visualizations focus on the words used by the reviewers in the text column of the data. The purpose is to look for words commonly used in higher reviews. This can tell us what the reviews consider qualities of highly rated bourbons. I used the BING lexicon. In my research I did some analysis with the NRC lexicon, but since the two variables I was evaluating on (Reviewer and Rating category) had very different numbers, I was unable to do side by side comparison of these variables.

Positive and Negative Sentiment by Category (bing lexicon)

Many of the words are shared across the three categories. Of note are “sweet”, “sweetness”, and “dark”.


Focus on the Classic category (95-100)

Since we are trying to get he best bourbon for our money, I wanted to specifically look for words to describing the “Classic” rating category and what words are used to describe these bourbons. It is interesting to see that some of the words that the BING lexicon describes as negative (dark, complex) can actually be seen as a positive traits in a bourbon.




Investigation of relationship of Rating to Price

The purpose of this exercise is to get the highest quality bourbon for the price. In order to do this we compare these two variables in different ways (see tabs below).


Statistical Corrleation of Ratings to Price

In order see if there was a statistical correlation between price and rating for Bourbon, I did a Pearson correlations and the p-value of the test is 5.712e-16, which is less than the significance level alpha = 0.05. We can conclude that Price and Rating are significantly correlated with a correlation coefficient of .25 and p-value of 5.712e-16 . We can also see that the plot is generally straight which also indicate correlation.

## 
##  Pearson's product-moment correlation
## 
## data:  whiskeya$Price and whiskeya$Rating
## t = 8.2274, df = 1033, p-value = 5.712e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1899236 0.3043257
## sample estimates:
##      cor 
## 0.247989


Rating vs. Price Boxplot Visualization

Continuing to compare Rating to Price, I pulled the top 17 bourbons with Prices over $250 out. In addiction to skewing the data it is very unlikely I will buy any bourbon over that amount. Mean price generally increases with higher ratings. Most of the outliers are high. There is more variance in “Outstanding” rating group so that could be promising to find 90+ rated bourbons below $75.


Rating vs. Prices

Plotting the data a bit differently we can see that lower prices ratings seem to rise with price and the flatten out.

As we look at the key price points between 50 and 100 USD (my typical price point) we see that there is a bit of variability in price. There seems to be a sweet spot between 65 and 75 USD where review scores start going up from 90-100 USD. This further support my previous goal of finding high quality bourbons under 75 USD

Linear regression of Price

I ran a linear regression of Rating to Price and the is a slight positive correlation of .01 with an intercept of 87.29.

Rating = 87.29 + .01*(Price)

## 
## Call:
## lm(formula = Rating ~ Price, data = whiskeya)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -27.7030  -2.4414   0.4796   3.1812   8.5477 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 87.294332   0.187468 465.650  < 2e-16 ***
## Price        0.013623   0.001656   8.227 5.71e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.444 on 1033 degrees of freedom
## Multiple R-squared:  0.0615, Adjusted R-squared:  0.06059 
## F-statistic: 67.69 on 1 and 1033 DF,  p-value: 5.712e-16



Investigation of relationship of Rating to ABV

As the bourbon ages, the proof levels can change, depending on where the barrel is placed in the rickhouse (the old, wooden, non-temperature controlled barns where bourbon is aged). Generally, barrels placed on lower floors will lose proof, while barrels placed higher will gain proof. I was interested to see if proof had an effect on rating.


Statistical Corrleation of Ratings to ABV

In order see if there was a statistical correlation between ABV and rating for Bourbon, I did a Pearson correlations and the p-value of the test is 2.2e-16, which is less than the significance level alpha = 0.05. We can conclude that Price and Rating are significantly correlated with a correlation coefficient of .45 and p-value of 2.2e-16. We can also see that the plot is generally straight which also indicate correlation.

## 
##  Pearson's product-moment correlation
## 
## data:  whiskeya$ABV and whiskeya$Rating
## t = 16.292, df = 1033, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4022717 0.4993038
## sample estimates:
##       cor 
## 0.4521244


Rating vs. ABV Boxplot Visualization

Based on the plot below it appears that rating increases with higher ABV.


Rating vs. ABV

Based on the plots below it appears that rating increases with higher ABV specifically starting about 85 ABV.

Below is a better view of the same data (filtering for reviews greater than 80)

Below is a better view of the same data (filtering for reviews greater than 80), Notice the “hump” between ratings of ratings of 93 and 96.

Let us focus on ratings great than 90 (we only care about highly rated bourbons). The ABV is hovering between just over .5 to around .575 ABV. Not that this is a hard and fast rule as ABV and enjoyment leave qiote a bit to personal taste (not unlick ABV and International Bitterness units (IBU) for beer).

Linear regression of ABV

I ran a linear regression of Rating to Price and the is a positive correlation of 28.7 with an intercept of 73.96. This seem more promising than price, but it is important to remember that ABV is less than one.

Rating = 73.96 + 28.7*ABV

## 
## Call:
## lm(formula = Rating ~ ABV, data = whiskeya)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -26.8672  -2.1540   0.6746   2.6986   9.1328 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  73.9588     0.8917   82.94   <2e-16 ***
## ABV          28.6854     1.7607   16.29   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.091 on 1033 degrees of freedom
## Multiple R-squared:  0.2044, Adjusted R-squared:  0.2036 
## F-statistic: 265.4 on 1 and 1033 DF,  p-value: < 2.2e-16



Investigation of relationship of ABV to Price

Because both ABV and Price have correlation with rating, I want to understand the relationship between these two variables Do higher proof Bourbons cost more?

Statistical Corrleation of Ratings to ABV

In order see if there was a statistical correlation between ABV and Price for Bourbon, I did a Pearson correlations and the p-value of the test is 7.471e-14, which is less than the significance level alpha = 0.05. We can conclude that Price and Rating are significantly correlated with a correlation coefficient of 7.6 and p-value of 7.471e-14. We can also see that the plot is generally straight which also indicate correlation.

## 
##  Pearson's product-moment correlation
## 
## data:  whiskeya$ABV and whiskeya$Price
## t = 7.5836, df = 1033, p-value = 7.471e-14
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1711059 0.2865722
## sample estimates:
##       cor 
## 0.2296469

Price vs. ABV Boxplot Visualization

Based on the plot below it appears that rating increases with higher ABV.


Price vs. ABV

TBD


Price vs. ABV

Filter price lower than $250


Predictive Analysis

Basic Correlation Chart

This chart compares all of the variables in one chart and show there is a relationship between the variables, the strongest being ABV.


Multivariate ABV and Price

When we run a multivariate analysis to Rating using both ABV and Price we get the results below.

## 
## Call:
## lm(formula = Rating ~ ABV + Price, data = whiskeya)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -26.592  -2.058   0.638   2.728   8.971 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 74.430217   0.884031   84.19  < 2e-16 ***
## ABV         26.468012   1.784836   14.83  < 2e-16 ***
## Price        0.008360   0.001545    5.41 7.84e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.036 on 1032 degrees of freedom
## Multiple R-squared:  0.2264, Adjusted R-squared:  0.2249 
## F-statistic:   151 on 2 and 1032 DF,  p-value: < 2.2e-16


This gives us an Intercept for 74.4, and a modfier for Price of .008 amd ABV of 26.47.

##                Estimate  Std. Error   t value     Pr(>|t|)
## (Intercept) 74.43021653 0.884031479 84.194079 0.000000e+00
## Price        0.00836008 0.001545367  5.409771 7.843222e-08
## ABV         26.46801220 1.784836153 14.829379 3.031250e-45


All-in regression model

We can now build a simple all-in regression model and examine the output. When we run this as a multivariavble, it has simlar results than the privods ones conducted on individual variables :

## 
## Regression Results
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                               Rating           
## -----------------------------------------------
## Price                    0.008*** (0.002)      
## ABV                      26.468*** (1.785)     
## Constant                 74.430*** (0.884)     
## -----------------------------------------------
## Observations                   1,035           
## R2                             0.226           
## Adjusted R2                    0.225           
## Residual Std. Error      4.036 (df = 1032)     
## F Statistic          150.973*** (df = 2; 1032) 
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01


Indetify the best model

Using the earlier model with 2 explanatory variables, we can run the regression subsets command to identify the best models:

## Subset selection object
## Call: regsubsets.formula(Rating ~ ., whiskeya_explore, nvmax = 13)
## 2 Variables  (and intercept)
##       Forced in Forced out
## Price     FALSE      FALSE
## ABV       FALSE      FALSE
## 1 subsets of each size up to 2
## Selection Algorithm: exhaustive
##          Price ABV
## 1  ( 1 ) " "   "*"
## 2  ( 1 ) "*"   "*"
## [1] 0.2036463 0.2248563

R Squared

We can examine these features in a little more detail with a few plots. We can see by the scree plot, the ideal number of variables appears to be between 1 and 2, with 2 appearing as the primary bend.

Adjusted R Squared

Next, lets explore the number of variables by adjusted R Squared


Evaluate the ideal number of variables

A similar story is told by a chart of adjusted R-squared. The ideal number of variables appears to be 2. We can evaluate it with the following:

## 
## Q2 Part D Regression Results
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                               Rating           
## -----------------------------------------------
## ABV                      26.468*** (1.785)     
## Price                    0.008*** (0.002)      
## Constant                 74.430*** (0.884)     
## -----------------------------------------------
## Observations                   1,035           
## R2                             0.226           
## Adjusted R2                    0.225           
## Residual Std. Error      4.036 (df = 1032)     
## F Statistic          150.973*** (df = 2; 1032) 
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

Rating Prediction Model

We can use this data to predict the rating of a bourbon based on Price and ABV. Running a prediction based on the mean of ABV and Price we get a result of 88.337 which matches the Mean for Rating from our summary statistics.

##       1 
## 88.3372

An example of how to use this would be if we wanted to have a maximum price of $50 and a rating of greater than 90, we would expect to have to have an ABV of at least .574.

Similarly, if we knew we preferred a bourbon with a .50 ABV, we would expect to pay $295. This tells me that ABV is clearly drive a bourbons rating.



How to get the best value in Bourbon

Top 98 by Cost per rating

Keeping it simple. This looks at Bourbons reviewed as “Outstanding or”Classic" and that cost less that $60. This is for the practical consumer


Classics 2000-2015

Chart of top 48 by Cost per rating by only the Reviewer John Hansell. He was the main review of Bourbon for 15 years (2000-2015) and while many of these bourbons are no longer available they could point to new version of the same brand. Because the mean of Hasnell’s ratings was just below “90”, we will limit this list to bourbons in the “Outstanding” and “Classic” categories they will all be in the top half Reviewers opinion.


What’s new?

Chart of top 77 by cost per rating 2018 to present. Focusing on Bourbons that a consumer will be more likely able to purchase on the market today, the below is a list of bourbons that are less that $70, and have a rating over 90


Best of the best

Bourbons that are less that $70 (our target price), and have a rating of 95 or more (the highest “Classic” rating) 2018 to present. There are only 5, but they provide to tier rating at a reasonable price. It is interesting to note that that 4 of 5 have an ABV which was supported by our analysis of ABV.