Niko Nolte 2023-05-05
Every year, millions and millions of beer is consumed a year and I would like to explore what I find this topic very interesting because beer has been around forever. I have had the opportunity to try beer all over the world and it is something that I will always hold dear to my heart. There are so many different kinds of beer and so many different tastes. I want to understand what factors contribute to how well a certain beer is rated as well as the different styles and attributes that relate to a beer rating.
Before I start analyzing the data, I would first like to introduce you my data dictionary that will help you understand what each column means so there is no confusion later! This is a big data set from Kaggle that has a lot of different variables that I will be analyzing in later parts.
A unique id that identifies a specific brewery.
The name of the Brewery which is next to its unique ID.
The time that the review was made in seconds.
The overall score out of 5 that a specific beer achieved.
The aroma score out of 5 which has to do with smell.
The score of the beers appearance out of 5.
The reviewers specific profile name.
Beer styles differentiate by color, flavor, strength, ingredients, production method, recipe, history, or origin (from Google).
The sense of taste of the beer without the aroma on a score out of 5.
How the beer tastes leaving your mouth on a score out of 5.
The name of the beer
How much alcohol content is in the beer (How strong the beer is)
The unique id to identify a specific beer
This graph is an introduction to this data set. As we can see as the taste of the beer goes up, so does the review overall. This makes logical sense and it is very interesting to see how the taste is very influential to the review overall.
This bar chart is meant to show what the most common beer ABV is. Within this data set there are over 100,000 observations. I took out all of the NA’s so were left with thousands of data points left. As shown the most common beer ABV seems to be between 7-8. This is highly interesting. Most of this data is from around the world and very limited number within the States from preliminary observation of the data. It seems that around the world, the ABV’s are much higher on average than the United States. Unfortunately, there is no locational data in this data set which could provide us with more insight as to where majority of these breweries are located.
This graph shows some beer styles and their respective beer ABV’s along with the review overall. As shown the higher the ABV, the higher the review. This could mean that the reviewers could like higher ABV’s more. This could also be related to the Beer Style as well. Preferences in which beer is drank by the names could be swaying the data. For example, a “Light Lager” is not likely to have a high ABV. It seems that there is a correlation between the ABV and the overall review score given to the Beer Style.
I wanted to understand better the time frame of when these reviews were collected to see if there are any time related variables that could paly into the high preference for ABV as shown above or the wide variety of breweries that are located in different countries. This bar chart shows that majority of the reviews occurred in 2011. This is interesting because if we take a look at the other charts within this document we can see that a higher ABV is preferred.
I was curious if Aroma was rated the same as other variables like overall appearance. It seems like Aroma is a big factor when rating beers which is very interesting. Craft beer has become more like wine tasting, where attributes like Aroma matter a lot.
Now I will be getting into the sentiment analysis of the first page of reviews on a website called Beer Connoisseur. To get all of this data, I was unable to scrape it progromatically, so instead I manually copied all of the links and then joined all of them together to get a column of reviews. I purely looked at the words and what kind of sentiment they have. I did not include the reviewer or the beer name.
In the 16 reviews that I obtained about beers, these are the 10 most used words in those 16 reviews. As seen “beer” is at the top with a count of about 63. But I am more interested in the words such as “style”, or “character”. I have never heard of someone calling a beer or referencing a beer to have “style”, so I found that interesting.
With this graph I wanted to see the emotional aspects of the 16 reviews that I collected. Much to my surprise there seems to be a lot of “positive” words within the reviews. I am unsure how someone can trust a beer, but nevertheless this is very cool to analyze and see the emotions from something as simple as a beer.
With this bing sentiment I was curious as to whether the 16 reviews were more positive or more negative. To my surprise, majority of the reviews are positive. This could have to do with the site that these reviews are submitted. The website the “Beer Connoisseur” is a premium beer site, which could explain all of the positivity from these reviews. Usually, people tend to leave polarizing reviews when they either have a really good time or a bad time. A prime example of this is if you were to go on Yelp and look at the reviews of your favorite restaurant.
To conclude this in depth analysis of beer data from all over the world, we have examined the different variables that contribute to how a beer gets rated as well as the sentiment feeling towards beer in general. The analysis shows that beer is a constant growing industry with a lot of craft breweries throughout the world that have an influence on societal norms and preferences. I can personally say that I prefer craft beer over any name beer. I hope you enjoyed this analysis and have a beer!