Brewery Market Study

You’re trying to start a craft brewery chain and it’s important to know what the market for such items would be. We analyze your data on a large number of beers and breweries. Here are some results of some analysis to help build the market analysis for the business plan.

1. How many breweries are present in each state?

breweries <- read.csv("https://raw.githubusercontent.com/tikisen/6306-case-study-1/master/Breweries.csv", quote = "", row.names = NULL, stringsAsFactors = FALSE, header = TRUE)
beers <- read.csv("https://raw.githubusercontent.com/tikisen/6306-case-study-1/master/Beers.csv", quote = "", row.names = NULL, stringsAsFactors = FALSE, header = TRUE)
str(breweries)
## 'data.frame':    559 obs. of  4 variables:
##  $ Brew_ID: chr  "1" "2" "3" "4" ...
##  $ Name   : chr  "NorthGate Brewing " "Against the Grain Brewery" "Jack's Abby Craft Lagers" "Mike Hess Brewing Company" ...
##  $ City   : chr  "Minneapolis" "Louisville" "Framingham" "San Diego" ...
##  $ State  : chr  " MN" " KY" " MA" " CA" ...
names(breweries)[2]<-paste("Brewery_Name")
str(beers)
## 'data.frame':    2412 obs. of  7 variables:
##  $ Name      : chr  "Pub Beer" "Devil's Cup" "Rise of the Phoenix" "Sinister" ...
##  $ Beer_ID   : chr  "1436" "2265" "2264" "2263" ...
##  $ ABV       : num  0.05 0.066 0.071 0.09 0.075 0.077 0.045 0.065 0.055 0.086 ...
##  $ IBU       : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Brewery_id: int  409 178 178 178 178 178 178 178 178 178 ...
##  $ Style     : chr  "American Pale Lager" "American Pale Ale (APA)" "American IPA" "American Double / Imperial IPA" ...
##  $ Ounces    : chr  "12" "12" "12" "12" ...
names(beers)[1]<-paste("Beer_Name")
require(package)
## Loading required package: package
## Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
## logical.return = TRUE, : there is no package called 'package'
summary(breweries)
##    Brew_ID          Brewery_Name           City          
##  Length:559         Length:559         Length:559        
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##     State          
##  Length:559        
##  Class :character  
##  Mode  :character
install.packages('plyr',repos = "http://cran.us.r-project.org")
## 
## The downloaded binary packages are in
##  /var/folders/3b/4yr_49g17gn2h9_h6n5zyl6w0000gr/T//Rtmph0nsps/downloaded_packages
library("plyr")
count(breweries, "State")
##              State freq
## 1                     1
## 2               AK    7
## 3               AL    3
## 4               AR    2
## 5               AZ   11
## 6               CA   39
## 7               CO   47
## 8               CT    8
## 9               DC    1
## 10              DE    2
## 11              FL   15
## 12              GA    7
## 13              HI    4
## 14              IA    5
## 15              ID    5
## 16              IL   18
## 17              IN   22
## 18              KS    3
## 19              KY    4
## 20              LA    5
## 21              MA   23
## 22              MD    7
## 23              ME    9
## 24              MI   32
## 25              MN   12
## 26              MO    9
## 27              MS    2
## 28              MT    9
## 29              NC   19
## 30              ND    1
## 31              NE    5
## 32              NH    2
## 33              NJ    3
## 34              NM    4
## 35              NV    2
## 36              NY   16
## 37              OH   15
## 38              OK    6
## 39              OR   29
## 40              PA   25
## 41              RI    5
## 42              SC    4
## 43              SD    1
## 44              TN    3
## 45              TX   28
## 46              UT    4
## 47              VA   16
## 48              VT   10
## 49              WA   23
## 50              WI   20
## 51              WV    1
## 52              WY    4
## 53 North Woodstock    1

2. Merge beer data with the breweries data. Print the first 6

observations and the last six observations to check the merged file.

#change brewery ID name to match in both files
names(breweries)[1]<-paste("Brewery_id")
beer_merge <- merge(beers,breweries,by=c("Brewery_id"))
head(beer_merge)
##   Brewery_id     Beer_Name Beer_ID   ABV IBU
## 1          1  Get Together    2692 0.045  50
## 2          1 Maggie's Leap    2691 0.049  26
## 3          1    Wall's End    2690 0.048  19
## 4          1       Pumpion    2689 0.060  38
## 5          1    Stronghold    2688 0.060  25
## 6          1   Parapet ESB    2687 0.056  47
##                                 Style Ounces       Brewery_Name
## 1                        American IPA     16 NorthGate Brewing 
## 2                  Milk / Sweet Stout     16 NorthGate Brewing 
## 3                   English Brown Ale     16 NorthGate Brewing 
## 4                         Pumpkin Ale     16 NorthGate Brewing 
## 5                     American Porter     16 NorthGate Brewing 
## 6 Extra Special / Strong Bitter (ESB)     16 NorthGate Brewing 
##          City State
## 1 Minneapolis    MN
## 2 Minneapolis    MN
## 3 Minneapolis    MN
## 4 Minneapolis    MN
## 5 Minneapolis    MN
## 6 Minneapolis    MN
tail(beer_merge)
##      Brewery_id                 Beer_Name Beer_ID   ABV IBU
## 2403        556             Pilsner Ukiah      98 0.055  NA
## 2404        557  Heinnieweisse Weissebier      52 0.049  NA
## 2405        557           Snapperhead IPA      51 0.068  NA
## 2406        557         Moo Thunder Stout      50 0.049  NA
## 2407        557         Porkslap Pale Ale      49 0.043  NA
## 2408        558 Urban Wilderness Pale Ale      30 0.049  NA
##                        Style Ounces                  Brewery_Name
## 2403         German Pilsener     12         Ukiah Brewing Company
## 2404              Hefeweizen     12       Butternuts Beer and Ale
## 2405            American IPA     12       Butternuts Beer and Ale
## 2406      Milk / Sweet Stout     12       Butternuts Beer and Ale
## 2407 American Pale Ale (APA)     12       Butternuts Beer and Ale
## 2408        English Pale Ale     12 Sleeping Lady Brewing Company
##               City State
## 2403         Ukiah    CA
## 2404 Garrattsville    NY
## 2405 Garrattsville    NY
## 2406 Garrattsville    NY
## 2407 Garrattsville    NY
## 2408     Anchorage    AK

3. Report the number of NA’s in each column.

sapply(beer_merge, function(x) sum(is.na(x)))
##   Brewery_id    Beer_Name      Beer_ID          ABV          IBU 
##            0            0            0           62         1003 
##        Style       Ounces Brewery_Name         City        State 
##            0            0            0            0            0

4. Compute the median alcohol content and international bitterness unit for each state. Plot a bar chart to compare.

median(beer_merge$ABV, na.rm = TRUE)
## [1] 0.056
median(beer_merge$IBU, na.rm = TRUE)
## [1] 35
median_ABV <- aggregate(ABV~State, beer_merge, FUN = median)
barplot(median_ABV$ABV, 
        space = .25, 
        xlab = "States", 
        ylab = "Alcohol by Volume", 
        main = "Median ABV by State", 
        names.arg = median_ABV$State, 
        las=2)

median_IBU <- aggregate(IBU~State, beer_merge, FUN = median)
barplot(median_IBU$IBU, 
        space = .25, 
        xlab = "States", 
        ylab = "Bitterness", 
        main = "Median IBU by State", 
        names.arg = median_IBU$State, 
        las=2)

###5. This code will look in the merged file to determine which state has the maximum alcoholic (ABV) beer and which state has the most bitter (IBU) beer.

beer_merge$State[which.max(beer_merge$ABV)]
## [1] " CO"
beer_merge$State[which.max(beer_merge$IBU)]
## [1] " OR"

The state with the beer with the highest alcohol by volume is Colorado. The state with the beer with the most bitter beer is Oregon.

6. Here is some code to produce some summary statistics for the ABV of the beers.

summary(beer_merge$ABV)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
## 0.00100 0.05000 0.05600 0.05978 0.06700 0.12800      62

The lowest ABV is 0.1%, the highest is 12.8%. The average is 5.98% and the middle value is 5.6%. The alcohol content is unreported for 62 beers.

7. Is there an apparent relationship between the bitterness of the

beer and its alcoholic content? Draw a scatter plot.

plot(x=beer_merge$IBU,
     y= beer_merge$ABV,
     xlab = "Bitterness",
     ylab = "Alcohol by Volume",
     main = "Bitterness vs ABV")

There doesn’t seem to be a strong relationship between bitterness and alcoholic content. You can see a range of values for alcohol content all along the range of values for bitterness.