Description

Objective

Implement a machine learning algorithm such that given the input of a beer style (IPA, Lager etc.), the algorithm will output a new recipe for that beer style.

Relevant Data

To facilitate this, existing beer brewing recipes have been scraped from brewing websites. This has further been compiled into a data set containing the (i) style, (ii) characteristics, (iii) ingredients and (iv) instructions on how to brew each beer. The dataset provided here is a sample of the collected dataset. Two important assumptions for the dataset are:

  1. The dataset contains enough recipes to train a reliable model.

  2. The features (variables) in the dataset are sufficient to replicate each beer by the brewer.

Questions

  1. Give an overview of your model’s architecture. Justify why this specific architecture is recommended.

  2. Provide a description of the training dataset that you will use to train your model. This may include additional features and/or a different structure to the sample dataset that was provided. Justify your approach.

  3. Briefly discuss the training algorithm that would be used to train your model.

Answer: I would compare the performance of several algorithms such as logistic regression, support vector machines, random forests in the scikit-learn package in python. I am currently only familiar with single layer neural nets, which is why I don’t consider a deep learning algorithm.

  1. Explain the expected consequences for your model if the training data is imbalanced. That is, the distribution of classes in the training data deviates significantly (statistically and practically) from a uniform distribution.

Answer: It is unlikely that we would be able to find the patterns in the style of the beer with underrepresented data. Hence, a beer recipe generated by the algorithm won’t necessarily taste like a beer of that style.

  1. In case the consequences in question (3) are deemed to be detrimental to your model’s performance, explain how you would attempt to mitigate the effects.

  2. Discuss the performance metrics that would be used to evaluate your model. Also explain the effects of the class distribution problem in question (3) on each of the performance metrics.

Answer: The two metrics that I am familiar with are micro and macro averaging. For imbalances in available data for each beer style, macro averaging is a good fit since it is weighted to put each class on equal footing.

Further Questions

A. If the approach above did not involve deep learning:

  1. Explain why a deep learning approach was not considered, or deemed unsuitable for the task.

  2. Briefly discuss the differences (in terms of predictive modelling) between an image classification task and a non-image classification task (i.e. numeric and/or categorical data).

Answer: Numeric data, such as weight forms an ordered set. Categorical data, such as the colors red, blue and green cannot be ordered. Several options exist in python to deal with categorical or nominal variable. One method is to create dummy variables and assign binary value to it. So, instead of having red, blue or green under the variables color, we can create dummy variable color_red, color_blue, color_green and assign them 1 or 0 value.

  1. Explain the advantages of deep learning over other approaches for image classification.

Something that I was not aware of and will explore in the future.

  1. Explain how you would apply a deep learning approach to the classification of non-image data.

Beer Brewing Dataset

Let us first load the data and look at all the column names.

# Load csv format data from file
file_location <- 'C:\\Users\\Windows\\Dropbox\\AllStuff\\Beer_Brewing_Problem\\Data\\beerrecipe.csv'
beer_data <- read.csv(file_location)

# Columns of the data frame
names(beer_data) <- tolower(names(beer_data))
names(beer_data)
##  [1] "name"                    "orig_gravity"           
##  [3] "final_gravity"           "abv"                    
##  [5] "ibu"                     "srm"                    
##  [7] "style"                   "fermentable1"           
##  [9] "fermentable2"            "fermentable3"           
## [11] "fermentable4"            "f_amount1"              
## [13] "f_amount2"               "f_amount3"              
## [15] "f_amount4"               "firstworthops1"         
## [17] "firstworthops2"          "fwh_amount1"            
## [19] "fwh_amount2"             "boilhops1"              
## [21] "boilhops2"               "boilhops3"              
## [23] "boilhops4"               "bh_amount1"             
## [25] "bh_amount2"              "bh_amount3"             
## [27] "bh_amount4"              "boil_time1"             
## [29] "boil_time2"              "boil_time3"             
## [31] "boil_time4"              "dryhop1"                
## [33] "dryhop2"                 "dryhop_time1"           
## [35] "dryhop_time2"            "mashtype"               
## [37] "mashamount"              "mashtime"               
## [39] "mashtemp"                "yeast"                  
## [41] "yeastattenuation"        "yeasttemp"              
## [43] "fermentationtemperature"

The beer names are irrelevant and can be eliminated. Let us look at and extract the various styles of the beer. These would be the discrete class labels for each recipe. We can see that there are missing values that need to be imputed. In the American Lager data below, fermentation temperature is missing.

unique(beer_data$style)
##  [1] American IPA                Sweet Stout                
##  [3] Special/Best/Premium Bitter Belgian Pale Ale           
##  [5] American Lager              Irish Red Ale              
##  [7] English IPA                 Blonde Ale                 
##  [9] American Barleywine         Belgian Tripel             
## 10 Levels: American Barleywine American IPA ... Sweet Stout
beer_data[beer_data$style == "American Lager", ]
##             name orig_gravity final_gravity  abv   ibu  srm          style
## 6 American Lager        1.046         1.012 4.54 24.13 2.77 American Lager
## 7      Aus Lager        1.042         1.007 4.68 25.43 2.98 American Lager
##         fermentable1     fermentable2 fermentable3 fermentable4 f_amount1
## 6 American - Pilsner      Flaked Rice                                3500
## 7 American - Pilsner American - Wheat                                3800
##   f_amount2 f_amount3 f_amount4 firstworthops1 firstworthops2 fwh_amount1
## 6      1000        NA        NA                                        NA
## 7       400        NA        NA                                        NA
##   fwh_amount2         boilhops1 boilhops2 boilhops3 boilhops4 bh_amount1
## 6          NA            Galena                                       15
## 7          NA Pride of Ringwood                                       20
##   bh_amount2 bh_amount3 bh_amount4 boil_time1 boil_time2 boil_time3
## 6         NA         NA         NA         60         NA         NA
## 7         NA         NA         NA         NA         NA         NA
##   boil_time4 dryhop1 dryhop2 dryhop_time1 dryhop_time2 mashtype mashamount
## 6         NA                                           Infusion        11l
## 7         NA                                                           28l
##   mashtime mashtemp
## 6       60       68
## 7       90       64
##                                                     yeast yeastattenuation
## 6                 DCL Yeast S-189 - SafLager German Lager           75.00%
## 7 Fermentis / Safale - Saflager - German Lager Yeast S-23           82.00%
##   yeasttemp fermentationtemperature
## 6     19-22                      NA
## 7    8.9-22                      15

Modeling aspect of the problem

Step 1: Re-shape dataset.

beer_data[, c("fermentable1", "f_amount1")]
##                         fermentable1 f_amount1
## 1              American - Pale 2-Row       512
## 2              American - Pale 2-Row      4990
## 3              Canadian - Pale 2-Row      2950
## 4  United Kingdom - Maris Otter Pale      2400
## 5                German - Wheat Malt      3630
## 6                 American - Pilsner      3500
## 7                 American - Pilsner      3800
## 8                   German - Pilsner      3000
## 9  United Kingdom - Maris Otter Pale      4990
## 10             American - Pale 2-Row      4130
## 11 United Kingdom - Maris Otter Pale      6580
## 12  American - Caramel / Crystal 60L       100
## 13                 Belgian - Pilsner      6000

I would change the format of the data set as mentioned earlier by creating dummy variables fermentable1_AmericanPale2Row, fermentable1_CanadianPale2Row etc. with the amount used directly under it. The amount would be 0 if the ingredient was not used. This would make training the model more convenient. Although perhaps there other ways of getting around the fact that the fermentable1 and f_amount1 are decoupled as it is.
Step 2: Train various predictive models: I would test Random Forests, Support vector machines, or Logistic Regression.
Step 3: Test model using macro/micro-average metrics.
Step 4: Decide on a model based on performance.

Coming up with new beer recipe for a given style

Step 1: Use L1 regularization in conjunction with Logistic Regression to find out which features are essential to that beer style. Find set of variables outside of the essential features that have been used at one time or another in the same style of beer.

Step2: For essential numerical values - generate a random number from a normal distribution centered around the mean of the training data set and the corresponding standard deviation.

Step 3: For the categorical values - The amount of the variable would be determined by a normal distribution with standard deviation determined from the training data.

Step 4: For each non-essential but previously present items - use a binary random variable to determine whether to include the ingredient or not followed by a normally distributed random number about the mean with the corresponding standard deviation. One this phase is complete, we will have a suggestion for a beer recipe of that style.

Step 5: The next step is to have the machine learning algorithm predict the class of this beer to check that it passes the test.

Lastly

Make the beer and taste it.