Inspired by this guthub Repo by dantaki, which was sent to me by my boyfriend, who worries that my obsession with baking and everything concerning the bake off is taking overhand, I decided to redo his analysis. But since there’s not much fun in just redoing absolutely the same, I decided to go with the Great Australian Bake Off instead of the Great British Bake Off.
On the pro-side this will probably get some different results and it could be interesting to compare them afterwards and there’s also a new season starting in two weeks time, so not long to wait, until the algorithm can be tested in a “real life setting”. On the con-side, the Australian counterpart of the show has only run for 4 seasons until today. So setting one season aside for testing, we’ll have just the data from 3 seasons to find our algorithm.
Same as datanki I’ll get my data from the wikipedia page of the Great Australian Bake Off. Until now there have been 4 seasons. Season 1 consisted of 10 bakers and 8 episodes, while seasons 2, 3 and 4 featured 12 bakers and 10 episodes in total. Since the show has evovled somewhat over time (different judges and hosts for instance) we’ll take season 3 as testset. The main reason for this is that I believe season 4 should be used to tune the model, since it is the “closest” to the upcoming season 5. I don’t want to use either season 2 or season 1 for testing, since especially season 1 has been a bit different to the other ones.
As mentioned, I’ll get the data from the wikipedia page. I’ll then use the caret package to try to find a working classification algorithm that predicts the ranking of a contestant based on two things:
So in the end, after each show we can feed the algorithm the latest scores and hopefully get a decent prediction as to who is going to win this years bake off and additionally who will make it to the finale.
We’ll train the model on the data of seasons 1, 2 and 4 and use season 3 as validation set. After we found the right model, we’ll retrain the model using all 4 seasons in the hope of having an even better prediction model for the upcoming season 5.
So, after my third batch of macarons finally succeeded today, let’s start with the data stuff, maybe THAT will work out quicker. Let’s load the data and take a look at how the tables present themselves.
| baker | episode | rank_in_technical | mean_rank_in_technical | star | mean_star | top | mean_top | flop | mean_flop | final_rank | final_group | X |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nancy | 3 | 1 | 1.000000 | 0 | 0.0000000 | 0 | 0.0000000 | 0 | 0.0000000 | 1 | 1 | winner |
| Nancy | 4 | 7 | 4.000000 | 0 | 0.0000000 | 1 | 0.5000000 | 0 | 0.0000000 | 1 | 1 | winner |
| Nancy | 5 | 4 | 4.000000 | 0 | 0.0000000 | 0 | 0.3333333 | 0 | 0.0000000 | 1 | 1 | winner |
| Nancy | 6 | 3 | 3.750000 | 0 | 0.0000000 | 0 | 0.2500000 | 1 | 0.2500000 | 1 | 1 | winner |
| Nancy | 7 | 3 | 3.600000 | 1 | 0.2000000 | 0 | 0.2000000 | 0 | 0.2000000 | 1 | 1 | winner |
| Nancy | 8 | 2 | 3.333333 | 0 | 0.1666667 | 0 | 0.1666667 | 1 | 0.3333333 | 1 | 1 | winner |
Basically we have one observation from each baker for each show. Since I typed in the data myself, I wont bother too much with cleaning it. Let’s look at the variables we got here.
| key | value |
|---|---|
| baker | name of the baker |
| episode | Episode we’re looking at |
| rank_in_technical | rank the baker got in the technical bake |
| mean_rank_in_technical | average rank in technical over the last episodes (including actual one) |
| star | indicates wether the baker was star baker in the actual episode |
| mean_star | average over how many times the baker got star baker up to and including the actual episode |
| top | indicates wether the baker made one of the top bakes in the actual episode |
| mean_top | average over how many times the baker made one of the episodes top bakes |
| flop | indicates wether the baker made one of the least favorite bakes in the actual episode |
| mean_flop | average over how many times the baker made one of the least favorite bakes |
| final_rank | the final rank the baker got |
| final_group | indicates in which group the baker falls, will be our classifier |
| X | how the groups/classifier may be named |
Since we want to predict based on the most recent episode, we’ll have to create datasets of episodes rather than sets of seasons. Remember that the first season had 2 bakers less than the other ones thus also 2 episodes less? We’ll solve this by changing the episodes numbers in this season so that the first episode is now the third episode. Lets look at the head of episode 3:
| baker | episode | rank_in_technical | mean_rank_in_technical | star | mean_star | top | mean_top | flop | mean_flop | final_rank | final_group | X |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nancy | 3 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | winner |
| Jonathan | 3 | 6 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | runner_up |
| Maria | 3 | 2 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 2 | 2 | runner_up |
| Monique | 3 | 4 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 3 | top_tier |
| Brendan | 3 | 3 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 3 | top_tier |
| Julie | 3 | 7 | 7 | 0 | 0 | 1 | 1 | 0 | 0 | 6 | 4 | bottom_tier |
Remember that we didn’t include the episodes from season 3 in our datasets. We’ll use them to test our model in the end. But since we don’t have that many datapoints, we’ll have to use some kind of cross-validation to tune our model before we’ll use it on the testset. We’re going to use repeated cross validation.
We fit a random forest model to our data from episode 1 and get the following confusion matrix.
## Reference
## Prediction bottom_tier runner_up top_tier winner
## bottom_tier 12 1 1 2
## runner_up 0 2 0 0
## top_tier 0 1 5 0
## winner 0 0 0 0
So obviously the model is a bit conservative when it comes to predicting a winner. But in my opinion this makes sense, since we’re looking at episode 1 right now. After the first episode our data will be very homogenious, all the variables corresponding to a running mean are just the same as the actual data from this week and everyone, also the bakers in the bake off, can have a good or bad day in a while. Plus don’t forget, that we have one third less data for episodes 1 and 2 since they’re not present in our notation of season 1. Overall our model has Accuracy of 79%. It’s not perfect obviously bit it is decent enough considering the points mentioned above and it is way better than just randomly predicting classes.
But hold on, we still want to predict a winner, right? Of course we do. Even if the model doesn’t predict one right away, we can look at the class probabilities for each of the bakers and check, who has the highest probability of winning. And spoiler alert it turns out, the model is quite good, even after just one episode.
| baker | bottom_tier | runner_up | top_tier | winner | real_outcome |
|---|---|---|---|---|---|
| Claudia | 0.472 | 0.102 | 0.104 | 0.322 | winner |
| Sian | 0.620 | 0.028 | 0.086 | 0.266 | winner |
| Jessica | 0.620 | 0.028 | 0.086 | 0.266 | bottom_tier |
| Brendan | 0.836 | 0.044 | 0.028 | 0.092 | bottom_tier |
| Dave | 0.316 | 0.388 | 0.228 | 0.068 | runner_up |
| Peter | 0.838 | 0.014 | 0.086 | 0.062 | bottom_tier |
| Alex | 0.838 | 0.014 | 0.086 | 0.062 | bottom_tier |
| Chris | 0.338 | 0.244 | 0.374 | 0.044 | top_tier |
| Nathan | 0.676 | 0.012 | 0.274 | 0.038 | top_tier |
| Mariana | 0.808 | 0.012 | 0.144 | 0.036 | bottom_tier |
| James | 0.342 | 0.274 | 0.352 | 0.032 | top_tier |
| Barb | 0.342 | 0.274 | 0.352 | 0.032 | runner_up |
| Emma | 0.584 | 0.148 | 0.248 | 0.020 | bottom_tier |
| Marcus | 0.796 | 0.152 | 0.034 | 0.018 | bottom_tier |
| Suzy | 0.376 | 0.564 | 0.044 | 0.016 | runner_up |
| Ben | 0.620 | 0.090 | 0.276 | 0.014 | bottom_tier |
| Raeesa | 0.336 | 0.076 | 0.580 | 0.008 | top_tier |
| Robert | 0.198 | 0.042 | 0.752 | 0.008 | top_tier |
| Meg | 0.762 | 0.098 | 0.136 | 0.004 | bottom_tier |
| Max | 0.912 | 0.002 | 0.082 | 0.004 | bottom_tier |
| Jasmin | 0.546 | 0.412 | 0.040 | 0.002 | runner_up |
| Angela | 0.094 | 0.018 | 0.886 | 0.002 | top_tier |
| Janice | 0.644 | 0.320 | 0.034 | 0.002 | bottom_tier |
| Michelle | 0.688 | 0.254 | 0.058 | 0.000 | bottom_tier |
Not Too bad, right? What about if we just want to predict, which bakers are going to get to the final. We then count together the probababilities for the class “winner” and “runner up”. Lets create a new column with that variable.
| baker | bottom_tier | runner_up | top_tier | winner | real_outcome | gets_to_final |
|---|---|---|---|---|---|---|
| Suzy | 0.376 | 0.564 | 0.044 | 0.016 | runner_up | 0.580 |
| Dave | 0.316 | 0.388 | 0.228 | 0.068 | runner_up | 0.456 |
| Claudia | 0.472 | 0.102 | 0.104 | 0.322 | winner | 0.424 |
| Jasmin | 0.546 | 0.412 | 0.040 | 0.002 | runner_up | 0.414 |
| Janice | 0.644 | 0.320 | 0.034 | 0.002 | bottom_tier | 0.322 |
| James | 0.342 | 0.274 | 0.352 | 0.032 | top_tier | 0.306 |
| Barb | 0.342 | 0.274 | 0.352 | 0.032 | runner_up | 0.306 |
| Sian | 0.620 | 0.028 | 0.086 | 0.266 | winner | 0.294 |
| Jessica | 0.620 | 0.028 | 0.086 | 0.266 | bottom_tier | 0.294 |
| Chris | 0.338 | 0.244 | 0.374 | 0.044 | top_tier | 0.288 |
| Michelle | 0.688 | 0.254 | 0.058 | 0.000 | bottom_tier | 0.254 |
| Marcus | 0.796 | 0.152 | 0.034 | 0.018 | bottom_tier | 0.170 |
| Emma | 0.584 | 0.148 | 0.248 | 0.020 | bottom_tier | 0.168 |
| Brendan | 0.836 | 0.044 | 0.028 | 0.092 | bottom_tier | 0.136 |
| Ben | 0.620 | 0.090 | 0.276 | 0.014 | bottom_tier | 0.104 |
| Meg | 0.762 | 0.098 | 0.136 | 0.004 | bottom_tier | 0.102 |
| Raeesa | 0.336 | 0.076 | 0.580 | 0.008 | top_tier | 0.084 |
| Peter | 0.838 | 0.014 | 0.086 | 0.062 | bottom_tier | 0.076 |
| Alex | 0.838 | 0.014 | 0.086 | 0.062 | bottom_tier | 0.076 |
| Nathan | 0.676 | 0.012 | 0.274 | 0.038 | top_tier | 0.050 |
| Robert | 0.198 | 0.042 | 0.752 | 0.008 | top_tier | 0.050 |
| Mariana | 0.808 | 0.012 | 0.144 | 0.036 | bottom_tier | 0.048 |
| Angela | 0.094 | 0.018 | 0.886 | 0.002 | top_tier | 0.020 |
| Max | 0.912 | 0.002 | 0.082 | 0.004 | bottom_tier | 0.006 |
Tadaa! The model predicts 4 of the 6 finalists correctly after just the first episode.
So lets look at how our model works with the testdata from season 3.
## Reference
## Prediction bottom_tier runner_up top_tier winner
## bottom_tier 4 1 3 1
## runner_up 0 0 0 0
## top_tier 2 1 0 0
## winner 0 0 0 0
Accuracy is down to 33%. That’s unfortunate. But still, lets make the same exercise as before and predict the winner by looking at who has the highest probability of winning.
| baker | bottom_tier | runner_up | top_tier | winner | real_outcome |
|---|---|---|---|---|---|
| Olivia | 0.472 | 0.102 | 0.104 | 0.322 | winner |
| Diana | 0.620 | 0.028 | 0.086 | 0.266 | bottom_tier |
| Monica | 0.136 | 0.228 | 0.596 | 0.040 | runner_up |
| Antonio | 0.676 | 0.012 | 0.274 | 0.038 | runner_up |
| Bojan | 0.808 | 0.012 | 0.144 | 0.036 | bottom_tier |
| Janette | 0.342 | 0.274 | 0.352 | 0.032 | bottom_tier |
| Fiona | 0.792 | 0.170 | 0.010 | 0.028 | top_tier |
| Liesel | 0.584 | 0.148 | 0.248 | 0.020 | top_tier |
| James | 0.552 | 0.152 | 0.280 | 0.016 | top_tier |
| Jeremy | 0.336 | 0.076 | 0.580 | 0.008 | bottom_tier |
| Cheryl | 0.762 | 0.098 | 0.136 | 0.004 | bottom_tier |
| Noel | 0.688 | 0.254 | 0.058 | 0.000 | bottom_tier |
That looks way better. Even if the model doesn’t predict the final classifications absolutely correctly it predicts that Olivia, the real winner of this season, has the highest probability of winning.
So let’s look also here to the bakers with the highest probability of making it to the finale.| baker | bottom_tier | runner_up | top_tier | winner | real_outcome | gets_to_final |
|---|---|---|---|---|---|---|
| Olivia | 0.472 | 0.102 | 0.104 | 0.322 | winner | 0.424 |
| Janette | 0.342 | 0.274 | 0.352 | 0.032 | bottom_tier | 0.306 |
| Diana | 0.620 | 0.028 | 0.086 | 0.266 | bottom_tier | 0.294 |
| Monica | 0.136 | 0.228 | 0.596 | 0.040 | runner_up | 0.268 |
| Noel | 0.688 | 0.254 | 0.058 | 0.000 | bottom_tier | 0.254 |
| Fiona | 0.792 | 0.170 | 0.010 | 0.028 | top_tier | 0.198 |
| Liesel | 0.584 | 0.148 | 0.248 | 0.020 | top_tier | 0.168 |
| James | 0.552 | 0.152 | 0.280 | 0.016 | top_tier | 0.168 |
| Cheryl | 0.762 | 0.098 | 0.136 | 0.004 | bottom_tier | 0.102 |
| Jeremy | 0.336 | 0.076 | 0.580 | 0.008 | bottom_tier | 0.084 |
| Antonio | 0.676 | 0.012 | 0.274 | 0.038 | runner_up | 0.050 |
| Bojan | 0.808 | 0.012 | 0.144 | 0.036 | bottom_tier | 0.048 |
That could obviously be better. It only predicts 1 of the 3 finalists correctly. But we’ll take a look at how the predictions change within the season.
As mentioned before, we’ll retrain the model using also the third season. This will (hopefully) help us, have an even better model for the upcoming fifth season.
## Reference
## Prediction bottom_tier runner_up top_tier winner
## bottom_tier 18 1 3 1
## runner_up 0 5 1 0
## top_tier 0 0 5 0
## winner 0 0 0 2
As hoped, accuracy has gone up. We are now at 83%. Lets take a quick look at the predicted winners and finalists, before we’ll let go of episode 1 and turn to the next ones.
| baker | bottom_tier | runner_up | top_tier | winner | real_outcome |
|---|---|---|---|---|---|
| Claudia | 0.046 | 0.064 | 0.004 | 0.886 | winner |
| Olivia | 0.046 | 0.064 | 0.004 | 0.886 | winner |
| Sian | 0.698 | 0.000 | 0.000 | 0.302 | winner |
| Jessica | 0.698 | 0.000 | 0.000 | 0.302 | bottom_tier |
| Diana | 0.698 | 0.000 | 0.000 | 0.302 | bottom_tier |
| Brendan | 0.762 | 0.016 | 0.000 | 0.222 | bottom_tier |
| Dave | 0.022 | 0.812 | 0.122 | 0.044 | runner_up |
| Chris | 0.094 | 0.184 | 0.702 | 0.020 | top_tier |
| Monica | 0.006 | 0.810 | 0.164 | 0.020 | runner_up |
| Suzy | 0.056 | 0.860 | 0.068 | 0.016 | runner_up |
| Peter | 0.940 | 0.016 | 0.036 | 0.008 | bottom_tier |
| Alex | 0.940 | 0.016 | 0.036 | 0.008 | bottom_tier |
| Fiona | 0.332 | 0.018 | 0.644 | 0.006 | top_tier |
| Jasmin | 0.250 | 0.634 | 0.116 | 0.000 | runner_up |
| Angela | 0.152 | 0.020 | 0.828 | 0.000 | top_tier |
| James | 0.384 | 0.314 | 0.302 | 0.000 | top_tier |
| Nathan | 0.154 | 0.434 | 0.412 | 0.000 | top_tier |
| Ben | 0.850 | 0.000 | 0.150 | 0.000 | bottom_tier |
| Janice | 0.730 | 0.220 | 0.050 | 0.000 | bottom_tier |
| Meg | 0.906 | 0.018 | 0.076 | 0.000 | bottom_tier |
| Mariana | 0.908 | 0.054 | 0.038 | 0.000 | bottom_tier |
| Barb | 0.384 | 0.314 | 0.302 | 0.000 | runner_up |
| Raeesa | 0.570 | 0.000 | 0.430 | 0.000 | top_tier |
| Robert | 0.036 | 0.060 | 0.904 | 0.000 | top_tier |
| Marcus | 0.794 | 0.028 | 0.178 | 0.000 | bottom_tier |
| Michelle | 0.938 | 0.056 | 0.006 | 0.000 | bottom_tier |
| Emma | 0.512 | 0.004 | 0.484 | 0.000 | bottom_tier |
| Max | 0.794 | 0.094 | 0.112 | 0.000 | bottom_tier |
| Antonio | 0.154 | 0.434 | 0.412 | 0.000 | runner_up |
| Liesel | 0.512 | 0.004 | 0.484 | 0.000 | top_tier |
| James | 0.158 | 0.084 | 0.758 | 0.000 | top_tier |
| Noel | 0.938 | 0.056 | 0.006 | 0.000 | bottom_tier |
| Jeremy | 0.570 | 0.000 | 0.430 | 0.000 | bottom_tier |
| Bojan | 0.908 | 0.054 | 0.038 | 0.000 | bottom_tier |
| Cheryl | 0.906 | 0.018 | 0.076 | 0.000 | bottom_tier |
| Janette | 0.384 | 0.314 | 0.302 | 0.000 | bottom_tier |
| baker | bottom_tier | runner_up | top_tier | winner | real_outcome | gets_to_final |
|---|---|---|---|---|---|---|
| Claudia | 0.046 | 0.064 | 0.004 | 0.886 | winner | 0.950 |
| Olivia | 0.046 | 0.064 | 0.004 | 0.886 | winner | 0.950 |
| Suzy | 0.056 | 0.860 | 0.068 | 0.016 | runner_up | 0.876 |
| Dave | 0.022 | 0.812 | 0.122 | 0.044 | runner_up | 0.856 |
| Monica | 0.006 | 0.810 | 0.164 | 0.020 | runner_up | 0.830 |
| Jasmin | 0.250 | 0.634 | 0.116 | 0.000 | runner_up | 0.634 |
| Nathan | 0.154 | 0.434 | 0.412 | 0.000 | top_tier | 0.434 |
| Antonio | 0.154 | 0.434 | 0.412 | 0.000 | runner_up | 0.434 |
| James | 0.384 | 0.314 | 0.302 | 0.000 | top_tier | 0.314 |
7 out of 10 is not too bad in my opinion. We’ll check at how this number improves over the episodes, once we have models for all of them.
This is the end of part one, since otherwise this post would be waaay too long. But stay tuned of the upcoming posts about the next episodes and of course, the validation of the models in real life, once this years season is running. Part 2 can be found here
All the raw data and the code can be found in this Github Repo