Great British Bake-off
By Libby Mohr
November 17, 2021
As someone who doesn’t care much for cake or reality-television, I was admittedly skeptical when friends raved about the Great British Bake Off. When I watched it, though, I found it hard not be taken by the quirky British humor, the camaraderie among bakers, and the shared struggle to make some complicated baked good in some ridiculously-short time period. Needless to say I was delighted when I stumbled upon the bakeoff package, which enables easy access to some of the show’s data from R.
The first question I sought to answer was:
What are the most commonly-used flavorings in the signature and showstopper bakes?
To answer this question, I determined which words appeared most commonly in the titles of signature and showstopper bakes from all 10 seasons of the show. I then excluded words that didn’t pertain to flavor: common words like “a” and “on”, and words that described the bake itself like “cake” and “pie”. Finally, I divided by the total number of signature and showstopper bakes in all seasons (1,391 in total!) to get the percentage of bakes that contained a given word in their title. Here’s what I found:
Un-surprising, the most popular ingredient was CHOCOLATE.
Mmmmmm…..
It’s worth noting that the text-mining method I used is imperfect. For example, a “black forest” cake contains both cherry and chocolate, so if the title of the bake contained “black forest” rather than the words “chocolate” and “cherry”, then this bake would not count towards the total count of bakes flavored with chocolate.
Nonetheless, the results made me curious to know:
Does using each of popular flavorings hurt or help the bakers?
We can think about this question in two ways. I’ll use chocolate as an example to simplify things, but the same questions could be asked for each ingredient:
- Is the risk of getting eliminated lower if the baker uses chocolate in either their signature or showstopper bake?
- Is the chance of getting star baker higher if the baker uses chocolate in either of their bakes?
To visualize the answers to question #1, I calculated a “risk ratio”. In short, I first calculated the risk of being eliminated when using chocolate as the total number of times that a baker used chocolate and was eliminated divided by the total number of times a baker used chocolate in an episode, regardless of elimination status. I then divided this risk by the risk of getting eliminated when not using chocolate, calculated in a similar fashion. Values greater than 1 indicate that a baker is more likely to be eliminated if they use chocolate, whereas values less than 1 indicate that they are less likely to be eliminated if they used chocolate.
Similarly, to address question #2, I calculated a “reward ratio”. In this case, values greater than 1 indicate that the chance of getting star baker is higher if a baker uses use chocolate, whereas values less than 1 indicate that the chance of getting star baker is lower. Here’s what I found for all the ingredients.
We can interpret these results by placing flavors into one of four categories:
Relative risk of elimination | Relative chance of star baker | Takeaway | Flavors |
---|---|---|---|
Lower | Higher | Use if you want to win | None! |
Higher | Higher | Use if you’re willing to take a risk to get star baker | Orange, apple, caramel, walnut |
Lower | Lower | Use if you want to play it safe | Chocolate, lemon, raspberry, almond, pistachio, cherry |
Higher | Lower | Don’t use if you want to win | Ginger, fruit, cheese |
Before we go too far interpreting these findings though, it’s worth asking:
How likely is it that these results occurred by chance?
In other words, how certain can we be that if we ran another 10 seasons of the show that we would find the same thing? To visualize the answer to this question without actually running 10 more seasons of the Great British Bakeoff, I “re-sampled” the data by randomly choosing 704 rows to create a simulated dataset the same size as the original one. This data set ends up being different than the original one, because every time a row is randomly selected it goes back into the mix of rows that can be chosen. I performed this re-sampling procedure several times in order to produce several simulated datasets. Finally, I calculated the risk and reward ratios for each simulated dataset. Here’s an animation that loops through results for each simulated dataset:
In case that makes your head spin, here’s a static plot showing 100 simulated outcomes as points, along with bars representing the ratios calculated from the original dataset.
These plots demonstrate that there’s a fair bit of uncertainty surrounding our findings. Still, there is relatively strong visual evidence that both fruit, orange, and walnut are likely to increase a baker’s chance of being eliminated relative to not using those flavors, whereas almond, pistachio and lemon appear to decrease a baker’s chance of being eliminated. On the star baker side of things, lemon and cherry seem to decrease a baker’s chance of winning star baker, whereas caramel may increase a baker’s likelihood of winning star baker (though the evidence is rather weak). When it comes to the most popular ingredient (chocolate), there’s not strong evidence that using it either hurts or helps bakers.
In the end, there’s still lots more to explore. How does a baker’s performance in the technical challenge affect all of this? Is the risk of using certain flavors different when Mary judges compared to when Prue judges? For now, I’ll have to leave those questions for another day.
- Posted on:
- November 17, 2021
- Length:
- 6 minute read, 1153 words
- Tags:
- hugo-site
- See Also:
- Tidy Tuesday
- Snowy Owls