The Problem

I’m a big fan of Chipotle. They do as good a job as any restaurant in balancing cost, flavor, quantity, speed of service and healthy ingredients (minus the occassional salmonella outbreak).

There are articles floating around the internet on how to maximize the weight of your burrito. The table below shows the weight of a control burrito compared to the maximum burrito. Note that they’re comparing the maximum to the control, rather than a comparison of averages under a more standard ANOVA.

In order to get a bigger burrito, one recommendation in the article is to order 50/50 on all ingredients. So, for example, rather than order white rice OR brown rice, ask for half white and half brown.

Presumably the burrito-ista divides each portion in half. And yet the maximum for the 50/50 split of beans and rice are larger on a per ingredient basis. This is a bit of a paradox. So how are we to resolve it? How would we model this statistically?

Modeling The Assembly Line

Let’s start with a simple model of the burrito assembly line. We’ll focus just on a simple burrito composed of only beans and rice.

We can assume that each server is homogenous and serves a normally distributed amount of beans and rice with a small variance. The numbers are made up, but we can assume that they are instructed to serve 3 oz of each ingredient with a standard deviation of .5.

set.seed(1)
#original
beans_weight <- rnorm(30, 3, .5) #.5 is sd, not var
rice_weight <- rnorm(30, 3, .5)
burrito_weight = beans_weight + rice_weight
hist(burrito_weight, breaks = 10)

Recall that the sum of two normal variables Z is distributed \(Z \sim N(\mu_{x} + \mu_{y}, \sigma^2_{x} + \sigma^2_{x})\). It follows that our mean burrito should weight 6 oz with a variance of .5.

Let’s construct a 95% CI for the mean:

mean(burrito_weight) - 2 * sd(burrito_weight)/sqrt(30); mean(burrito_weight) + 2 * sd(burrito_weight)/sqrt(30)
## [1] 5.879722
## [1] 6.335511

And a 95% CI for the variance:

(30-1) * var(burrito_weight) / qchisq(.975,29); (30-1) * var(burrito_weight) / qchisq(.025,29)
## [1] 0.2470581
## [1] 0.7039329

Both check out.

Modeling a 50/50 Ingredient Split

When instructed to do a 50/50 split, the average amount of each serving is divided by 2. So the mean serving of each portion is 1.5 oz. We’ll also assume that the standard deviation is also still 0.5 oz.

So how are we able to explain a higher maximum weight while maintaining consistency with the intent of the server (ie. the model parameters)? To do this, we will adjust the underlying distribution of the variance.

The standard deviation or scale of the distribution will have its own exponential distribution with an inverse scale parameter of 1/.5 or 2. Adopting a long-tailed distribution for the variance (a common practice in stochastic modeling) can explain why the maximum burrito weight is much more without changing the underlying parameters.

#combo
beans_combo <- rnorm(30, 1.5, rexp(30, 1/.5)) + rnorm(30, 1.5, rexp(30,1/.5))  
rice_combo  <- rnorm(30, 1.5, rexp(30, 1/.5)) + rnorm(30, 1.5, rexp(30,1/.5))  
burrito_combo = beans_combo + rice_combo
hist(burrito_combo, breaks = 10)

We can see that there is a wider overall spread of burrito weights. As the sample statsitics below indicate, the underlying averages have not changed.

mean(burrito_weight);mean(burrito_combo)
## [1] 6.107616
## [1] 6.156224
sd(burrito_weight); sd(burrito_combo)
## [1] 0.6241149
## [1] 1.651982