We want to estimate how many employees will choose each kind of car, by its features, so we will run a multi coditional logit:

mlogit
## 
## Call:
## mclogit(formula = cbind(choice_indicator, id) ~ const + size + 
##     price, data = data)
## 
##       Estimate Std. Error z value Pr(>|z|)    
## const -4.33763    0.70993  -6.110 9.97e-10 ***
## size   0.69190    0.07359   9.402  < 2e-16 ***
## price -2.27414    0.13447 -16.912  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Null Deviance:     2282 
## Residual Deviance: 1274 
## Number of Fisher Scoring iterations:  6 
## Number of observations:  823
sizes <- data$size %>% unique()
cons_coef <- mlogit$coefficients[1,1]
size_coef <- mlogit$coefficients[2,1]
price_coef <- mlogit$coefficients[3,1]

beta_opt_small <- exp(cons_coef + size_coef*sizes[[3]] + price_coef * 1)
beta_opt_big <- exp(cons_coef + size_coef*sizes[[2]] + price_coef * 2)
beta_opt_medium <- exp(cons_coef + size_coef*sizes[[1]] + price_coef * 2)

denom <- 1 + beta_opt_small + beta_opt_medium + beta_opt_big

num_of_small <- round(beta_opt_small/denom * 1000)
num_of_medium <- round(beta_opt_medium/denom * 1000)
num_of_big <- round(beta_opt_big/denom * 1000)
num_of_none <- 1000 - num_of_small - num_of_medium - num_of_big

So we can see that 364 expected to choose the smallest car, 178 choose the medium car, and 198 expected to choose the biggest car.

But the logit is very sensitive; if we will remove the option of the medium car (which costs the same as the biggest one) we will want our model to predict most of its customers to choose the big car. The logit model doesn’t allow that. Let’s have a look:

denom2 <- 1 + beta_opt_small + beta_opt_big


num_of_small2 <- round(beta_opt_small/denom2 * 1000)
num_of_big2 <- round(beta_opt_big/denom2 * 1000)
num_of_none2 <- 1000 - num_of_small2 - num_of_big2

Now we can see that 78 employees will move to the small car, and 43 will move to the big car. That not a good feature of the model.