Goal: Give marketing students an intuitive and practical understanding of multinomial logit (MNL)—what it is, when to use it, how to prepare data, how to estimate it with three common R approaches, and how to run simple what-if simulations.

Dataset: Fishing from the mlogit package (choice of fishing mode: beach, boat, charter, pier).
Approaches covered: nnet::multinom, VGAM::vglm, mlogit::mlogit.
Data prep alternatives: mlogit.data (shortcut), tidyr::pivot_longer, tidyr::gather.


1 1. What is Multinomial Logit (MNL)?

In marketing, MNL is used whenever a person selects one option from multiple discrete alternatives. Classic use cases:

  • Brand choice (which cereal brand?)
  • Channel choice (store vs online)
  • Campaign/creative choice (which ad gets clicked)
  • Product configuration choice (SKU among variants)
  • Transport or service mode choice (used here: fishing mode)

Idea: Each alternative \(j\) yields a latent utility \(U_{ij}\) to individual \(i\). With the logit specification,

\[ P(Y_i=j) \;=\; \frac{\exp(X_{ij} \beta)}{\sum_{k} \exp(X_{ik}\beta)} \]

  • Coefficients are relative to a reference alternative (a baseline).
  • Predictors can be alternative-specific (e.g., price of that option) or individual-specific (e.g., income of the chooser).
  • IIA assumption: odds between any pair of alternatives do not depend on other available alternatives. This is often reasonable but should be discussed and checked in real applications.

When to prefer a choice model: If you need to simulate market shares under pricing or feature changes and you have variables defined per alternative, MNL (in “choice model” form) is ideal.


2 2. Setup

# Optionally install packages if missing (for classrooms/labs)
pkgs <- c("nnet","VGAM","mlogit","lmtest","pscl","tidyverse","knitr","kableExtra")
to_install <- setdiff(pkgs, rownames(installed.packages()))
if (length(to_install)) install.packages(to_install, repos = "https://cloud.r-project.org")

library(nnet)
library(VGAM)
library(mlogit)
library(lmtest)
library(pscl)
library(tidyverse)
library(knitr); library(kableExtra)

set.seed(123) # for reproducibility in algorithms that use random starts (e.g., multinom)
knitr::opts_chunk$set(comment = NA, fig.align = "center", fig.width = 7, fig.height = 4.5)

# helper: safe table printer across HTML/PDF/Word
kbl_safe <- function(x, ...) {
  tb <- knitr::kable(x, ...)
  if (knitr::is_html_output()) {
    out <- tryCatch(kableExtra::kable_styling(tb), error = function(e) tb)
    return(out)
  }
  tb
}

Data:

data("Fishing", package="mlogit")
str(Fishing)
'data.frame':   1182 obs. of  10 variables:
 $ mode         : Factor w/ 4 levels "beach","pier",..: 4 4 3 2 3 4 1 4 3 3 ...
 $ price.beach  : num  157.9 15.1 161.9 15.1 106.9 ...
 $ price.pier   : num  157.9 15.1 161.9 15.1 106.9 ...
 $ price.boat   : num  157.9 10.5 24.3 55.9 41.5 ...
 $ price.charter: num  182.9 34.5 59.3 84.9 71 ...
 $ catch.beach  : num  0.0678 0.1049 0.5333 0.0678 0.0678 ...
 $ catch.pier   : num  0.0503 0.0451 0.4522 0.0789 0.0503 ...
 $ catch.boat   : num  0.26 0.157 0.241 0.164 0.108 ...
 $ catch.charter: num  0.539 0.467 1.027 0.539 0.324 ...
 $ income       : num  7083 1250 3750 2083 4583 ...
kbl_safe(head(Fishing))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
mode price.beach price.pier price.boat price.charter catch.beach catch.pier catch.boat catch.charter income
charter 157.930 157.930 157.930 182.930 0.0678 0.0503 0.2601 0.5391 7083.332
charter 15.114 15.114 10.534 34.534 0.1049 0.0451 0.1574 0.4671 1250.000
boat 161.874 161.874 24.334 59.334 0.5333 0.4522 0.2413 1.0266 3750.000
pier 15.134 15.134 55.930 84.930 0.0678 0.0789 0.1643 0.5391 2083.333
boat 106.930 106.930 41.514 71.014 0.0678 0.0503 0.1082 0.3240 4583.332
charter 192.474 192.474 28.934 63.934 0.5333 0.4522 0.1665 0.3975 4583.332

Interpretation of columns (wide format): - mode: chosen alternative (factor with levels beach, boat, charter, pier) - price.beach, price.boat, price.charter, price.pier: alternative-specific prices - catch.beach, catch.boat, catch.charter, catch.pier: expected catch by alternative - income: chooser’s income (individual-specific)


3 3. Data Preparation (3 ways)

We’ll create a long “choice” format with one row per (chooser, alternative) and a logical mode column indicating the chosen alternative.

3.1 3A. Shortcut: mlogit.data

fishing_mlogit <- mlogit.data(Fishing,
                              shape = "wide",
                              choice = "mode",
                              varying = 2:9,   # columns price.* and catch.*
                              sep = ".")

kbl_safe(head(as.data.frame(fishing_mlogit)[, c("chid","alt","mode","price","catch","income")]))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
chid alt mode price catch income
1 beach FALSE 157.930 0.0678 7083.332
1 boat FALSE 157.930 0.2601 7083.332
1 charter TRUE 182.930 0.5391 7083.332
1 pier FALSE 157.930 0.0503 7083.332
2 beach FALSE 15.114 0.1049 1250.000
2 boat FALSE 10.534 0.1574 1250.000
  • chid is the chooser ID; alt is the alternative; mode is TRUE for the chosen alt.
  • This structure is what mlogit() expects.

3.2 3B. Tidyverse: pivot_longer

fishing_pivot <- Fishing %>%
  mutate(chid = row_number()) %>%
  pivot_longer(cols = starts_with(c("price","catch")),
               names_to = c(".value","alt"),
               names_sep = "\\.") %>%
  mutate(mode = (mode == alt))

kbl_safe(head(as.data.frame(fishing_pivot)[, c("chid","alt","mode","price","catch","income")]))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
chid alt mode price catch income
1 beach FALSE 157.930 0.0678 7083.332
1 pier FALSE 157.930 0.0503 7083.332
1 boat FALSE 157.930 0.2601 7083.332
1 charter TRUE 182.930 0.5391 7083.332
2 beach FALSE 15.114 0.1049 1250.000
2 pier FALSE 15.114 0.0451 1250.000

3.3 3C. Tidyverse (classic): gather/separate/spread

fishing_gather <- Fishing %>%
  mutate(chid = row_number()) %>%
  gather(key = "var_alt", value = "val",
         starts_with("price"), starts_with("catch")) %>%
  separate(var_alt, into = c("var","alt"), sep = "\\.") %>%
  spread(var, val) %>%
  mutate(mode = (mode == alt))

kbl_safe(head(as.data.frame(fishing_gather)[, c("chid","alt","mode","price","catch","income")]))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
chid alt mode price catch income
604 beach TRUE 5.934 0.0678 416.6667
604 boat FALSE 7.740 0.0014 416.6667
604 charter FALSE 36.740 0.0029 416.6667
604 pier FALSE 5.934 0.0789 416.6667
893 beach TRUE 3.870 0.5333 416.6667
893 boat FALSE 47.730 0.2413 416.6667

3.4 3D. Sanity check: structures are equivalent

check1 <- all.equal(
  as.data.frame(fishing_mlogit)[,c("chid","alt","mode","price","catch","income")],
  as.data.frame(fishing_pivot)[,c("chid","alt","mode","price","catch","income")]
)
check2 <- all.equal(
  as.data.frame(fishing_mlogit)[,c("chid","alt","mode","price","catch","income")],
  as.data.frame(fishing_gather)[,c("chid","alt","mode","price","catch","income")]
)
check1; check2
[1] "Attributes: < Component \"class\": Lengths (2, 1) differ (string compare on first 1) >"
[2] "Component \"alt\": 'current' is not a factor"                                          
[3] "Component \"mode\": 2096 element mismatches"                                           
[4] "Component \"price\": Mean relative difference: 0.8301712"                              
[5] "Component \"catch\": Mean relative difference: 1.243696"                               
[1] "Attributes: < Component \"class\": Lengths (2, 1) differ (string compare on first 1) >"
[2] "Component \"chid\": Mean relative difference: 0.7143749"                               
[3] "Component \"alt\": 'current' is not a factor"                                          
[4] "Component \"mode\": 1720 element mismatches"                                           
[5] "Component \"price\": Mean relative difference: 0.9345275"                              
[6] "Component \"catch\": Mean relative difference: 1.222338"                               
[7] "Component \"income\": Mean relative difference: 0.6685676"                             

Teaching tip: Showing three prep routes helps students connect “tidy” data wrangling to econometric packages’ expectations.

3.4.1 3C.1 Why this older tidyverse path? (what we did & why)

What we did
1. mutate(chid = row_number()) — create a chooser id so the model knows which rows belong to the same choice set.
2. gather(...)melt all alternative-specific columns into key–value pairs. This exposes the hidden “alternative” dimension that was previously baked into column names like price.boat.
3. separate(var_alt, into=c("var","alt")) — split names like price.boat into var="price" and alt="boat".
4. spread(var, val) — pivot the var dimension back into columns so each row has price and catch columns for its specific alt.
5. mutate(mode = (mode == alt)) — convert the single chosen label into a TRUE/FALSE indicator per (chooser, alternative) row; this is the dependent variable for mlogit.

Why we did it
- To surface the alternative dimension explicitly (long format: one row per (chooser, alt)).
- To line up variables with the mlogit formula: alt-specific variables (price, catch) before the |; individual-specific variables (income) after.
- For pedagogical clarity—students see exactly how the data map to the choice model.

Mini checks (run & inspect):

# See the stacked alternatives just after gather/separate
tmp_long <- Fishing %>% 
  mutate(chid = row_number()) %>%
  gather("var_alt","val", starts_with("price"), starts_with("catch")) %>%
  separate(var_alt, c("var","alt"), sep="\\.")
head(tmp_long)
# Verify we have exactly one chosen alternative per chooser in the final table
fishing_gather %>%
  group_by(chid) %>%
  summarise(chosen_rows = sum(mode)) %>%
  count(chosen_rows)

Takeaway: this route makes the model’s data structure transparent: one row per (chooser, alt) with price, catch, and a single TRUE marking the chosen option.


4 4. “Classification” View (Only the Chosen Row per Person)

Here we keep one row per chooser—the chosen alternative—and treat it as a multi-class label. This is common in ML classes.

wide_df <- subset(fishing_mlogit, mode == TRUE)
wide_df$alt <- relevel(wide_df$alt, ref = "beach")  # baseline category
table(wide_df$alt)

  beach    boat charter    pier 
    134     418     452     178 

4.1 4A. nnet::multinom

m_nnet <- multinom(alt ~ price + catch + income, data = wide_df, trace = FALSE)
summary(m_nnet)
Call:
multinom(formula = alt ~ price + catch + income, data = wide_df, 
    trace = FALSE)

Coefficients:
        (Intercept)        price      catch        income
boat      0.9880772  0.003555725 -1.2997215  6.930617e-05
charter   0.1588430  0.022700714  1.4116330 -1.788990e-04
pier      1.0836892 -0.003027237 -0.9494361 -1.286185e-04

Std. Errors:
         (Intercept)       price        catch       income
boat    1.767921e-05 0.003155494 2.992347e-06 3.161360e-05
charter 1.364347e-05 0.003160311 9.419144e-06 3.691491e-05
pier    2.304065e-05 0.003965765 5.115583e-06 4.049548e-05

Residual Deviance: 2531.988 
AIC: 2555.988 

Wald-style p-values (for quick classroom discussion; LR tests are preferable):

coefs <- summary(m_nnet)$coefficients
ses   <- summary(m_nnet)$standard.errors
pvals <- 2*(1 - pnorm(abs(coefs/ses)))
kbl_safe(round(pvals, 3))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
(Intercept) price catch income
boat 0 0.260 0 0.028
charter 0 0.000 0 0.000
pier 0 0.445 0 0.001

Likelihood-ratio (LR) test vs. null:

m0_nnet <- multinom(alt ~ 1, data = wide_df, trace = FALSE)
anova(m0_nnet, m_nnet, test = "Chisq")

In-sample classification snapshot (for intuition, not model selection):

pred_nnet_class <- predict(m_nnet, type = "class")
tab_nnet <- table(True = wide_df$alt, Pred = pred_nnet_class)
acc_nnet <- mean(pred_nnet_class == wide_df$alt)
tab_nnet; acc_nnet
         Pred
True      beach boat charter pier
  beach       0   96      38    0
  boat        0  314     102    2
  charter     0  138     314    0
  pier        0  129      46    3
[1] 0.5338409

4.2 4B. VGAM::vglm (multinomial regression)

m_vgam <- vglm(alt ~ price + catch + income,
               family = multinomial(refLevel = "beach"),
               data   = wide_df)
summary(m_vgam)

Call:
vglm(formula = alt ~ price + catch + income, family = multinomial(refLevel = "beach"), 
    data = wide_df)

Coefficients: 
                Estimate Std. Error z value Pr(>|z|)    
(Intercept):1  9.880e-01  2.231e-01   4.429 9.49e-06 ***
(Intercept):2  1.589e-01  2.367e-01   0.671 0.502172    
(Intercept):3  1.084e+00  2.584e-01   4.194 2.75e-05 ***
price:1        3.557e-03  3.343e-03   1.064 0.287253    
price:2        2.270e-02  3.333e-03   6.811 9.71e-12 ***
price:3       -3.026e-03  4.180e-03  -0.724 0.469130    
catch:1       -1.300e+00  3.406e-01  -3.816 0.000136 ***
catch:2        1.412e+00  2.760e-01   5.115 3.14e-07 ***
catch:3       -9.493e-01  4.023e-01  -2.360 0.018291 *  
income:1       6.931e-05  4.239e-05   1.635 0.102074    
income:2      -1.789e-04  4.891e-05  -3.658 0.000254 ***
income:3      -1.286e-04  5.514e-05  -2.332 0.019683 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1]), 
log(mu[,4]/mu[,1])

Residual deviance: 2531.988 on 3534 degrees of freedom

Log-likelihood: -1265.994 on 3534 degrees of freedom

Number of Fisher scoring iterations: 5 

No Hauck-Donner effect found in any of the estimates


Reference group is level  1  of the response

LR test vs. null:

m0_vgam <- vglm(alt ~ 1, family = multinomial(refLevel = "beach"), data = wide_df)
VGAM::lrtest(m0_vgam, m_vgam)
Likelihood ratio test

Model 1: alt ~ 1
Model 2: alt ~ price + catch + income
   #Df  LogLik Df  Chisq Pr(>Chisq)    
1 3543 -1497.7                         
2 3534 -1266.0 -9 463.46  < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Notes on interpretation: Coefficients are log-odds for each non-baseline class vs beach. A negative price coefficient means that (holding other things fixed) higher price for that option reduces the odds of choosing it relative to beach.


5 5. “Choice Model” View (One Row per (chooser, alternative))

This is the econometric formulation students will use for market share simulation and policy experiments.

m_mlogit <- mlogit(mode ~ price + catch | income,
                   data     = fishing_mlogit,
                   reflevel = "beach")
summary(m_mlogit)

Call:
mlogit(formula = mode ~ price + catch | income, data = fishing_mlogit, 
    reflevel = "beach", method = "nr")

Frequencies of alternatives:choice
  beach    boat charter    pier 
0.11337 0.35364 0.38240 0.15059 

nr method
7 iterations, 0h:0m:0s 
g'(-H)^-1g = 1.37E-05 
successive function values within tolerance limits 

Coefficients :
                       Estimate  Std. Error  z-value  Pr(>|z|)    
(Intercept):boat     5.2728e-01  2.2279e-01   2.3667 0.0179485 *  
(Intercept):charter  1.6944e+00  2.2405e-01   7.5624 3.952e-14 ***
(Intercept):pier     7.7796e-01  2.2049e-01   3.5283 0.0004183 ***
price               -2.5117e-02  1.7317e-03 -14.5042 < 2.2e-16 ***
catch                3.5778e-01  1.0977e-01   3.2593 0.0011170 ** 
income:boat          8.9440e-05  5.0067e-05   1.7864 0.0740345 .  
income:charter      -3.3292e-05  5.0341e-05  -0.6613 0.5084031    
income:pier         -1.2758e-04  5.0640e-05  -2.5193 0.0117582 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log-Likelihood: -1215.1
McFadden R^2:  0.18868 
Likelihood ratio test : chisq = 565.17 (p.value = < 2.22e-16)
  • Left side mode ~ ... indicates the choice indicator.
  • Alternative-specific variables (price, catch) go before the |.
  • Individual-specific variables (income) go after the | (because they don’t vary across alternatives within a chooser).
  • reflevel="beach" makes beach the baseline utility.

LR test vs. null:

m0_mlogit <- mlogit(mode ~ 1 | 1, data = fishing_mlogit, reflevel = "beach")
lmtest::lrtest(m0_mlogit, m_mlogit)

Reading the output: For each non-baseline alternative, you’ll see a coefficient on price and catch (and possibly interactions with income if you specify them). Signs and significance match intuition: price usually negative; catch positive. Magnitudes are in utility units; the effect on probability depends on the full choice set (through the softmax denominator).

IIA caveat: MNL assumes IIA (relative odds unaffected by presence of other options). In real marketing settings (e.g., close substitutes), consider nested or mixed logit. Here we proceed with MNL for clarity.

6 5.1 How to Interpret the Coefficients (all three approaches)

Across nnet::multinom, VGAM::vglm, and mlogit::mlogit, coefficients are in log-odds units relative to a baseline alternative (our baseline is beach). A quick translation:

  • Log-odds change: \(\Delta \log\text{-odds} = \beta \cdot \Delta x\)
  • Odds ratio (OR): \(\text{OR} = e^{\beta}\) (for a one-unit increase)
  • % change in odds: \(100 \times (e^{\beta} - 1)\%\)

6.1 5.1.1 multinom / vglm (classification view)

  • Each non-baseline row (e.g., boat) shows how predictors tilt log-odds of boat vs beach.
  • Example: if price under boat is \(-0.05\), then increasing boat price by 1 multiplies the odds of boat vs beach by \(\exp(-0.05)\approx 0.95\), a 5% decrease (holding other variables fixed).
  • If catch under charter is \(+0.20\), then \(\exp(0.20)\approx 1.22\): about a 22% increase in odds of charter vs beach per unit of catch.
  • Intercepts act like alternative-specific constants (ASCs) relative to the baseline.
# Quick odds-ratio views
# (interpret within a single fitted model)
# exp(coef(m_nnet))
# exp(coef(m_vgam))

Prefer LR tests vs null to judge overall signal; use Wald p-values for per-coefficient checks.

6.2 5.1.2 mlogit (choice model view)

  • With mode ~ price + catch | income, alt-specific variables (price, catch) affect the utility difference of each non-baseline alt vs the baseline.
  • Example: a price coefficient of \(-0.05\) for boat means a one-unit increase reduces log-odds of boat vs beach by 0.05; the odds multiply by \(\exp(-0.05)\).
  • Individual-specific variables (income) appear as alternative-specific effects (one per non-baseline alt). A positive income under charter shifts odds toward charter vs beach.
# Odds ratios from mlogit summary (by alternative vs baseline)
# exp(summary(m_mlogit)$coefficients)

Probability changes depend on the whole choice set (softmax). Use marginal effects or elasticities below for managerial statements.

6.3 5.1.3 Managerial translation patterns

  • “Sign makes sense? price ↓ odds; catch ↑ odds.”
  • “Scale to realistic \(\Delta x\): OR for \(\beta\cdot \Delta x\) is \(\exp(\beta \Delta x)\).”
  • “State the effect in odds; then show probability movement with marginal effects.”
  • “Be explicit about units (rescale price if needed, e.g., per $10).”

6.4 5.1.4 Checklist while reading output

  • Baseline stated (here: beach).
  • Signs match intuition (price \(<0\), catch \(>0\)).
  • Magnitude via odds ratios \(e^{\beta}\).
  • Significance: SEs, p-values; LR test vs null.
  • Substitution: who gains when one option loses? (see AMEs/elasticities).
  • IIA caveat and when to consider nested/mixed logit.
# Optional: explicit odds ratio table for the mlogit fit (if desired)
or_tab <- exp(summary(m_mlogit)$coefficients)
kbl_safe(round(or_tab, 3))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
x
(Intercept):boat 1.694
(Intercept):charter 5.443
(Intercept):pier 2.177
price 0.975
catch 1.430
income:boat 1.000
income:charter 1.000
income:pier 1.000

7 6. Prediction & What-If Simulation

We’ll simulate the effect of raising the boat price by 20 on predicted market shares.

# Baseline market shares
p0 <- predict(m_mlogit, newdata = fishing_mlogit)        # matrix: rows=obs, cols=alts
share0 <- colMeans(p0)

# Counterfactual: increase boat price by +20
whatif <- fishing_mlogit
whatif$price[whatif$alt == "boat"] <- whatif$price[whatif$alt == "boat"] + 20
p1 <- predict(m_mlogit, newdata = whatif)
share1 <- colMeans(p1)

# Compare
shares <- rbind(Baseline = share0, BoatPriceUp = share1)
kbl_safe(round(shares, 3))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
beach boat charter pier
Baseline 0.113 0.354 0.382 0.151
BoatPriceUp 0.123 0.259 0.454 0.164

Bar plot of market share change:

barplot(t(shares), beside = TRUE, legend = TRUE,
        ylab = "Market Share",
        main = "What-if: Boat Price +20")
What-if: Increase Boat Price by 20 (Predicted Market Shares)

What-if: Increase Boat Price by 20 (Predicted Market Shares)

Teaching angle: Emphasize that these are ceteris paribus predictions under the model’s assumptions. Students can try other scenarios (e.g., lowering pier price, increasing catch for charter, or targeting by income).


8 7. Extensions & Exercises

  1. Targeting/heterogeneity: Interact income with price or catch to see if price sensitivity varies by income.

    m_mlogit_ix <- mlogit(mode ~ price + catch + price:income + catch:income | income,
                          data = fishing_mlogit, reflevel = "beach")
    summary(m_mlogit_ix)
  2. Alternative-specific constants (ASCs): Include ASCs to capture average appeal of each option (beyond observed attributes). In mlogit, add 0/1 dummies per alternative (leaving one out) or use ~ 1 on the right as needed.

  3. Model comparison: Compare AIC/BIC across specifications; discuss out-of-sample validation by splitting choosers.

  4. IIA discussion: Brainstorm situations where IIA might fail and what nested/mixed logit would change conceptually.

  5. Elasticities: As a challenge, derive own/cross price elasticities via small perturbations to price and observing predicted shares (finite differences).


9 8. Summary (Talking Points)

  • MNL models probability of choosing one among many.
  • Choice model formulation (with one row per (chooser, alt)) supports scenario analysis and market share prediction.
  • Classification view is handy pedagogically but less suited for counterfactuals.
  • Key assumptions: correct utility specification and IIA.
  • Always check data structure (alt-specific vs individual-specific variables) before deciding the formula.

10 Appendix: Interpretation Tools (Average Marginal Effects & Elasticities)

Sometimes you want probability movements and elasticities rather than odds. These helpers compute average marginal effects (AMEs) and own/cross-price elasticities by finite differences.

# Average Marginal Effects (AME): nudge a variable and average Δprobabilities
ame <- function(model, data, var, delta = 1, target_alt = NULL) {
  p0 <- predict(model, newdata = data)
  d1 <- data
  if (!is.null(target_alt)) {
    # alt-specific variable: nudge only that alternative's rows
    d1[[var]][d1$alt == target_alt] <- d1[[var]][d1$alt == target_alt] + delta
  } else {
    # individual-specific variable: nudge all rows equally
    d1[[var]] <- d1[[var]] + delta
  }
  p1 <- predict(model, newdata = d1)
  colMeans(p1 - p0)   # average Δprobability for each alternative
}

# Example AMEs:
ame_price_boat <- ame(m_mlogit, fishing_mlogit, var = "price", delta = 1, target_alt = "boat")
ame_income_all <- ame(m_mlogit, fishing_mlogit, var = "income", delta = 1, target_alt = NULL)
kbl_safe(round(ame_price_boat, 4))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
x
beach 0.0006
boat -0.0050
charter 0.0037
pier 0.0007
kbl_safe(round(ame_income_all, 4))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
x
beach 0
boat 0
charter 0
pier 0
# Elasticities: percent change in shares over percent change in (alt-specific) price
elasticities <- function(model, data, target_alt, pct = 0.01) {
  base_p <- predict(model, newdata = data)
  base_s <- colMeans(base_p)

  d_up <- data
  idx  <- d_up$alt == target_alt
  d_up$price[idx] <- d_up$price[idx] * (1 + pct)

  s_up <- colMeans(predict(model, newdata = d_up))
  ((s_up - base_s) / base_s) / pct
}

# Example: own & cross elasticities when BOAT price +1%
elas_boat <- elasticities(m_mlogit, fishing_mlogit, target_alt = "boat", pct = 0.01)
kbl_safe(round(elas_boat, 3))
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
Use 'xfun::attr2()' instead.
See help("Deprecated")
x
beach 0.292
boat -0.609
charter 0.376
pier 0.255

Reading AMEs: own-effect (target alt’s probability change) and cross-effects (where probability goes).
Reading elasticities: own-price elasticity (boat entry) should be negative; positive cross-price elasticities indicate substitutes.


11 9. Reproducibility

sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Melbourne
tzcode source: internal

attached base packages:
[1] splines   stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] kableExtra_1.4.0 knitr_1.49       lubridate_1.9.4  forcats_1.0.0   
 [5] stringr_1.5.1    dplyr_1.1.4      purrr_1.0.4      readr_2.1.5     
 [9] tidyr_1.3.1      tibble_3.2.1     ggplot2_3.5.2    tidyverse_2.0.0 
[13] pscl_1.5.9       lmtest_0.9-40    zoo_1.8-13       mlogit_1.1-3    
[17] dfidx_0.1-0      VGAM_1.1-13      nnet_7.3-20     

loaded via a namespace (and not attached):
 [1] sass_0.4.9        generics_0.1.3    xml2_1.3.8        stringi_1.8.4    
 [5] lattice_0.22-6    hms_1.1.3         digest_0.6.37     magrittr_2.0.3   
 [9] timechange_0.3.0  evaluate_1.0.3    grid_4.4.2        fastmap_1.2.0    
[13] jsonlite_1.9.0    Formula_1.2-5     viridisLite_0.4.2 scales_1.3.0     
[17] textshaping_1.0.0 jquerylib_0.1.4   Rdpack_2.6.2      cli_3.6.4        
[21] rlang_1.1.5       rbibutils_2.3     munsell_0.5.1     withr_3.0.2      
[25] cachem_1.1.0      yaml_2.3.10       tools_4.4.2       tzdb_0.5.0       
[29] colorspace_2.1-1  vctrs_0.6.5       R6_2.6.1          lifecycle_1.0.4  
[33] MASS_7.3-64       pkgconfig_2.0.3   pillar_1.10.1     bslib_0.9.0      
[37] gtable_0.3.6      glue_1.8.0        systemfonts_1.2.3 statmod_1.5.0    
[41] xfun_0.52         tidyselect_1.2.1  rstudioapi_0.17.1 htmltools_0.5.8.1
[45] svglite_2.2.1     rmarkdown_2.29    compiler_4.4.2