This tutorial draws heavily from:
Introduction to SAS. UCLA: Statistical Consulting Group. from http://www.ats.ucla.edu/stat/sas/notes2/ (accessed November 24, 2007).
While the logit is just one possible transformation, it has the desirable property of being relatively understood (the log of the odds).
For the purposes of this tutorial, the following terms need to be understood:
p = represents probabilities and is read “the probability of”
\(\begin{align}\frac{p}{1-p}\end{align}\) = is the formula to convert probabilities into odds, and is read “the odds of”
p(Y) = the probability of Y occuring
The general logistic regression equation is:
\(\begin{aligned}logit(p) = log(\frac{p}{1-p}) = \beta_0+\beta_1X_1+\beta_2X_2\dots+\beta_kX_k\end{aligned}+\epsilon\)
And thus the probability of outcome “Y” is found by solving for ‘p’ resulting in:
\(\begin{aligned}p(Y) = \frac{e^{(\beta_0+\beta_1X_1+\beta_2X_2\dots+\beta_kX_k}}{e^{(1 + \beta_0+\beta_1X_1+\beta_2X_2\dots+\beta_kX_k)}}\end{aligned}\)
The general equation is: \(\begin{aligned}logit(p) =\beta_0 \end{aligned}\)
| Dependent variable: | |
| Being in Honors | |
| Constant | -1.13*** (-1.45, -0.80) |
| Observations | 200 |
| Log Likelihood | -111.00 |
| Akaike Inf. Crit. | 225.00 |
| Note: | p<0.1; p<0.05; p<0.01 |
So, the intercept, -1.125, is the log of the odds (logit) of being in honors. To get the odds of being in honors, we exponentiate the logit:
\(\begin{aligned}\frac{p}{1-p} = e^{-1.125} = 0.325\end{aligned}\)
An odds of 0.325 translates to odds of “1 to 3.082”, which means that for every 1 person in honors, there are ~3 persons who are not in honors. To do this translation, you divide 1 by the odds.
What is the estimated population probability of being in honors?
\(\begin{aligned}p(honors) = \frac{e^{-1.125}}{1+e^{-1.125}} = 24.5\%\end{aligned}\)
Note, examining the equation above, you can determine that converting odds to probabilities involves solving the equation, \(\frac{odds}{1+odds}\), you do not need to redo the exponentiation if you have computed the odds to derive the probability. In our case we know that the probability = \(\frac{0.325}{1.325}\).
What happens if we predict the likelihood of being in honors and use gender as a predictor? In our data, male is coded as zero, and female as 1.
\(\begin{aligned}logit(p) =\beta_0 + \beta_1*female \end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -1.47 | -2.03 – -0.97 | <.001 | |
| gender (Female) | 0.59 | -0.07 – 1.28 | .083 | |
| Observations | 200 | |||
So, the intercept, -1.471, is the log of the odds (logit) of being in honors for males, and \(e^{-1.471}\) = 0.23 is the odds of being in honors for males.
Then what is the estimated population probability of being in honors for males?
\(\begin{aligned}p(honors~|~male) = \frac{e^{-1.471}}{1+e^{-1.471}} = \frac{0.23}{1.23} = 18.68\%\end{aligned}\)
Now, the coefficient for gender, 0.593, is the log of the odds ratio of being in honors for females, compared to males (see discussion below for details on interpreting the odds ratios).
To calculate the population odds for females we must fill in the entire equation:
\(\begin{aligned}\frac{p}{1-p} = e^{-1.471~+~0.593} = e^{-0.878} = 0.416\end{aligned}\)
What is the estimated population probability of being in honors for females?
\(\begin{aligned}p(honors~|~female) = \frac{e^{-1.471~+~0.593}}{1+e^{-1.471~+~0.593}} = \frac{0.42}{1.416} = 29.36\%\end{aligned}\)
Interpreting the individualcoefficients in units of “log” is not intuitive.
Common practice in logistic regression is to convert our coefficients into odds ratios
Doing this is simple, it just involves exponentiating the coefficients. From our example:
Remember:
\(\begin{aligned}logit(p) = log(\frac{p}{1-p}) = \beta_0+\beta_1X_1\end{aligned}\)
So: we have two situations of interest, the baseline odds for males, and the odds ratio for females compared to males.
To get these values, we exponentiate the coefficients.
The odds of being in honors for males is:
\(\begin{aligned}e^{\beta_0} = e^{-1.47} = 0.23\end{aligned}\)
Whereas the odds for females is:
\(\begin{aligned}e^{\beta_0 + \beta_1} = e^{-1.47~+~0.59} = 0.42\end{aligned}\)
Whereas the odds ratio for females to males is:
\(\begin{aligned}\frac{e^{\beta_0+\beta_1}}{e^{\beta_0}} = \frac{0.42}{0.23} = 1.81\end{aligned}\)
Which can be calculated directly by exponentiating the coefficient for females
\(\begin{aligned}e^{\beta_1} = e^{0.59} = 1.81\end{aligned}\)
The exponentiated coefficients of predictors in logistic regression represent the odds of going from one state to another, they are odds ratios (OR)
Note, the OR is often interpreted as 1.81 = 81% greater odds for females than for males.
We’ll predict being in honors based on math scores
\(\begin{aligned}logit(p) =\beta_0 + \beta_1*math \end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -9.79 | -12.94 – -7.09 | <.001 | |
| math | 0.16 | 0.11 – 0.21 | <.001 | |
| Observations | 200 | |||
So, the intercept, -9.794, is the log of the odds (logit) of being in honors when math scores = zero. The odds of being in honors when math = 0 is the exponentiated coeffcient:
\(\begin{aligned}e^{-9.794} =0.0001\end{aligned}\)
The probability of being in honors when math = 0 is:
\(\begin{aligned}p(honors~|~math=0) = \frac{e^{-9.794}}{1+e^{-9.794}} = \frac{0}{1} = 0.01\%\end{aligned}\)
The coefficient on math represents the change in log odds for each 1-point increase in math scores.
So, as math scores go up 1 point, the log odds of beign in honors goes up by 0.156 points.
The probability of being in honors when math = mean(math) is:
\(\begin{aligned}p(honors~|~math=52.65) = \frac{e^{-9.794~+~0.156*52.65}}{1+e^{-9.794~+~0.156*52.65}} = 17.32\%\end{aligned}\)
The odds ratio for a 1 point increase in math scores is:
\(\begin{aligned}e^{\beta_1} = e^{0.156} = 1.17 \end{aligned}\)
Which = a 17% increase in odds for every 1 point increase in math scores
Let’s look at predicting beign in honors based on math scores, reading scores, and gender
\(\begin{aligned}logit(p) =\beta_0 + \beta_1math + \beta_2gender+\beta_3read\end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -11.77 | -15.40 – -8.66 | <.001 | |
| math | 0.12 | 0.06 – 0.19 | <.001 | |
| female | 0.98 | 0.17 – 1.84 | .020 | |
| read | 0.06 | 0.01 – 0.11 | .026 | |
| Observations | 200 | |||
This fitted model says that, holding math and reading at a fixed value, the odds of getting into an honors class for females (female = 1) over the odds of getting into an honors class for males (female = 0) is:
\(\begin{aligned}e^{0.98} = 2.66\end{aligned}\)
In terms of percent change, we can say that the odds for females are 166% higher than the odds for males.
Similarly, the coefficient for math (0.12) says that, holding gender and reading scores at a fixed value, a one point change in math scores results in a change of odds:
\(\begin{aligned}e^{0.12} = 1.13\end{aligned}\)
In terms of percent change, we can say that the change in odds for every 1 point increase in math scores is: 13%.
Let’s see if gender modifies the influence of math scores on the likelihood of being in honors.
Note, when we model interactions, we center our continuous predictor variables. Thus, the mean of math will now be zero.
We’ll start with the 2 predictors, gender and math, with no interaction.
\(\begin{aligned}logit(p) =\beta_0 + \beta_1gender + \beta_2math\end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -10.81 | -14.25 – -7.87 | <.001 | |
| gender (Female) | 0.97 | 0.17 – 1.81 | .020 | |
| math | 0.16 | 0.12 – 0.22 | <.001 | |
| Observations | 200 | |||
And, if we want to look at the likelihood of being in honors when math scores are at their average level, (math = 0), and \(\beta_2(0)\) also drops out.
Thus, the odds of being in honors for males when both gender and math scores are in the model, and math is set at the mean score, is:
\(\begin{aligned}e^{\beta_0} = e^{-10.81} = 0\end{aligned}\)
And, the odds of being in honors for females when both gender and math scores are in the model, and math is set at the mean score, is:
\(\begin{aligned}e^{\beta_0+\beta_1(1)} = e^{-10.81+ 0.97} = 0\end{aligned}\)
Now, let’s look at the model with the interaction term included
\(\begin{aligned}logit(p) =\beta_0 + \beta_1gender + \beta_2math+\beta_3(math*gender)\end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -8.75 | -13.44 – -4.97 | <.001 | |
| genderFemale | -2.90 | -9.11 – 3.24 | .349 | |
| math | 0.13 | 0.06 – 0.21 | <.001 | |
| genderFemale:math | 0.07 | -0.04 – 0.18 | .210 | |
| Observations | 200 | |||
It no longer makes sense to talk about holding math or gender constant since our interest is in the interaction of these two variables and our third term is depenent on both of them.
Although the interaction term, genderFemale:math is not statistically significant, we will proceed with intepreting the interaction for purposes of illustrating the process.
Looking at the output for the non-interaction model and the interaction model, we see that the coefficients for gender and mathboth decrease when the interaction is included.
Since male is coded zero, the coefficient for the interaction cancels out (math * gender) = (math *0), similarly the coefficient for gender cancels out (gender = 0), and we are left with the intercept and the math coefficients. But, since we centered math, if we look at the average math score, then the math coefficient is also dropped, and we have just the intercept predicting honors for males when math = mean(math) = 0, and the interaction is present:
\(\begin{aligned}logit(p) = log(\frac{p}{1-p}) = \beta_0 + \beta_1(0) + \beta_2(math)+\beta_3(math*0)\end{aligned}\)
\(\begin{aligned} log(\frac{p}{1-p}) = \beta_0 + \beta_2(math)\end{aligned}\).
\(\begin{aligned} log(\frac{p}{1-p}) = -8.75~+~0.13*math\end{aligned}\)
Thus, to calculate the odds of a male being in honors when math is at the mean level (0 due to centering), we exponentiate both sides of the equation and plug in zero for math:
\(\begin{aligned}\frac{p}{1-p} = e^{-8.75+(-2.9*0)} = e^{-8.75} = 0\end{aligned}\)
Which equates to a probability of a male with an average math score being in honors of:
\(\begin{aligned}p(honors~|~male=True, math=avg(math)) = \frac{e^{-8.75}}{1+e^{-8.75}} = 0.02\%\end{aligned}\)
When gender = female = 1, the equation is :
\(\begin{aligned}logit(p) =\beta_0 + \beta_1(1) + \beta_2(math)+\beta_3(math*1)\end{aligned}\)
\(\begin{aligned} ~~~~~~~~~~~ = \beta_0 + \beta_1 + \beta_2math+\beta_3math\end{aligned}\)
\(\begin{aligned} ~~~~~~~~~~~ = (\beta_0 + \beta_1) + (\beta_2+\beta_3)*math\end{aligned}\)
\(\begin{aligned}logit(p) = (-8.75~+~-2.9) ~+~(0.13~+~0.07)*(math)\end{aligned}\)
\(\begin{aligned}\frac{p}{1-p} = e^{(-8.75~+~-2.9) ~+~(0.13~+~0.07)*(math)}\end{aligned}\)
For the odds of a female being in honors when math is at the mean level, we again set math to zero and we get:
\(\begin{aligned}\frac{p}{1-p} = e^{(-11.65) ~+~(0)} = 0\end{aligned}\)
Which equates to a probability of a female with an average math score being in honors of:
\(\begin{aligned}p(honors~|~female=True, math=avg(math)) = \frac{e^{(-11.65)}}{1+e^{(-11.65)}} = 0\%\end{aligned}\)
To summarize, when there is no interaction modeled, the odds of being in honors for males and females is:
\(\begin{aligned}p(honors~|~gender=male, math=mean(math)) = e^{\beta_0} = e^{-10.81} = 0\end{aligned}\)
\(\begin{aligned}p(honors~|~gender=female, math=mean(math)) = e^{\beta_0+\beta_1} = e^{-10.81~+~0.97} = e^{-9.84}= 0\end{aligned}\)
And when there is an interaction in model \(\begin{aligned}p(honors~|~gender=male, math=mean(math)) = e^{\beta_0} = e^{-8.75} = 0\end{aligned}\)
\(\begin{aligned}p(honors~|~gender=female, math=mean(math)) = e^{(\beta_0+\beta_1)+(\beta_2+\beta_3)*(0)} = e^{-8.75~+~-2.9} = e^{-11.65} = 0\end{aligned}\)
We see that when math scores are at their average value, the likelihood of being in honors goes up for males, and goes down for females, thus the effect of math on likelihood of being in honors is different for males and females.
We’ll start with the 2 predictors, reading and math scores, with no interaction.
\(\begin{aligned}logit(p) =\beta_0 + \beta_1math + \beta_2read\end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -10.83 | -14.25 – -7.90 | <.001 | |
| read | 0.06 | 0.01 – 0.11 | .026 | |
| math | 0.12 | 0.06 – 0.18 | <.001 | |
| Observations | 200 | |||
The model with the interaction term:
\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math+\beta3(read*math)\end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -14.63 | -21.19 – -9.71 | <.001 | |
| read | 0.09 | 0.03 – 0.16 | .005 | |
| math | 0.16 | 0.09 – 0.24 | <.001 | |
| readMathIntr | -0.01 | -0.01 – -0.00 | .042 | |
| Observations | 200 | |||
Influence of math level on reading’s associaton with being in honors (simple slopes)“)
From the graph, we can see that as math scores go up, the impact (slope) of reading onto likelihood of being in honors goes down, as expected by the interaction coefficient (-0.01).
The specific slopes for the fixed math scores are:
| math | slopes | OR |
|---|---|---|
| -18.74 | 0.21 | 1.24 |
| -9.37 | 0.15 | 1.16 |
| 0.00 | 0.09 | 1.09 |
| 9.37 | 0.03 | 1.03 |
| 18.74 | -0.04 | 0.97 |
To calculate the slopes for fixed values of one of the two variables in an interaction, you must do a little algebraic simplification:
\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math+\beta3(read*math)\end{aligned}\)
In this case, if we hold math score constant, then the equation simplifies to:
\(\begin{aligned}logit(p) =\beta_0 + \beta_1(read) + \beta_2(fixedMathScore)+\beta3(fixedMathScore*read)\end{aligned}\)
\(\begin{aligned}~~~~~~~ =\beta_0 + \beta_2(fixedMathScore) + (\beta_1 +(\beta3*fixedMathScore))*read\end{aligned}\)
Let’s fix the math score at 1 standard deviation above the centered mean of zero, = 9.37:
\(\begin{aligned}logit(p) =\beta_0 + \beta_2(9.37) + (\beta_1+(\beta3*9.37))*read)\end{aligned}\)
Pulling values from our model:
\(\begin{aligned}\beta_0 = -1.62, ~~\beta_1 = 0.09,~~\beta_2 = 0.16,~~\beta_3 = -0.01,~~\end{aligned}\)
\(\begin{aligned}logit(p) =-1.62 + (0.16*9.37) + (0.09+(-0.01*9.37))*read\end{aligned}\)
The slope of read (which can be verified in the table above) = \((0.09+(-0.01*9.37))*read = (0.03*read) = (0.03)\)
We’ll model the same 2 continuous predictors, reading and math scores, and test the interaction with gender .
\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3gender + \beta_4(read*math) + \beta_5(gender*read) + \beta_6(gender*math) +\beta_7(gender*read*math)\end{aligned}\)
| Being in Honors | ||||
| OR | CI | p | ||
| (Intercept) | -2.46 | -4.12 – -1.43 | <.001 | |
| readC | 0.21 | -0.04 – 0.54 | .148 | |
| mathC | 0.28 | -0.02 – 0.68 | .112 | |
| gender (Female) | 1.13 | -0.13 – 2.90 | .126 | |
| readMathIntr | -0.02 | -0.06 – 0.00 | .086 | |
| genderReadIntr | -0.07 | -0.24 – 0.08 | .387 | |
| genderMathIntr | -0.06 | -0.28 – 0.11 | .518 | |
| genderReadMathIntr | 0.01 | -0.00 – 0.03 | .163 | |
| Observations | 200 | |||
To break this down, we first fix gender = male = 0, and get the simplified functions:
\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3(0) + \beta_4(read*math) + \beta_5(0*read) + \beta_6(0*math) +\beta_7(0*read*math)\end{aligned}\)
\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_4(read*math) )\end{aligned}\)
Now we have the same formula as above where we had the two-way continuous variable interaction.
| math | slopes | OR | gender |
|---|---|---|---|
| -18.74 | 0.66 | 1.93 | Male |
| -9.37 | 0.43 | 1.54 | Male |
| 0.00 | 0.21 | 1.23 | Male |
| 9.37 | -0.02 | 0.98 | Male |
| 18.74 | -0.24 | 0.78 | Male |
Now fix gender = female = 1, and simplify the equations:
\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3(1) + \beta_4(read*math) + \beta_5(1*read) + \beta_6(1*math) +\beta_7(1*read*math)\end{aligned}\)
\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3 + \beta_4(read*math) + \beta_5read + \beta_6math + \beta_7(read*math)\end{aligned}\)
\(\begin{aligned}~~~~~~ =\beta_0 + \beta_3 + \beta_1read + \beta_2math + \beta_4(read*math) + \beta_5read + \beta_6math + \beta_7(read*math)\end{aligned}\)
Let’s fix math at 9.368 and rearrange the equation
\(\begin{aligned}~~~~~~ =\beta_0 + \beta_3 + \beta_29.368 + \beta_69.368 +\beta_1read + \beta_4(read*9.368) + \beta_5read + \beta_7(read*9.368)\end{aligned}\)
\(\begin{aligned}~~~~~~ =\beta_0 + \beta_3 + \beta_29.368 + \beta_69.368 + (\beta_1 + \beta_49.368 + \beta_5 + \beta_79.368)read\end{aligned}\)
\(\begin{aligned}~~~~~~ =-2.458 + 1.134 +0.277(9.368) + -0.062(9.368) + (0.206 + -0.024(9.368) + -0.069 + 0.011(9.368))read\end{aligned}\)
\(\begin{aligned}~~~~~~ =0.686 + (0.02)read\end{aligned}\)
Thus the slope of read for females when math is fixed at 9.368 is 0.02
As is consistent in the graph below:
| math | male_slopes | male_OR | female_slopes | female_OR |
|---|---|---|---|---|
| -18.74 | 0.66 | 1.93 | 0.37 | 1.45 |
| -9.37 | 0.43 | 1.54 | 0.26 | 1.29 |
| 0.00 | 0.21 | 1.23 | 0.14 | 1.15 |
| 9.37 | -0.02 | 0.98 | 0.02 | 1.02 |
| 18.74 | -0.24 | 0.78 | -0.10 | 0.91 |
Finally, what we really want to see is a close comparison of the gender slopes by math scores:
This tutoria was created in RStudio using the following packages:
@knitr
@stargazer
@dplyr
@ggplot2
@stats