This tutorial draws heavily from:
Introduction to SAS. UCLA: Statistical Consulting Group. from http://www.ats.ucla.edu/stat/sas/notes2/ (accessed November 24, 2007).

Why Logistic Regression?

When the outcome of interest is binary, we are trying to predict the probability of the given outcome being 1.
Probabilities run from 0 to 1. This “restricted range” is problematic for models which rely on distributions that run from negative infinity to positive infinity.
2.1 Thus, transforming the outcome from a probability ranging from 0 to 1 into the “logodds” (also called logit) allows us to model a distribution that ranges from negative to positive infinity.

While the logit is just one possible transformation, it has the desirable property of being relatively understood (the log of the odds).

For the purposes of this tutorial, the following terms need to be understood:

p = represents probabilities and is read “the probability of”

\(\begin{align}\frac{p}{1-p}\end{align}\) = is the formula to convert probabilities into odds, and is read “the odds of”

p(Y) = the probability of Y occuring

The general logistic regression equation is:

\(\begin{aligned}logit(p) = log(\frac{p}{1-p}) = \beta_0+\beta_1X_1+\beta_2X_2\dots+\beta_kX_k\end{aligned}+\epsilon\)

And thus the probability of outcome “Y” is found by solving for ‘p’ resulting in:

\(\begin{aligned}p(Y) = \frac{e^{(\beta_0+\beta_1X_1+\beta_2X_2\dots+\beta_kX_k}}{e^{(1 + \beta_0+\beta_1X_1+\beta_2X_2\dots+\beta_kX_k)}}\end{aligned}\)

Example 1 - Intercept Only

Let’s consider a simple example, a logistic regression with no predictors (just an intercept)
In this case, we are predicting the likelihood of being in an honors class:

The general equation is: \(\begin{aligned}logit(p) =\beta_0 \end{aligned}\)


	Dependent variable:

	Being in Honors

Constant	-1.13^*** (-1.45, -0.80)

Observations	200
Log Likelihood	-111.00
Akaike Inf. Crit.	225.00

Note:	p<0.1; p<0.05; p<0.01

So, the intercept, -1.125, is the log of the odds (logit) of being in honors. To get the odds of being in honors, we exponentiate the logit:

\(\begin{aligned}\frac{p}{1-p} = e^{-1.125} = 0.325\end{aligned}\)

An odds of 0.325 translates to odds of “1 to 3.082”, which means that for every 1 person in honors, there are ~3 persons who are not in honors. To do this translation, you divide 1 by the odds.

What is the estimated population probability of being in honors?

\(\begin{aligned}p(honors) = \frac{e^{-1.125}}{1+e^{-1.125}} = 24.5\%\end{aligned}\)

Note, examining the equation above, you can determine that converting odds to probabilities involves solving the equation, \(\frac{odds}{1+odds}\), you do not need to redo the exponentiation if you have computed the odds to derive the probability. In our case we know that the probability = \(\frac{0.325}{1.325}\).

Example 2 - A Single Categorical (dichotomous) Predictor

What happens if we predict the likelihood of being in honors and use gender as a predictor? In our data, male is coded as zero, and female as 1.
\(\begin{aligned}logit(p) =\beta_0 + \beta_1*female \end{aligned}\)

	Being in Honors
	OR	CI	p
(Intercept)	-1.47	-2.03 – -0.97	<.001
gender (Female)	0.59	-0.07 – 1.28	.083
Observations	200

So, the intercept, -1.471, is the log of the odds (logit) of being in honors for males, and \(e^{-1.471}\) = 0.23 is the odds of being in honors for males.

Then what is the estimated population probability of being in honors for males?

\(\begin{aligned}p(honors~|~male) = \frac{e^{-1.471}}{1+e^{-1.471}} = \frac{0.23}{1.23} = 18.68\%\end{aligned}\)

Now, the coefficient for gender, 0.593, is the log of the odds ratio of being in honors for females, compared to males (see discussion below for details on interpreting the odds ratios).

To calculate the population odds for females we must fill in the entire equation:

\(\begin{aligned}\frac{p}{1-p} = e^{-1.471~+~0.593} = e^{-0.878} = 0.416\end{aligned}\)

What is the estimated population probability of being in honors for females?

\(\begin{aligned}p(honors~|~female) = \frac{e^{-1.471~+~0.593}}{1+e^{-1.471~+~0.593}} = \frac{0.42}{1.416} = 29.36\%\end{aligned}\)

Interpreting the individualcoefficients in units of “log” is not intuitive.

Common practice in logistic regression is to convert our coefficients into odds ratios

Doing this is simple, it just involves exponentiating the coefficients. From our example:

Remember:

\(\begin{aligned}logit(p) = log(\frac{p}{1-p}) = \beta_0+\beta_1X_1\end{aligned}\)

So: we have two situations of interest, the baseline odds for males, and the odds ratio for females compared to males.

To get these values, we exponentiate the coefficients.

The odds of being in honors for males is:

\(\begin{aligned}e^{\beta_0} = e^{-1.47} = 0.23\end{aligned}\)

Whereas the odds for females is:

\(\begin{aligned}e^{\beta_0 + \beta_1} = e^{-1.47~+~0.59} = 0.42\end{aligned}\)

Whereas the odds ratio for females to males is:

\(\begin{aligned}\frac{e^{\beta_0+\beta_1}}{e^{\beta_0}} = \frac{0.42}{0.23} = 1.81\end{aligned}\)

Which can be calculated directly by exponentiating the coefficient for females

\(\begin{aligned}e^{\beta_1} = e^{0.59} = 1.81\end{aligned}\)

The exponentiated coefficients of predictors in logistic regression represent the odds of going from one state to another, they are odds ratios (OR)

Note, the OR is often interpreted as 1.81 = 81% greater odds for females than for males.

Example 3 - A Single Continuous Predictor

We’ll predict being in honors based on math scores

\(\begin{aligned}logit(p) =\beta_0 + \beta_1*math \end{aligned}\)

	Being in Honors
	OR	CI	p
(Intercept)	-9.79	-12.94 – -7.09	<.001
math	0.16	0.11 – 0.21	<.001
Observations	200

So, the intercept, -9.794, is the log of the odds (logit) of being in honors when math scores = zero. The odds of being in honors when math = 0 is the exponentiated coeffcient:

\(\begin{aligned}e^{-9.794} =0.0001\end{aligned}\)

The probability of being in honors when math = 0 is:

\(\begin{aligned}p(honors~|~math=0) = \frac{e^{-9.794}}{1+e^{-9.794}} = \frac{0}{1} = 0.01\%\end{aligned}\)

The coefficient on math represents the change in log odds for each 1-point increase in math scores.

So, as math scores go up 1 point, the log odds of beign in honors goes up by 0.156 points.

The probability of being in honors when math = mean(math) is:

\(\begin{aligned}p(honors~|~math=52.65) = \frac{e^{-9.794~+~0.156*52.65}}{1+e^{-9.794~+~0.156*52.65}} = 17.32\%\end{aligned}\)

The odds ratio for a 1 point increase in math scores is:

\(\begin{aligned}e^{\beta_1} = e^{0.156} = 1.17 \end{aligned}\)

Which = a 17% increase in odds for every 1 point increase in math scores

Example 4 - Multiple Predictors

Let’s look at predicting beign in honors based on math scores, reading scores, and gender

\(\begin{aligned}logit(p) =\beta_0 + \beta_1math + \beta_2gender+\beta_3read\end{aligned}\)

	Being in Honors
	OR	CI	p
(Intercept)	-11.77	-15.40 – -8.66	<.001
math	0.12	0.06 – 0.19	<.001
female	0.98	0.17 – 1.84	.020
read	0.06	0.01 – 0.11	.026
Observations	200

This fitted model says that, holding math and reading at a fixed value, the odds of getting into an honors class for females (female = 1) over the odds of getting into an honors class for males (female = 0) is:

\(\begin{aligned}e^{0.98} = 2.66\end{aligned}\)

In terms of percent change, we can say that the odds for females are 166% higher than the odds for males.

Similarly, the coefficient for math (0.12) says that, holding gender and reading scores at a fixed value, a one point change in math scores results in a change of odds:

\(\begin{aligned}e^{0.12} = 1.13\end{aligned}\)

In terms of percent change, we can say that the change in odds for every 1 point increase in math scores is: 13%.

Example 5 - An Interaction between a categorical and a continuous predictor

Let’s see if gender modifies the influence of math scores on the likelihood of being in honors.
Note, when we model interactions, we center our continuous predictor variables. Thus, the mean of math will now be zero.

We’ll start with the 2 predictors, gender and math, with no interaction.

\(\begin{aligned}logit(p) =\beta_0 + \beta_1gender + \beta_2math\end{aligned}\)

Print out the results and then save the coefficients for easy retrieval

	Being in Honors
	OR	CI	p
(Intercept)	-10.81	-14.25 – -7.87	<.001
gender (Female)	0.97	0.17 – 1.81	.020
math	0.16	0.12 – 0.22	<.001
Observations	200

And, if we want to look at the likelihood of being in honors when math scores are at their average level, (math = 0), and \(\beta_2(0)\) also drops out.

Thus, the odds of being in honors for males when both gender and math scores are in the model, and math is set at the mean score, is:

\(\begin{aligned}e^{\beta_0} = e^{-10.81} = 0\end{aligned}\)

And, the odds of being in honors for females when both gender and math scores are in the model, and math is set at the mean score, is:

\(\begin{aligned}e^{\beta_0+\beta_1(1)} = e^{-10.81+ 0.97} = 0\end{aligned}\)

Now, let’s look at the model with the interaction term included

\(\begin{aligned}logit(p) =\beta_0 + \beta_1gender + \beta_2math+\beta_3(math*gender)\end{aligned}\)

	Being in Honors
	OR	CI	p
(Intercept)	-8.75	-13.44 – -4.97	<.001
genderFemale	-2.90	-9.11 – 3.24	.349
math	0.13	0.06 – 0.21	<.001
genderFemale:math	0.07	-0.04 – 0.18	.210
Observations	200

It no longer makes sense to talk about holding math or gender constant since our interest is in the interaction of these two variables and our third term is depenent on both of them.

Although the interaction term, genderFemale:math is not statistically significant, we will proceed with intepreting the interaction for purposes of illustrating the process.

Looking at the output for the non-interaction model and the interaction model, we see that the coefficients for gender and mathboth decrease when the interaction is included.

Since male is coded zero, the coefficient for the interaction cancels out (math * gender) = (math *0), similarly the coefficient for gender cancels out (gender = 0), and we are left with the intercept and the math coefficients. But, since we centered math, if we look at the average math score, then the math coefficient is also dropped, and we have just the intercept predicting honors for males when math = mean(math) = 0, and the interaction is present:

\(\begin{aligned}logit(p) = log(\frac{p}{1-p}) = \beta_0 + \beta_1(0) + \beta_2(math)+\beta_3(math*0)\end{aligned}\)

\(\begin{aligned} log(\frac{p}{1-p}) = \beta_0 + \beta_2(math)\end{aligned}\).

\(\begin{aligned} log(\frac{p}{1-p}) = -8.75~+~0.13*math\end{aligned}\)

Thus, to calculate the odds of a male being in honors when math is at the mean level (0 due to centering), we exponentiate both sides of the equation and plug in zero for math:

\(\begin{aligned}\frac{p}{1-p} = e^{-8.75+(-2.9*0)} = e^{-8.75} = 0\end{aligned}\)

Which equates to a probability of a male with an average math score being in honors of:

\(\begin{aligned}p(honors~|~male=True, math=avg(math)) = \frac{e^{-8.75}}{1+e^{-8.75}} = 0.02\%\end{aligned}\)

When gender = female = 1, the equation is :

\(\begin{aligned}logit(p) =\beta_0 + \beta_1(1) + \beta_2(math)+\beta_3(math*1)\end{aligned}\)

\(\begin{aligned} ~~~~~~~~~~~ = \beta_0 + \beta_1 + \beta_2math+\beta_3math\end{aligned}\)

\(\begin{aligned} ~~~~~~~~~~~ = (\beta_0 + \beta_1) + (\beta_2+\beta_3)*math\end{aligned}\)

\(\begin{aligned}logit(p) = (-8.75~+~-2.9) ~+~(0.13~+~0.07)*(math)\end{aligned}\)

\(\begin{aligned}\frac{p}{1-p} = e^{(-8.75~+~-2.9) ~+~(0.13~+~0.07)*(math)}\end{aligned}\)

For the odds of a female being in honors when math is at the mean level, we again set math to zero and we get:

\(\begin{aligned}\frac{p}{1-p} = e^{(-11.65) ~+~(0)} = 0\end{aligned}\)

Which equates to a probability of a female with an average math score being in honors of:

\(\begin{aligned}p(honors~|~female=True, math=avg(math)) = \frac{e^{(-11.65)}}{1+e^{(-11.65)}} = 0\%\end{aligned}\)

To summarize, when there is no interaction modeled, the odds of being in honors for males and females is:

\(\begin{aligned}p(honors~|~gender=male, math=mean(math)) = e^{\beta_0} = e^{-10.81} = 0\end{aligned}\)

\(\begin{aligned}p(honors~|~gender=female, math=mean(math)) = e^{\beta_0+\beta_1} = e^{-10.81~+~0.97} = e^{-9.84}= 0\end{aligned}\)

And when there is an interaction in model \(\begin{aligned}p(honors~|~gender=male, math=mean(math)) = e^{\beta_0} = e^{-8.75} = 0\end{aligned}\)

\(\begin{aligned}p(honors~|~gender=female, math=mean(math)) = e^{(\beta_0+\beta_1)+(\beta_2+\beta_3)*(0)} = e^{-8.75~+~-2.9} = e^{-11.65} = 0\end{aligned}\)

We see that when math scores are at their average value, the likelihood of being in honors goes up for males, and goes down for females, thus the effect of math on likelihood of being in honors is different for males and females.

Example 6 - An Interaction between two continuous predictors

We’ll start with the 2 predictors, reading and math scores, with no interaction.

\(\begin{aligned}logit(p) =\beta_0 + \beta_1math + \beta_2read\end{aligned}\)

Print out the results and then save the coefficients for easy retrieval

	Being in Honors
	OR	CI	p
(Intercept)	-10.83	-14.25 – -7.90	<.001
read	0.06	0.01 – 0.11	.026
math	0.12	0.06 – 0.18	<.001
Observations	200

The model with the interaction term:

\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math+\beta3(read*math)\end{aligned}\)

	Being in Honors
	OR	CI	p
(Intercept)	-14.63	-21.19 – -9.71	<.001
read	0.09	0.03 – 0.16	.005
math	0.16	0.09 – 0.24	<.001
readMathIntr	-0.01	-0.01 – -0.00	.042
Observations	200

Influence of math level on reading’s associaton with being in honors (simple slopes)“)

From the graph, we can see that as math scores go up, the impact (slope) of reading onto likelihood of being in honors goes down, as expected by the interaction coefficient (-0.01).

The specific slopes for the fixed math scores are:

math	slopes	OR
-18.74	0.21	1.24
-9.37	0.15	1.16
0.00	0.09	1.09
9.37	0.03	1.03
18.74	-0.04	0.97

To calculate the slopes for fixed values of one of the two variables in an interaction, you must do a little algebraic simplification:

\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math+\beta3(read*math)\end{aligned}\)

In this case, if we hold math score constant, then the equation simplifies to:

\(\begin{aligned}logit(p) =\beta_0 + \beta_1(read) + \beta_2(fixedMathScore)+\beta3(fixedMathScore*read)\end{aligned}\)

\(\begin{aligned}~~~~~~~ =\beta_0 + \beta_2(fixedMathScore) + (\beta_1 +(\beta3*fixedMathScore))*read\end{aligned}\)

Let’s fix the math score at 1 standard deviation above the centered mean of zero, = 9.37:

\(\begin{aligned}logit(p) =\beta_0 + \beta_2(9.37) + (\beta_1+(\beta3*9.37))*read)\end{aligned}\)

Pulling values from our model:

\(\begin{aligned}\beta_0 = -1.62, ~~\beta_1 = 0.09,~~\beta_2 = 0.16,~~\beta_3 = -0.01,~~\end{aligned}\)

\(\begin{aligned}logit(p) =-1.62 + (0.16*9.37) + (0.09+(-0.01*9.37))*read\end{aligned}\)

The slope of read (which can be verified in the table above) = \((0.09+(-0.01*9.37))*read = (0.03*read) = (0.03)\)

Example 7 - A three-way Interaction between two continuous predictors and a categorical variable

We’ll model the same 2 continuous predictors, reading and math scores, and test the interaction with gender .

\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3gender + \beta_4(read*math) + \beta_5(gender*read) + \beta_6(gender*math) +\beta_7(gender*read*math)\end{aligned}\)

	Being in Honors
	OR	CI	p
(Intercept)	-2.46	-4.12 – -1.43	<.001
readC	0.21	-0.04 – 0.54	.148
mathC	0.28	-0.02 – 0.68	.112
gender (Female)	1.13	-0.13 – 2.90	.126
readMathIntr	-0.02	-0.06 – 0.00	.086
genderReadIntr	-0.07	-0.24 – 0.08	.387
genderMathIntr	-0.06	-0.28 – 0.11	.518
genderReadMathIntr	0.01	-0.00 – 0.03	.163
Observations	200

To break this down, we first fix gender = male = 0, and get the simplified functions:

\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3(0) + \beta_4(read*math) + \beta_5(0*read) + \beta_6(0*math) +\beta_7(0*read*math)\end{aligned}\)

\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_4(read*math) )\end{aligned}\)

Now we have the same formula as above where we had the two-way continuous variable interaction.

math	slopes	OR	gender
-18.74	0.66	1.93	Male
-9.37	0.43	1.54	Male
0.00	0.21	1.23	Male
9.37	-0.02	0.98	Male
18.74	-0.24	0.78	Male

Now fix gender = female = 1, and simplify the equations:

\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3(1) + \beta_4(read*math) + \beta_5(1*read) + \beta_6(1*math) +\beta_7(1*read*math)\end{aligned}\)

\(\begin{aligned}logit(p) =\beta_0 + \beta_1read + \beta_2math + \beta_3 + \beta_4(read*math) + \beta_5read + \beta_6math + \beta_7(read*math)\end{aligned}\)

\(\begin{aligned}~~~~~~ =\beta_0 + \beta_3 + \beta_1read + \beta_2math + \beta_4(read*math) + \beta_5read + \beta_6math + \beta_7(read*math)\end{aligned}\)

Let’s fix math at 9.368 and rearrange the equation

\(\begin{aligned}~~~~~~ =\beta_0 + \beta_3 + \beta_29.368 + \beta_69.368 +\beta_1read + \beta_4(read*9.368) + \beta_5read + \beta_7(read*9.368)\end{aligned}\)

\(\begin{aligned}~~~~~~ =\beta_0 + \beta_3 + \beta_29.368 + \beta_69.368 + (\beta_1 + \beta_49.368 + \beta_5 + \beta_79.368)read\end{aligned}\)

\(\begin{aligned}~~~~~~ =-2.458 + 1.134 +0.277(9.368) + -0.062(9.368) + (0.206 + -0.024(9.368) + -0.069 + 0.011(9.368))read\end{aligned}\)

\(\begin{aligned}~~~~~~ =0.686 + (0.02)read\end{aligned}\)

Thus the slope of read for females when math is fixed at 9.368 is 0.02

As is consistent in the graph below:

math	male_slopes	male_OR	female_slopes	female_OR
-18.74	0.66	1.93	0.37	1.45
-9.37	0.43	1.54	0.26	1.29
0.00	0.21	1.23	0.14	1.15
9.37	-0.02	0.98	0.02	1.02
18.74	-0.24	0.78	-0.10	0.91

Finally, what we really want to see is a close comparison of the gender slopes by math scores:

This tutoria was created in RStudio using the following packages:
@knitr
@stargazer
@dplyr
@ggplot2
@stats

CSP-708-logistic-regression

Steven Vannoy

November 7, 2016

Why Logistic Regression?

Example 1 - Intercept Only

Example 2 - A Single Categorical (dichotomous) Predictor

Example 3 - A Single Continuous Predictor

Example 4 - Multiple Predictors

Example 5 - An Interaction between a categorical and a continuous predictor

Print out the results and then save the coefficients for easy retrieval

Example 6 - An Interaction between two continuous predictors

Print out the results and then save the coefficients for easy retrieval

Example 7 - A three-way Interaction between two continuous predictors and a categorical variable