The purpose of this document is to explore the interpretation of coefficients for logist and regression models.
For this example, we would like to demonstrate how to interpret coefficients of a logistic regression model. The model that we will use will be the standard glm() model in R. For this demonstration, we want a final model that has the form:
\[ g(p) = ln(\frac{p}{1-p}) = \hat\beta_0 + \hat\beta_1x_1 + \hat\beta_2x_2 \]
The above model is the standard regression function in three dimensions.
We will simulate the model above with the following formula
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
df <- data.frame(x1=as.double(), x2=as.double(), z=as.integer(), ptype=as.integer())
for (b1 in seq(-10, 10, by=1)) {
for (b2 in seq(-10, 10, by=1)) {
z <- as.integer(round(1 / ( 1 + (exp(b1+rnorm(1)) * exp(b2)))))
df <- rbind(df, c(b1, b2, z, 0))
}
}
colnames(df) <- c('b1','b2','z',"ptype")
summary(df)
## b1 b2 z ptype
## Min. :-10 Min. :-10 Min. :0.0000 Min. :0
## 1st Qu.: -5 1st Qu.: -5 1st Qu.:0.0000 1st Qu.:0
## Median : 0 Median : 0 Median :0.0000 Median :0
## Mean : 0 Mean : 0 Mean :0.4921 Mean :0
## 3rd Qu.: 5 3rd Qu.: 5 3rd Qu.:1.0000 3rd Qu.:0
## Max. : 10 Max. : 10 Max. :1.0000 Max. :0
head(df)
p <- plot_ly(df, x = ~b1, y=~b2, z=~z)
p
## No trace type specified:
## Based on info supplied, a 'scatter3d' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#scatter3d
## No scatter3d mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
Ok. So good enough. Now, let’s see if we can fit a logistic regression model to this thing…
mdl <- glm(z~b1+b2, data=df, family=binomial)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(mdl)
##
## Call:
## glm(formula = z ~ b1 + b2, family = binomial, data = df)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.91386 -0.00131 0.00000 0.00179 1.42716
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.3855 0.3391 -1.137 0.256
## b1 -2.2569 0.4265 -5.292 1.21e-07 ***
## b2 -2.2774 0.4299 -5.298 1.17e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 611.24 on 440 degrees of freedom
## Residual deviance: 59.09 on 438 degrees of freedom
## AIC: 65.09
##
## Number of Fisher Scoring iterations: 10
c <- mdl$coefficients[2]
x1_log_odds <- c
x1_odds <- exp(x1_log_odds)
pcnt <- x1_odds / (1+x1_odds)
old_p = 0.625
old_odds <- 0.625/0.375
new_odds <- x1_odds*old_odds
new_p <- new_odds / (1+new_odds)
cat("\nCoefficient: ", c, "\n")
##
## Coefficient: -2.256881
cat("\nLog odds: ", x1_log_odds, "\n")
##
## Log odds: -2.256881
cat("Odds: ", x1_odds, "\n")
## Odds: 0.1046765
cat("New Odds: ", new_p, '\n')
## New Odds: 0.1485454
cat("Pcnt change: ", pcnt, "\n")
## Pcnt change: 0.09475759
Ok. Based on the above, we have a model here to our slightly over-dispersed data. If we want to mess with coefficients, lets try a few things.
For a 1 unit positive change in x1, we have a log odds change of -2.2568808 (this is just the coefficient) holding all other predictors constant. That means that we have a 0.1046765 change in the value of the odds. To figure that as a percentage, we convert that number from odds to a percentage:
\[ \frac{odds}{1+odds} \] As an example, let’s say that our model output for a given input is 0.625. Therefore the odds are $ 0.625/0.375 = 1.667 $ Now, if we have a 1 unit change in predictor that goes along with \(\beta_1\), our coefficient for x1,
Applying a 1 unit change to the predictor associated with \(\beta_1\) (namely \(x_1\) ), we would expect to see the following:
| Coefficient | Log odds | Odds Factor | Odds (given) | New Odds | Probability (Given) | New Probability |
|---|---|---|---|---|---|---|
| -2.2568808 | -2.2568808 | 0.1046765 | 1.6666667 | 0.1744608 | 0.625 | 0.1485454 |
So, what does this mean?
Let’s superimpose our model predictions with our model training data…
df2 <- data.frame(x1=as.double(), x2=as.double(), z=as.integer(), t=as.integer())
for (b1 in seq(-10.5, 10.5, by=1)) {
for (b2 in seq(-10.5, 10.5, by=1)) {
#z <- as.integer(round(1 / ( 1 + (exp(b1+rnorm(1)) * exp(b2)))))
dat <- data.frame("b1"=b1,"b2"=b2)
z <- predict(mdl, newdata=dat, type="response")
df2 <- rbind(df2, c(b1, b2, z, 1))
}
}
colnames(df2) <- c('b1','b2','z','ptype')
head(df2)
df3 <- rbind(df,df2)
p <- plot_ly(df3, x = ~b1, y=~b2, z=~z, color=~ptype, size=.5, colors = c('#BF382A', '#0C4B8E'))
p
## No trace type specified:
## Based on info supplied, a 'scatter3d' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#scatter3d
## No scatter3d mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
In the plot, we can see blue and red dots. The red dots are the original training data. The blue dots are the fitted mode. Notice as we move more positive on the \(\beta_1\) axis, the more likely it is that we will become negative. The predictions predictions that we ran through above, in fact, are showing us that if we move 1 unit in the positive \(\beta_1\) direction, the odds of being positive change by a factor of 0.1046765. Also, if our olds odds were 1.6666667, then our new odds of being positive will be 0.1744608.
In this writeup, I showed how changes in coefficients in a logistic regression model can be interpreted in the context of a 3D graph.
Thanks!
Miles.