Anna
5 Februar 2019
Recall the logistic regression assumptions:
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
head(mydata)## admit gre gpa rank
## 1 0 380 3.61 3
## 2 1 660 3.67 3
## 3 1 800 4.00 1
## 4 1 640 3.19 4
## 5 0 520 2.93 4
## 6 1 760 3.00 2
library(psych)
describe(mydata)## vars n mean sd median trimmed mad min max range skew
## admit 1 400 0.32 0.47 0.0 0.27 0.00 0.00 1 1.00 0.78
## gre 2 400 587.70 115.52 580.0 589.06 118.61 220.00 800 580.00 -0.14
## gpa 3 400 3.39 0.38 3.4 3.40 0.40 2.26 4 1.74 -0.21
## rank 4 400 2.48 0.94 2.0 2.48 1.48 1.00 4 3.00 0.10
## kurtosis se
## admit -1.39 0.02
## gre -0.36 5.78
## gpa -0.60 0.02
## rank -0.91 0.05
the ‘rank’ variable is read as quantitative – correct it and describe the data again:
class(mydata$rank)## [1] "integer"
table(mydata$rank)##
## 1 2 3 4
## 61 151 121 67
mydata$rank <- factor(mydata$rank)
class(mydata$rank)## [1] "factor"
describe(mydata)## vars n mean sd median trimmed mad min max range skew
## admit 1 400 0.32 0.47 0.0 0.27 0.00 0.00 1 1.00 0.78
## gre 2 400 587.70 115.52 580.0 589.06 118.61 220.00 800 580.00 -0.14
## gpa 3 400 3.39 0.38 3.4 3.40 0.40 2.26 4 1.74 -0.21
## rank* 4 400 2.48 0.94 2.0 2.48 1.48 1.00 4 3.00 0.10
## kurtosis se
## admit -1.39 0.02
## gre -0.36 5.78
## gpa -0.60 0.02
## rank* -0.91 0.05
describeBy(mydata, mydata$rank)##
## Descriptive statistics by group
## group: 1
## vars n mean sd median trimmed mad min max range skew
## admit 1 61 0.54 0.50 1.00 0.55 0.00 0.00 1 1.00 -0.16
## gre 2 61 611.80 120.24 600.00 613.88 118.61 340.00 800 460.00 -0.06
## gpa 3 61 3.45 0.39 3.53 3.47 0.46 2.42 4 1.58 -0.33
## rank* 4 61 1.00 0.00 1.00 1.00 0.00 1.00 1 0.00 NaN
## kurtosis se
## admit -2.01 0.06
## gre -0.81 15.40
## gpa -0.55 0.05
## rank* NaN 0.00
## --------------------------------------------------------
## group: 2
## vars n mean sd median trimmed mad min max range skew
## admit 1 151 0.36 0.48 0.00 0.32 0.00 0.00 1 1.00 0.59
## gre 2 151 596.03 107.01 600.00 596.69 118.61 300.00 800 500.00 -0.15
## gpa 3 151 3.36 0.38 3.38 3.37 0.39 2.42 4 1.58 -0.15
## rank* 4 151 2.00 0.00 2.00 2.00 0.00 2.00 2 0.00 NaN
## kurtosis se
## admit -1.66 0.04
## gre -0.31 8.71
## gpa -0.65 0.03
## rank* NaN 0.00
## --------------------------------------------------------
## group: 3
## vars n mean sd median trimmed mad min max range skew
## admit 1 121 0.23 0.42 0.00 0.16 0.00 0.00 1 1.00 1.26
## gre 2 121 574.88 121.15 580.00 578.35 118.61 220.00 800 580.00 -0.28
## gpa 3 121 3.43 0.39 3.43 3.45 0.44 2.56 4 1.44 -0.24
## rank* 4 121 3.00 0.00 3.00 3.00 0.00 3.00 3 0.00 NaN
## kurtosis se
## admit -0.42 0.04
## gre -0.27 11.01
## gpa -0.89 0.04
## rank* NaN 0.00
## --------------------------------------------------------
## group: 4
## vars n mean sd median trimmed mad min max range skew
## admit 1 67 0.18 0.39 0.00 0.11 0.00 0.00 1 1.00 1.64
## gre 2 67 570.15 116.22 560.00 569.09 118.61 300.00 800 500.00 0.13
## gpa 3 67 3.32 0.36 3.33 3.32 0.36 2.26 4 1.74 -0.28
## rank* 4 67 4.00 0.00 4.00 4.00 0.00 4.00 4 0.00 NaN
## kurtosis se
## admit 0.69 0.05
## gre -0.68 14.20
## gpa -0.07 0.04
## rank* NaN 0.00
The share of admit = 1 is 0.32. It means that, without any information, there is a 0.68 probability that admit = 0.
xtabs(~ admit + rank, data = mydata)## rank
## admit 1 2 3 4
## 0 28 97 93 55
## 1 33 54 28 12
mylogit <- glm(admit ~ gre + gpa + rank, data = mydata, family = "binomial")
summary(mylogit)##
## Call:
## glm(formula = admit ~ gre + gpa + rank, family = "binomial",
## data = mydata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6268 -0.8662 -0.6388 1.1490 2.0790
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.989979 1.139951 -3.500 0.000465 ***
## gre 0.002264 0.001094 2.070 0.038465 *
## gpa 0.804038 0.331819 2.423 0.015388 *
## rank2 -0.675443 0.316490 -2.134 0.032829 *
## rank3 -1.340204 0.345306 -3.881 0.000104 ***
## rank4 -1.551464 0.417832 -3.713 0.000205 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 499.98 on 399 degrees of freedom
## Residual deviance: 458.52 on 394 degrees of freedom
## AIC: 470.52
##
## Number of Fisher Scoring iterations: 4
anova(mylogit, test="Chisq")## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: admit
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 399 499.98
## gre 1 13.9204 398 486.06 0.0001907 ***
## gpa 1 5.7122 397 480.34 0.0168478 *
## rank 3 21.8265 394 458.52 7.088e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This model output shows that all predictors are significant, therefore, we look at the estimates.
Positive estimates indicate that, as the predictor changes from 0 to 1 (dummy) or increases by 1 (continuous), the chances of admit increase. A negative estimate shows a decrease of chances (in the log of odds form). To interpret the coefficients in a more intuitive way, we get the odds or, better, predicted probabilities (see next slide).
*Some extra things:*
0. Change the reference level of a categorical predictor:
mydatarank2 <- within(mydata, rank <- relevel(rank, ref = 2))
mylogitrank2 <- update(mylogit, data = mydatarank2)
summary(mylogitrank2)##
## Call:
## glm(formula = admit ~ gre + gpa + rank, family = "binomial",
## data = mydatarank2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6268 -0.8662 -0.6388 1.1490 2.0790
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.665422 1.109370 -4.205 2.61e-05 ***
## gre 0.002264 0.001094 2.070 0.0385 *
## gpa 0.804038 0.331819 2.423 0.0154 *
## rank1 0.675443 0.316490 2.134 0.0328 *
## rank3 -0.664761 0.283319 -2.346 0.0190 *
## rank4 -0.876021 0.366735 -2.389 0.0169 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 499.98 on 399 degrees of freedom
## Residual deviance: 458.52 on 394 degrees of freedom
## AIC: 470.52
##
## Number of Fisher Scoring iterations: 4
1. Check the fit of the ‘rank’ variable:
#install.packages("aod")
library(aod)
wald.test(b = coef(mylogit), Sigma = vcov(mylogit), Terms = 4:6)## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 20.9, df = 3, P(> X2) = 0.00011
anova(mylogit, test="Chisq")## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: admit
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 399 499.98
## gre 1 13.9204 398 486.06 0.0001907 ***
## gpa 1 5.7122 397 480.34 0.0168478 *
## rank 3 21.8265 394 458.52 7.088e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
2.Check the significance of the difference between ranks 2 and 3
l <-cbind(0, 0, 0, 1, -1, 0) #contrast between rank2 and rank3
wald.test(b = coef(mylogit), Sigma = vcov(mylogit), L = l)## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 5.5, df = 1, P(> X2) = 0.019
exp(cbind(OR = coef(mylogit), confint(mylogit)))## OR 2.5 % 97.5 %
## (Intercept) 0.0185001 0.001889165 0.1665354
## gre 1.0022670 1.000137602 1.0044457
## gpa 2.2345448 1.173858216 4.3238349
## rank2 0.5089310 0.272289674 0.9448343
## rank3 0.2617923 0.131641717 0.5115181
## rank4 0.2119375 0.090715546 0.4706961
When gpa increases by one unit, the odds of admit change by a factor of 2.23 = they multiply by 2.23 = they increase by 123 per cent [95% CI = 1.17; 4.32].
When students graduate from a university of rank2, their odds of admit change by a factor of .51 = decrease by 2 = decrease by 49 percent, as compared to the graduates of university of rank1 (95% CI = [.27;.94].
newdata1 <- with(mydata, data.frame(gre = mean(gre),
gpa = mean(gpa),
rank = factor(1:4)))
newdata1 ## gre gpa rank
## 1 587.7 3.3899 1
## 2 587.7 3.3899 2
## 3 587.7 3.3899 3
## 4 587.7 3.3899 4
newdata1$rankP <- predict(mylogit, newdata = newdata1, type = "response")
newdata1 ## predicted probabilities for different ranks at the mean levels of gre and gpa.## gre gpa rank rankP
## 1 587.7 3.3899 1 0.5166016
## 2 587.7 3.3899 2 0.3522846
## 3 587.7 3.3899 3 0.2186120
## 4 587.7 3.3899 4 0.1846684
newdata2 <- with(mydata, data.frame(gre = rep(seq(from = 200, to = 800, length.out = 100), 4),
gpa = mean(gpa), rank = factor(rep(1:4, each = 100))))
newdata3 <- cbind(newdata2, predict(mylogit, newdata = newdata2, type = "link", se = T))newdata3 <- within(newdata3, {
PredictedProb <- plogis(fit)
LL <- plogis(fit - (1.96 * se.fit))
UL <- plogis(fit + (1.96 * se.fit))
})
head(newdata3)## gre gpa rank fit se.fit residual.scale UL
## 1 200.0000 3.3899 1 -0.8114870 0.5147714 1 0.5492064
## 2 206.0606 3.3899 1 -0.7977632 0.5090986 1 0.5498513
## 3 212.1212 3.3899 1 -0.7840394 0.5034491 1 0.5505074
## 4 218.1818 3.3899 1 -0.7703156 0.4978239 1 0.5511750
## 5 224.2424 3.3899 1 -0.7565919 0.4922237 1 0.5518545
## 6 230.3030 3.3899 1 -0.7428681 0.4866494 1 0.5525464
## LL PredictedProb
## 1 0.1393812 0.3075737
## 2 0.1423880 0.3105042
## 3 0.1454429 0.3134499
## 4 0.1485460 0.3164108
## 5 0.1516973 0.3193867
## 6 0.1548966 0.3223773
library(ggplot2)
ggplot(newdata3, aes(x = gre, y = PredictedProb)) +
geom_ribbon(aes(ymin = LL, ymax = UL, fill = rank), alpha = 0.2) +
geom_line(aes(colour = rank), size = 0.75) +
ylim(0, 1)For gre <=380, there is no difference between the graduates of rank1-rank4 universities. With higher gre grades, rank1 university graduates have higher probability to be admitted thatn rank3-rank4 university graduates. Rank2 university graduates are not significantly different from either rank1 or rank3 or rank4 university graduates.
newdata4 <- with(mydata, data.frame(gre = mean(gre),
gpa = rep(seq(from = 2.2, to = 4.0, length.out = 18), 4),
rank = factor(rep(1:4, each = 18))))
newdata5 <- cbind(newdata4, predict(mylogit, newdata = newdata4, type = "link", se = T))
newdata5 <- within(newdata5, {
PredictedProb <- plogis(fit)
LL <- plogis(fit - (1.96 * se.fit))
UL <- plogis(fit + (1.96 * se.fit))
})
ggplot(newdata5, aes(x = gpa, y = PredictedProb)) +
geom_ribbon(aes(ymin = LL, ymax = UL, fill = rank), alpha = 0.2) +
geom_line(aes(colour = rank), size = 0.75) +
ylim(0, 1)Similar story. If gpa is <=2.8, there is no difference across university ranks. For higher gpa levels, rank1 university graduates have higher probabilities than rank3-rank4 university graduates.
In a linear regression, there are the R-squared and the F-test to evaluate the overall model quality. The R-squared shows the explained share of the outcome’s variance, while the F-test says whether the model is better than no model (= just the mean).
(But see: https://data.library.virginia.edu/is-r-squared-useless/)
In a binary logistic model, overall goodness-of-fit measures are based on:
See https://www.r-bloggers.com/evaluating-logistic-regression-models/
The loglikelihood (LL) is the measure of discrepancy between the model and the data. The closer it is to zero, the better.
Based on the LL are:
We can compare them across nested models
library(lmtest)
mymodel2 <- glm(admit ~ gre + rank, data = mydata, family = "binomial")
mymodel3 <- glm(admit ~ rank, data = mydata, family = "binomial")
lrtest(mymodel3, mymodel2, mylogit) #you want LogLik closer to 0.## Likelihood ratio test
##
## Model 1: admit ~ rank
## Model 2: admit ~ gre + rank
## Model 3: admit ~ gre + gpa + rank
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 4 -237.48
## 2 5 -232.27 1 10.4349 0.001237 **
## 3 6 -229.26 1 6.0143 0.014190 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(mymodel3, mymodel2, mylogit, test ="Chisq") # This is JUST THE SAME (look at p-values), but it is -2LL.## Analysis of Deviance Table
##
## Model 1: admit ~ rank
## Model 2: admit ~ gre + rank
## Model 3: admit ~ gre + gpa + rank
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 396 474.97
## 2 395 464.53 1 10.4349 0.001237 **
## 3 394 458.52 1 6.0143 0.014190 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
These are two equivalent functions. Both outputs say that the 2nd model is significantly better than the 1st one, and that the 3rd model is significantly better than the 2nd one. We choose the best model (mylogit) and interpret it.
#install.packages("pscl")
library(pscl)
pR2(mylogit)## llh llhNull G2 McFadden r2ML
## -229.25874624 -249.98825878 41.45902508 0.08292194 0.09845702
## r2CU
## 0.13799580
library(rcompanion)
compareGLM(mymodel3, mymodel2, mylogit)## $Models
## Formula
## 1 "admit ~ rank"
## 2 "admit ~ gre + rank"
## 3 "admit ~ gre + gpa + rank"
##
## $Fit.criteria
## Rank Df.res AIC AICc BIC McFadden Cox.and.Snell Nagelkerke
## 1 4 396 485.0 485.1 504.9 0.05002 0.06061 0.08495
## 2 5 395 476.5 476.7 500.5 0.07089 0.08480 0.11890
## 3 6 394 472.5 472.8 500.5 0.08292 0.09846 0.13800
## p.value
## 1 7.399e-06
## 2 1.781e-07
## 3 3.528e-08
A well-fitting model shows no significant difference between the model and the observed data, i.e. the reported p-value should be greater than 0.05.
See: http://thestatsgeek.com/2014/02/16/the-hosmer-lemeshow-goodness-of-fit-test-for-logistic-regression/
#install.packages("generalhoslem")
library(generalhoslem)
logitgof(mydata$admit, fitted(mylogit), g = 10) #g should be larger than the number of predictors; df = g - 2##
## Hosmer and Lemeshow test (binary model)
##
## data: mydata$admit, fitted(mylogit)
## X-squared = 11.085, df = 8, p-value = 0.1969
#Alternatively:
library(sjstats)
hoslem_gof(mylogit)##
## # Hosmer-Lemeshow Goodness-of-Fit Test
##
## Chi-squared: 11.085
## df: 8
## p-value: 0.197
First, we need a training and a test subsamples and fit a model on the train data:
bound <- floor((nrow(mydata)/4) * 3) #define 75% of training and test set (could be 50%, 60%)
set.seed(123)
df <- mydata[sample(nrow(mydata)), ] #sample 400 random rows out of the data
df.train <- df[1:bound, ] #get training set
df.test <- df[(bound + 1):nrow(df), ] #get test set
mylogit1 <- glm(admit ~ gre + gpa + rank, data = df.train, family = "binomial",
na.action = na.omit)Then we count the proportion of outcomes accurately predicted on the test data:
pred <- format(round(predict(mylogit1, newdata = df.test, type = "response")))
#accuracy <- table(pred, df.test[,"admit"])
#sum(diag(accuracy))/sum(accuracy)
#table(pred, df.test$admit)
library(caret)
#confusionMatrix(data=pred, df.test$admit)
confusionMatrix(table(pred, df.test$admit))## Confusion Matrix and Statistics
##
##
## pred 0 1
## 0 65 27
## 1 4 4
##
## Accuracy : 0.69
## 95% CI : (0.5897, 0.7787)
## No Information Rate : 0.69
## P-Value [Acc > NIR] : 0.5484
##
## Kappa : 0.0893
## Mcnemar's Test P-Value : 7.772e-05
##
## Sensitivity : 0.9420
## Specificity : 0.1290
## Pos Pred Value : 0.7065
## Neg Pred Value : 0.5000
## Prevalence : 0.6900
## Detection Rate : 0.6500
## Detection Prevalence : 0.9200
## Balanced Accuracy : 0.5355
##
## 'Positive' Class : 0
##
Conclusion: Our model fit is not statistically significantly better than no model at all (Null Information Rate).
But there are only 300 + 100 observations. We need more observations and, probably, more predictors.
library(pROC)
# Compute AUC for predicting admit with continuous variables:
f1 <- roc(admit ~ gre + gpa, data = df.train)
plot(f1$gre, col = "red")f1$gre #This one is a bad fit (< 0.80)##
## Call:
## roc.formula(formula = admit ~ gre, data = df.train)
##
## Data: gre in 204 controls (admit 0) < 96 cases (admit 1).
## Area under the curve: 0.6255
plot(f1$gpa, col = "red")f1$gpa #This one is a bad fit, too (< 0.80)##
## Call:
## roc.formula(formula = admit ~ gpa, data = df.train)
##
## Data: gpa in 204 controls (admit 0) < 96 cases (admit 1).
## Area under the curve: 0.5933
See: https://cran.r-project.org/web/packages/margins/vignettes/Introduction.html
#install.packages("margins")
library(margins)
m <- margins(mylogit, type = "response")## Warning in warn_for_weights(model): 'weights' used in model estimation are
## currently ignored!
summary(m)## factor AME SE z p lower upper
## gpa 0.1564 0.0630 2.4846 0.0130 0.0330 0.2798
## gre 0.0004 0.0002 2.1068 0.0351 0.0000 0.0009
## rank2 -0.1566 0.0736 -2.1272 0.0334 -0.3009 -0.0123
## rank3 -0.2872 0.0733 -3.9170 0.0001 -0.4309 -0.1435
## rank4 -0.3212 0.0802 -4.0052 0.0001 -0.4784 -0.1640
plot(m) # gre and gpa seem to be mixed up :( Give preference to the table values.margins(mylogit, at = list(gre = fivenum(mydata$gre))) #fivenum are min, 25%, 50%(median), 75%, and maximum.## Warning in warn_for_weights(model): 'weights' used in model estimation are
## currently ignored!
## Average marginal effects at specified values
## glm(formula = admit ~ gre + gpa + rank, family = "binomial", data = mydata)
## at(gre) gre gpa rank2 rank3 rank4
## 220 0.0003054 0.1084 -0.1253 -0.2090 -0.2282
## 520 0.0004254 0.1510 -0.1571 -0.2810 -0.3120
## 580 0.0004465 0.1585 -0.1606 -0.2920 -0.3257
## 660 0.0004716 0.1675 -0.1632 -0.3039 -0.3411
## 800 0.0005043 0.1791 -0.1620 -0.3155 -0.3585
cplot(mylogit, "gre", what = "prediction", main = "Predicted admit, given gre")cplot(mylogit, "gre", what = "effect", main = "Average Marginal Effect of gre")## Warning in warn_for_weights(model): 'weights' used in model estimation are
## currently ignored!
margins(mylogit, at = list(gpa = fivenum(mydata$gpa))) #fivenum are min, 25%, 50%(median), 75%, and maximum.## Warning in warn_for_weights(model): 'weights' used in model estimation are
## currently ignored!
## Average marginal effects at specified values
## glm(formula = admit ~ gre + gpa + rank, family = "binomial", data = mydata)
## at(gpa) gre gpa rank2 rank3 rank4
## 2.260 0.0002921 0.1037 -0.1209 -0.2001 -0.2181
## 3.130 0.0004167 0.1479 -0.1561 -0.2768 -0.3068
## 3.395 0.0004501 0.1598 -0.1619 -0.2948 -0.3289
## 3.670 0.0004795 0.1703 -0.1645 -0.3087 -0.3470
## 4.000 0.0005052 0.1794 -0.1627 -0.3174 -0.3606
cplot(mylogit, "gpa", what = "prediction", main = "Predicted admit, given gpa")cplot(mylogit, "gpa", what = "effect", main = "Average Marginal Effect of gpa")## Warning in warn_for_weights(model): 'weights' used in model estimation are
## currently ignored!
See: https://data.princeton.edu/wws509/notes/c3s8
arm::binnedplot(mydata$gre, mydata$admit)arm::binnedplot(mydata$gpa, mydata$admit)car::residualPlots(mylogit)## Test stat Pr(>|Test stat|)
## gre 0.1082 0.7422
## gpa 0.0835 0.7726
## rank
If the model is properly specified, no additional predictors that are statistically significant can be found.
If some continuous predictor’s stat in this test is significant, test a new model where you include its quadratic term.
car::marginalModelPlots(mylogit)## Warning in mmps(...): Interactions and/or factors skipped
Marginal model plot drawing response variable against each predictors and linear predictor. Fitted values are compared to observed data.
library(car)
vif(mylogit)## GVIF Df GVIF^(1/(2*Df))
## gre 1.134377 1 1.065071
## gpa 1.155902 1 1.075129
## rank 1.025759 3 1.004248
looks OK, GVIFs are ~1.
See: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4885900/
outlierTest(mylogit)## No Studentized residuals with Bonferonni p < 0.05
## Largest |rstudent|:
## rstudent unadjusted p-value Bonferonni p
## 198 2.106376 0.035172 NA
influenceIndexPlot(mylogit, id.n. = 3) Obs.198 has the largest residual, but it is not significant = no problem here. We want to see 3 “most outlying outliers”.
influencePlot(mylogit, col = "red", id.n. = 3)#obs.373 has the largest leverage; obs.198 is most influential.## StudRes Hat CookD
## 40 2.083822 0.01235387 0.015563562
## 156 2.057274 0.01487512 0.017574208
## 198 2.106376 0.01472214 0.019411921
## 356 1.199803 0.04133761 0.007569415
## 373 1.428058 0.04921401 0.015025521
model2 <- update(mylogit, subset = c(-373,-198))
compareCoefs(mylogit, model2)#no coefficients change significantly, none of them changes significance level.## Calls:
## 1: glm(formula = admit ~ gre + gpa + rank, family = "binomial", data =
## mydata)
## 2: glm(formula = admit ~ gre + gpa + rank, family = "binomial", data =
## mydata, subset = c(-373, -198))
##
## Model 1 Model 2
## (Intercept) -3.99 -4.33
## SE 1.14 1.17
##
## gre 0.00226 0.00232
## SE 0.00109 0.00111
##
## gpa 0.804 0.879
## SE 0.332 0.340
##
## rank2 -0.675 -0.626
## SE 0.316 0.319
##
## rank3 -1.340 -1.300
## SE 0.345 0.347
##
## rank4 -1.551 -1.598
## SE 0.418 0.429
##
There are no changes in the sign or magnitude of coefficients, which means there is no need to exclude any observations from mylogit.