Load all the libraries or functions that you will use to for the rest of the assignment. It is helpful to define your libraries and functions at the top of a report, so that others can know what they need for the report to compile correctly.
The data for this project has already been loaded. Here you will be distinguishing between the uses for let, allow, and permit. You should pick two of these verbs to examine and subset the dataset for only those columns. You can use the droplevels() function to help drop the empty level after you subset.
library(Rling)
data(let)
head(let)
## Year Reg Verb Neg Permitter Imper
## 1 2003 MAG allow No Inanim No
## 2 2005 SPOK allow No Inanim No
## 3 1990 SPOK let No Anim No
## 4 2007 SPOK allow No Inanim No
## 5 1997 MAG permit No Anim Yes
## 6 1996 MAG allow No Anim No
library(rms)
## Loading required package: Hmisc
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
## Loading required package: SparseM
##
## Attaching package: 'SparseM'
## The following object is masked from 'package:base':
##
## backsolve
library(visreg)
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
## The following objects are masked from 'package:rms':
##
## Predict, vif
data(let)
let <- subset(let, Verb != "permit")
The data is from the COCA: Corpus of Contemporary American English investigating the verb choice of let, allow, and permit. These are permissive constructions that are often paired with the word to. Predict the verb choice in the Verb column with the following independent variables:
table(droplevels(let)$Verb)
##
## let allow
## 187 167
rms package.
model = lrm(Verb ~ Reg + Permitter+ Imper,
data = let)
model
## Logistic Regression Model
##
## lrm(formula = Verb ~ Reg + Permitter + Imper, data = let)
##
## Model Likelihood Discrimination Rank Discrim.
## Ratio Test Indexes Indexes
## Obs 354 LR chi2 201.17 R2 0.579 C 0.876
## let 187 d.f. 4 g 2.417 Dxy 0.753
## allow 167 Pr(> chi2) <0.0001 gr 11.214 gamma 0.825
## max |deriv| 6e-10 gp 0.377 tau-a 0.376
## Brier 0.132
##
## Coef S.E. Wald Z Pr(>|Z|)
## Intercept -0.1935 0.2280 -0.85 0.3961
## Reg=SPOK 0.0530 0.3072 0.17 0.8629
## Permitter=Inanim 2.6069 0.4106 6.35 <0.0001
## Permitter=Undef 1.2888 0.6166 2.09 0.0366
## Imper=Yes -3.1051 0.5464 -5.68 <0.0001
##
Imper*Reg, but remember you will need to do a glm model.anova function to answer if the addition of the interaction was significant.
visreg library and funtion to visualize the interaction.
model1 = glm(Verb ~ Reg + Permitter + Imper,
family = binomial,
data = let)
model2 = glm(Verb ~ Permitter + Imper*Reg,
family = binomial,
data = let)
anova(model1, model2, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: Verb ~ Reg + Permitter + Imper
## Model 2: Verb ~ Permitter + Imper * Reg
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 349 288.45
## 2 348 275.27 1 13.176 0.0002835 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(model2)
##
## Call:
## glm(formula = Verb ~ Permitter + Imper * Reg, family = binomial,
## data = let)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.19626 -0.57802 -0.00013 0.43343 1.93484
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.3432 0.2343 -1.465 0.1430
## PermitterInanim 2.6611 0.4142 6.425 1.32e-10 ***
## PermitterUndef 1.4209 0.6195 2.294 0.0218 *
## ImperYes -1.3615 0.5919 -2.300 0.0214 *
## RegSPOK 0.3667 0.3216 1.140 0.2543
## ImperYes:RegSPOK -17.2280 720.3052 -0.024 0.9809
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 489.62 on 353 degrees of freedom
## Residual deviance: 275.27 on 348 degrees of freedom
## AIC: 287.27
##
## Number of Fisher Scoring iterations: 17
visreg(model2, "Imper", by = "Reg")
table(droplevels(let)$Verb, let$Reg, let$Imper)
## , , = No
##
##
## MAG SPOK
## let 50 33
## allow 99 64
##
## , , = Yes
##
##
## MAG SPOK
## let 22 82
## allow 4 0
car library and the influencePlot() to create a picture of the outliers.
influencePlot(model2)
## StudRes Hat CookD
## 13 0.7803198 0.06328249 0.004091446
## 15 0.6635054 0.06591552 0.002970197
## 45 -2.2236205 0.01176981 0.020395281
## 67 1.9908803 0.03846154 0.038133333
## 111 1.9908803 0.03846154 0.038133333
## 132 -2.2236205 0.01176981 0.020395281
vif values of the original model (not the interaction model) and determine if you meet the assumption of additivity (meaning no multicollinearity). It shows that there is no multi-collinearity. It meets the assumption.rms::vif(model)
## Reg=SPOK Permitter=Inanim Permitter=Undef Imper=Yes
## 1.090869 1.044777 1.067404 1.055026