Load the Libraries + Functions

Load all the libraries or functions that you will use to for the rest of the assignment. It is helpful to define your libraries and functions at the top of a report, so that others can know what they need for the report to compile correctly.

The data for this project has already been loaded. Here you will be distinguishing between the uses for let, allow, and permit. You should pick two of these verbs to examine and subset the dataset for only those columns. You can use the droplevels() function to help drop the empty level after you subset.

library(Rling)
data(let)
head(let)
##   Year  Reg   Verb Neg Permitter Imper
## 1 2003  MAG  allow  No    Inanim    No
## 2 2005 SPOK  allow  No    Inanim    No
## 3 1990 SPOK    let  No      Anim    No
## 4 2007 SPOK  allow  No    Inanim    No
## 5 1997  MAG permit  No      Anim   Yes
## 6 1996  MAG  allow  No      Anim    No

Description of the Data

The data is from the COCA: Corpus of Contemporary American English investigating the verb choice of let, allow, and permit. These are permissive constructions that are often paired with the word to. Predict the verb choice in the Verb column with the following independent variables:

Sample Size Requirements

table(droplevels(let)$Verb)
## 
##    let  allow permit 
##    187    167    164

Running a Binary Logistic Regression

library(rms)
## Warning: package 'rms' was built under R version 3.6.3
## Loading required package: Hmisc
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
## Loading required package: SparseM
## 
## Attaching package: 'SparseM'
## The following object is masked from 'package:base':
## 
##     backsolve
model = lrm(Verb ~ Reg + Permitter+ Imper,
            data = let)
model
## Logistic Regression Model
##  
##  lrm(formula = Verb ~ Reg + Permitter + Imper, data = let)
##  
##                        Model Likelihood     Discrimination    Rank Discrim.    
##                           Ratio Test           Indexes           Indexes       
##  Obs           518    LR chi2     248.15    R2       0.428    C       0.759    
##   let          187    d.f.             4    g        1.730    Dxy     0.518    
##   allow        167    Pr(> chi2) <0.0001    gr       5.643    gamma   0.596    
##   permit       164                          gp       0.302    tau-a   0.346    
##  max |deriv| 9e-08                          Brier    0.124                     
##  
##                   Coef    S.E.   Wald Z Pr(>|Z|)
##  y>=allow          0.7679 0.1768  4.34  <0.0001 
##  y>=permit        -1.1173 0.1834 -6.09  <0.0001 
##  Reg=SPOK          0.0076 0.1893  0.04  0.9681  
##  Permitter=Inanim  1.1607 0.2023  5.74  <0.0001 
##  Permitter=Undef   1.1276 0.3702  3.05  0.0023  
##  Imper=Yes        -3.4539 0.4223 -8.18  <0.0001 
## 

Coefficients

table(droplevels(let)$Verb, let$Reg)
##         
##          MAG SPOK
##   let     72  115
##   allow  103   64
##   permit 100   64
- Permitter: 
table(droplevels(let)$Verb, let$Permitter)
##         
##          Anim Inanim Undef
##   let     175      8     4
##   allow    64     91    12
##   permit   62     86    16
- Imper: 
table(droplevels(let)$Verb, let$Imper)
##         
##           No Yes
##   let     83 104
##   allow  163   4
##   permit 161   3

Interactions

model1 = glm(Verb ~ Reg + Permitter + Imper,
             family = binomial,
             data = let)
model2 = glm(Verb ~ Permitter + Imper*Reg,
             family = binomial,
             data = let)
anova(model1, model2, test = "Chisq")
## Analysis of Deviance Table
## 
## Model 1: Verb ~ Reg + Permitter + Imper
## Model 2: Verb ~ Permitter + Imper * Reg
##   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
## 1       513     393.38                          
## 2       512     379.40  1   13.973 0.0001855 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(model2)
## 
## Call:
## glm(formula = Verb ~ Permitter + Imper * Reg, family = binomial, 
##     data = let)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4667  -0.1557   0.3127   0.5391   2.9728  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        0.3500     0.1965   1.781 0.074895 .  
## PermitterInanim    2.6435     0.3956   6.683 2.35e-11 ***
## PermitterUndef     1.5051     0.5595   2.690 0.007140 ** 
## ImperYes          -1.6493     0.5007  -3.294 0.000989 ***
## RegSPOK            0.3559     0.2770   1.284 0.198985    
## ImperYes:RegSPOK  -3.4633     1.1406  -3.036 0.002395 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 677.54  on 517  degrees of freedom
## Residual deviance: 379.40  on 512  degrees of freedom
## AIC: 391.4
## 
## Number of Fisher Scoring iterations: 7
library(visreg)
## Warning: package 'visreg' was built under R version 3.6.3
visreg(model2, "Imper", by = "Reg")

table(droplevels(let)$Verb, let$Reg, let$Imper)
## , ,  = No
## 
##         
##          MAG SPOK
##   let     50   33
##   allow   99   64
##   permit  98   63
## 
## , ,  = Yes
## 
##         
##          MAG SPOK
##   let     22   82
##   allow    4    0
##   permit   2    1

Outliers

library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following objects are masked from 'package:rms':
## 
##     Predict, vif
influencePlot(model2)

##        StudRes         Hat       CookD
## 18  -0.7017301 0.035714286 0.001745854
## 36   3.1365078 0.012048193 0.168699187
## 45  -2.4917395 0.006173684 0.020789048
## 59   1.7935140 0.035714286 0.023472032
## 119 -2.0556891 0.033974915 0.038790368

Assumptions

rms::vif(model)
##         Reg=SPOK Permitter=Inanim  Permitter=Undef        Imper=Yes 
##         1.049413         1.155355         1.114566         1.062978

No Python in this section! You will use the functions from this week in a few assignments coming up!