Homework 9

library(foreign)
library(Zelig)

## ## 
## ##  Zelig (Version 3.5.3, built: 2011-11-29)
## ##  Please refer to http://gking.harvard.edu/zelig for full
## ##  documentation or help.zelig() for help with commands and
## ##  models supported by Zelig.
## ##
## 
## ##  Zelig project citations:
## ##    Kosuke Imai, Gary King, and Olivia Lau. (2009).
## ##    ``Zelig: Everyone's Statistical Software,''
## ##    http://gking.harvard.edu/zelig.
## ##  and
## ##    Kosuke Imai, Gary King, and Olivia Lau. (2008).
## ##    ``Toward A Common Framework for Statistical Analysis
## ##    and Development,'' Journal of Computational and
## ##    Graphical Statistics, Vol. 17, No. 4 (December)
## ##    pp. 892-913. 
## 
## ##  To cite individual Zelig models, please use the citation format printed with
## ##  each model run and in the documentation.
## ##

library(stargazer)
library(DescTools)
library(Zelig)
library(foreign)
library(knitr)
library(stargazer)
library(dplyr)
library(DescTools)
library(lme4)
library(npmlreg)
library(texreg)
library(magrittr)

## How to cite this model in Zelig:
## Kosuke Imai, Gary King, and Oliva Lau. 2007. "normal: Normal Regression for Continuous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig

## 
## =============================================
##                       Dependent variable:    
##                   ---------------------------
##                              death           
## ---------------------------------------------
## sex1                   45.000*** (6.326)     
## Constant               14.269*** (4.473)     
## ---------------------------------------------
## Observations                  104            
## Log Likelihood             -508.819          
## Akaike Inf. Crit.          1,021.637         
## =============================================
## Note:             *p<0.1; **p<0.05; ***p<0.01

## 
## -------------------------------------------------------------------------
## 'data.frame':    13 obs. of  1 variable:
##  1 $ sexc: num  24.75 57.75 87.25 7.75 11.25 ...
## 
## ------------------------------------------------------------------------- 
## 1 - sexc (numeric)
## 
##   length      n    NAs unique     0s   mean meanSE
##       13     13      0    = n      0     45  7.677
## 
##      .05    .10    .25 median    .75    .90    .95
##    6.250  8.450 24.750 49.250 58.500 78.600 82.300
## 
##      rng     sd  vcoef    mad    IQR   skew   kurt
##   83.250 27.681  0.615 36.324 33.750 -0.056 -1.445
##  
## lowest : 4, 7.75, 11.25, 24.75, 35.25
## highest: 57.75, 58.5, 77, 79, 87.25
## 
## Shapiro-Wilks normality test  p.value : 0.55554

The data I am using is a study on Suicide Rates in the Republic of Ireland 1989-1998.

Region a factor with levels Cork , Dublin , EHB - Dub., Galway, Lim., Mid HB, MWHB-Lim., NEHB, NWHB, SEHB-Wat., SHB-Cork, Waterf., WHB-Gal.. sex a factor for gender with levels 0 (female) and 1 (male). death a numeric vector giving the total number of deaths. pop population of region population (Ahlbom, 1993). (death in suicides I assume)

m1_lme <- lmer(death ~ sex + (1|Region), data = irlsuicide)
summary(m1_lme)

## Linear mixed model fit by REML ['lmerMod']
## Formula: death ~ sex + (1 | Region)
##    Data: irlsuicide
## 
## REML criterion at convergence: 974.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3654 -0.6604 -0.0931  0.5213  4.2800 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Region   (Intercept) 458.2    21.41   
##  Residual             609.0    24.68   
## Number of obs: 104, groups:  Region, 13
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   14.269      6.853   2.082
## sex1          45.000      4.840   9.298
## 
## Correlation of Fixed Effects:
##      (Intr)
## sex1 -0.353

m2_lme <- lmer(death ~ sex + (sex|Region), data = irlsuicide)
summary(m2_lme)

## Linear mixed model fit by REML ['lmerMod']
## Formula: death ~ sex + (sex | Region)
##    Data: irlsuicide
## 
## REML criterion at convergence: 943.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0032 -0.2889 -0.0697  0.2623  4.0805 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Region   (Intercept)  82.27    9.07        
##           sex1        693.73   26.34    1.00
##  Residual             411.59   20.29        
## Number of obs: 104, groups:  Region, 13
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   14.269      3.774   3.781
## sex1          45.000      8.318   5.410
## 
## Correlation of Fixed Effects:
##      (Intr)
## sex1 0.333

stargazer(cpooling, m1_lme, m2_lme, type = "text")

## 
## =================================================
##                          Dependent variable:     
##                     -----------------------------
##                                 death            
##                      normal         linear       
##                                  mixed-effects   
##                        (1)       (2)       (3)   
## -------------------------------------------------
## sex1                45.000*** 45.000*** 45.000***
##                      (6.326)   (4.840)   (8.318) 
##                                                  
## Constant            14.269*** 14.269**  14.269***
##                      (4.473)   (6.853)   (3.774) 
##                                                  
## -------------------------------------------------
## Observations           104       104       104   
## Log Likelihood      -508.819  -487.381  -471.532 
## Akaike Inf. Crit.   1,021.637  982.762   955.063 
## Bayesian Inf. Crit.            993.340   970.930 
## =================================================
## Note:                 *p<0.1; **p<0.05; ***p<0.01

In the above model we see that all three models are statistically significant at a 99% confidence level. To see which model to use we look at the Akaike, model 1 has the highest rate followed by 2 then 3.

The above shows that there is a statistical correlation between sex and death by suicide, specifically sex1 (male). With a p value of .01 or lower meaning that it is at a 99% confidence level.

The table below shows that there is a significant between death, sex and region.

screenreg(list(m1_lme, m2_lme))

## 
## ======================================================
##                               Model 1      Model 2    
## ------------------------------------------------------
## (Intercept)                     14.27 *      14.27 ***
##                                 (6.85)       (3.77)   
## sex1                            45.00 ***    45.00 ***
##                                 (4.84)       (8.32)   
## ------------------------------------------------------
## AIC                            982.76       955.06    
## BIC                            993.34       970.93    
## Log Likelihood                -487.38      -471.53    
## Num. obs.                      104          104       
## Num. groups: Region             13           13       
## Variance: Region.(Intercept)   458.23        82.27    
## Variance: Residual             609.04       411.59    
## Variance: Region.sex1                       693.73    
## ======================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

m3_lme <- lmer(death ~ sex + (sex|Region) + pop, data = irlsuicide)
m4_lme <- lmer(death ~ sex*pop + (sex|Region), data = irlsuicide)
screenreg(list(m3_lme, m4_lme))

## 
## ======================================================
##                               Model 1      Model 2    
## ------------------------------------------------------
## (Intercept)                     -1.51         9.67 ***
##                                 (2.82)       (2.84)   
## sex1                            45.14 ***    20.69 ***
##                                 (7.60)       (4.77)   
## pop                              0.00 ***     0.00 ** 
##                                 (0.00)       (0.00)   
## sex1:pop                                      0.00 ***
##                                              (0.00)   
## ------------------------------------------------------
## AIC                            922.69       883.69    
## BIC                            941.20       904.85    
## Log Likelihood                -454.34      -433.85    
## Num. obs.                      104          104       
## Num. groups: Region             13           13       
## Variance: Region.(Intercept)     0.00        29.06    
## Variance: Region.sex1          623.41       144.48    
## Variance: Residual             256.70       147.01    
## ======================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

irlsuicide %<>% mutate(death = expected - mean(pop))
m4a_lme <- lmer(death ~ sex*pop + (sex|Region), data = irlsuicide)
screenreg(list(m4_lme, m4a_lme))

## 
## ========================================================
##                               Model 1      Model 2      
## --------------------------------------------------------
## (Intercept)                      9.67 ***  -34500.58 ***
##                                 (2.84)         (3.06)   
## sex1                            20.69 ***      13.36 ***
##                                 (4.77)         (3.66)   
## pop                              0.00 **        0.00    
##                                 (0.00)         (0.00)   
## sex1:pop                         0.00 ***       0.00 ***
##                                 (0.00)         (0.00)   
## --------------------------------------------------------
## AIC                            883.69         835.81    
## BIC                            904.85         856.97    
## Log Likelihood                -433.85        -409.91    
## Num. obs.                      104            104       
## Num. groups: Region             13             13       
## Variance: Region.(Intercept)    29.06          76.42    
## Variance: Region.sex1          144.48          90.87    
## Variance: Residual             147.01          83.57    
## ========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

m0_lme <- lmer(death ~ 1 + (1|Region), data = irlsuicide)
summary(m0_lme)

## Linear mixed model fit by REML ['lmerMod']
## Formula: death ~ 1 + (1 | Region)
##    Data: irlsuicide
## 
## REML criterion at convergence: 1032
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8822 -0.5660 -0.2454  0.2597  4.8004 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Region   (Intercept)  506.4   22.50   
##  Residual             1045.4   32.33   
## Number of obs: 104, groups:  Region, 13
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   -34479          7   -4925

The pooling models above show that sex1 (male) have a p value of less than .001 meaning that it is statistically significant, in both model 1 and model 2. In model 1 under pop (population) the p value is also less than .001 meaning that it is also statistically significant and in model 2 it has a P value less than .01 meaning it is also statistically significant. In model 2 sex and population together has a p value of less than .001. Population density and being male have very high statistical correlation with suicide. The partial pooling models show the same type of correlations between sex, population density and suicide.

Homework 9

Robert Johnson

Sunday, April 26, 2015