Missing data homework

1. Use a data source you are familiar with

GSS 2020

2. Measure an outcome variable and at least 5 predictors.

Education will be the outcome variable being measured categorically and as a continuous variable for modeling purposes. The key predictors that will be included will be Hispanics, being born in the US, living in poverty, being percieved as not smart, and being treated with disrespect. The two control variables that will also be observed will be sex and respondents age. 



3. Report the pattern of missingness among all of these variables

From observing all of the variables the two variables that not only have the highest frequencies but almost identical values of missingness would be being percieved as not smart and being treated with disrespect. With both variables sitting at about 1430 values missing thats almost half of the data being observed. Normally these variables will be dropped but will be analyzed with this technique because of the high number of missing values. 


4. Perform a mean (a mean for numeric data) or a modal imputation (for categorical data) of all values. Perform the analysis using this imputed data. What are your results?

A modal imputation was conducted for all key predictor variables as well as the outcome variable education. When observing the proportion tables as well as the bar charts it can be shown that little change was found using this method. By using this method of modal imputation it produced no significant difference between the original data and the imputed data.  


5. Perform a multiple imputation of all values. Perform the analysis using this imputed data set. What are your results?
A multiple imputation was then performed on those same variables. 

When observing the pooled results of those averages of the 10 iterations conducted; 

those who are Non-Hispanic tend to have higher educational attainment than Hispanics, 
those born outside of the US tend to have higher education than those born in the US,
those with resources at 16 economically had higher educational attainment than those who didn't,
those who were percieved as not smart had lower educational attainment than those who were not,
those who have been disrespected tended to have higher educational attainment than those who have never been disrespected. 

For the control variables men tended to have a college educaation and age is postively associated with education with the higehst rates of education being found between ages 24-39. 

As we now observe the lambda for those pooled results mentioned observing the stability of the estimates between iterations all variables are extreamly stable being well under 40. The two exceptions are the variables Not Smart and Disrespect. They have lambda's just over the 40 threshold so although instability is present it is not to the extream. 

To finally observe the impact of the multiple imputation and the original model a linear regression will be conducted to observe this effect. When using the predictors and outcome variable mentioned previously it can be shown by comparing the rsquare of both regression models that the imputed model actually had a increase by about 1 unit change for the rsquare now being at .05 a rise from .04. 

Although the change of the rsquare is slight, when observing the predictors in the imputed model you can see that the statistical signifcance of all variables that had statistically signicant pvalues in the orignal data  were stregthened in the new model. Because of this grown strength in statistical significane in the imputed regression model amongst predictors warrents this model to be a better fit than the regression model using the original data.

Were the results similar between the mean/modal and multiply imputed data sets?  How do the results compare to the results from the model fit with the data source with missing values?

I would say that the results were similar yet multiple imputation is a better technique to observe true interactions. The reason being is that between the original data and models compared to the imputed for both techniques of modal and multiple imputation it is shown that although per variable a slight difference can be shown it did not have a statistical effect on the overall modals. Being that no true growth of change can be found. As expressed before, the imputed models per predictor has better outcomes so when observing interactions between variables this method of multiple imputation is supported more so than the other techniques. 

Although true, the overall model may not have a large change of being a better fit than the original model. In conclusion, multiple imputation does help increase the significance of those relationships between predictors and the outcome variable but to help build a better rsquare or goodness of fit for the overall model checking for large amounts of missingness in those predictors and replacing them with similiar predictors with fewer missing values is largely suggested over multiple imputation when checking for missing data. But rather to be used for better analyzing those variables in said models.  
library(VGAM)
## Loading required package: stats4
## Loading required package: splines
library(svyVGAM)
## Loading required package: survey
## Loading required package: grid
## Loading required package: Matrix
## Loading required package: survival
## 
## Attaching package: 'survey'
## The following object is masked from 'package:VGAM':
## 
##     calibrate
## The following object is masked from 'package:graphics':
## 
##     dotchart
library(mice)
## 
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
## 
##     filter
## The following objects are masked from 'package:base':
## 
##     cbind, rbind
library(gtsummary)
library(pander)
library(factoextra)
## Warning: package 'factoextra' was built under R version 4.1.3
## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(haven)
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(scales)
library(sur)
library(plyr)
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following object is masked from 'package:gtsummary':
## 
##     mutate
library(summarytools)
library(Rmisc)
## Loading required package: lattice
library(car)
## Loading required package: carData
## 
## Attaching package: 'carData'
## The following objects are masked from 'package:sur':
## 
##     Anscombe, States
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:VGAM':
## 
##     logit
library(forcats)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble  3.1.6     v purrr   0.3.4
## v tidyr   1.1.4     v stringr 1.4.0
## v readr   2.1.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x plyr::arrange()     masks dplyr::arrange()
## x readr::col_factor() masks scales::col_factor()
## x purrr::compact()    masks plyr::compact()
## x plyr::count()       masks dplyr::count()
## x purrr::discard()    masks scales::discard()
## x tidyr::expand()     masks Matrix::expand()
## x plyr::failwith()    masks dplyr::failwith()
## x tidyr::fill()       masks VGAM::fill()
## x dplyr::filter()     masks mice::filter(), stats::filter()
## x plyr::id()          masks dplyr::id()
## x dplyr::lag()        masks stats::lag()
## x plyr::mutate()      masks dplyr::mutate(), gtsummary::mutate()
## x tidyr::pack()       masks Matrix::pack()
## x car::recode()       masks dplyr::recode()
## x plyr::rename()      masks dplyr::rename()
## x purrr::some()       masks car::some()
## x plyr::summarise()   masks dplyr::summarise()
## x plyr::summarize()   masks dplyr::summarize()
## x tidyr::unpack()     masks Matrix::unpack()
## x tibble::view()      masks summarytools::view()
library(survey)
library(stargazer)
## 
## Please cite as:
##  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.2. https://CRAN.R-project.org/package=stargazer
library(grid)
library(Matrix)
library(caret)
## 
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
## 
##     lift
## The following object is masked from 'package:survival':
## 
##     cluster
## The following object is masked from 'package:VGAM':
## 
##     predictors
gss2021_ZERODraft<-read_dta("C:\\Users\\BTP\\Desktop\\STATS 2 FOLDER\\2021_sas\\gss2021.dta")

recode hispanic

gss2021_ZERODraft$subgrouphis <-Recode(gss2021_ZERODraft$hispanic, recodes="1 = 0; 2:50 = 1; else=NA",as.factor=T)
gss2021_ZERODraft %>%
  
tabyl(subgrouphis)
##  subgrouphis    n    percent valid_percent
##            0 3544 0.87896825     0.8864432
##            1  454 0.11259921     0.1135568
##         <NA>   34 0.00843254            NA
subgrouphis_1<-as.factor(ifelse(gss2021_ZERODraft$subgrouphis==1, "Hispanic", "Non Hispanic"))

recode education outcome variable for Multinomial model for those with less than high school, high school, and college

gss2021_ZERODraft$AllEducLevels <-Recode(gss2021_ZERODraft$educ, recodes="0:11 = 1; 12 = 2; 13:16 = 3; 17:20 = 4; else=NA", as.factor=T)

gss2021_ZERODraft$AllEducLevels<-relevel(gss2021_ZERODraft$AllEducLevels, ref = "1")
gss2021_ZERODraft %>%
  tabyl(AllEducLevels)
##  AllEducLevels    n    percent valid_percent
##              1  230 0.05704365    0.05799294
##              2  829 0.20560516    0.20902673
##              3 1969 0.48834325    0.49646999
##              4  938 0.23263889    0.23651034
##           <NA>   66 0.01636905            NA

sex

gss2021_ZERODraft$subgroupsex <-Recode(gss2021_ZERODraft$sex, recodes="1:1 = 0; 2:2 = 1; else=NA",)
gss2021_ZERODraft %>%
  
tabyl(subgroupsex)
##  subgroupsex    n    percent valid_percent
##            0 1736 0.43055556     0.4406091
##            1 2204 0.54662698     0.5593909
##           NA   92 0.02281746            NA
subgroupsex_1<-as.factor(ifelse(gss2021_ZERODraft$subgroupsex==1, "Women", "Men"))

income of household of respondent when age 16

gss2021_ZERODraft$subgroupincom16 <-Recode(gss2021_ZERODraft$incom16, recodes="1:2 = 0; 3:5 = 1; else=NA",as.factor=T)
gss2021_ZERODraft %>%
  
tabyl(subgroupincom16)
##  subgroupincom16    n    percent valid_percent
##                0 1434 0.35565476      0.374804
##                1 2392 0.59325397      0.625196
##             <NA>  206 0.05109127            NA
subgroupincom16_1<-as.factor(ifelse(gss2021_ZERODraft$subgroupincom16==1, "secure economic resources at 16", "insecure economic resources"))

if respondent was born in the US

gss2021_ZERODraft$subgroupBorn <-Recode(gss2021_ZERODraft$born, recodes="1:1 = 1; 2:2 = 0; else=NA",as.factor=T)
gss2021_ZERODraft %>%
  
tabyl(subgroupBorn)
##  subgroupBorn    n    percent valid_percent
##             0  444 0.11011905     0.1121212
##             1 3516 0.87202381     0.8878788
##          <NA>   72 0.01785714            NA
subgroupborn_1<-as.factor(ifelse(gss2021_ZERODraft$subgroupBorn==1, "Born in US", "Not born in US"))

if respondents ever been disrespected

gss2021_ZERODraft$subgroupdisrspct <-Recode(gss2021_ZERODraft$disrspct, recodes="1:5 = 1; 6:6 = 0; else=NA",as.factor=T)
gss2021_ZERODraft %>%
  
tabyl(subgroupdisrspct)
##  subgroupdisrspct    n   percent valid_percent
##                 0  554 0.1374008      0.212995
##                 1 2047 0.5076885      0.787005
##              <NA> 1431 0.3549107            NA
subgroupdisrspct_1<-as.factor(ifelse(gss2021_ZERODraft$subgroupdisrspct==1, "respondent has been disrespected", "Not being disrespected"))

if respondents ever been called or treated like they were not smart.

gss2021_ZERODraft$subgroupnotsmart <-Recode(gss2021_ZERODraft$notsmart, recodes="1:5 = 1; 6:6 = 0; else=NA",as.factor=T)
gss2021_ZERODraft %>%
  
tabyl(subgroupnotsmart)
##  subgroupnotsmart    n   percent valid_percent
##                 0  867 0.2150298     0.3332052
##                 1 1735 0.4303075     0.6667948
##              <NA> 1430 0.3546627            NA
subgroupnotsmart_1<-as.factor(ifelse(gss2021_ZERODraft$subgroupnotsmart==1, "respondent was told or treated as if they are not smart", "Never experienced that sort of treatment of not being smart"))

age cut into intervals

age1<-cut(gss2021_ZERODraft$age,
          breaks = c(0,24,39,59,79,99))
summary(gss2021_ZERODraft[, c("AllEducLevels","subgrouphis","subgroupBorn","subgroupsex","subgroupincom16","subgroupdisrspct","subgroupnotsmart","age")])
##  AllEducLevels subgrouphis subgroupBorn  subgroupsex     subgroupincom16
##  1   : 230     0   :3544   0   : 444    Min.   :0.0000   0   :1434      
##  2   : 829     1   : 454   1   :3516    1st Qu.:0.0000   1   :2392      
##  3   :1969     NA's:  34   NA's:  72    Median :1.0000   NA's: 206      
##  4   : 938                              Mean   :0.5594                  
##  NA's:  66                              3rd Qu.:1.0000                  
##                                         Max.   :1.0000                  
##                                         NA's   :92                      
##  subgroupdisrspct subgroupnotsmart      age       
##  0   : 554        0   : 867        Min.   :18.00  
##  1   :2047        1   :1735        1st Qu.:37.00  
##  NA's:1431        NA's:1430        Median :53.00  
##                                    Mean   :52.16  
##                                    3rd Qu.:66.00  
##                                    Max.   :89.00  
##                                    NA's   :333
## outcome variable 
table(gss2021_ZERODraft$AllEducLevels)
## 
##    1    2    3    4 
##  230  829 1969  938
## predictors
table(gss2021_ZERODraft$subgrouphis)
## 
##    0    1 
## 3544  454
table(gss2021_ZERODraft$subgroupnotsmart)
## 
##    0    1 
##  867 1735
table(gss2021_ZERODraft$subgroupBorn)
## 
##    0    1 
##  444 3516
table(gss2021_ZERODraft$subgroupincom16)
## 
##    0    1 
## 1434 2392
table(gss2021_ZERODraft$subgroupdisrspct)
## 
##    0    1 
##  554 2047
table(gss2021_ZERODraft$subgroupsex)
## 
##    0    1 
## 1736 2204
#find the most common value

mcv.AllEducLevels<-factor(names(which.max(table(gss2021_ZERODraft$AllEducLevels))), levels=levels(gss2021_ZERODraft$AllEducLevels))

mcv.notsmart<-factor(names(which.max(table(gss2021_ZERODraft$subgroupnotsmart))), levels=levels(gss2021_ZERODraft$subgroupnotsmart))

mcv.his<-factor(names(which.max(table(gss2021_ZERODraft$subggrouphis))), levels=levels(gss2021_ZERODraft$subgrouphis))
## Warning: Unknown or uninitialised column: `subggrouphis`.
mcv.disrespect<-factor(names(which.max(table(gss2021_ZERODraft$subgroupdisrspct))), levels=levels(gss2021_ZERODraft$subgroupdisrspct))

mcv.born<-factor(names(which.max(table(gss2021_ZERODraft$subgroupBorn))), levels=levels(gss2021_ZERODraft$subgroupBorn))

mcv.income16<-factor(names(which.max(table(gss2021_ZERODraft$subgroupincom16))), levels=levels(gss2021_ZERODraft$subgroupincom16))

mcv.his
## factor(0)
## Levels: 0 1
mcv.disrespect
## [1] 1
## Levels: 0 1
mcv.notsmart
## [1] 1
## Levels: 0 1
mcv.income16
## [1] 1
## Levels: 0 1
mcv.born
## [1] 1
## Levels: 0 1
mcv.AllEducLevels
## [1] 3
## Levels: 1 2 3 4
#impute the cases

gss2021_ZERODraft$AllEducLevels.imp<-as.factor(ifelse(is.na(gss2021_ZERODraft$AllEducLevels)==T, mcv.AllEducLevels, gss2021_ZERODraft$AllEducLevels))
levels(gss2021_ZERODraft$AllEducLevels.imp)<-levels(gss2021_ZERODraft$AllEducLevels)

gss2021_ZERODraft$his.imp<-as.factor(ifelse(is.na(gss2021_ZERODraft$subgrouphis)==T, mcv.his, gss2021_ZERODraft$subgrouphis))
levels(gss2021_ZERODraft$his.imp)<-levels(gss2021_ZERODraft$subgrouphis)

gss2021_ZERODraft$notsmart.imp<-as.factor(ifelse(is.na(gss2021_ZERODraft$subgroupnotsmart)==T, mcv.notsmart, gss2021_ZERODraft$subgroupnotsmart))
levels(gss2021_ZERODraft$notsmart.imp)<-levels(gss2021_ZERODraft$subgroupnotsmart)

gss2021_ZERODraft$disrespect.imp<-as.factor(ifelse(is.na(gss2021_ZERODraft$subgroupdisrspct)==T, mcv.disrespect, gss2021_ZERODraft$subgroupdisrspct))
levels(gss2021_ZERODraft$disrespect.imp)<-levels(gss2021_ZERODraft$subgroupdisrspct)

gss2021_ZERODraft$born.imp<-as.factor(ifelse(is.na(gss2021_ZERODraft$subgroupBorn)==T, mcv.born, gss2021_ZERODraft$subgroupBorn))
levels(gss2021_ZERODraft$born.imp)<-levels(gss2021_ZERODraft$subgroupBorn)

gss2021_ZERODraft$income16.imp<-as.factor(ifelse(is.na(gss2021_ZERODraft$subgroupincom16)==T, mcv.income16, gss2021_ZERODraft$subgroupincom16))
levels(gss2021_ZERODraft$income16.imp)<-levels(gss2021_ZERODraft$subgroupincom16)



prop.table(table(gss2021_ZERODraft$subgroupdisrspct))
## 
##        0        1 
## 0.212995 0.787005
prop.table(table(gss2021_ZERODraft$disrespect.imp))
## 
##         0         1 
## 0.1374008 0.8625992
prop.table(table(gss2021_ZERODraft$subgroupnotsmart))
## 
##         0         1 
## 0.3332052 0.6667948
prop.table(table(gss2021_ZERODraft$notsmart.imp))
## 
##         0         1 
## 0.2150298 0.7849702
prop.table(table(gss2021_ZERODraft$subgroupincom16))
## 
##        0        1 
## 0.374804 0.625196
prop.table(table(gss2021_ZERODraft$income16.imp))
## 
##         0         1 
## 0.3556548 0.6443452
prop.table(table(gss2021_ZERODraft$subgroupBorn))
## 
##         0         1 
## 0.1121212 0.8878788
prop.table(table(gss2021_ZERODraft$born.imp))
## 
##        0        1 
## 0.110119 0.889881
prop.table(table(gss2021_ZERODraft$subgrouphis))
## 
##         0         1 
## 0.8864432 0.1135568
prop.table(table(gss2021_ZERODraft$his.imp))
## 
##         0         1 
## 0.8864432 0.1135568
prop.table(table(gss2021_ZERODraft$AllEducLevels))
## 
##          1          2          3          4 
## 0.05799294 0.20902673 0.49646999 0.23651034
prop.table(table(gss2021_ZERODraft$AllEducLevels.imp))
## 
##          1          2          3          4 
## 0.05704365 0.20560516 0.50471230 0.23263889
barplot(prop.table(table(gss2021_ZERODraft$AllEducLevels)), main="Original Data All Education Levels", ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$AllEducLevels.imp)), main="Imputed Data Education",ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$subgrouphis)), main="Original Data Hispanics", ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$his.imp)), main="Imputed Data Hispanics",ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$subgroupBorn)), main="Original Data Born in US", ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$born.imp)), main="Imputed Data Born in US",ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$subgroupincom16)), main="Original Data income at 16", ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$income16.imp)), main="Imputed Data income at 16",ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$subgroupdisrspct)), main="Original Data being disrespected", ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$disrespect.imp)), main="Imputed Data being disrespected",ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$subgroupnotsmart)), main="Original Data Not Smart", ylim=c(0, .6))

barplot(prop.table(table(gss2021_ZERODraft$notsmart.imp)), main="Imputed Data Not Smart",ylim=c(0, .6))

#look at the patterns of missingness
md.pattern(gss2021_ZERODraft[,c("AllEducLevels", "subgrouphis", "subgroupBorn","subgroupincom16","subgroupdisrspct", "subgroupnotsmart", "subgroupsex", "age")])

##      subgrouphis AllEducLevels subgroupBorn subgroupsex subgroupincom16 age
## 2351           1             1            1           1               1   1
## 7              1             1            1           1               1   1
## 9              1             1            1           1               1   1
## 1184           1             1            1           1               1   1
## 124            1             1            1           1               1   0
## 2              1             1            1           1               1   0
## 1              1             1            1           1               1   0
## 88             1             1            1           1               1   0
## 57             1             1            1           1               0   1
## 33             1             1            1           1               0   1
## 9              1             1            1           1               0   0
## 2              1             1            1           1               0   0
## 10             1             1            1           1               0   0
## 3              1             1            1           0               1   1
## 3              1             1            1           0               1   1
## 3              1             1            1           0               1   0
## 1              1             1            1           0               1   0
## 2              1             1            1           0               0   1
## 1              1             1            1           0               0   1
## 1              1             1            1           0               0   0
## 24             1             1            1           0               0   0
## 6              1             1            0           1               1   1
## 3              1             1            0           1               1   1
## 1              1             1            0           1               1   0
## 1              1             1            0           1               0   1
## 2              1             1            0           1               0   1
## 1              1             1            0           1               0   0
## 1              1             1            0           0               1   0
## 5              1             1            0           0               0   0
## 4              1             0            1           1               1   1
## 2              1             0            1           1               1   1
## 2              1             0            1           1               1   0
## 2              1             0            1           1               1   0
## 2              1             0            1           1               0   1
## 2              1             0            1           1               0   1
## 1              1             0            1           1               0   0
## 1              1             0            0           1               1   1
## 1              1             0            0           1               0   1
## 1              1             0            0           1               0   0
## 1              1             0            0           0               1   1
## 2              1             0            0           0               0   1
## 42             1             0            0           0               0   0
## 13             0             1            1           1               1   1
## 5              0             1            1           1               1   1
## 2              0             1            1           1               1   0
## 5              0             1            1           1               1   0
## 1              0             1            1           1               0   1
## 1              0             1            1           1               0   1
## 1              0             1            1           1               0   0
## 1              0             1            0           1               1   1
## 1              0             1            0           1               1   0
## 1              0             1            0           0               0   0
## 1              0             0            1           1               0   1
## 1              0             0            1           0               0   0
## 1              0             0            0           0               0   0
##               34            66           72          92             206 333
##      subgroupnotsmart subgroupdisrspct     
## 2351                1                1    0
## 7                   1                0    1
## 9                   0                1    1
## 1184                0                0    2
## 124                 1                1    1
## 2                   1                0    2
## 1                   0                1    2
## 88                  0                0    3
## 57                  1                1    1
## 33                  0                0    3
## 9                   1                1    2
## 2                   1                0    3
## 10                  0                0    4
## 3                   1                1    1
## 3                   0                0    3
## 3                   1                1    2
## 1                   0                0    4
## 2                   1                1    2
## 1                   0                0    4
## 1                   1                1    3
## 24                  0                0    5
## 6                   1                1    1
## 3                   0                0    3
## 1                   0                1    3
## 1                   1                1    2
## 2                   0                0    4
## 1                   0                0    5
## 1                   0                0    5
## 5                   0                0    6
## 4                   1                1    1
## 2                   0                0    3
## 2                   1                1    2
## 2                   0                0    4
## 2                   1                1    2
## 2                   0                0    4
## 1                   0                0    5
## 1                   1                1    2
## 1                   0                0    5
## 1                   1                1    4
## 1                   1                1    3
## 2                   1                1    4
## 42                  0                0    7
## 13                  1                1    1
## 5                   0                0    3
## 2                   1                1    2
## 5                   0                0    4
## 1                   1                1    2
## 1                   0                0    4
## 1                   1                1    3
## 1                   1                1    2
## 1                   0                0    5
## 1                   1                1    5
## 1                   1                1    3
## 1                   1                0    6
## 1                   0                0    8
##                  1430             1431 3664
dat2<-gss2021_ZERODraft
samp2<-sample(1:dim(dat2)[1], replace = F, size = 500)
dat2$Eduknock<-dat2$AllEducLevels
dat2$Eduknock[samp2]<-NA
imp<-mice(data = dat2[,c("AllEducLevels", "subgrouphis", "subgroupBorn","subgroupincom16","subgroupdisrspct", "subgroupnotsmart", "educ", "subgroupsex", "age")], seed = 58, m = 10)
## 
##  iter imp variable
##   1   1  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   2  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   3  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   4  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   5  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   6  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   7  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   8  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   9  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   1   10  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   1  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   2  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   3  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   4  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   5  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   6  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   7  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   8  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   9  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   2   10  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   1  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   2  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   3  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   4  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   5  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   6  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   7  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   8  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   9  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   3   10  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   1  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   2  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   3  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   4  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   5  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   6  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   7  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   8  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   9  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   4   10  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   1  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   2  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   3  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   4  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   5  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   6  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   7  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   8  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   9  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
##   5   10  AllEducLevels  subgrouphis  subgroupBorn  subgroupincom16  subgroupdisrspct  subgroupnotsmart  educ  subgroupsex  age
print(imp)
## Class: mids
## Number of multiple imputations:  10 
## Imputation methods:
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##        "polyreg"         "logreg"         "logreg"         "logreg" 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##         "logreg"         "logreg"            "pmm"            "pmm" 
##              age 
##            "pmm" 
## PredictorMatrix:
##                  AllEducLevels subgrouphis subgroupBorn subgroupincom16
## AllEducLevels                0           1            1               1
## subgrouphis                  1           0            1               1
## subgroupBorn                 1           1            0               1
## subgroupincom16              1           1            1               0
## subgroupdisrspct             1           1            1               1
## subgroupnotsmart             1           1            1               1
##                  subgroupdisrspct subgroupnotsmart educ subgroupsex age
## AllEducLevels                   1                1    1           1   1
## subgrouphis                     1                1    1           1   1
## subgroupBorn                    1                1    1           1   1
## subgroupincom16                 1                1    1           1   1
## subgroupdisrspct                0                1    1           1   1
## subgroupnotsmart                1                0    1           1   1
plot(imp)

library(lattice)

## looking at education an Hispanics
stripplot(imp,educ~subgrouphis|.imp, pch=20)

## looking at education and those percieved as not smart
stripplot(imp,educ~subgroupnotsmart|.imp, pch=20)

dat.imp<-complete(imp, action = 1)
head(dat.imp, n=10)
##    AllEducLevels subgrouphis subgroupBorn subgroupincom16 subgroupdisrspct
## 1              2           0            1               0                1
## 2              3           0            1               1                1
## 3              2           1            1               0                1
## 4              3           0            1               1                1
## 5              3           0            0               1                1
## 6              4           0            1               1                0
## 7              2           1            0               1                0
## 8              4           0            1               1                1
## 9              3           0            1               0                1
## 10             3           0            0               1                1
##    subgroupnotsmart educ subgroupsex age
## 1                 0   12           1  65
## 2                 1   16           0  60
## 3                 1   12           0  29
## 4                 1   13           0  68
## 5                 0   14           1  43
## 6                 1   17           1  33
## 7                 0   12           0  20
## 8                 1   19           1  55
## 9                 1   13           0  76
## 10                1   16           1  61
#Now, I will see the variability in the 5 different imputations for each outcom
fit.edu<-with(data=imp ,expr=lm(educ~subgrouphis+subgroupBorn+subgroupincom16+subgroupnotsmart+subgroupdisrspct+subgroupsex+age1))
fit.edu
## call :
## with.mids(data = imp, expr = lm(educ ~ subgrouphis + subgroupBorn + 
##     subgroupincom16 + subgroupnotsmart + subgroupdisrspct + subgroupsex + 
##     age1))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##           13.4206            -1.0588            -0.3297             0.7736  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##           -0.1405             0.7433            -0.3713             1.1654  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##            1.0410             0.9987             0.9131  
## 
## 
## [[2]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##          13.53968           -1.05000           -0.32333            0.74328  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##          -0.07001            0.54376           -0.38512            1.17760  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##           1.05963            0.99356            0.88763  
## 
## 
## [[3]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##          13.62579           -1.11141           -0.37773            0.75678  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##          -0.06546            0.49962           -0.36878            1.14707  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##           1.04352            0.97944            0.89857  
## 
## 
## [[4]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##        13.4611252         -0.9687635         -0.3345354          0.7770422  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##        -0.0005685          0.5578796         -0.3966716          1.1732223  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##         1.0629113          1.0083785          0.9222487  
## 
## 
## [[5]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##        13.5345362         -1.0767138         -0.3939369          0.8084543  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##         0.0009698          0.5666276         -0.3932121          1.1377787  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##         1.0311374          0.9876120          0.8780563  
## 
## 
## [[6]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##          13.64632           -1.09455           -0.37126            0.78098  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##           0.09033            0.33343           -0.39926            1.14452  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##           1.03552            0.99118            0.89918  
## 
## 
## [[7]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##           13.7271            -1.0583            -0.3145             0.7710  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##           -0.2032             0.4627            -0.3898             1.1304  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##            1.0032             0.9330             0.8112  
## 
## 
## [[8]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##          13.52409           -1.01183           -0.38627            0.78161  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##          -0.02582            0.57428           -0.38888            1.17556  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##           1.06517            0.99844            0.90446  
## 
## 
## [[9]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##           13.6504            -1.0359            -0.3119             0.7163  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##           -0.2109             0.5777            -0.3812             1.1462  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##            1.0308             0.9535             0.8238  
## 
## 
## [[10]]
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age1)
## 
## Coefficients:
##       (Intercept)       subgrouphis1      subgroupBorn1   subgroupincom161  
##          13.58699           -1.08145           -0.34305            0.78365  
## subgroupnotsmart1  subgroupdisrspct1        subgroupsex        age1(24,39]  
##           0.01975            0.42379           -0.38313            1.14731  
##       age1(39,59]        age1(59,79]        age1(79,99]  
##           1.03658            0.98029            0.89633
with (data=imp, exp=(sd(educ)))
## call :
## with.mids(data = imp, expr = (sd(educ)))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## [1] 2.803917
## 
## [[2]]
## [1] 2.796903
## 
## [[3]]
## [1] 2.798131
## 
## [[4]]
## [1] 2.807146
## 
## [[5]]
## [1] 2.801157
## 
## [[6]]
## [1] 2.799297
## 
## [[7]]
## [1] 2.79675
## 
## [[8]]
## [1] 2.80444
## 
## [[9]]
## [1] 2.799908
## 
## [[10]]
## [1] 2.796787
with (data=imp, exp=(prop.table(table(educ))))
## call :
## with.mids(data = imp, expr = (prop.table(table(educ))))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0039682540 0.0012400794 0.0064484127 0.0081845238 0.0128968254 0.0210813492 
##           12           13           14           15           16           17 
## 0.2093253968 0.0709325397 0.1356646825 0.0525793651 0.2363591270 0.0654761905 
##           18           19           20 
## 0.0877976190 0.0287698413 0.0548115079 
## 
## [[2]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0037202381 0.0012400794 0.0062003968 0.0079365079 0.0131448413 0.0205853175 
##           12           13           14           15           16           17 
## 0.2085813492 0.0706845238 0.1371527778 0.0523313492 0.2378472222 0.0644841270 
##           18           19           20 
## 0.0877976190 0.0285218254 0.0553075397 
## 
## [[3]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0037202381 0.0012400794 0.0064484127 0.0081845238 0.0131448413 0.0205853175 
##           12           13           14           15           16           17 
## 0.2095734127 0.0701884921 0.1369047619 0.0520833333 0.2366071429 0.0654761905 
##           18           19           20 
## 0.0885416667 0.0287698413 0.0540674603 
## 
## [[4]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0009920635 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0037202381 0.0014880952 0.0062003968 0.0081845238 0.0131448413 0.0208333333 
##           12           13           14           15           16           17 
## 0.2093253968 0.0694444444 0.1359126984 0.0523313492 0.2375992063 0.0652281746 
##           18           19           20 
## 0.0885416667 0.0287698413 0.0545634921 
## 
## [[5]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0039682540 0.0012400794 0.0062003968 0.0079365079 0.0131448413 0.0208333333 
##           12           13           14           15           16           17 
## 0.2090773810 0.0694444444 0.1366567460 0.0523313492 0.2375992063 0.0644841270 
##           18           19           20 
## 0.0892857143 0.0285218254 0.0548115079 
## 
## [[6]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0037202381 0.0012400794 0.0062003968 0.0081845238 0.0128968254 0.0210813492 
##           12           13           14           15           16           17 
## 0.2088293651 0.0711805556 0.1366567460 0.0523313492 0.2353670635 0.0654761905 
##           18           19           20 
## 0.0890376984 0.0285218254 0.0548115079 
## 
## [[7]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0037202381 0.0012400794 0.0064484127 0.0079365079 0.0133928571 0.0205853175 
##           12           13           14           15           16           17 
## 0.2083333333 0.0699404762 0.1376488095 0.0523313492 0.2375992063 0.0652281746 
##           18           19           20 
## 0.0882936508 0.0280257937 0.0548115079 
## 
## [[8]]
## educ
##            0            1            2            3            4            5 
## 0.0024801587 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0037202381 0.0012400794 0.0062003968 0.0079365079 0.0128968254 0.0210813492 
##           12           13           14           15           16           17 
## 0.2083333333 0.0696924603 0.1371527778 0.0518353175 0.2378472222 0.0657242063 
##           18           19           20 
## 0.0887896825 0.0282738095 0.0545634921 
## 
## [[9]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0039682540 0.0012400794 0.0064484127 0.0079365079 0.0131448413 0.0208333333 
##           12           13           14           15           16           17 
## 0.2090773810 0.0699404762 0.1361607143 0.0515873016 0.2378472222 0.0659722222 
##           18           19           20 
## 0.0890376984 0.0282738095 0.0540674603 
## 
## [[10]]
## educ
##            0            1            2            3            4            5 
## 0.0022321429 0.0002480159 0.0004960317 0.0007440476 0.0002480159 0.0004960317 
##            6            7            8            9           10           11 
## 0.0037202381 0.0012400794 0.0064484127 0.0079365079 0.0131448413 0.0208333333 
##           12           13           14           15           16           17 
## 0.2073412698 0.0706845238 0.1366567460 0.0525793651 0.2366071429 0.0667162698 
##           18           19           20 
## 0.0885416667 0.0287698413 0.0543154762
with (data=imp, exp=(prop.table(table(subgrouphis))))
## call :
## with.mids(data = imp, expr = (prop.table(table(subgrouphis))))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## subgrouphis
##         0         1 
## 0.8856647 0.1143353 
## 
## [[2]]
## subgrouphis
##         0         1 
## 0.8856647 0.1143353 
## 
## [[3]]
## subgrouphis
##         0         1 
## 0.8864087 0.1135913 
## 
## [[4]]
## subgrouphis
##         0         1 
## 0.8861607 0.1138393 
## 
## [[5]]
## subgrouphis
##         0         1 
## 0.8851687 0.1148313 
## 
## [[6]]
## subgrouphis
##         0         1 
## 0.8861607 0.1138393 
## 
## [[7]]
## subgrouphis
##         0         1 
## 0.8861607 0.1138393 
## 
## [[8]]
## subgrouphis
##         0         1 
## 0.8856647 0.1143353 
## 
## [[9]]
## subgrouphis
##         0         1 
## 0.8861607 0.1138393 
## 
## [[10]]
## subgrouphis
##         0         1 
## 0.8854167 0.1145833
with (data=imp, exp=(prop.table(table(subgroupBorn))))
## call :
## with.mids(data = imp, expr = (prop.table(table(subgroupBorn))))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## subgroupBorn
##         0         1 
## 0.1135913 0.8864087 
## 
## [[2]]
## subgroupBorn
##         0         1 
## 0.1116071 0.8883929 
## 
## [[3]]
## subgroupBorn
##         0         1 
## 0.1133433 0.8866567 
## 
## [[4]]
## subgroupBorn
##         0         1 
## 0.1128472 0.8871528 
## 
## [[5]]
## subgroupBorn
##         0         1 
## 0.1121032 0.8878968 
## 
## [[6]]
## subgroupBorn
##         0         1 
## 0.1121032 0.8878968 
## 
## [[7]]
## subgroupBorn
##         0         1 
## 0.1128472 0.8871528 
## 
## [[8]]
## subgroupBorn
##         0         1 
## 0.1121032 0.8878968 
## 
## [[9]]
## subgroupBorn
##         0         1 
## 0.1140873 0.8859127 
## 
## [[10]]
## subgroupBorn
##         0         1 
## 0.1138393 0.8861607
with (data=imp, exp=(prop.table(table(subgroupnotsmart))))
## call :
## with.mids(data = imp, expr = (prop.table(table(subgroupnotsmart))))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## subgroupnotsmart
##         0         1 
## 0.3402778 0.6597222 
## 
## [[2]]
## subgroupnotsmart
##         0         1 
## 0.3377976 0.6622024 
## 
## [[3]]
## subgroupnotsmart
##         0         1 
## 0.3377976 0.6622024 
## 
## [[4]]
## subgroupnotsmart
##         0         1 
## 0.3301091 0.6698909 
## 
## [[5]]
## subgroupnotsmart
##         0         1 
## 0.3412698 0.6587302 
## 
## [[6]]
## subgroupnotsmart
##         0         1 
## 0.3353175 0.6646825 
## 
## [[7]]
## subgroupnotsmart
##         0         1 
## 0.3375496 0.6624504 
## 
## [[8]]
## subgroupnotsmart
##         0         1 
## 0.3382937 0.6617063 
## 
## [[9]]
## subgroupnotsmart
##         0         1 
## 0.3355655 0.6644345 
## 
## [[10]]
## subgroupnotsmart
##         0         1 
## 0.3380456 0.6619544
with (data=imp, exp=(prop.table(table(subgroupdisrspct))))
## call :
## with.mids(data = imp, expr = (prop.table(table(subgroupdisrspct))))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## subgroupdisrspct
##        0        1 
## 0.218998 0.781002 
## 
## [[2]]
## subgroupdisrspct
##         0         1 
## 0.2127976 0.7872024 
## 
## [[3]]
## subgroupdisrspct
##         0         1 
## 0.2157738 0.7842262 
## 
## [[4]]
## subgroupdisrspct
##         0         1 
## 0.2162698 0.7837302 
## 
## [[5]]
## subgroupdisrspct
##         0         1 
## 0.2256944 0.7743056 
## 
## [[6]]
## subgroupdisrspct
##         0         1 
## 0.2137897 0.7862103 
## 
## [[7]]
## subgroupdisrspct
##         0         1 
## 0.2127976 0.7872024 
## 
## [[8]]
## subgroupdisrspct
##         0         1 
## 0.2212302 0.7787698 
## 
## [[9]]
## subgroupdisrspct
##         0         1 
## 0.2234623 0.7765377 
## 
## [[10]]
## subgroupdisrspct
##         0         1 
## 0.2145337 0.7854663
with (data=imp, exp=(prop.table(table(subgroupsex))))
## call :
## with.mids(data = imp, expr = (prop.table(table(subgroupsex))))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## subgroupsex
##         0         1 
## 0.4397321 0.5602679 
## 
## [[2]]
## subgroupsex
##         0         1 
## 0.4409722 0.5590278 
## 
## [[3]]
## subgroupsex
##         0         1 
## 0.4419643 0.5580357 
## 
## [[4]]
## subgroupsex
##         0         1 
## 0.4392361 0.5607639 
## 
## [[5]]
## subgroupsex
##         0         1 
## 0.4427083 0.5572917 
## 
## [[6]]
## subgroupsex
##         0         1 
## 0.4404762 0.5595238 
## 
## [[7]]
## subgroupsex
##        0        1 
## 0.437004 0.562996 
## 
## [[8]]
## subgroupsex
##         0         1 
## 0.4392361 0.5607639 
## 
## [[9]]
## subgroupsex
##         0         1 
## 0.4404762 0.5595238 
## 
## [[10]]
## subgroupsex
##         0         1 
## 0.4409722 0.5590278
with (data=imp, exp=(prop.table(table(age1))))
## call :
## with.mids(data = imp, expr = (prop.table(table(age1))))
## 
## call1 :
## mice(data = dat2[, c("AllEducLevels", "subgrouphis", "subgroupBorn", 
##     "subgroupincom16", "subgroupdisrspct", "subgroupnotsmart", 
##     "educ", "subgroupsex", "age")], m = 10, seed = 58)
## 
## nmis :
##    AllEducLevels      subgrouphis     subgroupBorn  subgroupincom16 
##               66               34               72              206 
## subgroupdisrspct subgroupnotsmart             educ      subgroupsex 
##             1431             1430               66               92 
##              age 
##              333 
## 
## analyses :
## [[1]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[2]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[3]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[4]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[5]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[6]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[7]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[8]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[9]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283 
## 
## [[10]]
## age1
##     (0,24]    (24,39]    (39,59]    (59,79]    (79,99] 
## 0.04298459 0.24195729 0.33495539 0.33062990 0.04947283
est.p<-pool(fit.edu)
print(est.p)
## Class: mipo    m = 10 
##                 term  m    estimate        ubar            b           t dfcom
## 1        (Intercept) 10 13.57166878 0.083605936 0.0087553597 0.093236832  3688
## 2       subgrouphis1 10 -1.05477171 0.022429196 0.0017371266 0.024340035  3688
## 3      subgroupBorn1 10 -0.34861735 0.022777981 0.0009525136 0.023825746  3688
## 4   subgroupincom161 10  0.76926041 0.008765909 0.0006395576 0.009469422  3688
## 5  subgroupnotsmart1 10 -0.06054336 0.012434372 0.0096957467 0.023099693  3688
## 6  subgroupdisrspct1 10  0.52830315 0.015897699 0.0119292211 0.029019842  3688
## 7        subgroupsex 10 -0.38572955 0.008257486 0.0001010958 0.008368692  3688
## 8        age1(24,39] 10  1.15451568 0.055843122 0.0002867405 0.056158536  3688
## 9        age1(39,59] 10  1.04093811 0.054193538 0.0003444165 0.054572396  3688
## 10       age1(59,79] 10  0.98240349 0.055038457 0.0005243987 0.055615296  3688
## 11       age1(79,99] 10  0.88345362 0.090757465 0.0013657507 0.092259791  3688
##            df         riv      lambda         fmi
## 1   672.00355 0.115193922 0.103294969 0.105951859
## 2  1021.23240 0.085194283 0.078506019 0.080305403
## 3  2005.39781 0.045999025 0.043976165 0.044928191
## 4  1103.33299 0.080255609 0.074293165 0.075966634
## 5    41.33929 0.857728979 0.461708349 0.485988920
## 6    43.07822 0.825411463 0.452178306 0.475956207
## 7  3394.77812 0.013467219 0.013288263 0.013869061
## 8  3618.80867 0.005648225 0.005616502 0.006165611
## 9  3590.04121 0.006990836 0.006942304 0.007495071
## 10 3495.36615 0.010480645 0.010371941 0.010937707
## 11 3276.00964 0.016553192 0.016283646 0.016883654
summary(est.p)
##                 term    estimate  std.error  statistic         df      p.value
## 1        (Intercept) 13.57166878 0.30534707 44.4466976  672.00355 0.000000e+00
## 2       subgrouphis1 -1.05477171 0.15601293 -6.7607966 1021.23240 2.304579e-11
## 3      subgroupBorn1 -0.34861735 0.15435591 -2.2585294 2005.39781 2.401971e-02
## 4   subgroupincom161  0.76926041 0.09731096  7.9051779 1103.33299 6.439294e-15
## 5  subgroupnotsmart1 -0.06054336 0.15198583 -0.3983487   41.33929 6.924232e-01
## 6  subgroupdisrspct1  0.52830315 0.17035211  3.1012422   43.07822 3.393776e-03
## 7        subgroupsex -0.38572955 0.09148055 -4.2165196 3394.77812 2.545754e-05
## 8        age1(24,39]  1.15451568 0.23697792  4.8718280 3618.80867 1.153093e-06
## 9        age1(39,59]  1.04093811 0.23360735  4.4559304 3590.04121 8.607925e-06
## 10       age1(59,79]  0.98240349 0.23582895  4.1657458 3495.36615 3.178412e-05
## 11       age1(79,99]  0.88345362 0.30374297  2.9085566 3276.00964 3.655439e-03
lam<-data.frame(lam=est.p$pooled$lambda, param=row.names(est.p$pooled))

ggplot(data=lam,aes(x=param, y=lam))+geom_col()+theme(axis.text.x = element_text(angle = 45, hjust = 1))

library(dplyr)
bnmgss<-gss2021_ZERODraft%>%
  select(educ, subgrouphis, subgroupBorn, subgroupincom16, subgroupnotsmart, subgroupdisrspct, subgroupsex, age)%>%
  filter(complete.cases(.))%>%
  as.data.frame()

summary(lm(educ~subgrouphis+subgroupBorn+subgroupincom16+subgroupnotsmart+subgroupdisrspct+subgroupsex+age, bnmgss))
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age, 
##     data = bnmgss)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.6688  -2.1415   0.4042   1.6836   6.8092 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       14.945258   0.305248  48.961  < 2e-16 ***
## subgrouphis1      -1.132534   0.196673  -5.758 9.60e-09 ***
## subgroupBorn1     -0.422627   0.195618  -2.160   0.0308 *  
## subgroupincom161   0.806355   0.116508   6.921 5.77e-12 ***
## subgroupnotsmart1 -0.098224   0.139085  -0.706   0.4801    
## subgroupdisrspct1  0.416147   0.158708   2.622   0.0088 ** 
## subgroupsex       -0.221005   0.112479  -1.965   0.0495 *  
## age               -0.003931   0.003403  -1.155   0.2482    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.697 on 2343 degrees of freedom
## Multiple R-squared:  0.04089,    Adjusted R-squared:  0.03803 
## F-statistic: 14.27 on 7 and 2343 DF,  p-value: < 2.2e-16
fit1<-lm(educ~subgrouphis+subgroupBorn+subgroupincom16+subgroupnotsmart+subgroupdisrspct+subgroupsex+age, data=gss2021_ZERODraft)
summary(fit1)
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age, 
##     data = gss2021_ZERODraft)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.6688  -2.1415   0.4042   1.6836   6.8092 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       14.945258   0.305248  48.961  < 2e-16 ***
## subgrouphis1      -1.132534   0.196673  -5.758 9.60e-09 ***
## subgroupBorn1     -0.422627   0.195618  -2.160   0.0308 *  
## subgroupincom161   0.806355   0.116508   6.921 5.77e-12 ***
## subgroupnotsmart1 -0.098224   0.139085  -0.706   0.4801    
## subgroupdisrspct1  0.416147   0.158708   2.622   0.0088 ** 
## subgroupsex       -0.221005   0.112479  -1.965   0.0495 *  
## age               -0.003931   0.003403  -1.155   0.2482    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.697 on 2343 degrees of freedom
##   (1681 observations deleted due to missingness)
## Multiple R-squared:  0.04089,    Adjusted R-squared:  0.03803 
## F-statistic: 14.27 on 7 and 2343 DF,  p-value: < 2.2e-16
fit.imp<-lm(educ~subgrouphis+subgroupBorn+subgroupincom16+subgroupnotsmart+subgroupdisrspct+subgroupsex+age, data=dat.imp)

summary(fit.imp)
## 
## Call:
## lm(formula = educ ~ subgrouphis + subgroupBorn + subgroupincom16 + 
##     subgroupnotsmart + subgroupdisrspct + subgroupsex + age, 
##     data = dat.imp)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -15.626  -2.124   0.356   1.691   7.151 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       14.5522055  0.2284391  63.703  < 2e-16 ***
## subgrouphis1      -1.2174737  0.1426718  -8.533  < 2e-16 ***
## subgroupBorn1     -0.4893178  0.1411732  -3.466 0.000534 ***
## subgroupincom161   0.7993754  0.0898499   8.897  < 2e-16 ***
## subgroupnotsmart1 -0.1176638  0.1067295  -1.102 0.270333    
## subgroupdisrspct1  0.7392800  0.1206792   6.126 9.88e-10 ***
## subgroupsex       -0.3218888  0.0870016  -3.700 0.000219 ***
## age               -0.0005952  0.0026225  -0.227 0.820463    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.731 on 4024 degrees of freedom
## Multiple R-squared:  0.05275,    Adjusted R-squared:  0.0511 
## F-statistic: 32.01 on 7 and 4024 DF,  p-value: < 2.2e-16