1 Data Cleaning

1.1 Load Libraries

library(tidyverse) # for the map() command
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(psych) # for the describe () command
## 
## Attaching package: 'psych'
## 
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
library(naniar) # for the gg_miss-upset() command
library(expss) # for the cross_cases() command
## Loading required package: maditr
## 
## To select columns from data: columns(mtcars, mpg, vs:carb)
## 
## 
## Attaching package: 'maditr'
## 
## The following objects are masked from 'package:dplyr':
## 
##     between, coalesce, first, last
## 
## The following object is masked from 'package:purrr':
## 
##     transpose
## 
## The following object is masked from 'package:readr':
## 
##     cols
## 
## 
## Attaching package: 'expss'
## 
## The following object is masked from 'package:naniar':
## 
##     is_na
## 
## The following objects are masked from 'package:stringr':
## 
##     fixed, regex
## 
## The following objects are masked from 'package:dplyr':
## 
##     compute, contains, na_if, recode, vars, where
## 
## The following objects are masked from 'package:purrr':
## 
##     keep, modify, modify_if, when
## 
## The following objects are masked from 'package:tidyr':
## 
##     contains, nest
## 
## The following object is masked from 'package:ggplot2':
## 
##     vars
library(ggplot2)

1.2 Import Data

df <- read.csv(file="/Users/lydiaschwartz/Desktop/r studio/Data Cleaning and Basic Statistics HW/EAMMi2-Data1.2.csv", header=T)

1.3 Viewing Data

names(df)
##   [1] "StartDate"             "EndDate"               "Status"               
##   [4] "Progress"              "Duration..in.seconds." "Finished"             
##   [7] "RecordedDate"          "ResponseId"            "RecipientLastName"    
##  [10] "RecipientFirstName"    "RecipientEmail"        "ExternalReference"    
##  [13] "DistributionChannel"   "informedconsent"       "moa1.1_1"             
##  [16] "moa1.1_2"              "moa1.1_3"              "moa1.1_4"             
##  [19] "moa1.1_5"              "moa1.1_6"              "moa1.1_7"             
##  [22] "moa1.1_8"              "moa1.1_9"              "moa1.1_10"            
##  [25] "moa1.2_1"              "moa1.2_2"              "moa1.2_3"             
##  [28] "moa1.2_4"              "moa1.2_5"              "moa1.2_6"             
##  [31] "moa1.2_7"              "moa1.2_8"              "moa1.2_9"             
##  [34] "moa1.2_10"             "moa2.1_1"              "moa2.1_2"             
##  [37] "moa2.1_3"              "moa2.1_4"              "moa2.1_5"             
##  [40] "moa2.1_6"              "moa2.1_7"              "moa2.1_8"             
##  [43] "moa2.1_9"              "moa2.1_10"             "moa2.2_1"             
##  [46] "moa2.2_2"              "moa2.2_3"              "moa2.2_4"             
##  [49] "moa2.2_5"              "moa2.2_6"              "moa2.2_7"             
##  [52] "moa2.2_8"              "moa2.2_9"              "moa2.2_10"            
##  [55] "adult_Q"               "MOA_IMP_biascheck"     "MOA_ach_biascheck"    
##  [58] "MOA_IMP_dummy"         "MOA.ACH_dummy"         "Q65_First.Click"      
##  [61] "Q65_Last.Click"        "Q65_Page.Submit"       "Q65_Click.Count"      
##  [64] "IDEA_1"                "IDEA_2"                "IDEA_3"               
##  [67] "IDEA_4"                "IDEA_5"                "IDEA_6"               
##  [70] "IDEA_7"                "IDEA_8"                "IDEA.biascheck"       
##  [73] "IDEA.bias.dummy"       "Q66_First.Click"       "Q66_Last.Click"       
##  [76] "Q66_Page.Submit"       "Q66_Click.Count"       "politics"             
##  [79] "party"                 "president"             "Q74_First.Click"      
##  [82] "Q74_Last.Click"        "Q74_Page.Submit"       "Q74_Click.Count"      
##  [85] "swb_1"                 "swb_2"                 "swb_3"                
##  [88] "swb_4"                 "swb_5"                 "swb_6"                
##  [91] "Q67_First.Click"       "Q67_Last.Click"        "Q67_Page.Submit"      
##  [94] "Q67_Click.Count"       "mindful_1"             "mindful_2"            
##  [97] "mindful_3"             "mindful_4"             "mindful_5"            
## [100] "mindful_6"             "mindful_7"             "mindful_8"            
## [103] "mindful_9"             "mindful_10"            "mindful_11"           
## [106] "mindful_12"            "mindful_13"            "mindful_14"           
## [109] "mindful_15"            "mindful_biascheck"     "mindful_bias_dummy"   
## [112] "Q68_First.Click"       "Q68_Last.Click"        "Q68_Page.Submit"      
## [115] "Q68_Click.Count"       "belong_1"              "belong_2"             
## [118] "belong_3"              "belong_4"              "belong_5"             
## [121] "belong_6"              "belong_7"              "belong_8"             
## [124] "belong_9"              "belong_10"             "belnow"               
## [127] "belong_biascheck"      "belong_bias_dummy"     "Q72_First.Click"      
## [130] "Q72_Last.Click"        "Q72_Page.Submit"       "Q72_Click.Count"      
## [133] "efficacy_1"            "efficacy_2"            "efficacy_3"           
## [136] "efficacy_4"            "efficacy_5"            "efficacy_6"           
## [139] "efficacy_7"            "efficacy_8"            "efficacy_9"           
## [142] "efficacy_10"           "efficacy_biascheck"    "efficacy_bias_dummy"  
## [145] "Q77_First.Click"       "Q77_Last.Click"        "Q77_Page.Submit"      
## [148] "Q77_Click.Count"       "support_1"             "support_2"            
## [151] "support_3"             "support_4"             "support_5"            
## [154] "support_6"             "support_7"             "support_8"            
## [157] "support_9"             "support_10"            "support_11"           
## [160] "support_12"            "support_biascheck"     "support_bias_dummy"   
## [163] "Q96_First.Click"       "Q96_Last.Click"        "Q96_Page.Submit"      
## [166] "Q96_Click.Count"       "SocMedia_1"            "SocMedia_2"           
## [169] "SocMedia_3"            "SocMedia_4"            "SocMedia_5"           
## [172] "SocMedia_6"            "SocMedia_7"            "SocMedia_8"           
## [175] "SocMedia_9"            "SocMedia_10"           "SocMedia_11"          
## [178] "SocMedia_biascheck"    "SocMedia_bias_dummy"   "Q80_First.Click"      
## [181] "Q80_Last.Click"        "Q80_Page.Submit"       "Q80_Click.Count"      
## [184] "usdream_1"             "usdream_2"             "usdream_3"            
## [187] "Q73_First.Click"       "Q73_Last.Click"        "Q73_Page.Submit"      
## [190] "Q73_Click.Count"       "freq"                  "transgres"            
## [193] "relation"              "relation_10_TEXT"      "fault"                
## [196] "feel"                  "common"                "attenion2"            
## [199] "Q78_First.Click"       "Q78_Last.Click"        "Q78_Page.Submit"      
## [202] "Q78_Click.Count"       "transgres_1"           "transgres_2"          
## [205] "transgres_3"           "transgres_4"           "Q79_First.Click"      
## [208] "Q79_Last.Click"        "Q79_Page.Submit"       "Q79_Click.Count"      
## [211] "NPI1"                  "NPI2"                  "NPI3"                 
## [214] "NPI4"                  "NPI5"                  "NPI6"                 
## [217] "NPI7"                  "NPI8"                  "NPI9"                 
## [220] "NPI10"                 "NPI11"                 "NPI12"                
## [223] "NPI13"                 "exploit_1"             "exploit_2"            
## [226] "exploit_3"             "NPI_biascheck"         "NPI_bias_dummy"       
## [229] "Q76_First.Click"       "Q76_Last.Click"        "Q76_Page.Submit"      
## [232] "Q76_Click.Count"       "Q11"                   "Q14_1"                
## [235] "Q14_2"                 "Q14_3"                 "Q14_4"                
## [238] "Q14_5"                 "Q14_6"                 "Q14_6_TEXT"           
## [241] "Q10_1"                 "Q10_2"                 "Q10_3"                
## [244] "Q10_4"                 "Q10_5"                 "Q10_6"                
## [247] "Q10_7"                 "Q10_8"                 "Q10_9"                
## [250] "Q10_10"                "Q10_11"                "Q10_12"               
## [253] "Q10_13"                "Q10_14"                "Q10_15"               
## [256] "Q71_First.Click"       "Q71_Last.Click"        "Q71_Page.Submit"      
## [259] "Q71_Click.Count"       "physSx_1"              "physSx_2"             
## [262] "physSx_3"              "physSx_4"              "physSx_5"             
## [265] "physSx_6"              "physSx_7"              "physSx_8"             
## [268] "physSx_9"              "physSx_10"             "physSx_11"            
## [271] "physSx_12"             "physSx_13"             "phys_sx_biaschec"     
## [274] "phys_sym_bias_dummy."  "Q70_First.Click"       "Q70_Last.Click"       
## [277] "Q70_Page.Submit"       "Q70_Click.Count"       "stress_1"             
## [280] "stress_2"              "stress_3"              "stress_4"             
## [283] "stress_5"              "stress_6"              "stress_7"             
## [286] "stress_8"              "stress_9"              "stress_10"            
## [289] "stress_biascheck"      "stress_bias_dummy"     "Q69_First.Click"      
## [292] "Q69_Last.Click"        "Q69_Page.Submit"       "Q69_Click.Count"      
## [295] "marriage1_1"           "marriage1_2"           "marriage1_3"          
## [298] "marriage1_4"           "marriage2"             "marriage3"            
## [301] "marriage4"             "marriage5"             "Q75_First.Click"      
## [304] "Q75_Last.Click"        "Q75_Page.Submit"       "Q75_Click.Count"      
## [307] "school"                "sex"                   "age"                  
## [310] "edu"                   "sibling"               "race"                 
## [313] "race_6_TEXT"           "Q82"                   "Q83"                  
## [316] "income"                "place2"                "Q80"                  
## [319] "place"                 "Q81"                   "Q81_First.Click"      
## [322] "Q81_Last.Click"        "Q81_Page.Submit"       "Q81_Click.Count"      
## [325] "comments"              "affiliation"           "response_bias_SUM"    
## [328] "school_coded"
head(df)
##   StartDate  EndDate Status Progress Duration..in.seconds. Finished
## 1  12/02/16 12/02/16      0      100                  1839        1
## 2  11/16/16 11/16/16      0      100                  1467        1
## 3  11/09/16 11/09/16      0       99                  2185        0
## 4  11/07/16 11/07/16      0      100                  2904        1
## 5  11/18/16 11/18/16      0      100                  1229        1
## 6  11/07/16 11/07/16      0      100                  2068        1
##       RecordedDate        ResponseId RecipientLastName RecipientFirstName
## 1   12/2/2016 5:38 R_BJN3bQqi1zUMid3                NA                 NA
## 2 11/16/2016 11:53 R_2TGbiBXmAtxywsD                NA                 NA
## 3  11/16/2016 1:22 R_12G7bIqN2wB2N65                NA                 NA
## 4   11/7/2016 4:54 R_39pldNoon8CePfP                NA                 NA
## 5  11/18/2016 0:30 R_1QiKb2LdJo1Bhvv                NA                 NA
## 6  11/7/2016 14:42 R_pmwDTZyCyCycXwB                NA                 NA
##   RecipientEmail ExternalReference DistributionChannel informedconsent moa1.1_1
## 1             NA                NA           anonymous               1        4
## 2             NA                NA           anonymous               1        4
## 3             NA                NA           anonymous               1        4
## 4             NA                NA           anonymous               1        4
## 5             NA                NA           anonymous               1        4
## 6             NA                NA           anonymous               1        4
##   moa1.1_2 moa1.1_3 moa1.1_4 moa1.1_5 moa1.1_6 moa1.1_7 moa1.1_8 moa1.1_9
## 1        4        3        2        2        3        2        1        4
## 2        4        4        2        3        3        4        3        3
## 3        4        4        1        1        4        2        3        4
## 4        3        3        1        1        2        1        1        1
## 5        4        4        1        1        3        1        1        4
## 6        3        4        2        3        4        2        1        3
##   moa1.1_10 moa1.2_1 moa1.2_2 moa1.2_3 moa1.2_4 moa1.2_5 moa1.2_6 moa1.2_7
## 1         3        2        1        2        1        1        1        2
## 2         3        1        1        2        2        1        1        1
## 3         3        2        1        1        1        1        1        2
## 4         1        1        1        1        1        1        1        2
## 5         3        2        1        2        1        3        3        1
## 6         4        1        1        1        1        1        1        1
##   moa1.2_8 moa1.2_9 moa1.2_10 moa2.1_1 moa2.1_2 moa2.1_3 moa2.1_4 moa2.1_5
## 1        3        3         2        4        4        4        4        3
## 2        1        2         3        3        4        2        4        4
## 3        3        3         3        4        2        2        4        3
## 4        3        2         3        4        2        2        4        3
## 5        1        3         2        4        4        3        4        4
## 6        1        1         3        4        4        4        4        2
##   moa2.1_6 moa2.1_7 moa2.1_8 moa2.1_9 moa2.1_10 moa2.2_1 moa2.2_2 moa2.2_3
## 1        4        4        4        3         2        2        1        1
## 2        3        2        4        2         1        3        1        2
## 3        3        4        4        3         2        2        1        1
## 4        2        4        2        3         2        2        1        1
## 5        3        4        4        3         3        3        3        3
## 6        4        3        4        4         4        2        1        1
##   moa2.2_4 moa2.2_5 moa2.2_6 moa2.2_7 moa2.2_8 moa2.2_9 moa2.2_10 adult_Q
## 1        3        2        3        3        2        2         1       1
## 2        2        1        2        2        2        2         1       1
## 3        2        1        2        3        1        1         1       1
## 4        2        1        2        2        2        2         1       1
## 5        3        3        2        3        3        2         3       1
## 6        2        1        2        2        1        2         1       1
##   MOA_IMP_biascheck MOA_ach_biascheck MOA_IMP_dummy MOA.ACH_dummy
## 1                64                38             0             0
## 2                62                33             0             0
## 3                61                33             0             0
## 4                46                32             0             0
## 5                62                47             0             0
## 6                67                27             0             0
##   Q65_First.Click Q65_Last.Click Q65_Page.Submit Q65_Click.Count IDEA_1 IDEA_2
## 1          37.139        307.731         308.890              45      3      4
## 2         120.026        336.428         338.177              58      4      4
## 3          27.705        154.447         155.544              47      4      4
## 4          19.656        297.285         298.509              43      4      4
## 5          12.867        121.932         122.254              46      4      4
## 6          15.652        223.372         225.431              50      3      4
##   IDEA_3 IDEA_4 IDEA_5 IDEA_6 IDEA_7 IDEA_8 IDEA.biascheck IDEA.bias.dummy
## 1      4      3      4      4      4      4             30               0
## 2      4      4      3      4      4      4             31               0
## 3      4      4      4      4      3      3             30               0
## 4      3      3      4      4      4      4             30               0
## 5      3      4      3      3      3      4             28               0
## 6      3      3      4      4      3      2             26               0
##   Q66_First.Click Q66_Last.Click Q66_Page.Submit Q66_Click.Count politics party
## 1          44.705         86.585          87.514              11        2     3
## 2          19.927         65.200          67.162              13        1     4
## 3          23.170         51.401          52.408              11        2     8
## 4          27.467        172.797         174.119               9        8     8
## 5          23.952         52.176          53.355               9        1     8
## 6           9.475         72.935          73.937              13        8     8
##                                                                            president
## 1                                                                                   
## 2 None, but I was a US Citizen and had a gun next to my head, I would vote for Trump
## 3                                                                    Hillary Clinton
## 4                                                                    Hillary Clinton
## 5                                                                            Sanders
## 6                                                                            No one.
##   Q74_First.Click Q74_Last.Click Q74_Page.Submit Q74_Click.Count swb_1 swb_2
## 1          13.052         40.445          46.399               2     4     6
## 2           4.899         28.125          55.107               6     3     4
## 3          34.868         48.402          56.371               4     1     2
## 4          66.886        119.219         135.295               4     5     6
## 5          23.614         32.221          35.338               4     2     5
## 6          12.314         41.232          54.436               6     4     4
##   swb_3 swb_4 swb_5 swb_6 Q67_First.Click Q67_Last.Click Q67_Page.Submit
## 1     5     5     3     3           9.627         40.388          41.198
## 2     5     5     4     4           8.607         29.115          29.955
## 3     2     2     2     2          37.656         53.240          54.603
## 4     6     5     6     3          13.587         55.197          56.150
## 5     5     3     2     5           6.798         22.246          23.138
## 6     6     5     1     4           7.927         44.108          48.227
##   Q67_Click.Count mindful_1 mindful_2 mindful_3 mindful_4 mindful_5 mindful_6
## 1               7         4         2         2         2         4         1
## 2               7         2         2         2         1         3         1
## 3               9         2         3         1         2         3         1
## 4               7         2         2         1         2         2         1
## 5               7         4         5         3         2         4         1
## 6               7         1         4         3         3         5         1
##   mindful_7 mindful_8 mindful_9 mindful_10 mindful_11 mindful_12 mindful_13
## 1         2         2         2          2          2          4          1
## 2         1         1         2          2          1          2          1
## 3         2         2         5          3          2          1          1
## 4         2         2         3          2          2          3          3
## 5         4         2         1          2          2          6          5
## 6         5         6         3          5          1          3          2
##   mindful_14 mindful_15 mindful_biascheck mindful_bias_dummy Q68_First.Click
## 1          2          4                36                  0          32.692
## 2          1          5                27                  0          13.184
## 3          1          4                33                  0          48.022
## 4          2          4                33                  0         110.432
## 5          2          5                48                  0          81.124
## 6          4          5                51                  0          33.458
##   Q68_Last.Click Q68_Page.Submit Q68_Click.Count belong_1 belong_2 belong_3
## 1        154.123         157.391              15        4        2        4
## 2         76.856          77.629              22        2        3        1
## 3        142.665         143.398              20        4        4        2
## 4        255.734         257.134              17        3        4        1
## 5        134.499         135.848              16        4        3        3
## 6        212.770         213.511              25        2        3        2
##   belong_4 belong_5 belong_6 belong_7 belong_8 belong_9 belong_10 belnow
## 1        4        4        2        5        2        4         3      4
## 2        5        4        4        2        4        5         4      4
## 3        5        4        4        2        3        4         4      2
## 4        5        4        5        2        4        4         4      4
## 5        4        4        5        1        3        2         3      4
## 6        5        4        5        1        4        4         4      3
##   belong_biascheck belong_bias_dummy Q72_First.Click Q72_Last.Click
## 1               38                 0           8.221         80.460
## 2               38                 0           6.774         65.987
## 3               38                 0           5.697         71.192
## 4               40                 0           6.096         86.373
## 5               36                 0          58.613        121.436
## 6               37                 0           6.124         70.638
##   Q72_Page.Submit Q72_Click.Count efficacy_1 efficacy_2 efficacy_3 efficacy_4
## 1          82.781              13          4          3          4          3
## 2          67.158              18          3          3          3          4
## 3          72.176              13          3          3          1          2
## 4          87.231              14          4          1          2          3
## 5         122.620              11          3          3          2          3
## 6          72.503              15          3          2          3          2
##   efficacy_5 efficacy_6 efficacy_7 efficacy_8 efficacy_9 efficacy_10
## 1          3          4          3          3          4           3
## 2          4          4          3          3          3           4
## 3          2          3          1          3          2           2
## 4          2          4          2          3          4           3
## 5          3          3          3          3          4           3
## 6          1          3          2          3          3           2
##   efficacy_biascheck efficacy_bias_dummy Q77_First.Click Q77_Last.Click
## 1                 34                   0          15.372        104.315
## 2                 34                   0          11.711         74.768
## 3                 22                   0          38.964        179.667
## 4                 28                   0          16.050        168.886
## 5                 30                   0          32.150         59.030
## 6                 24                   0           6.259         69.817
##   Q77_Page.Submit Q77_Click.Count support_1 support_2 support_3 support_4
## 1         105.195              11         7         4         6         5
## 2          75.558              21         7         7         7         6
## 3         182.727              13         6         6         5         2
## 4         170.530              10         6         6         7         3
## 5          60.273              11         6         6         5         5
## 6          70.970              13         7         7         6         6
##   support_5 support_6 support_7 support_8 support_9 support_10 support_11
## 1         6         6         7         7         7          4          6
## 2         7         6         6         7         7          7          7
## 3         7         5         5         3         6          6          5
## 4         7         6         5         4         6          6          6
## 5         6         6         7         6         6          6          7
## 6         7         2         2         1         1          7          6
##   support_12 support_biascheck support_bias_dummy Q96_First.Click
## 1          7                72                  0          12.241
## 2          7                81                  0           7.482
## 3          6                62                  0          18.845
## 4          5                67                  0          30.307
## 5          6                72                  0          13.096
## 6          2                54                  0          24.841
##   Q96_Last.Click Q96_Page.Submit Q96_Click.Count SocMedia_1 SocMedia_2
## 1         91.497          92.381              17          4          2
## 2         34.247          35.467              19          3          2
## 3         74.388          76.037              14          3          3
## 4        150.285         151.869              18          4          2
## 5         43.727          45.041              12          3          3
## 6        110.942         111.636              18          1          1
##   SocMedia_3 SocMedia_4 SocMedia_5 SocMedia_6 SocMedia_7 SocMedia_8 SocMedia_9
## 1          5          3          5          5          5          4          5
## 2          4          2          1          1          1          1          2
## 3          4          2          3          4          4          2          3
## 4          5          2          2          4          4          1          3
## 5          5          2          2          4          4          2          4
## 6          2          1          1          1          2          1          1
##   SocMedia_10 SocMedia_11 SocMedia_biascheck SocMedia_bias_dummy
## 1           5           4                 47                   0
## 2           4           2                 23                   0
## 3           3           3                 34                   0
## 4           4           4                 35                   0
## 5           4           4                 37                   0
## 6           1           1                 13                   0
##   Q80_First.Click Q80_Last.Click Q80_Page.Submit Q80_Click.Count usdream_1
## 1          17.470         59.573          60.431              11         4
## 2          10.768         45.340          47.007              15         4
## 3          30.059         66.498          68.018              12         2
## 4         112.432        161.455         163.036              11         1
## 5          22.246         53.406          55.275              12         3
## 6           5.361         89.568          90.785              15         1
##   usdream_2 usdream_3 Q73_First.Click Q73_Last.Click Q73_Page.Submit
## 1         4         1          14.144         24.073          27.558
## 2         4         1           5.240         27.206          36.973
## 3         2         1           5.671         16.751          19.891
## 4         3         1          71.153         82.926          98.687
## 5         4         1          21.293         32.533          35.448
## 6         1         2           6.771         37.203          45.268
##   Q73_Click.Count freq
## 1               3    3
## 2               5    4
## 3               3    6
## 4               3    3
## 5               3    4
## 6               4    2
##                                                                      transgres
## 1 told my friend that something he did was wrong and hurtful to somebody else.
## 2                                                             got very passive
## 3                                                     you don't want to see me
## 4                                I tried to excuse myself and said I was sorry
## 5                                                                  Got angry. 
## 6                                                     why didn't you tell him?
##   relation relation_10_TEXT fault        feel common attenion2 Q78_First.Click
## 1        4                      6           3      3         7          22.366
## 2        7                      2 1,4,5,6,7,8      4         7           6.174
## 3        5                      3           6      5         7           9.705
## 4        4                      3      10,4,6      2         7          58.517
## 5        7                      2         5,6      3        NA           5.647
## 6        6                      3           1      4         7          15.862
##   Q78_Last.Click Q78_Page.Submit Q78_Click.Count transgres_1 transgres_2
## 1        148.934         153.134              12           3           1
## 2         72.626          74.609              22           4           3
## 3        109.167         111.847              10           3           1
## 4        195.290         199.227              14           4           2
## 5         61.894          65.304              13           4           2
## 6        174.894         177.914              21           1           3
##   transgres_3 transgres_4 Q79_First.Click Q79_Last.Click Q79_Page.Submit
## 1           1           1          16.239         27.208          28.488
## 2           2           1          20.400         31.697          32.941
## 3           4           1          57.079         77.311          82.171
## 4           1           1          23.005         49.246          50.303
## 5           2           1          11.727         23.987          25.130
## 6           1           1          14.701         35.077          40.711
##   Q79_Click.Count NPI1 NPI2 NPI3 NPI4 NPI5 NPI6 NPI7 NPI8 NPI9 NPI10 NPI11
## 1               4    1    1    1    2    1    2    2    1    2     2     1
## 2               4    2    2    1    2    1    2    1    1    1     1     2
## 3               5    2    1    1    2    1    2    1    1    1     2     2
## 4               5    2    2    1    2    1    2    2    1    1     2     1
## 5               8    1    2    1    2    1    2    2    2    1     2     1
## 6               6    2    2    2    1    2    2    2    2    1     2     1
##   NPI12 NPI13 exploit_1 exploit_2 exploit_3 NPI_biascheck NPI_bias_dummy
## 1     2     1         2         2         2            19              0
## 2     1     1         4         4         3            18              0
## 3     2     1         5         5         3            19              0
## 4     2     1         2         1         2            20              0
## 5     2     2         5         4         3            21              0
## 6     2     2         1         1         2            23              0
##   Q76_First.Click Q76_Last.Click Q76_Page.Submit Q76_Click.Count Q11 Q14_1
## 1          25.575        141.688         142.668              16   2     2
## 2          10.473        103.989         105.050              35   2     2
## 3          29.957        129.786         132.515              22   2     2
## 4          49.912        222.748         223.703              18   2     2
## 5           5.315        108.076         109.093              25   2     2
## 6           5.353         94.716          98.083              25   2     2
##   Q14_2 Q14_3 Q14_4 Q14_5 Q14_6 Q14_6_TEXT Q10_1 Q10_2 Q10_3 Q10_4 Q10_5 Q10_6
## 1     2     2     2     2     2               NA    NA    NA    NA    NA    NA
## 2     2     2     2     2    NA               NA    NA    NA    NA    NA    NA
## 3     2     2     1     2     2                4     3     4     2     4     5
## 4     2     2     2     2     2               NA    NA    NA    NA    NA    NA
## 5     2     2     2     2    NA               NA    NA    NA    NA    NA    NA
## 6     2     2     2     2     2               NA    NA    NA    NA    NA    NA
##   Q10_7 Q10_8 Q10_9 Q10_10 Q10_11 Q10_12 Q10_13 Q10_14 Q10_15 Q71_First.Click
## 1    NA    NA    NA     NA     NA     NA     NA     NA     NA           5.199
## 2    NA    NA    NA     NA     NA     NA     NA     NA     NA           6.497
## 3     3     1     4      1      3      1      4      2      2           4.533
## 4    NA    NA    NA     NA     NA     NA     NA     NA     NA           7.430
## 5    NA    NA    NA     NA     NA     NA     NA     NA     NA           5.020
## 6    NA    NA    NA     NA     NA     NA     NA     NA     NA          15.872
##   Q71_Last.Click Q71_Page.Submit Q71_Click.Count physSx_1 physSx_2 physSx_3
## 1         28.239          28.669               7        3        1        1
## 2         23.457          24.600               8        2        2        1
## 3        115.299         116.410              21        3        1        1
## 4        122.984         128.490              12        2        3        2
## 5          7.996          17.581               6        1        1        1
## 6         32.719          33.562               7        1        2        2
##   physSx_4 physSx_5 physSx_6 physSx_7 physSx_8 physSx_9 physSx_10 physSx_11
## 1        2        1        2        1        2        2         2         2
## 2        1        3        1        1        3        2         1         1
## 3        3        2        2        1        3        2         1         2
## 4        3        1        2        1        2        1         1         2
## 5        2        1        1        1        1        2         1         1
## 6        1        2        2        1        3        1         2         2
##   physSx_12 physSx_13 phys_sx_biaschec phys_sym_bias_dummy. Q70_First.Click
## 1         3         2               24                    0          13.703
## 2         3         3               24                    0           5.209
## 3         3         3               27                    0          36.694
## 4         3         2               25                    0          47.764
## 5         2         1               16                    0           4.561
## 6         3         2               24                    0          10.062
##   Q70_Last.Click Q70_Page.Submit Q70_Click.Count stress_1 stress_2 stress_3
## 1         54.318          60.887              16        2        4        5
## 2         27.841          28.982              17        4        5        5
## 3         79.053          80.519              18        4        4        5
## 4         77.308          79.068              13        2        4        4
## 5         18.707          20.343              14        4        3        3
## 6         80.372          82.827              18        3        3        5
##   stress_4 stress_5 stress_6 stress_7 stress_8 stress_9 stress_10
## 1        3        3        3        4        3        3         3
## 2        4        3        3        2        2        4         4
## 3        2        1        5        2        2        4         4
## 4        3        4        5        4        2        2         2
## 5        4        3        4        3        3        4         4
## 6        2        3        5        2        1        3         2
##   stress_biascheck stress_bias_dummy Q69_First.Click Q69_Last.Click
## 1               33                 0          10.383        100.428
## 2               36                 0           9.729         43.281
## 3               33                 0          42.213         89.329
## 4               32                 0          43.206        180.767
## 5               35                 0          63.797         76.765
## 6               29                 0          16.412         94.543
##   Q69_Page.Submit Q69_Click.Count marriage1_1 marriage1_2 marriage1_3
## 1         101.287              14          10          25          30
## 2          44.423              13          10          25          35
## 3          91.194              12           1           1          59
## 4         182.932              12           0           0          60
## 5          77.998              13          25          25          25
## 6          95.709              15          13          33          21
##   marriage1_4 marriage2 marriage3 marriage4 marriage5 Q75_First.Click
## 1          35         2        20         1         2          20.229
## 2          30         3        19         1         1          11.100
## 3          39         2        19         1         1          40.807
## 4          40         1        16         1         1          74.778
## 5          25         2        14         3         1          21.347
## 6          33         3        17         1         1          19.366
##   Q75_Last.Click Q75_Page.Submit Q75_Click.Count                     school sex
## 1        164.560         165.870              32                        ACG   2
## 2         60.223          61.354              17                  ACG,Deree   1
## 3        160.366         165.459              23 American College of Greece   1
## 4        155.971         156.974              13 American College of Greece   2
## 5         48.802          49.742              11 American College of Greece   1
## 6        181.747         183.288              25 American College of Greece   2
##   age edu sibling race race_6_TEXT Q82 Q83 income place2 Q80 place    Q81
## 1  20   2       2    1               3  NA      3      2  NA    NA Greece
## 2  23   5       2    1               3  NA      3      2  NA    NA Greece
## 3  23   2       5    1               3  NA      1      2  NA    NA Greece
## 4  22   2   3,5,7    6       Greek   3  NA      1      2  NA    NA Greece
## 5  18   2   3,5,7    1               3  NA      6      2  NA    NA Greece
## 6  23   2     2,4    1               3  NA      1      2  NA    NA Greece
##   Q81_First.Click Q81_Last.Click Q81_Page.Submit Q81_Click.Count
## 1           0.000          0.000           5.781               0
## 2           2.424          2.424           6.621               1
## 3           0.000          0.000           9.294               0
## 4           2.679          2.679           5.803               1
## 5           1.632          1.632           5.500               1
## 6           2.766          2.766           6.995               1
##                                                                                                                              comments
## 1                                                                                                        I have completed this survey
## 2                                                                                                        i have completed this survey
## 3 Didn't know my household income.\nI have completed the last question about disability wrong. I read "not" the moment I pressed ">>"
## 4                                                                                                        I have completed this survey
## 5                                                                                                        i have completed this survey
## 6                                                                                             The question saying "Select" was funny.
##   affiliation response_bias_SUM school_coded
## 1    acgreece                 0     acgreece
## 2    acgreece                 0     acgreece
## 3    acgreece                 0     acgreece
## 4    acgreece                 0     acgreece
## 5    acgreece                 0     acgreece
## 6    acgreece                 0     acgreece
str(df)
## 'data.frame':    3182 obs. of  328 variables:
##  $ StartDate            : chr  "12/02/16" "11/16/16" "11/09/16" "11/07/16" ...
##  $ EndDate              : chr  "12/02/16" "11/16/16" "11/09/16" "11/07/16" ...
##  $ Status               : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Progress             : int  100 100 99 100 100 100 99 99 100 100 ...
##  $ Duration..in.seconds.: int  1839 1467 2185 2904 1229 2068 1656 1839 1160 2134 ...
##  $ Finished             : int  1 1 0 1 1 1 0 0 1 1 ...
##  $ RecordedDate         : chr  "12/2/2016 5:38" "11/16/2016 11:53" "11/16/2016 1:22" "11/7/2016 4:54" ...
##  $ ResponseId           : chr  "R_BJN3bQqi1zUMid3" "R_2TGbiBXmAtxywsD" "R_12G7bIqN2wB2N65" "R_39pldNoon8CePfP" ...
##  $ RecipientLastName    : logi  NA NA NA NA NA NA ...
##  $ RecipientFirstName   : logi  NA NA NA NA NA NA ...
##  $ RecipientEmail       : logi  NA NA NA NA NA NA ...
##  $ ExternalReference    : logi  NA NA NA NA NA NA ...
##  $ DistributionChannel  : chr  "anonymous" "anonymous" "anonymous" "anonymous" ...
##  $ informedconsent      : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ moa1.1_1             : int  4 4 4 4 4 4 4 4 4 3 ...
##  $ moa1.1_2             : int  4 4 4 3 4 3 4 4 4 2 ...
##  $ moa1.1_3             : int  3 4 4 3 4 4 4 4 4 3 ...
##  $ moa1.1_4             : int  2 2 1 1 1 2 3 3 4 1 ...
##  $ moa1.1_5             : int  2 3 1 1 1 3 3 3 4 1 ...
##  $ moa1.1_6             : int  3 3 4 2 3 4 4 4 4 2 ...
##  $ moa1.1_7             : int  2 4 2 1 1 2 4 3 1 3 ...
##  $ moa1.1_8             : int  1 3 3 1 1 1 4 3 2 3 ...
##  $ moa1.1_9             : int  4 3 4 1 4 3 4 4 4 4 ...
##  $ moa1.1_10            : int  3 3 3 1 3 4 3 3 4 3 ...
##  $ moa1.2_1             : int  2 1 2 1 2 1 1 2 2 2 ...
##  $ moa1.2_2             : int  1 1 1 1 1 1 1 2 2 2 ...
##  $ moa1.2_3             : int  2 2 1 1 2 1 2 2 2 2 ...
##  $ moa1.2_4             : int  1 2 1 1 1 1 1 1 1 1 ...
##  $ moa1.2_5             : int  1 1 1 1 3 1 1 1 1 1 ...
##  $ moa1.2_6             : int  1 1 1 1 3 1 1 1 2 2 ...
##  $ moa1.2_7             : int  2 1 2 2 1 1 3 3 1 3 ...
##  $ moa1.2_8             : int  3 1 3 3 1 1 3 3 3 3 ...
##  $ moa1.2_9             : int  3 2 3 2 3 1 3 3 3 3 ...
##  $ moa1.2_10            : int  2 3 3 3 2 3 3 3 1 3 ...
##  $ moa2.1_1             : int  4 3 4 4 4 4 4 4 4 3 ...
##  $ moa2.1_2             : int  4 4 2 2 4 4 4 4 4 3 ...
##  $ moa2.1_3             : int  4 2 2 2 3 4 3 4 4 3 ...
##  $ moa2.1_4             : int  4 4 4 4 4 4 4 4 4 4 ...
##  $ moa2.1_5             : int  3 4 3 3 4 2 4 4 4 3 ...
##  $ moa2.1_6             : int  4 3 3 2 3 4 4 3 4 4 ...
##  $ moa2.1_7             : int  4 2 4 4 4 3 4 4 2 4 ...
##  $ moa2.1_8             : int  4 4 4 2 4 4 3 4 4 4 ...
##  $ moa2.1_9             : int  3 2 3 3 3 4 4 4 4 4 ...
##  $ moa2.1_10            : int  2 1 2 2 3 4 2 4 2 2 ...
##  $ moa2.2_1             : int  2 3 2 2 3 2 2 2 3 3 ...
##  $ moa2.2_2             : int  1 1 1 1 3 1 1 1 1 2 ...
##  $ moa2.2_3             : int  1 2 1 1 3 1 1 1 3 3 ...
##  $ moa2.2_4             : int  3 2 2 2 3 2 2 3 3 3 ...
##  $ moa2.2_5             : int  2 1 1 1 3 1 1 1 2 2 ...
##  $ moa2.2_6             : int  3 2 2 2 2 2 2 2 3 3 ...
##  $ moa2.2_7             : int  3 2 3 2 3 2 1 2 1 3 ...
##  $ moa2.2_8             : int  2 2 1 2 3 1 2 3 2 2 ...
##  $ moa2.2_9             : int  2 2 1 2 2 2 2 2 3 3 ...
##  $ moa2.2_10            : int  1 1 1 1 3 1 1 1 1 2 ...
##  $ adult_Q              : int  1 1 1 1 1 1 1 1 1 2 ...
##  $ MOA_IMP_biascheck    : int  64 62 61 46 62 67 73 74 71 59 ...
##  $ MOA_ach_biascheck    : int  38 33 33 32 47 27 34 39 40 48 ...
##  $ MOA_IMP_dummy        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ MOA.ACH_dummy        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Q65_First.Click      : num  37.1 120 27.7 19.7 12.9 ...
##  $ Q65_Last.Click       : num  308 336 154 297 122 ...
##  $ Q65_Page.Submit      : num  309 338 156 299 122 ...
##  $ Q65_Click.Count      : int  45 58 47 43 46 50 45 42 47 47 ...
##  $ IDEA_1               : int  3 4 4 4 4 3 4 4 4 4 ...
##  $ IDEA_2               : int  4 4 4 4 4 4 3 3 4 4 ...
##  $ IDEA_3               : int  4 4 4 3 3 3 4 3 3 2 ...
##  $ IDEA_4               : int  3 4 4 3 4 3 4 4 2 2 ...
##  $ IDEA_5               : int  4 3 4 4 3 4 3 3 4 4 ...
##  $ IDEA_6               : int  4 4 4 4 3 4 4 2 4 4 ...
##  $ IDEA_7               : int  4 4 3 4 3 3 3 2 3 3 ...
##  $ IDEA_8               : int  4 4 3 4 4 2 3 3 3 4 ...
##  $ IDEA.biascheck       : int  30 31 30 30 28 26 28 24 27 27 ...
##  $ IDEA.bias.dummy      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Q66_First.Click      : num  44.7 19.9 23.2 27.5 24 ...
##  $ Q66_Last.Click       : num  86.6 65.2 51.4 172.8 52.2 ...
##  $ Q66_Page.Submit      : num  87.5 67.2 52.4 174.1 53.4 ...
##  $ Q66_Click.Count      : int  11 13 11 9 9 13 9 8 8 9 ...
##  $ politics             : int  2 1 2 8 1 8 4 2 8 4 ...
##  $ party                : int  3 4 8 8 8 8 4 1 3 8 ...
##  $ president            : chr  "" "None, but I was a US Citizen and had a gun next to my head, I would vote for Trump" "Hillary Clinton" "Hillary Clinton" ...
##  $ Q74_First.Click      : num  13.1 4.9 34.9 66.9 23.6 ...
##  $ Q74_Last.Click       : num  40.4 28.1 48.4 119.2 32.2 ...
##  $ Q74_Page.Submit      : num  46.4 55.1 56.4 135.3 35.3 ...
##  $ Q74_Click.Count      : int  2 6 4 4 4 6 3 3 3 3 ...
##  $ swb_1                : int  4 3 1 5 2 4 4 5 5 5 ...
##  $ swb_2                : int  6 4 2 6 5 4 4 5 5 4 ...
##  $ swb_3                : int  5 5 2 6 5 6 5 6 5 6 ...
##  $ swb_4                : int  5 5 2 5 3 5 4 5 5 4 ...
##  $ swb_5                : int  3 4 2 6 2 1 1 6 5 6 ...
##  $ swb_6                : int  3 4 2 3 5 4 4 6 5 4 ...
##  $ Q67_First.Click      : num  9.63 8.61 37.66 13.59 6.8 ...
##  $ Q67_Last.Click       : num  40.4 29.1 53.2 55.2 22.2 ...
##  $ Q67_Page.Submit      : num  41.2 30 54.6 56.1 23.1 ...
##  $ Q67_Click.Count      : int  7 7 9 7 7 7 7 7 6 13 ...
##  $ mindful_1            : int  4 2 2 2 4 1 5 3 3 5 ...
##  $ mindful_2            : int  2 2 3 2 5 4 6 2 3 6 ...
##  $ mindful_3            : int  2 2 1 1 3 3 5 4 4 4 ...
##  $ mindful_4            : int  2 1 2 2 2 3 4 2 3 5 ...
##  $ mindful_5            : int  4 3 3 2 4 5 6 2 3 6 ...
##   [list output truncated]

1.4 Subsetting Data

d <- subset(df, select=c(belnow, marriage2,marriage5,income, race, politics, party, efficacy_1, efficacy_2, efficacy_3, efficacy_4, efficacy_5, efficacy_6, efficacy_7, efficacy_8, efficacy_9, efficacy_10))

1.5 Basic Data Checking

1.5.1 Checking Values

d %>%
    map(table, useNA = "always")
## $belnow
## 
##    1    2    3    4    5 <NA> 
##  115  308  829 1371  555    4 
## 
## $marriage2
## 
##    1    2    3    4    5 <NA> 
##  154  361  746 1155  756   10 
## 
## $marriage5
## 
##    1    2    3    4 <NA> 
## 2137  736   48  251   10 
## 
## $income
## 
##    1    2    3    4    5    6    7    8    9 <NA> 
##  860  518  361  344  302  236  389  140    7   25 
## 
## $race
## 
##                   1       1,2     1,2,3 1,2,3,4,5   1,2,3,5     1,2,4   1,2,4,5 
##         9      2026        26         2         1         3         4         3 
##     1,2,5     1,2,6       1,3     1,3,4     1,3,5       1,4     1,4,5       1,5 
##         4         2        98         6         8        39         2        35 
##       1,6         2       2,3     2,3,4   2,3,4,5     2,3,5     2,3,6       2,4 
##        15       249         5         1         1         1         1         6 
##       2,5     2,5,6       2,6         3       3,4     3,4,5       3,5         4 
##         5         1         6       286         9         1         5       210 
##       4,6         5         6      <NA> 
##         3        12        97         0 
## 
## $politics
## 
##    1    2    3    4    5    6    7    8 <NA> 
##  235  772  373  568  308  332   57  532    5 
## 
## $party
## 
##    1    2    3    4    5    6    7    8 <NA> 
##  345  595  669  330  329  320  136  441   17 
## 
## $efficacy_1
## 
##    1    2    3    4 <NA> 
##   26  219 1789 1145    3 
## 
## $efficacy_2
## 
##    1    2    3    4 <NA> 
##   89  802 1841  448    2 
## 
## $efficacy_3
## 
##    1    2    3    4 <NA> 
##   62  536 1790  792    2 
## 
## $efficacy_4
## 
##    1    2    3    4 <NA> 
##   69  446 1849  815    3 
## 
## $efficacy_5
## 
##    1    2    3    4 <NA> 
##   54  455 1928  742    3 
## 
## $efficacy_6
## 
##    1    2    3    4 <NA> 
##   11  122 1551 1495    3 
## 
## $efficacy_7
## 
##    1    2    3    4 <NA> 
##  126  580 1647  826    3 
## 
## $efficacy_8
## 
##    1    2    3    4 <NA> 
##   29  388 1949  812    4 
## 
## $efficacy_9
## 
##    1    2    3    4 <NA> 
##   17  195 1963 1005    2 
## 
## $efficacy_10
## 
##    1    2    3    4 <NA> 
##   39  273 1909  958    3

1.5.2 Recoding data

table(d$race, useNA = "always")
## 
##                   1       1,2     1,2,3 1,2,3,4,5   1,2,3,5     1,2,4   1,2,4,5 
##         9      2026        26         2         1         3         4         3 
##     1,2,5     1,2,6       1,3     1,3,4     1,3,5       1,4     1,4,5       1,5 
##         4         2        98         6         8        39         2        35 
##       1,6         2       2,3     2,3,4   2,3,4,5     2,3,5     2,3,6       2,4 
##        15       249         5         1         1         1         1         6 
##       2,5     2,5,6       2,6         3       3,4     3,4,5       3,5         4 
##         5         1         6       286         9         1         5       210 
##       4,6         5         6      <NA> 
##         3        12        97         0
d$race_rc <- NA
d$race_rc[d$race == 1] <- "white"
d$race_rc[d$race == 2] <- "black"
d$race_rc[d$race == 3] <- "hispanic"
d$race_rc[d$race == 4] <- "asian"
d$race_rc[d$race == 5] <- "nativeamer"
d$race_rc[d$race == 6] <- "other"

1.5.3 Factor Scores/Composite Variables

# use the str() command to check that your recoded variable is numeric so you can use mathematical operators on it
str(d)
## 'data.frame':    3182 obs. of  18 variables:
##  $ belnow     : int  4 4 2 4 4 3 4 4 4 3 ...
##  $ marriage2  : int  2 3 2 1 2 3 4 3 4 2 ...
##  $ marriage5  : int  2 1 1 1 1 1 1 1 1 4 ...
##  $ income     : int  3 3 1 1 6 1 2 3 7 1 ...
##  $ race       : chr  "1" "1" "1" "6" ...
##  $ politics   : int  2 1 2 8 1 8 4 2 8 4 ...
##  $ party      : int  3 4 8 8 8 8 4 1 3 8 ...
##  $ efficacy_1 : int  4 3 3 4 3 3 3 3 3 4 ...
##  $ efficacy_2 : int  3 3 3 1 3 2 3 3 3 3 ...
##  $ efficacy_3 : int  4 3 1 2 2 3 2 3 3 4 ...
##  $ efficacy_4 : int  3 4 2 3 3 2 1 3 3 4 ...
##  $ efficacy_5 : int  3 4 2 2 3 1 2 3 3 4 ...
##  $ efficacy_6 : int  4 4 3 4 3 3 3 3 3 4 ...
##  $ efficacy_7 : int  3 3 1 2 3 2 2 3 3 3 ...
##  $ efficacy_8 : int  3 3 3 3 3 3 2 3 3 4 ...
##  $ efficacy_9 : int  4 3 2 4 4 3 2 3 3 4 ...
##  $ efficacy_10: int  3 4 2 3 3 2 3 3 3 3 ...
##  $ race_rc    : chr  "white" "white" "white" "other" ...
d$efficacy <- (d$efficacy_1 + d$efficacy_2 + d$efficacy_3 + d$efficacy_4 + d$efficacy_5 + d$efficacy_6 + d$efficacy_7 + d$efficacy_8 + d$efficacy_9 + d$efficacy_10)/10

d$politicalviews <- (d$politics + d$party)/2

d$marriageimportance <- (d$marriage2)

d$belong <- (d$belnow)

1.6 Exporting Data

d2 <- subset(d, select=c(efficacy, belong, marriageimportance, race_rc, politicalviews, income))

write.csv(d2, file="/Users/lydiaschwartz/Desktop/r studio/Data Cleaning and Basic Statistics HW/final.csv", row.names = F)

2 Basic Statistics

2.1 Import Cleaned Data

d2 <- read.csv(file= "/Users/lydiaschwartz/Desktop/r studio/Data Cleaning and Basic Statistics HW/final.csv", header=T) 

2.2 Check Data

2.2.1 Formatting

head(d2)
##   efficacy belong marriageimportance race_rc politicalviews income
## 1      3.4      4                  2   white            2.5      3
## 2      3.4      4                  3   white            2.5      3
## 3      2.2      2                  2   white            5.0      1
## 4      2.8      4                  1   other            8.0      1
## 5      3.0      4                  2   white            4.5      6
## 6      2.4      3                  3   white            8.0      1
str(d2)
## 'data.frame':    3182 obs. of  6 variables:
##  $ efficacy          : num  3.4 3.4 2.2 2.8 3 2.4 2.3 3 3 3.7 ...
##  $ belong            : int  4 4 2 4 4 3 4 4 4 3 ...
##  $ marriageimportance: int  2 3 2 1 2 3 4 3 4 2 ...
##  $ race_rc           : chr  "white" "white" "white" "other" ...
##  $ politicalviews    : num  2.5 2.5 5 8 4.5 8 4 1.5 5.5 6 ...
##  $ income            : int  3 3 1 1 6 1 2 3 7 1 ...
d2$race <- as.factor(d2$race)
d2$income <- as.factor(d2$income)

2.3 Making a new subset with updated race (from numbers to written form)

d2 <- subset(d2, select=c(efficacy, belong, marriageimportance, race_rc, politicalviews, income))

2.4 Univariate Normality

describe(d2)
##                    vars    n mean   sd median trimmed  mad min max range  skew
## efficacy              1 3176 3.13 0.45    3.1    3.13 0.44   1   4     3 -0.29
## belong                2 3178 3.61 1.00    4.0    3.68 1.48   1   5     4 -0.62
## marriageimportance    3 3172 3.63 1.11    4.0    3.72 1.48   1   5     4 -0.59
## race_rc*              4 2880 4.95 1.75    6.0    5.28 0.00   1   6     5 -1.25
## politicalviews        5 3163 4.14 2.05    4.0    4.01 2.22   1   8     7  0.40
## income*               6 3157 3.54 2.30    3.0    3.37 2.97   1   9     8  0.47
##                    kurtosis   se
## efficacy               0.63 0.01
## belong                 0.04 0.02
## marriageimportance    -0.35 0.02
## race_rc*              -0.14 0.03
## politicalviews        -0.93 0.04
## income*               -1.12 0.04

2.5 Histograms

# use the hist() command to create a histogram for your continuous variables
hist(d$efficacy)

hist(d$belong)

hist(d$marriageimportance)

hist(d$politicalviews)

# use the table() command to create a table for your categorical variables (other than your ID variable)
table(d2$race_rc, useNA = "always")
## 
##      asian      black   hispanic nativeamer      other      white       <NA> 
##        210        249        286         12         97       2026        302
table(d2$income, useNA = "always")
## 
##    1    2    3    4    5    6    7    8    9 <NA> 
##  860  518  361  344  302  236  389  140    7   25

2.6 Missing Data

# use the gg_miss_upset() command to visualize your missing data
gg_miss_upset(data=d, nsets = "6")

# create a new dataframe with only your complete cases/observations
d2 <- na.omit(d)

2.7 Crosstabs & Scatterplots

2.7.1 Crosstabs

# use the cross_cases() command to create a crosstab of your categorical variables
cross_cases(d2, race_rc, income)
 income 
 1   2   3   4   5   6   7   8   9 
 race_rc 
   asian  86 25 21 27 14 7 19 10
   black  72 54 29 37 11 12 18 2 1
   hispanic  63 45 57 67 26 11 12 4
   nativeamer  4 1 1 2 1 2 1
   other  41 18 10 8 1 6 4 4
   white  513 319 195 171 222 174 299 104 6
   #Total cases  779 462 313 312 275 210 354 125 7

2.7.2 Scatterplots

# use the plot() command to create scatterplots of your continuous variables
plot(d2$efficacy, d2$belong,
     main="scatterplot of efficacy and belonging",
     xlab = "efficacy",
     ylab = "belonging")

# use the plot() command to create scatterplots of your continuous variables
plot(d2$marriageimportance, d2$politicalviews,
     main="scatterplot of marriage importance and political views",
     xlab = "marriage importance",
     ylab = "political views")

# use the plot() command to create scatterplots of your continuous variables
plot(d2$efficacy, d2$marriageimportance,
     main="scatterplot of efficacy marriage importance",
     xlab = "marriage importance",
     ylab = "efficacy")

# use the plot() command to create scatterplots of your continuous variables
plot(d2$politicalviews, d2$belonging,
     main="scatterplot of political views and belonging",
     xlab = "political views",
     ylab = "belonging")

2.8 Boxplots

# use the boxplot() command to create boxplots of your continuous and categorical variables
boxplot(data=d2, income~race_rc,
        main="income and race",
        xlab = "race",
        ylab = "income")

boxplot(data=d2, marriageimportance~race_rc,
        main="race and marriage",
        xlab = "race",
        ylab = "marriage importance")

boxplot(data=d2, efficacy~race_rc,
        main="race and efficacy",
        xlab = "race",
        ylab = "efficacy")

num_na <- sum(is.na(d$income))
num_na <- sum(is.na(df$race_rc))
num_na <- sum(is.na(d$belong))
num_na <- sum(is.na(d$politicalviews))
num_na <- sum(is.na(d$marriageimportance))
num_na <- sum(is.na(d$efficacy))

sample_size <- nrow(d2)
sample_size <- nrow(d)


item_non_response_percentage <- (3182 - 3130) / 3182 * 100

3 Homework Write up

do your continuous variables meet the criteria for univariate normality? Skew and kurtosis should be between -2 and +2.

Yes, my continuous variables meet this criteria.

Efficacy: skew = -0.24, kurtosis = 0.46

Marriage Opinions: skew = -0.60, kurtosis = -0.33

Political Views: skew = 0.40, kurtosis = -0.92

Belonging: skew = -0.62, kurtosis = 0.05

Do you have any missing data? Once you have removed the cases/participants with missing data, what is your total sample size? Please discuss how much data is missing and whether it’s due to survey design or individual non-response.

I do have missing data. 25 participants did not report an income. 19 participants did not report their political view. 10 Participants did not report their percieved marriage importance. 4 participants did not report their percieved belonging. 6 Participants did not report their percieved self-efficacy.

Due to how the the survey was designed (unit non response), 19 participants did not see the income question, 15 participants did not see the political views question, 6 did not see the marriage importance question, and 3 did not see the efficacy question. 12 participants skipped some of the items selected for analysis (item nonresponse). The percentage of item non response from the number of participants who responded fully is 1.63%, below the 5% cutoff considered a concern by the literature. The original sample size was 3182 Participants. After the missing data was removed, the sample size changed to 3130.