Categorical Data Analysis

Research Question -I hypothesize that there will be a relationship between ever being on medication for depression and whether or not someone is male or female. I hypothesize that those who identify as female will be more likely to have been on medication for depression.

Prep

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(readr)
library(ggplot2)
Dataset<-read.csv('/Users/apple/Downloads/NHIS Data.csv')
head(Dataset)

##   psu sampweight year year_strata Demo_Race Demo_Hispanic
## 1   2       4316 1997    1997.514                Hispanic
## 2   2       2845 1997    1997.510                Hispanic
## 3   2       3783 1997    1997.510                Hispanic
## 4   2       2466 1997    1997.510                Hispanic
## 5   2       3794 1997    1997.510                Hispanic
## 6   1       1793 1997    1997.515                Hispanic
##                 Demo_RaceEthnicity Demo_Region Demo_sex_C Demo_sexorien_C
## 1 Hispanic (Race Identity Unknown)        West     female                
## 2 Hispanic (Race Identity Unknown)        West     female                
## 3 Hispanic (Race Identity Unknown)        West       male                
## 4 Hispanic (Race Identity Unknown)        West       male                
## 5 Hispanic (Race Identity Unknown)        West       male                
## 6 Hispanic (Race Identity Unknown)        West     female                
##   Demo_belowpovertyline_B Demo_age_N Demo_agerange_C Demo_marital_C
## 1                       1         33           30-39        Married
## 2                       0         52           50-59        Married
## 3                       0         41           40-49        Married
## 4                       0         67           60-69        Widowed
## 5                       1         25           18-29        Married
## 6                       0         61           60-69        Widowed
##   Demo_hourswrk_C MentalHealth_MentalIllnessK6_N MentalHealth_MentalIllnessK6_C
## 1           20-39                              0                       Low Risk
## 2            None                             NA                               
## 3           40-59                              0                       Low Risk
## 4            1-19                              0                       Low Risk
## 5           40-59                              0                       Low Risk
## 6            None                             11                            MMD
##   MentalHealth_SeriousMentalIllnessK6_B MentalHealth_depressionmeds_B
## 1                                     0                            NA
## 2                                    NA                            NA
## 3                                     0                            NA
## 4                                     0                            NA
## 5                                     0                            NA
## 6                                     0                            NA
##   Health_SelfRatedHealth_C Health_diagnosed_STD5yr_B Health_BirthControlNow_B
## 1                Excellent                        NA                       NA
## 2                Very Good                        NA                       NA
## 3                Excellent                        NA                       NA
## 4                Very Good                        NA                       NA
## 5                     Good                        NA                       NA
## 6                     Poor                        NA                       NA
##   Health_EverHaveHeartAttack_B Health_EverHaveHeartCondition_B
## 1                            0                               0
## 2                            0                               0
## 3                            0                               0
## 4                            1                               0
## 5                            0                               0
## 6                            0                               0
##   Health_EverHaveCancer_B Health_EverHaveDiabetes_B
## 1                       0                         0
## 2                       0                         0
## 3                       0                         0
## 4                       0                         1
## 5                       0                         0
## 6                       0                         1
##   Health_EverHavePrediabetes_B Health_EverHaveAsthma_B Health_StillHaveAsthma_B
## 1                           NA                       0                        0
## 2                           NA                       0                        0
## 3                           NA                       0                        0
## 4                           NA                       0                        0
## 5                           NA                       0                        0
## 6                           NA                       1                       NA
##   Health_HIVAidsRisk_C Health_HIVAidsHighRisk_B Health_EverTakeHIVTest_B
## 1                  Low                        0                        1
## 2                 None                        0                        1
## 3                 None                        0                        1
## 4                 None                        0                        1
## 5                 None                        0                        0
## 6                 None                        0                       NA
##   Health_EverHaveHypertension_B Health_BMI_N Health_BMI_C
## 1                             0        19.73       Normal
## 2                             0        25.73   Overweight
## 3                             0        36.48        Obese
## 4                             1        24.19       Normal
## 5                             0        24.80       Normal
## 6                             1        27.25   Overweight
##   Health_BMIOverweight_B Health_BMIObese_B Health_Weight_N Health_Height_N
## 1                      0                 0             115              64
## 2                      1                 0             150              64
## 3                      1                 1             233              67
## 4                      0                 0             141              64
## 5                      0                 0             140              63
## 6                      1                 0             135              59
##   Health_UsualPlaceHealthcare_C Health_UsualPlaceHealthcare_B
## 1                            No                             0
## 2                           Yes                             1
## 3                           Yes                             1
## 4                           Yes                             1
## 5                            No                             0
## 6                           Yes                             1
##   Health_AbnormalPapPast3yr_B Behav_EverSmokeCigs_B Behav_CigsPerDay_N
## 1                          NA                     0                  0
## 2                          NA                     0                  0
## 3                          NA                     1                  5
## 4                          NA                     0                  0
## 5                          NA                     0                  0
## 6                          NA                     1                  0
##   Behav_CigsPerDay_C Behav_AgeStartSmoking Behav_AlcDaysPerYear_N
## 1            00 None                    NA                     NA
## 2            00 None                    NA                     NA
## 3              01-09                    17                      1
## 4            00 None                    NA                      3
## 5            00 None                    NA                      2
## 6            00 None                     6                      5
##   Behav_AlcDaysPerWeek_N Behav_BingeDrinkDaysYear_N Behav_BingeDrinkDaysYear_C
## 1                     NA                         NA                           
## 2                     NA                         NA                           
## 3                      0                          0                     0 Days
## 4                      0                          0                     0 Days
## 5                      0                          0                     0 Days
## 6                      0                          2                001-10 Days

Select Variable

Dataset<-Dataset%>%
  select(Demo_sex_C, MentalHealth_depressionmeds_B)%>%
  filter(Demo_sex_C %in% combine("male", "female"), MentalHealth_depressionmeds_B %in% combine("0","1") )

## Warning: `combine()` is deprecated as of dplyr 1.0.0.
## Please use `vctrs::vec_c()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.

Crosstab(Null-Demo_sex_c)

table(Dataset$Demo_sex_C)%>%
  prop.table()%>%
  round(2)

## 
## female   male 
##   0.55   0.45

Crosstab(Null-MentalHealth_depressionmeds_B)

table(Dataset$MentalHealth_depressionmeds_B)%>%
  prop.table()%>%
  round(2)

## 
##    0    1 
## 0.91 0.09

Crosstab(Actual)

table(Dataset$Demo_sex_C,Dataset$MentalHealth_depressionmeds_B)%>%
  prop.table()

##         
##                   0          1
##   female 0.48798856 0.06360519
##   male   0.42137922 0.02702703

The difference between the null hypothesis and the actual values do not differ greatly.The number of males and females who either take or do not take medication for depression do not demonstrate a substantial difference between the null hypothesis and the actual observations

Relationship of Interest

table(Dataset$Demo_sex_C,Dataset$MentalHealth_depressionmeds_B)%>%
  prop.table(1)

##         
##                   0          1
##   female 0.88468834 0.11531166
##   male   0.93972647 0.06027353

88% of women responded that they have not been on medication for depression, while 93% of men responded that they have not been medication for depression. This supports my hypothesis that women were more likely to take medication for depression.

Chi-Square Statistical Test

chisq.test(Dataset$Demo_sex_C, Dataset$MentalHealth_depressionmeds_B)

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  Dataset$Demo_sex_C and Dataset$MentalHealth_depressionmeds_B
## X-squared = 876.88, df = 1, p-value < 2.2e-16

The p-value, which is 2.2e-16, is lower than 0.05 which means that there is a relationship of statisical significance between the variables and they are dependent on one another.