I hypothesize that there is a relationship between poverty status and mental health. I will test this hypothesis by analyzing survey responses to poverty status and mental health, and seeing whether or not mental health is dependent on poverty status.
library(readr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
data <- read.csv("/Users/Nazija/Downloads/SD2 Data(1).csv")
data <- data%>%
select(poverty_status, mental_health)
unique(data$poverty_status)
## [1] above poverty below poverty
## Levels: above poverty below poverty
unique(data$mental_health)
## [1] Low Risk Moderate Mental Distress Serious Mental Illness
## Levels: Low Risk Moderate Mental Distress Serious Mental Illness
head(data)
## poverty_status mental_health
## 1 above poverty Low Risk
## 2 above poverty Low Risk
## 3 above poverty Low Risk
## 4 above poverty Moderate Mental Distress
## 5 above poverty Low Risk
## 6 above poverty Moderate Mental Distress
table(data$poverty_status)%>%
prop.table()%>%
round(2)
##
## above poverty below poverty
## 0.83 0.17
table(data$mental_health)%>%
prop.table()%>%
round(2)
##
## Low Risk Moderate Mental Distress Serious Mental Illness
## 0.80 0.16 0.03
above_lr = .83 * .8
above_moderate = .83 * .16
above_serious = .83 * .03
below_lr = .17 * .8
below_moderate = .17*.16
below_serious = .17*.03
print("Expected Values Above Poverty (low risk, moderate, serious)")
## [1] "Expected Values Above Poverty (low risk, moderate, serious)"
above_lr
## [1] 0.664
above_moderate
## [1] 0.1328
above_serious
## [1] 0.0249
print("Expected Values Below Poverty (low risk, moderate, serious")
## [1] "Expected Values Below Poverty (low risk, moderate, serious"
below_lr
## [1] 0.136
below_moderate
## [1] 0.0272
below_serious
## [1] 0.0051
chisq.test(data$poverty_status, data$mental_health)[7]
## $expected
## data$mental_health
## data$poverty_status Low Risk Moderate Mental Distress Serious Mental Illness
## above poverty 171875.35 34582.577 7155.074
## below poverty 34036.65 6848.423 1416.926
table(data$poverty_status, data$mental_health)%>%
prop.table()%>%
round(2)
##
## Low Risk Moderate Mental Distress Serious Mental Illness
## above poverty 0.69 0.12 0.02
## below poverty 0.11 0.04 0.01
chisq.test(data$poverty_status, data$mental_health)[6]
## $observed
## data$mental_health
## data$poverty_status Low Risk Moderate Mental Distress Serious Mental Illness
## above poverty 177273 31235 5105
## below poverty 28639 10196 3467
The observed values are different from the expected values, and show that respondents above poverty were at lower risk or had less mental distress than expected, while those below poverty were experiecing more mental distress than expected.
table(data$poverty_status, data$mental_health)%>%
prop.table(1)%>%
round(2)
##
## Low Risk Moderate Mental Distress Serious Mental Illness
## above poverty 0.83 0.15 0.02
## below poverty 0.68 0.24 0.08
data%>%
group_by(poverty_status, mental_health)%>%
summarize(n = n())%>%
mutate(percentage = n/sum(n))%>%
ggplot()+
geom_col(aes(x = poverty_status, y = percentage, fill = mental_health))
## `summarise()` regrouping output by 'poverty_status' (override with `.groups` argument)
The visualization shows how a greater percentage of people below poverty are experience moderate or serious mental distress compared to people above poverty, who have a greater percentage at low risk of mental health issues.
chisq.test(data$poverty_status, data$mental_health)
##
## Pearson's Chi-squared test
##
## data: data$poverty_status and data$mental_health
## X-squared = 6539.4, df = 2, p-value < 2.2e-16
The results indicate that there is a statistically significant relationship between inidividuals’ poverty status and their mental health. Since the p-value is much lower than .05, we know the relationship is statistically significant.