library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.7
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
data <-read.csv("https://raw.githubusercontent.com/AldataSci/Data606Project/main/survey.csv",header=TRUE)
data <- data %>%
select("Age","Gender","Country",
"state","seek_help", "treatment","benefits","mental_vs_physical","remote_work") %>%
filter(Country=="United States")
You should phrase your research question in a way that matches up with the scope of inference your dataset allows for.
Does the frequency of respondents reaching out for help vary by age?
What are the cases, and how many are there?
There are 751 observations and each case represents an employer’s responses to various questions regarding mental health.
Describe the method of data collection.
The dataset was found on kaggle But the original dataset was collected and can be downloaded from Open Source Mental Illness
What type of study is this (observational/experiment)?
This data collected from respondents answering questions on a survey so the study is observational.
The dataset is found on kaggle the link is here
The response Variable is respondents that are the frequency of respondents reaching out for help and is categorical.
The explanatory variable is the age of the respondents and it is numerical
summary(data$Age)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -29.00 27.50 32.00 33.33 37.50 329.00
ggplot(data,aes(x=Age)) + geom_histogram() + labs(
title = "Age of Respondents") +
xlim(0,100)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 3 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing missing values (geom_bar).
ggplot(data,aes(x=seek_help)) + geom_bar() +
labs(title = "Would Respondents Seek Help?",xlabs ="Seek Help")