Data is loaded from github repository
# load data
url <- "https://raw.githubusercontent.com/javernw/JWCUNYAssignments/master/master.csv"
master_file <- read.csv(url, stringsAsFactors = F, header = T)
Do suicide rates increase or decrease due to a Country’s standard of living?
Each case represents a country and the suicide rate within an age group of males or females for the year. There are 27,820 cases.
kable(head(master_file, 20)) %>% kable_styling("striped", "hovered", font_size = 11) %>% scroll_box(height = "500px")
ï..country | year | sex | age | suicides_no | population | suicides.100k.pop | country.year | HDI.for.year | gdp_for_year…. | gdp_per_capita…. | generation |
---|---|---|---|---|---|---|---|---|---|---|---|
Albania | 1987 | male | 15-24 years | 21 | 312900 | 6.71 | Albania1987 | NA | 2,156,624,900 | 796 | Generation X |
Albania | 1987 | male | 35-54 years | 16 | 308000 | 5.19 | Albania1987 | NA | 2,156,624,900 | 796 | Silent |
Albania | 1987 | female | 15-24 years | 14 | 289700 | 4.83 | Albania1987 | NA | 2,156,624,900 | 796 | Generation X |
Albania | 1987 | male | 75+ years | 1 | 21800 | 4.59 | Albania1987 | NA | 2,156,624,900 | 796 | G.I. Generation |
Albania | 1987 | male | 25-34 years | 9 | 274300 | 3.28 | Albania1987 | NA | 2,156,624,900 | 796 | Boomers |
Albania | 1987 | female | 75+ years | 1 | 35600 | 2.81 | Albania1987 | NA | 2,156,624,900 | 796 | G.I. Generation |
Albania | 1987 | female | 35-54 years | 6 | 278800 | 2.15 | Albania1987 | NA | 2,156,624,900 | 796 | Silent |
Albania | 1987 | female | 25-34 years | 4 | 257200 | 1.56 | Albania1987 | NA | 2,156,624,900 | 796 | Boomers |
Albania | 1987 | male | 55-74 years | 1 | 137500 | 0.73 | Albania1987 | NA | 2,156,624,900 | 796 | G.I. Generation |
Albania | 1987 | female | 5-14 years | 0 | 311000 | 0.00 | Albania1987 | NA | 2,156,624,900 | 796 | Generation X |
Albania | 1987 | female | 55-74 years | 0 | 144600 | 0.00 | Albania1987 | NA | 2,156,624,900 | 796 | G.I. Generation |
Albania | 1987 | male | 5-14 years | 0 | 338200 | 0.00 | Albania1987 | NA | 2,156,624,900 | 796 | Generation X |
Albania | 1988 | female | 75+ years | 2 | 36400 | 5.49 | Albania1988 | NA | 2,126,000,000 | 769 | G.I. Generation |
Albania | 1988 | male | 15-24 years | 17 | 319200 | 5.33 | Albania1988 | NA | 2,126,000,000 | 769 | Generation X |
Albania | 1988 | male | 75+ years | 1 | 22300 | 4.48 | Albania1988 | NA | 2,126,000,000 | 769 | G.I. Generation |
Albania | 1988 | male | 35-54 years | 14 | 314100 | 4.46 | Albania1988 | NA | 2,126,000,000 | 769 | Silent |
Albania | 1988 | male | 55-74 years | 4 | 140200 | 2.85 | Albania1988 | NA | 2,126,000,000 | 769 | G.I. Generation |
Albania | 1988 | female | 15-24 years | 8 | 295600 | 2.71 | Albania1988 | NA | 2,126,000,000 | 769 | Generation X |
Albania | 1988 | female | 55-74 years | 3 | 147500 | 2.03 | Albania1988 | NA | 2,126,000,000 | 769 | G.I. Generation |
Albania | 1988 | female | 25-34 years | 5 | 262400 | 1.91 | Albania1988 | NA | 2,126,000,000 | 769 | Boomers |
Secondary Data: This is a quantitative research where data was collected by extracting information from an online database. The data was compiled from four different databases ( United Nations Development Program (HDI), World Bank, World Health Organization, and Szmali) to identify any attributes that correlated with suicide rates globally.
This is an observational study since the participants are observed without any kind of interference.
Data was found on Kaggle.com
Suicide rate. Quantitative variable
Gross Domestic Product. Quantitative variable
Generation. Qualitative variable
options(scipen = 999)
master_file$gdp_for_year.... <- extract_numeric(master_file$gdp_for_year....)
## extract_numeric() is deprecated: please use readr::parse_number() instead
# summary of each variable
summary(master_file)
## ï..country year sex age
## Length:27820 Min. :1985 Length:27820 Length:27820
## Class :character 1st Qu.:1995 Class :character Class :character
## Mode :character Median :2002 Mode :character Mode :character
## Mean :2001
## 3rd Qu.:2008
## Max. :2016
##
## suicides_no population suicides.100k.pop country.year
## Min. : 0.0 Min. : 278 Min. : 0.00 Length:27820
## 1st Qu.: 3.0 1st Qu.: 97498 1st Qu.: 0.92 Class :character
## Median : 25.0 Median : 430150 Median : 5.99 Mode :character
## Mean : 242.6 Mean : 1844794 Mean : 12.82
## 3rd Qu.: 131.0 3rd Qu.: 1486143 3rd Qu.: 16.62
## Max. :22338.0 Max. :43805214 Max. :224.97
##
## HDI.for.year gdp_for_year.... gdp_per_capita....
## Min. :0.483 Min. : 46919625 Min. : 251
## 1st Qu.:0.713 1st Qu.: 8985352832 1st Qu.: 3447
## Median :0.779 Median : 48114688201 Median : 9372
## Mean :0.777 Mean : 445580969026 Mean : 16866
## 3rd Qu.:0.855 3rd Qu.: 260202429150 3rd Qu.: 24874
## Max. :0.944 Max. :18120714000000 Max. :126352
## NA's :19456
## generation
## Length:27820
## Class :character
## Mode :character
##
##
##
##
Relationship between suicide rates and gdp per capita
plot(master_file$suicides.100k.pop, master_file$gdp_per_capita....)
Suicide rate per generation
ggplot(master_file, aes(generation, suicides.100k.pop)) + geom_boxplot()
Suicide rates based on gender
ggplot(master_file, aes(x =suicides.100k.pop, fill = sex, color = sex )) + geom_histogram(alpha=0.5, position="identity")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.