Suicide often derives from deep feelings of hopelessness. Victims usually have the inability to see solutions to problems or to cope with challenging life circumstances which then leads them to see suicide as the only solution. According to World Health Organization suicide is a major health problem worldwide and is a leading cause of death. Over 800,000 people die every year from suicide which is estimated to be at a rate of one person every 40 seconds. However, suicide is preventable when timely, effective interventions are implemented at national, municipal and individual levels. When it comes to a country’s income level, suicide does not just occur in high-income countries, but it is a global phenomenon in all regions of the world. In fact, over 79% of global suicides occurred in low- and middle-income countries in 2016. This study will be focused on if there is a change in a country’s suicide rate based on the standard of living or it’s GDP.
The data to be used in this project was collected by extracting information from an online database (Kaggle.com). It was compiled from four different datasets ( United Nations Development Program (HDI), World Bank, World Health Organization, and Szmali) to identify any attributes that correlated with suicide rates globally.
There are a total of 27,820 cases where each case represents a country and the suicide rate within an age group of males or females for the year between 1987 and 2016 along with the country’s GDP at the time. As mentioned earlier, the research will be concerned with discovering any relationship between suicide rate and GDP, therefore these are the two variables will be the main focus. Both variables are both quantitative.
This study is an observational study since the participants are observed without any kind of interference.Therefore, the goal is to see if there is any relationship betwee GDP and Suicide. The population of interest is all persons ages 5 and up who committed suicide. The demographics from this data comes from countries around the world so we can generalize our conclusions to the population globally. However, since the study is observational the findings cannot be used to establish causal relationships, only correlation. For the sake of potential bias, we have to assume that every country equally reported all suicide events otherwise our conclusions may be incorrect. Suicide is seen as a negative attraction for a country’s economy so if numbers go unreported, then the results will reflect inaccuracy.
url <- "https://raw.githubusercontent.com/javernw/JWCUNYAssignments/master/master.csv"
master_file <- read.csv(url, stringsAsFactors = F, header = T)
dim(master_file)## [1] 27820 12
Removing HDI because this column is almost empty and country.year is repetitve
# remove useless columns
suicide <- master_file %>% dplyr::select(-HDI.for.year, -country.year)#rename columns
names(suicide) <- str_to_title(names(suicide), locale = "en") %>% gsub("Ï..Country","Country", .) %>% gsub("Suicides.100k.pop", "Suicide_per_100k", .) %>% gsub("\\.\\.\\.\\.", "", .)
#add continent
suicide$Continent <- countrycode(sourcevar = suicide$Country,
origin = "country.name",
destination = "continent")
#rearrange columns
suicide <- suicide[, c("Continent", "Country", "Year", "Sex" ,"Age", "Suicides_no", "Population", "Suicide_per_100k", "Gdp_for_year", "Gdp_per_capita", "Generation")]
# only keep rows where data is not missing
suicide <- suicide[complete.cases(suicide), ]
suicide$Gdp_for_year <- parse_number(suicide$Gdp_for_year)A view of suicide dataframe after some adjustments. Showing the first 200.
kable(head(suicide, 200)) %>% kable_styling(bootstrap_options = "striped" ,font_size = 11) %>% scroll_box(height = "500px")| Continent | Country | Year | Sex | Age | Suicides_no | Population | Suicide_per_100k | Gdp_for_year | Gdp_per_capita | Generation |
|---|---|---|---|---|---|---|---|---|---|---|
| Europe | Albania | 1987 | male | 15-24 years | 21 | 312900 | 6.71 | 2156624900 | 796 | Generation X |
| Europe | Albania | 1987 | male | 35-54 years | 16 | 308000 | 5.19 | 2156624900 | 796 | Silent |
| Europe | Albania | 1987 | female | 15-24 years | 14 | 289700 | 4.83 | 2156624900 | 796 | Generation X |
| Europe | Albania | 1987 | male | 75+ years | 1 | 21800 | 4.59 | 2156624900 | 796 | G.I. Generation |
| Europe | Albania | 1987 | male | 25-34 years | 9 | 274300 | 3.28 | 2156624900 | 796 | Boomers |
| Europe | Albania | 1987 | female | 75+ years | 1 | 35600 | 2.81 | 2156624900 | 796 | G.I. Generation |
| Europe | Albania | 1987 | female | 35-54 years | 6 | 278800 | 2.15 | 2156624900 | 796 | Silent |
| Europe | Albania | 1987 | female | 25-34 years | 4 | 257200 | 1.56 | 2156624900 | 796 | Boomers |
| Europe | Albania | 1987 | male | 55-74 years | 1 | 137500 | 0.73 | 2156624900 | 796 | G.I. Generation |
| Europe | Albania | 1987 | female | 5-14 years | 0 | 311000 | 0.00 | 2156624900 | 796 | Generation X |
| Europe | Albania | 1987 | female | 55-74 years | 0 | 144600 | 0.00 | 2156624900 | 796 | G.I. Generation |
| Europe | Albania | 1987 | male | 5-14 years | 0 | 338200 | 0.00 | 2156624900 | 796 | Generation X |
| Europe | Albania | 1988 | female | 75+ years | 2 | 36400 | 5.49 | 2126000000 | 769 | G.I. Generation |
| Europe | Albania | 1988 | male | 15-24 years | 17 | 319200 | 5.33 | 2126000000 | 769 | Generation X |
| Europe | Albania | 1988 | male | 75+ years | 1 | 22300 | 4.48 | 2126000000 | 769 | G.I. Generation |
| Europe | Albania | 1988 | male | 35-54 years | 14 | 314100 | 4.46 | 2126000000 | 769 | Silent |
| Europe | Albania | 1988 | male | 55-74 years | 4 | 140200 | 2.85 | 2126000000 | 769 | G.I. Generation |
| Europe | Albania | 1988 | female | 15-24 years | 8 | 295600 | 2.71 | 2126000000 | 769 | Generation X |
| Europe | Albania | 1988 | female | 55-74 years | 3 | 147500 | 2.03 | 2126000000 | 769 | G.I. Generation |
| Europe | Albania | 1988 | female | 25-34 years | 5 | 262400 | 1.91 | 2126000000 | 769 | Boomers |
| Europe | Albania | 1988 | male | 25-34 years | 5 | 279900 | 1.79 | 2126000000 | 769 | Boomers |
| Europe | Albania | 1988 | female | 35-54 years | 4 | 284500 | 1.41 | 2126000000 | 769 | Silent |
| Europe | Albania | 1988 | female | 5-14 years | 0 | 317200 | 0.00 | 2126000000 | 769 | Generation X |
| Europe | Albania | 1988 | male | 5-14 years | 0 | 345000 | 0.00 | 2126000000 | 769 | Generation X |
| Europe | Albania | 1989 | male | 75+ years | 2 | 22500 | 8.89 | 2335124988 | 833 | G.I. Generation |
| Europe | Albania | 1989 | male | 25-34 years | 18 | 283600 | 6.35 | 2335124988 | 833 | Boomers |
| Europe | Albania | 1989 | male | 35-54 years | 15 | 318400 | 4.71 | 2335124988 | 833 | Silent |
| Europe | Albania | 1989 | male | 55-74 years | 6 | 142100 | 4.22 | 2335124988 | 833 | G.I. Generation |
| Europe | Albania | 1989 | male | 15-24 years | 12 | 323500 | 3.71 | 2335124988 | 833 | Generation X |
| Europe | Albania | 1989 | female | 35-54 years | 7 | 288600 | 2.43 | 2335124988 | 833 | Silent |
| Europe | Albania | 1989 | female | 15-24 years | 5 | 299900 | 1.67 | 2335124988 | 833 | Generation X |
| Europe | Albania | 1989 | female | 25-34 years | 2 | 266300 | 0.75 | 2335124988 | 833 | Boomers |
| Europe | Albania | 1989 | female | 55-74 years | 1 | 149600 | 0.67 | 2335124988 | 833 | G.I. Generation |
| Europe | Albania | 1989 | female | 5-14 years | 0 | 321900 | 0.00 | 2335124988 | 833 | Generation X |
| Europe | Albania | 1989 | female | 75+ years | 0 | 37000 | 0.00 | 2335124988 | 833 | G.I. Generation |
| Europe | Albania | 1989 | male | 5-14 years | 0 | 349700 | 0.00 | 2335124988 | 833 | Generation X |
| Europe | Albania | 1992 | male | 35-54 years | 12 | 343800 | 3.49 | 709452584 | 251 | Boomers |
| Europe | Albania | 1992 | male | 15-24 years | 9 | 263700 | 3.41 | 709452584 | 251 | Generation X |
| Europe | Albania | 1992 | male | 55-74 years | 5 | 159500 | 3.13 | 709452584 | 251 | Silent |
| Europe | Albania | 1992 | male | 25-34 years | 7 | 245500 | 2.85 | 709452584 | 251 | Boomers |
| Europe | Albania | 1992 | female | 15-24 years | 7 | 292400 | 2.39 | 709452584 | 251 | Generation X |
| Europe | Albania | 1992 | female | 25-34 years | 4 | 267400 | 1.50 | 709452584 | 251 | Boomers |
| Europe | Albania | 1992 | female | 35-54 years | 2 | 323100 | 0.62 | 709452584 | 251 | Boomers |
| Europe | Albania | 1992 | female | 55-74 years | 1 | 164900 | 0.61 | 709452584 | 251 | Silent |
| Europe | Albania | 1992 | female | 5-14 years | 0 | 336700 | 0.00 | 709452584 | 251 | Millenials |
| Europe | Albania | 1992 | female | 75+ years | 0 | 38700 | 0.00 | 709452584 | 251 | G.I. Generation |
| Europe | Albania | 1992 | male | 5-14 years | 0 | 362900 | 0.00 | 709452584 | 251 | Millenials |
| Europe | Albania | 1992 | male | 75+ years | 0 | 23900 | 0.00 | 709452584 | 251 | G.I. Generation |
| Europe | Albania | 1993 | male | 15-24 years | 18 | 243300 | 7.40 | 1228071038 | 437 | Generation X |
| Europe | Albania | 1993 | male | 55-74 years | 7 | 165000 | 4.24 | 1228071038 | 437 | Silent |
| Europe | Albania | 1993 | male | 75+ years | 1 | 24200 | 4.13 | 1228071038 | 437 | G.I. Generation |
| Europe | Albania | 1993 | male | 25-34 years | 9 | 230100 | 3.91 | 1228071038 | 437 | Boomers |
| Europe | Albania | 1993 | female | 15-24 years | 10 | 285300 | 3.51 | 1228071038 | 437 | Generation X |
| Europe | Albania | 1993 | male | 35-54 years | 10 | 350300 | 2.85 | 1228071038 | 437 | Boomers |
| Europe | Albania | 1993 | female | 25-34 years | 7 | 261800 | 2.67 | 1228071038 | 437 | Boomers |
| Europe | Albania | 1993 | female | 35-54 years | 7 | 331200 | 2.11 | 1228071038 | 437 | Boomers |
| Europe | Albania | 1993 | female | 55-74 years | 2 | 169500 | 1.18 | 1228071038 | 437 | Silent |
| Europe | Albania | 1993 | female | 5-14 years | 1 | 340300 | 0.29 | 1228071038 | 437 | Millenials |
| Europe | Albania | 1993 | male | 5-14 years | 1 | 367000 | 0.27 | 1228071038 | 437 | Millenials |
| Europe | Albania | 1993 | female | 75+ years | 0 | 39300 | 0.00 | 1228071038 | 437 | G.I. Generation |
| Europe | Albania | 1994 | male | 75+ years | 2 | 24600 | 8.13 | 1985673798 | 697 | G.I. Generation |
| Europe | Albania | 1994 | male | 55-74 years | 11 | 171400 | 6.42 | 1985673798 | 697 | Silent |
| Europe | Albania | 1994 | female | 75+ years | 2 | 39900 | 5.01 | 1985673798 | 697 | G.I. Generation |
| Europe | Albania | 1994 | male | 25-34 years | 6 | 231400 | 2.59 | 1985673798 | 697 | Boomers |
| Europe | Albania | 1994 | male | 35-54 years | 9 | 362800 | 2.48 | 1985673798 | 697 | Boomers |
| Europe | Albania | 1994 | male | 15-24 years | 6 | 242200 | 2.48 | 1985673798 | 697 | Generation X |
| Europe | Albania | 1994 | female | 15-24 years | 6 | 282600 | 2.12 | 1985673798 | 697 | Generation X |
| Europe | Albania | 1994 | female | 25-34 years | 4 | 261100 | 1.53 | 1985673798 | 697 | Boomers |
| Europe | Albania | 1994 | female | 35-54 years | 2 | 342500 | 0.58 | 1985673798 | 697 | Boomers |
| Europe | Albania | 1994 | female | 55-74 years | 1 | 174600 | 0.57 | 1985673798 | 697 | Silent |
| Europe | Albania | 1994 | male | 5-14 years | 1 | 371800 | 0.27 | 1985673798 | 697 | Millenials |
| Europe | Albania | 1994 | female | 5-14 years | 0 | 344400 | 0.00 | 1985673798 | 697 | Millenials |
| Europe | Albania | 1995 | male | 25-34 years | 13 | 232900 | 5.58 | 2424499009 | 835 | Generation X |
| Europe | Albania | 1995 | male | 55-74 years | 9 | 178000 | 5.06 | 2424499009 | 835 | Silent |
| Europe | Albania | 1995 | female | 75+ years | 2 | 40800 | 4.90 | 2424499009 | 835 | G.I. Generation |
| Europe | Albania | 1995 | female | 15-24 years | 13 | 283500 | 4.59 | 2424499009 | 835 | Generation X |
| Europe | Albania | 1995 | male | 15-24 years | 11 | 241200 | 4.56 | 2424499009 | 835 | Generation X |
| Europe | Albania | 1995 | male | 75+ years | 1 | 25100 | 3.98 | 2424499009 | 835 | G.I. Generation |
| Europe | Albania | 1995 | male | 35-54 years | 14 | 375900 | 3.72 | 2424499009 | 835 | Boomers |
| Europe | Albania | 1995 | female | 25-34 years | 7 | 264000 | 2.65 | 2424499009 | 835 | Generation X |
| Europe | Albania | 1995 | female | 35-54 years | 8 | 356400 | 2.24 | 2424499009 | 835 | Boomers |
| Europe | Albania | 1995 | male | 5-14 years | 6 | 376500 | 1.59 | 2424499009 | 835 | Millenials |
| Europe | Albania | 1995 | female | 55-74 years | 2 | 180400 | 1.11 | 2424499009 | 835 | Silent |
| Europe | Albania | 1995 | female | 5-14 years | 2 | 348700 | 0.57 | 2424499009 | 835 | Millenials |
| Europe | Albania | 1996 | male | 75+ years | 2 | 25400 | 7.87 | 3314898292 | 1127 | G.I. Generation |
| Europe | Albania | 1996 | male | 15-24 years | 17 | 243600 | 6.98 | 3314898292 | 1127 | Generation X |
| Europe | Albania | 1996 | male | 25-34 years | 14 | 235300 | 5.95 | 3314898292 | 1127 | Generation X |
| Europe | Albania | 1996 | female | 15-24 years | 16 | 287700 | 5.56 | 3314898292 | 1127 | Generation X |
| Europe | Albania | 1996 | female | 75+ years | 2 | 41200 | 4.85 | 3314898292 | 1127 | G.I. Generation |
| Europe | Albania | 1996 | female | 25-34 years | 10 | 267900 | 3.73 | 3314898292 | 1127 | Generation X |
| Europe | Albania | 1996 | male | 35-54 years | 12 | 379600 | 3.16 | 3314898292 | 1127 | Boomers |
| Europe | Albania | 1996 | female | 35-54 years | 9 | 362000 | 2.49 | 3314898292 | 1127 | Boomers |
| Europe | Albania | 1996 | male | 55-74 years | 3 | 179900 | 1.67 | 3314898292 | 1127 | Silent |
| Europe | Albania | 1996 | female | 55-74 years | 1 | 183100 | 0.55 | 3314898292 | 1127 | Silent |
| Europe | Albania | 1996 | male | 5-14 years | 2 | 380400 | 0.53 | 3314898292 | 1127 | Millenials |
| Europe | Albania | 1996 | female | 5-14 years | 1 | 354100 | 0.28 | 3314898292 | 1127 | Millenials |
| Europe | Albania | 1997 | male | 25-34 years | 36 | 236000 | 15.25 | 2359903108 | 793 | Generation X |
| Europe | Albania | 1997 | male | 15-24 years | 33 | 244400 | 13.50 | 2359903108 | 793 | Generation X |
| Europe | Albania | 1997 | male | 75+ years | 3 | 25400 | 11.81 | 2359903108 | 793 | G.I. Generation |
| Europe | Albania | 1997 | male | 35-54 years | 30 | 380800 | 7.88 | 2359903108 | 793 | Boomers |
| Europe | Albania | 1997 | female | 15-24 years | 21 | 294000 | 7.14 | 2359903108 | 793 | Generation X |
| Europe | Albania | 1997 | male | 55-74 years | 12 | 180300 | 6.66 | 2359903108 | 793 | Silent |
| Europe | Albania | 1997 | female | 25-34 years | 16 | 273900 | 5.84 | 2359903108 | 793 | Generation X |
| Europe | Albania | 1997 | female | 75+ years | 2 | 42100 | 4.75 | 2359903108 | 793 | G.I. Generation |
| Europe | Albania | 1997 | female | 35-54 years | 7 | 370100 | 1.89 | 2359903108 | 793 | Boomers |
| Europe | Albania | 1997 | female | 5-14 years | 6 | 361800 | 1.66 | 2359903108 | 793 | Millenials |
| Europe | Albania | 1997 | male | 5-14 years | 4 | 381500 | 1.05 | 2359903108 | 793 | Millenials |
| Europe | Albania | 1997 | female | 55-74 years | 0 | 187000 | 0.00 | 2359903108 | 793 | Silent |
| Europe | Albania | 1998 | male | 75+ years | 3 | 25800 | 11.63 | 2707123772 | 899 | G.I. Generation |
| Europe | Albania | 1998 | male | 15-24 years | 27 | 248800 | 10.85 | 2707123772 | 899 | Generation X |
| Europe | Albania | 1998 | female | 15-24 years | 32 | 295600 | 10.83 | 2707123772 | 899 | Generation X |
| Europe | Albania | 1998 | male | 25-34 years | 26 | 240400 | 10.82 | 2707123772 | 899 | Generation X |
| Europe | Albania | 1998 | male | 35-54 years | 29 | 388200 | 7.47 | 2707123772 | 899 | Boomers |
| Europe | Albania | 1998 | male | 55-74 years | 9 | 183800 | 4.90 | 2707123772 | 899 | Silent |
| Europe | Albania | 1998 | female | 25-34 years | 10 | 275300 | 3.63 | 2707123772 | 899 | Generation X |
| Europe | Albania | 1998 | female | 55-74 years | 6 | 188200 | 3.19 | 2707123772 | 899 | Silent |
| Europe | Albania | 1998 | female | 35-54 years | 9 | 372100 | 2.42 | 2707123772 | 899 | Boomers |
| Europe | Albania | 1998 | male | 5-14 years | 2 | 388400 | 0.51 | 2707123772 | 899 | Millenials |
| Europe | Albania | 1998 | female | 5-14 years | 1 | 363800 | 0.27 | 2707123772 | 899 | Millenials |
| Europe | Albania | 1998 | female | 75+ years | 0 | 42300 | 0.00 | 2707123772 | 899 | G.I. Generation |
| Europe | Albania | 1999 | male | 75+ years | 3 | 25900 | 11.58 | 3414760915 | 1127 | G.I. Generation |
| Europe | Albania | 1999 | male | 15-24 years | 24 | 250600 | 9.58 | 3414760915 | 1127 | Generation X |
| Europe | Albania | 1999 | female | 75+ years | 4 | 42400 | 9.43 | 3414760915 | 1127 | G.I. Generation |
| Europe | Albania | 1999 | male | 35-54 years | 31 | 391100 | 7.93 | 3414760915 | 1127 | Boomers |
| Europe | Albania | 1999 | male | 25-34 years | 19 | 242300 | 7.84 | 3414760915 | 1127 | Generation X |
| Europe | Albania | 1999 | male | 55-74 years | 14 | 185200 | 7.56 | 3414760915 | 1127 | Silent |
| Europe | Albania | 1999 | female | 15-24 years | 19 | 296800 | 6.40 | 3414760915 | 1127 | Generation X |
| Europe | Albania | 1999 | female | 25-34 years | 13 | 276500 | 4.70 | 3414760915 | 1127 | Generation X |
| Europe | Albania | 1999 | female | 55-74 years | 6 | 188800 | 3.18 | 3414760915 | 1127 | Silent |
| Europe | Albania | 1999 | female | 35-54 years | 5 | 373600 | 1.34 | 3414760915 | 1127 | Boomers |
| Europe | Albania | 1999 | female | 5-14 years | 1 | 365200 | 0.27 | 3414760915 | 1127 | Millenials |
| Europe | Albania | 1999 | male | 5-14 years | 0 | 391300 | 0.00 | 3414760915 | 1127 | Millenials |
| Europe | Albania | 2000 | male | 25-34 years | 17 | 232000 | 7.33 | 3632043908 | 1299 | Generation X |
| Europe | Albania | 2000 | male | 55-74 years | 10 | 177400 | 5.64 | 3632043908 | 1299 | Silent |
| Europe | Albania | 2000 | female | 75+ years | 2 | 37800 | 5.29 | 3632043908 | 1299 | G.I. Generation |
| Europe | Albania | 2000 | male | 75+ years | 1 | 24900 | 4.02 | 3632043908 | 1299 | G.I. Generation |
| Europe | Albania | 2000 | female | 15-24 years | 6 | 263900 | 2.27 | 3632043908 | 1299 | Generation X |
| Europe | Albania | 2000 | male | 15-24 years | 5 | 240000 | 2.08 | 3632043908 | 1299 | Generation X |
| Europe | Albania | 2000 | female | 35-54 years | 5 | 332200 | 1.51 | 3632043908 | 1299 | Boomers |
| Europe | Albania | 2000 | female | 25-34 years | 3 | 245800 | 1.22 | 3632043908 | 1299 | Generation X |
| Europe | Albania | 2000 | male | 35-54 years | 4 | 374700 | 1.07 | 3632043908 | 1299 | Boomers |
| Europe | Albania | 2000 | male | 5-14 years | 1 | 374900 | 0.27 | 3632043908 | 1299 | Millenials |
| Europe | Albania | 2000 | female | 5-14 years | 0 | 324700 | 0.00 | 3632043908 | 1299 | Millenials |
| Europe | Albania | 2000 | female | 55-74 years | 0 | 168000 | 0.00 | 3632043908 | 1299 | Silent |
| Europe | Albania | 2001 | male | 25-34 years | 22 | 206484 | 10.65 | 4060758804 | 1451 | Generation X |
| Europe | Albania | 2001 | male | 35-54 years | 34 | 378826 | 8.98 | 4060758804 | 1451 | Boomers |
| Europe | Albania | 2001 | male | 55-74 years | 11 | 196670 | 5.59 | 4060758804 | 1451 | Silent |
| Europe | Albania | 2001 | female | 75+ years | 2 | 47254 | 4.23 | 4060758804 | 1451 | Silent |
| Europe | Albania | 2001 | male | 15-24 years | 10 | 256039 | 3.91 | 4060758804 | 1451 | Millenials |
| Europe | Albania | 2001 | female | 15-24 years | 9 | 271359 | 3.32 | 4060758804 | 1451 | Millenials |
| Europe | Albania | 2001 | female | 35-54 years | 12 | 370191 | 3.24 | 4060758804 | 1451 | Boomers |
| Europe | Albania | 2001 | male | 75+ years | 1 | 31044 | 3.22 | 4060758804 | 1451 | Silent |
| Europe | Albania | 2001 | female | 55-74 years | 6 | 189799 | 3.16 | 4060758804 | 1451 | Silent |
| Europe | Albania | 2001 | male | 5-14 years | 6 | 321556 | 1.87 | 4060758804 | 1451 | Millenials |
| Europe | Albania | 2001 | female | 25-34 years | 4 | 222771 | 1.80 | 4060758804 | 1451 | Generation X |
| Europe | Albania | 2001 | female | 5-14 years | 2 | 307356 | 0.65 | 4060758804 | 1451 | Millenials |
| Europe | Albania | 2002 | male | 75+ years | 4 | 31007 | 12.90 | 4435078648 | 1573 | Silent |
| Europe | Albania | 2002 | male | 25-34 years | 23 | 206286 | 11.15 | 4435078648 | 1573 | Generation X |
| Europe | Albania | 2002 | male | 35-54 years | 35 | 382139 | 9.16 | 4435078648 | 1573 | Boomers |
| Europe | Albania | 2002 | male | 55-74 years | 13 | 198130 | 6.56 | 4435078648 | 1573 | Silent |
| Europe | Albania | 2002 | male | 15-24 years | 15 | 263067 | 5.70 | 4435078648 | 1573 | Millenials |
| Europe | Albania | 2002 | female | 15-24 years | 14 | 275970 | 5.07 | 4435078648 | 1573 | Millenials |
| Europe | Albania | 2002 | female | 35-54 years | 15 | 375113 | 4.00 | 4435078648 | 1573 | Boomers |
| Europe | Albania | 2002 | female | 25-34 years | 7 | 223685 | 3.13 | 4435078648 | 1573 | Generation X |
| Europe | Albania | 2002 | female | 75+ years | 1 | 47407 | 2.11 | 4435078648 | 1573 | Silent |
| Europe | Albania | 2002 | female | 55-74 years | 4 | 191712 | 2.09 | 4435078648 | 1573 | Silent |
| Europe | Albania | 2002 | female | 5-14 years | 1 | 304850 | 0.33 | 4435078648 | 1573 | Millenials |
| Europe | Albania | 2002 | male | 5-14 years | 1 | 319473 | 0.31 | 4435078648 | 1573 | Millenials |
| Europe | Albania | 2003 | female | 75+ years | 6 | 49088 | 12.22 | 5746945913 | 2021 | Silent |
| Europe | Albania | 2003 | male | 55-74 years | 16 | 201520 | 7.94 | 5746945913 | 2021 | Silent |
| Europe | Albania | 2003 | male | 35-54 years | 28 | 386196 | 7.25 | 5746945913 | 2021 | Boomers |
| Europe | Albania | 2003 | male | 15-24 years | 15 | 273235 | 5.49 | 5746945913 | 2021 | Millenials |
| Europe | Albania | 2003 | female | 15-24 years | 14 | 283709 | 4.93 | 5746945913 | 2021 | Millenials |
| Europe | Albania | 2003 | female | 55-74 years | 9 | 195699 | 4.60 | 5746945913 | 2021 | Silent |
| Europe | Albania | 2003 | male | 25-34 years | 9 | 205433 | 4.38 | 5746945913 | 2021 | Generation X |
| Europe | Albania | 2003 | female | 25-34 years | 9 | 222941 | 4.04 | 5746945913 | 2021 | Generation X |
| Europe | Albania | 2003 | female | 35-54 years | 13 | 381760 | 3.41 | 5746945913 | 2021 | Boomers |
| Europe | Albania | 2003 | male | 75+ years | 1 | 32667 | 3.06 | 5746945913 | 2021 | Silent |
| Europe | Albania | 2003 | male | 5-14 years | 4 | 313204 | 1.28 | 5746945913 | 2021 | Millenials |
| Europe | Albania | 2003 | female | 5-14 years | 0 | 298477 | 0.00 | 5746945913 | 2021 | Millenials |
| Europe | Albania | 2004 | male | 75+ years | 4 | 35526 | 11.26 | 7314865176 | 2544 | Silent |
| Europe | Albania | 2004 | male | 35-54 years | 39 | 391767 | 9.95 | 7314865176 | 2544 | Boomers |
| Europe | Albania | 2004 | male | 25-34 years | 16 | 203938 | 7.85 | 7314865176 | 2544 | Generation X |
| Europe | Albania | 2004 | female | 15-24 years | 20 | 292268 | 6.84 | 7314865176 | 2544 | Millenials |
| Europe | Albania | 2004 | male | 15-24 years | 19 | 286768 | 6.63 | 7314865176 | 2544 | Millenials |
| Europe | Albania | 2004 | female | 75+ years | 3 | 50970 | 5.89 | 7314865176 | 2544 | Silent |
| Europe | Albania | 2004 | female | 25-34 years | 11 | 222389 | 4.95 | 7314865176 | 2544 | Generation X |
| Europe | Albania | 2004 | male | 55-74 years | 10 | 207202 | 4.83 | 7314865176 | 2544 | Silent |
| Europe | Albania | 2004 | female | 35-54 years | 17 | 391436 | 4.34 | 7314865176 | 2544 | Boomers |
| Europe | Albania | 2004 | female | 55-74 years | 3 | 203841 | 1.47 | 7314865176 | 2544 | Silent |
| Europe | Albania | 2004 | female | 5-14 years | 3 | 286705 | 1.05 | 7314865176 | 2544 | Millenials |
| Europe | Albania | 2004 | male | 5-14 years | 1 | 302181 | 0.33 | 7314865176 | 2544 | Millenials |
| Europe | Albania | 2005 | female | 15-24 years | 0 | 281922 | 0.00 | 8158548717 | 2931 | Millenials |
| Europe | Albania | 2005 | female | 25-34 years | 0 | 190745 | 0.00 | 8158548717 | 2931 | Generation X |
| Europe | Albania | 2005 | female | 35-54 years | 0 | 386513 | 0.00 | 8158548717 | 2931 | Boomers |
| Europe | Albania | 2005 | female | 5-14 years | 0 | 276559 | 0.00 | 8158548717 | 2931 | Millenials |
| Europe | Albania | 2005 | female | 55-74 years | 0 | 210998 | 0.00 | 8158548717 | 2931 | Silent |
| Europe | Albania | 2005 | female | 75+ years | 0 | 53191 | 0.00 | 8158548717 | 2931 | Silent |
| Europe | Albania | 2005 | male | 15-24 years | 0 | 281675 | 0.00 | 8158548717 | 2931 | Millenials |
| Europe | Albania | 2005 | male | 25-34 years | 0 | 177519 | 0.00 | 8158548717 | 2931 | Generation X |
glimpse(suicide)## Observations: 27,820
## Variables: 11
## $ Continent <chr> "Europe", "Europe", "Europe", "Europe", "Euro...
## $ Country <chr> "Albania", "Albania", "Albania", "Albania", "...
## $ Year <int> 1987, 1987, 1987, 1987, 1987, 1987, 1987, 198...
## $ Sex <chr> "male", "male", "female", "male", "male", "fe...
## $ Age <chr> "15-24 years", "35-54 years", "15-24 years", ...
## $ Suicides_no <int> 21, 16, 14, 1, 9, 1, 6, 4, 1, 0, 0, 0, 2, 17,...
## $ Population <int> 312900, 308000, 289700, 21800, 274300, 35600,...
## $ Suicide_per_100k <dbl> 6.71, 5.19, 4.83, 4.59, 3.28, 2.81, 2.15, 1.5...
## $ Gdp_for_year <dbl> 2156624900, 2156624900, 2156624900, 215662490...
## $ Gdp_per_capita <int> 796, 796, 796, 796, 796, 796, 796, 796, 796, ...
## $ Generation <chr> "Generation X", "Silent", "Generation X", "G....
summary(suicide)## Continent Country Year Sex
## Length:27820 Length:27820 Min. :1985 Length:27820
## Class :character Class :character 1st Qu.:1995 Class :character
## Mode :character Mode :character Median :2002 Mode :character
## Mean :2001
## 3rd Qu.:2008
## Max. :2016
## Age Suicides_no Population Suicide_per_100k
## Length:27820 Min. : 0.0 Min. : 278 Min. : 0.00
## Class :character 1st Qu.: 3.0 1st Qu.: 97498 1st Qu.: 0.92
## Mode :character Median : 25.0 Median : 430150 Median : 5.99
## Mean : 242.6 Mean : 1844794 Mean : 12.82
## 3rd Qu.: 131.0 3rd Qu.: 1486143 3rd Qu.: 16.62
## Max. :22338.0 Max. :43805214 Max. :224.97
## Gdp_for_year Gdp_per_capita Generation
## Min. :4.692e+07 Min. : 251 Length:27820
## 1st Qu.:8.985e+09 1st Qu.: 3447 Class :character
## Median :4.811e+10 Median : 9372 Mode :character
## Mean :4.456e+11 Mean : 16866
## 3rd Qu.:2.602e+11 3rd Qu.: 24874
## Max. :1.812e+13 Max. :126352
mapped <- joinCountryData2Map(suicide, joinCode="NAME", nameJoinColumn="Country")## 27508 codes from your data successfully matched countries in the map
## 312 codes from your data failed to match with a country code in the map
## 144 codes from the map weren't represented in your data
mapCountryData(mapped, nameColumnToPlot="Suicides_no", mapTitle="Suicide Throughtout The World", catMethod = "pretty", colourPalette = "rainbow")## You asked for 7 categories, 5 were used due to pretty() classification
From this perspective, it seems as though continents with better economies have more suicide rates. However, later in this research we will prove whether or not this assumption is true.
ggplot(suicide, aes(x=Continent, y=Suicide_per_100k, fill=Continent)) + geom_bar(stat="identity")+theme_minimal() + scale_fill_brewer(palette="Set3")As seen in the graph above, Europe seems to have the leading suicide rates count followed the Americas while Africa has the least, which is depicted in the world map plot prior to this plot. On the other hand, this does not necessarily mean that people hardly commit suicide in Africa, but it could mean some cases were not accounted for as well.
suicide2 <- suicide
data <- suicide2 %>%
dplyr::group_by(Country) %>%
summarise(Suiciderates = mean(Suicide_per_100k))
data$Country <- factor(data$Country, levels = rev( data$Country[order(-data$Suiciderates)]))
ggplot(data, aes(x=Country, y=Suiciderates, fill = Country)) + geom_bar(stat="identity")+ theme_minimal() + theme(axis.text=element_text(size=6)) + theme(legend.position = "none") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) countries <- suicide %>%
dplyr::select(Country, Suicide_per_100k) %>%
dplyr::group_by(Country) %>%
dplyr::summarise(mean_suicide = mean(Suicide_per_100k)) %>%
dplyr::arrange(desc(mean_suicide)) %>%
data.frame()## Warning: package 'bindrcpp' was built under R version 3.5.2
kable(c(head(countries, 5), tail(countries, 5))) %>% kable_styling(bootstrap_options = "striped" ,font_size = 10)
|
|
|
|
average <- (sum(as.numeric(suicide$Suicides_no)) / sum(as.numeric(suicide$Population))) * 100000
suicide %>%
group_by(Year) %>%
summarise(population = sum(Population),
suicides = sum(Suicides_no),
suicides_per_100k = (suicides / population) * 100000) %>%
ggplot(aes(x = Year, y = suicides_per_100k)) +
geom_line(aes(color = Year), size = 1) +
geom_point(size = 2) +
geom_hline(yintercept = average, size = 1, linetype = 2) +
labs(title = "Global Suicides (per 100k)1985 - 2016",
x = "Year",
y = "Suicides per 100k") +
scale_x_continuous(breaks = seq(1985, 2016, 2)) +
scale_y_continuous(breaks = seq(10, 30)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))Suicide rates have decreased over the years. One point that can contribute to this is the fact that people are becoming more aware and knowlegeable due to the amount of programs that are being put into place. In addition this, social media is a part of our daily lives and people are more connected than ever, therefore help is always available.
suicide %>%
dplyr::group_by(Year) %>%
ggplot(aes(x = Year, y = Suicide_per_100k, group = Year)) +
geom_boxplot() +
labs(title = "Global Suicides (per 100k) 1985 - 2016",
x = "Year",
y = "Suicides per 100k") +
scale_x_continuous(breaks = seq(1985, 2016, 1)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))suicide %>%
group_by(Year, Age) %>%
summarise(population = sum(Population),
suicides = sum(Suicides_no),
suicides_per_100k = (suicides / population) * 100000) %>%
ggplot(aes(x = Year, y = suicides_per_100k, group = Age)) +
geom_line(aes(color = Age), size = 1) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
labs(title = "Global Suicides By Age 1985 -2016",
x = "Year",
y = "Suicides per 100k") +
scale_x_continuous(breaks = seq(1985, 2016, 2))ggplot(suicide, aes(x = Generation, y = Suicide_per_100k, fill = Generation)) + geom_boxplot() + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + theme(axis.text=element_text(size=8)) + theme(legend.position = "none")plot1 <- ggplot(suicide, aes(x = Sex, y = Suicide_per_100k, fill = Sex)) + geom_bar(stat = "identity")
plot2 <- ggplot(suicide, aes(x =Suicide_per_100k, fill = Sex, color = Sex )) + geom_histogram(alpha=0.5, position="identity")
grid.arrange(plot1, plot2, ncol = 2)## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Less women committed suicide than men. The distibution of suicide rate globally is unimodal and right skewed or positively skewed, for both males and females.
suicide %>% filter(Sex == "female") %>% summarise(Female = mean(Suicides_no))## Female
## 1 112.1143
suicide %>% filter(Sex == "male") %>% summarise(Male = mean(Suicides_no))## Male
## 1 373.0345
Globally, on average 374 would commit suicide compared to 113 women.
Here is a look at the outliers associates with each gender for a given region.
ggplot(suicide, aes(x=Continent, y=Suicide_per_100k, fill=Sex)) +
geom_boxplot() +
theme_minimal() +
scale_fill_brewer(palette="Set3")Overall more men are committing suicide than women.
suicide %>% ggplot(aes(x = Country, y = Gdp_per_capita, fill = Country)) + geom_bar(stat = "identity") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + theme(axis.text=element_text(size=6)) + theme(legend.position='none') + transition_time(Year) + labs(title = "Year: {frame_time}")meangdp = mean(suicide$Gdp_per_capita)
his <- suicide %>%
ggplot(aes(x = Gdp_per_capita, fill = Continent)) +
geom_histogram(position="identity", alpha=0.7) +
# Add mean lines
geom_vline(aes(xintercept=meangdp, color= "red"), linetype="dashed") +
ggtitle("Distribution of GDP")
bar <- ggplot(suicide, aes(x=Continent, y=Gdp_per_capita, fill=Continent)) + geom_bar(stat="identity")+theme_minimal() + scale_fill_brewer(palette="Set3")
grid.arrange(his, bar, ncol = 2)## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Notice that Europe has the most reported suicides (shown prior) and also has the highest GDP (per capita)
countries2 <- suicide %>%
dplyr::select(Country, Gdp_per_capita) %>%
dplyr::group_by(Country) %>%
dplyr::summarise(mean_gdppc = mean(Gdp_per_capita)) %>%
dplyr::arrange(desc(mean_gdppc)) %>%
data.frame()
kable(c(head(countries2, 5), tail(countries2, 5))) %>% kable_styling(bootstrap_options = "striped" ,font_size = 10)
|
|
|
|
suicide %>%
ggplot(aes(x = Gdp_per_capita, y = Suicide_per_100k, group = Continent)) +
geom_point(aes(color = Continent), size = 1) + labs(y = "Suicide", x = "GDP Per Capita")group_by(suicide, Continent) %>%
summarise(
size = n(),
mean = mean(Suicides_no, na.rm = TRUE),
sd = sd(Suicides_no, na.rm = TRUE)
)## # A tibble: 5 x 4
## Continent size mean sd
## <chr> <int> <dbl> <dbl>
## 1 Africa 850 13.4 23.7
## 2 Americas 9214 194. 798.
## 3 Asia 5366 271. 838.
## 4 Europe 11418 299. 1061.
## 5 Oceania 972 87.3 148.
# Compute the analysis of variance
aov_continents <- aov(Suicides_no ~ Continent, data = suicide)
# Summary of the analysis
summary(aov_continents)## Df Sum Sq Mean Sq F value Pr(>F)
## Continent 4 1.300e+08 32501192 40.17 <2e-16 ***
## Residuals 27815 2.251e+10 809134
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The P-Value is very small (< 0.05) and therefore we reject the null hypothesis in favor of the average suicide counts are different in each continent. There are many factors that can prove this otherwise such as age of the populous, religious orientation, race and ethnicity, education, economic status, social or ethnic customs and so on.
\(H_0 = 0:\) There is no relationship between GDP and Suicide Rates. GDP does not affect the outcome of suicide rates.
\(H_0 \neq 0:\) There is a relationship between GDP and Suicide. Suicide Rates changes when GDP does.
Check Conditions:
Independence: The sample size consist of less than 10% of the population so assuming independence is reasonable.
Normal: The distribution is strongly skewed however, we can be lenient on the skewness because there are over 30 cases in the sample.
plot(suicide[, -c(1, 2, 4, 5, 11)])Looking at our plot, it does not appear that any of our quantitative predictor variables are highly correlated, or have a strong linear relationship with one another.
C <- cor(suicide[, -c(1, 2, 4, 5, 11)])
corrplot(C, type="upper", order="hclust",
col=brewer.pal(n=8, name="Spectral")) Since none of the correlations are greater than 0.9, our results confirmed what was mentioned in the pairwise plot prior.
Using the Akaike Information Criteria (AIC) from the MASS package to develop the best model for this data.
start.model <- lm(Suicide_per_100k ~ Gdp_per_capita + Continent + Country + Year + Sex + Age + Suicides_no + Population + Gdp_for_year + Generation, data = suicide)
lm.model <- lm(Suicide_per_100k ~ 1, data = suicide)
stepAIC(start.model, scope = list(upper = start.model, lower = lm.model), direction = "backward")## Start: AIC=141789.4
## Suicide_per_100k ~ Gdp_per_capita + Continent + Country + Year +
## Sex + Age + Suicides_no + Population + Gdp_for_year + Generation
##
##
## Step: AIC=141789.4
## Suicide_per_100k ~ Gdp_per_capita + Country + Year + Sex + Age +
## Suicides_no + Population + Gdp_for_year + Generation
##
## Df Sum of Sq RSS AIC
## <none> 4509839 141789
## - Gdp_for_year 1 708 4510547 141792
## - Year 1 1731 4511571 141798
## - Generation 5 8308 4518147 141831
## - Gdp_per_capita 1 12214 4522053 141863
## - Population 1 36665 4546504 142013
## - Age 5 177620 4687460 142854
## - Suicides_no 1 292368 4802208 143535
## - Sex 1 1181868 5691708 148263
## - Country 100 1838596 6348435 151102
##
## Call:
## lm(formula = Suicide_per_100k ~ Gdp_per_capita + Country + Year +
## Sex + Age + Suicides_no + Population + Gdp_for_year + Generation,
## data = suicide)
##
## Coefficients:
## (Intercept) Gdp_per_capita
## 1.239e+02 -9.423e-05
## CountryAntigua and Barbuda CountryArgentina
## -2.306e+00 8.227e+00
## CountryArmenia CountryAruba
## -1.201e-01 8.314e+00
## CountryAustralia CountryAustria
## 1.213e+01 2.282e+01
## CountryAzerbaijan CountryBahamas
## -1.565e+00 -8.137e-03
## CountryBahrain CountryBarbados
## -4.842e-02 2.802e-01
## CountryBelarus CountryBelgium
## 2.694e+01 2.004e+01
## CountryBelize CountryBosnia and Herzegovina
## 2.871e+00 1.976e+00
## CountryBrazil CountryBulgaria
## 8.907e+00 1.592e+01
## CountryCabo Verde CountryCanada
## 8.179e+00 1.135e+01
## CountryChile CountryColombia
## 7.667e+00 3.372e+00
## CountryCosta Rica CountryCroatia
## 3.835e+00 2.012e+01
## CountryCuba CountryCyprus
## 1.792e+01 2.480e+00
## CountryCzech Republic CountryDenmark
## 1.581e+01 1.514e+01
## CountryDominica CountryEcuador
## -4.703e+00 3.178e+00
## CountryEl Salvador CountryEstonia
## 7.230e+00 2.477e+01
## CountryFiji CountryFinland
## 2.163e+00 2.209e+01
## CountryFrance CountryGeorgia
## 1.883e+01 9.408e-01
## CountryGermany CountryGreece
## 1.426e+01 2.252e+00
## CountryGrenada CountryGuatemala
## -1.058e+00 1.195e-01
## CountryGuyana CountryHungary
## 1.856e+01 2.927e+01
## CountryIceland CountryIreland
## 1.264e+01 1.016e+01
## CountryIsrael CountryItaly
## 7.449e+00 8.320e+00
## CountryJamaica CountryJapan
## -2.927e+00 1.565e+01
## CountryKazakhstan CountryKiribati
## 2.650e+01 2.697e+00
## CountryKuwait CountryKyrgyzstan
## 1.863e-01 1.075e+01
## CountryLatvia CountryLithuania
## 2.646e+01 3.734e+01
## CountryLuxembourg CountryMacau
## 1.919e+01 1.187e+01
## CountryMaldives CountryMalta
## -1.548e+00 2.541e+00
## CountryMauritius CountryMexico
## 8.356e+00 5.392e+00
## CountryMongolia CountryMontenegro
## 1.371e+01 6.967e+00
## CountryNetherlands CountryNew Zealand
## 1.021e+01 1.259e+01
## CountryNicaragua CountryNorway
## 3.692e+00 1.448e+01
## CountryOman CountryPanama
## -3.870e-01 2.828e+00
## CountryParaguay CountryPhilippines
## 8.572e-01 2.761e+00
## CountryPoland CountryPortugal
## 1.200e+01 8.924e+00
## CountryPuerto Rico CountryQatar
## 8.152e+00 4.766e+00
## CountryRepublic of Korea CountryRomania
## 2.147e+01 9.315e+00
## CountryRussian Federation CountrySaint Kitts and Nevis
## 2.010e+01 -3.864e+00
## CountrySaint Lucia CountrySaint Vincent and Grenadines
## 3.921e+00 2.530e+00
## CountrySan Marino CountrySerbia
## 5.446e+00 1.901e+01
## CountrySeychelles CountrySingapore
## 4.934e+00 1.681e+01
## CountrySlovakia CountrySlovenia
## 9.968e+00 2.597e+01
## CountrySouth Africa CountrySpain
## 5.425e-01 8.369e+00
## CountrySri Lanka CountrySuriname
## 3.025e+01 1.789e+01
## CountrySweden CountrySwitzerland
## 1.462e+01 2.120e+01
## CountryThailand CountryTrinidad and Tobago
## 5.362e+00 1.074e+01
## CountryTurkey CountryTurkmenistan
## 3.713e+00 5.208e+00
## CountryUkraine CountryUnited Arab Emirates
## 2.074e+01 2.260e+00
## CountryUnited Kingdom CountryUnited States
## 7.678e+00 1.185e+01
## CountryUruguay CountryUzbekistan
## 1.634e+01 5.146e+00
## Year Sexmale
## -6.534e-02 1.338e+01
## Age25-34 years Age35-54 years
## 2.969e+00 5.502e+00
## Age5-14 years Age55-74 years
## -8.061e+00 7.195e+00
## Age75+ years Suicides_no
## 1.480e+01 5.380e-03
## Population Gdp_for_year
## -7.596e-07 3.170e-13
## GenerationG.I. Generation GenerationGeneration X
## 7.106e-01 3.763e-01
## GenerationGeneration Z GenerationMillenials
## 2.202e+00 4.699e-01
## GenerationSilent
## -8.903e-01
summary(lm.model)##
## Call:
## lm(formula = Suicide_per_100k ~ 1, data = suicide)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.816 -11.896 -6.826 3.804 212.154
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.8161 0.1137 112.7 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.96 on 27819 degrees of freedom
This model is statistically significant as the P-Value is less than 0.05.
To get a less populated view of the scatterplot I grouped the data by country and continent. Every country is plotted into this graph.
suicide %>%
group_by(Country, Continent) %>%
summarise(population = sum(as.numeric(Population)),
suicides = sum(as.numeric(Suicides_no)),
suicides_per_100k = (suicides / population) * 100000,
gdp_per_capita = mean(Gdp_per_capita)) %>%
ggplot(aes(x = gdp_per_capita, y = suicides_per_100k, group = Continent)) +
geom_point(aes(color = Continent), size = 1) + labs(y = "Suicide Rates", x = "GDP Per Capita") + geom_smooth(method=lm, linetype="dashed", color="darkgreen", fill="orange", aes(group = 1))Correlation:
cor.test(suicide$Gdp_per_capita, suicide$Suicide_per_100k) ##
## Pearson's product-moment correlation
##
## data: suicide$Gdp_per_capita and suicide$Suicide_per_100k
## t = 0.29774, df = 27818, p-value = 0.7659
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.009966025 0.013535799
## sample estimates:
## cor
## 0.001785134
Very weak and positive trend but with the p-value being very big makes the correlation insignificant.
mod <- lm(Suicide_per_100k ~ Gdp_per_capita, data = suicide)
summary(mod)##
## Call:
## lm(formula = Suicide_per_100k ~ Gdp_per_capita, data = suicide)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.012 -11.894 -6.827 3.802 212.152
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.279e+01 1.524e-01 83.888 <2e-16 ***
## Gdp_per_capita 1.792e-06 6.019e-06 0.298 0.766
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.96 on 27818 degrees of freedom
## Multiple R-squared: 3.187e-06, Adjusted R-squared: -3.276e-05
## F-statistic: 0.08865 on 1 and 27818 DF, p-value: 0.7659
Equation:
\[ \hat{S} = 12.78587 + 0.000001792 * gdp\_per\_captia \]
layout(matrix(c(1,2,3,4),2,2))
plot(mod)moddf <- broom::augment(mod)
ggplot(moddf, aes(x = .resid)) + geom_histogram()## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The dignostic plots above shows that the model does not appear to meet the criteria for linearity, normality and constant variance.
The hypothesis test concludes that the correlation is not significantly different from zero since the p-value is huge (>0.05) so \(H_0\) is not rejected. In other words, there is insufficient evidence to conlude that there is a significant relationship between GDP and suicide. Therefore, we cannot use the linear regression line to model between GDP and suicide in the population. Even though the line in the scatterplot shows a weak but linear trend, it may not be appropriate or relaible for prediction outside the domain of the observed GDP values in the dataset.
Suicide occurs throughout the world, affecting individuals of all nations, cultures, religions, genders and classes. Moreover, statistics show that the countries with the highest suicide rates in the world are unbelievably diverse. For instance, in this report, among the top five are the eastern European country of Lithuania with 40.42 suicides per 100k, the eastern European country of Russia (34.89 suicides per 100k) as well as Sri Lanka, Hungary and Belarus with 34.89, 32.76 and 31.08 respectively.
Contrarily, many of the most troubled nations in the world have comparatively low suicide rates such as Mexico has 4.71 suicides per 100k. Maybe the people are concerned with trying to survive. It is not clear if the suicide statistics for these countries reflect suicides committed due to mental health problems, terminal illnesses or conflicts within the country which are all plausible factors for committing suicide. The islands of the Caribbean seem to have the lowest suicide rates especially in Antigua and Barbuda(0.55), Jamaica(0.52), Dominica (0) and St. Kitts and Nevis (0).
Although GDP does not directly impact suicide, it could be one of the many circumstantial reasons as it relates to economic aspects of a country. However, based on this analysis, it is statistically insignificant to support that GDP per capita of a country contributes to suicide. There are a lot of reasons why people commit suicide and not every one is willing to talk about how they feel but it is our job as a community to observe and lend a helping hand.
For future reseach, studies can be done to attest why countries with best economic standings may have higher suicide rates compared to those in poor or developing countries.
Diez, David M., et al. OpenIntro Statistics. OpenIntro, 2016.
Lee, Lindsay, et al. “Suicide.” Our World in Data, 15 June 2015, [ourworldindata.org/suicide.
“Suicide in America: Frequently Asked Questions.” National Institute of Mental Health, U.S. Department of Health and Human Services, www.nimh.nih.gov/health/publications/suicide-faq/index.shtml#pub9.