For Part 2 of this week’s assignment, I will be using the “foreveralone” data from Kaggle. The dataset contains demographics from the Reddit group reddit.com/r/foreveralone
Dependent variable: 1. Sucicide attempt - Yes or No
Indepdent variable: 1. Friends - Numeric 2. Age - Numeric 3. Virgin - Yes or No 4. Depressed - Yes or No
library(readr)
foreveralone<-read_csv("C:/Users/wroni/OneDrive/Documents/QC MADASR/SOC 712/foreveralone.csv")
I loaded the dataset into RStudio.
library(dplyr)
foreveralone2 <- mutate(foreveralone, attempt_suicide_binary= recode(attempt_suicide,`No` = 0, `Yes` = 1))
foreveralone3 <- foreveralone2 %>%
select(attempt_suicide_binary, age, friends, virgin, depressed)
head(foreveralone3)
foreveralone4 <- mutate(foreveralone3, virgin = factor(virgin))
foreveralone5 <- mutate(foreveralone4, depressed = factor(depressed))
I did some dplyr magic to get it ready for Zelig logit. I made Suicide Attempt a binary and made sure Virgin.
library(Zelig)
FAZ=zlogit$new()
FAZ$zelig(attempt_suicide_binary ~ age + friends + virgin + depressed, data = foreveralone5)
summary(FAZ)
## Model:
##
## Call:
## FAZ$zelig(formula = attempt_suicide_binary ~ age + friends +
## virgin + depressed, data = foreveralone5)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.9526 -0.7444 -0.4824 -0.3203 2.4357
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.994399 0.713249 -2.796 0.00517
## age -0.003632 0.021633 -0.168 0.86667
## friends -0.040831 0.022994 -1.776 0.07578
## virginYes -0.527867 0.305075 -1.730 0.08358
## depressedYes 1.515713 0.359444 4.217 2.48e-05
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 443.92 on 468 degrees of freedom
## Residual deviance: 411.23 on 464 degrees of freedom
## AIC: 421.23
##
## Number of Fisher Scoring iterations: 7
##
## Next step: Use 'setx' method
I ran logit with Zelig with my dataset.
The results above show that as age increases suicide attempt decreases by 0.003632, as number of friends increases suicide attempt decreases by 0.040831, being a virgin decreases suicide attempt by 0.527867, and being depressed increases suicide attempt by 1.515713.
FAZ$setrange(friends = min(foreveralone5$friends) : max(foreveralone5$friends))
FAZ$sim()
FAZ$graph()
This simulation shows the expected values for friends, with all other features set to their default values. The suicide attempt value decreases as friends increases. The highest suicide attempt is those with 0 friends.
FAZ$setrange(age = min(foreveralone5$age) : max(foreveralone5$age))
FAZ$sim()
FAZ$graph()
This simulation shows the expected values for age, with all other features set to their default values. The suicide attempt value decreases as age increases. The highest value for suicide attempt is for age 20-30.
FAZ=zlogit$new()
FAZ$zelig(attempt_suicide_binary ~ age + friends + virgin + depressed, data = foreveralone5)
FAZ$setx(virgin = "No")
FAZ$setx1(virgin = "Yes")
FAZ$sim()
par("mar")
## [1] 5.1 4.1 4.1 2.1
par(mar= c(1,1,1,1))
FAZ$graph()
This simulation shows the expected values for virginity, with all other features set to their default values. The suicide attempt value is greater for those who are not virgins (about .30) compared to those who are (about .20). The difference between the two groups are about about .10.
FAZ=zlogit$new()
FAZ$zelig(attempt_suicide_binary ~ age + friends + virgin + depressed, data = foreveralone5)
FAZ$setx(depressed = "No")
FAZ$setx1(depressed = "Yes")
FAZ$sim()
par("mar")
## [1] 5.1 4.1 4.1 2.1
par(mar= c(1,1,1,1))
FAZ$graph()
This simulation shows the expected values for depression, with all other features set to their default values. The suicide attempt value is greater for those who are depressed (about .20) compared to those who are (about .05). The difference between the two groups are about about .15.
This interesting dataset shows the odds of a suicide attempt for the Reddit subgroup “foreveralone.” Overall, having more friends, being a virgin, and being older decreases the chance of suicide attempt while being depressed increases the chance of suicide attempt.If more online subgroups are analyzed, perhaps we can find more variables that leads to suicide attempts.