Load data
data(addiction, package = "catdata")
dtaA<-addiction
head(dtaA)
## ill gender age university
## 1 1 1 61 0
## 2 0 1 43 0
## 3 2 0 44 0
## 4 0 1 21 1
## 5 0 0 33 0
## 6 1 0 83 0
Define variables as factor
library(magrittr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
dtaA$Both<-(ifelse(dtaA$ill== 2, "1", "0"))
dtaA$disease <-(ifelse(dtaA$ill==1, "1", "0"))
dtaA$weak <- (ifelse(dtaA$ill==0, "1", "0"))
dtaA$gender<-as.factor(dtaA$gender)
dtaA$academic<-as.factor(dtaA$university)
dtaA$weak<-as.factor(dtaA$weak)
dtaA$disease<-as.factor(dtaA$disease)
dtaA$Both<-as.factor(dtaA$Both)
head(dtaA)
## ill gender age university Both disease weak academic
## 1 1 1 61 0 0 1 0 0
## 2 0 1 43 0 0 0 1 0
## 3 2 0 44 0 1 0 0 0
## 4 0 1 21 1 0 0 1 1
## 5 0 0 33 0 0 0 1 0
## 6 1 0 83 0 0 1 0 0
3 logistic regression models were computed for ill=0, 1, 2 respectively
gender (0 = male, 1 = female), age, and academic status (1 = academic or 0 = otherwise)
The m1 output showed that the probability of the event to label addiction as “weak” decreases when respondents are female (the event decrease every 38 men vs every 100 women), younger (the odds of the event decreases 0.03 times for older respondents) & being in academic (the event decreases 1.45 times in academic than in non academic).
m1<-glm(weak~gender+age+academic, family = binomial(logit), data=dtaA)
summary(m1)
##
## Call:
## glm(formula = weak ~ gender + age + academic, family = binomial(logit),
## data = dtaA)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.3377 -0.8802 -0.5684 1.1237 2.7182
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.818028 0.245436 3.333 0.000859 ***
## gender1 -0.385307 0.180721 -2.132 0.033002 *
## age -0.034523 0.005856 -5.895 3.74e-09 ***
## academic1 -1.449038 0.245528 -5.902 3.60e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 827.03 on 681 degrees of freedom
## Residual deviance: 744.08 on 678 degrees of freedom
## (30 observations deleted due to missingness)
## AIC: 752.08
##
## Number of Fisher Scoring iterations: 4
The m2 output showed that the probability of the event to label addiction as “disease” increases only when respondents being in Academic (the event happens 1.07 times in academic vs not in academic).
m2<-glm(disease~gender+age+academic, family = binomial(logit), data=dtaA)
summary(m2)
##
## Call:
## glm(formula = disease ~ gender + age + academic, family = binomial(logit),
## data = dtaA)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6399 -1.0087 -0.8759 1.2521 1.5125
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.895789 0.222383 -4.028 5.62e-05 ***
## gender1 0.297274 0.160650 1.850 0.0643 .
## age 0.007976 0.004686 1.702 0.0888 .
## academic1 1.074952 0.180018 5.971 2.35e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 941.48 on 681 degrees of freedom
## Residual deviance: 899.95 on 678 degrees of freedom
## (30 observations deleted due to missingness)
## AIC: 907.95
##
## Number of Fisher Scoring iterations: 4
The m3 output showed that the probability of the event to label addiction as “Both” increases when respondents are younger in age (the probability of the event happens 0.02 times for older respondents as compared to younger respondents ).
m3<-glm(Both~ gender+age+academic, family = binomial(logit), data=dtaA)
summary(m3)
##
## Call:
## glm(formula = Both ~ gender + age + academic, family = binomial(logit),
## data = dtaA)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1589 -0.7583 -0.6500 -0.5673 1.9242
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.977987 0.258269 -7.659 1.88e-14 ***
## gender1 0.015297 0.182855 0.084 0.933
## age 0.022885 0.005167 4.429 9.47e-06 ***
## academic1 -0.079499 0.205707 -0.386 0.699
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 756.98 on 681 degrees of freedom
## Residual deviance: 736.94 on 678 degrees of freedom
## (30 observations deleted due to missingness)
## AIC: 744.94
##
## Number of Fisher Scoring iterations: 4