How closely an individual reports following COVID news
1: Very closely
0: Fairly/Not closely
| Frequency | Percent | Cum Percent | |
|---|---|---|---|
| fairly/not closely | 4023 | 35.04355 | 35.04355 |
| very closely | 7457 | 64.95645 | 100.00000 |
| Total | 11480 | 100.00000 | NA |
If they consider it a crisis
1: crisis
0: not a crisis
| Frequency | Percent | Cum Percent | |
|---|---|---|---|
| considers it a crisis | 8707 | 75.84495 | 75.84495 |
| does not consider it a crisis | 2773 | 24.15505 | 100.00000 |
| Total | 11480 | 100.00000 | NA |
The number of times they felt anxious in the past week
0: 0-2 days
1: 3-7 days
| Frequency | Percent | Cum Percent | |
|---|---|---|---|
| 0 | 6226 | 54.23345 | 54.23345 |
| 1 | 5254 | 45.76655 | 100.00000 |
| Total | 11480 | 100.00000 | NA |
#### Bivariate graphs
Hypothesis testing
Ho: There is no relationship between how closely an individual follows the news surrounding COVID-19 and their anxiety levels.
Ha: There is a relationship between how closely an individual follows the news surrounding COVID-19 and their anxiety levels.
Explanatory: how closely an individual follows the news surrounding COVID-19 (covidfol with 2 levels)
Response: anxiety levels (anx with 2 levels)
| Test statistic | df | P value |
|---|---|---|
| 158.6 | 1 | 2.341e-36 * * * |
## mydata$covidfol
## mydata$anxlev fairly/not closely very closely
## 0-2 days 2503 3723
## 3-7 days 1520 3734
## mydata$covidfol
## mydata$anxlev fairly/not closely very closely
## 0-2 days 2181.812 4044.188
## 3-7 days 1841.188 3412.812
| fairly/not closely | very closely | |
|---|---|---|
| 0-2 days | 0.6222 | 0.4993 |
| 3-7 days | 0.3778 | 0.5007 |
## mydata$covidfol
## mydata$anxlev fairly/not closely very closely
## 0-2 days 0.5423345 0.5423345
## 3-7 days 0.4576655 0.4576655
| Test statistic | df | P value |
|---|---|---|
| 427.9 | 1 | 4.736e-95 * * * |
## mydata$anxlev
## mydata$crisis 0-2 days 3-7 days
## considers it a crisis 4249 4458
## does not consider it a crisis 1977 796
## mydata$anxlev
## mydata$crisis 0-2 days 3-7 days
## considers it a crisis 4722.106 3984.894
## does not consider it a crisis 1503.894 1269.106
| 0-2 days | 3-7 days | |
|---|---|---|
| considers it a crisis | 0.6825 | 0.8485 |
| does not consider it a crisis | 0.3175 | 0.1515 |
## mydata$anxlev
## mydata$crisis 0-2 days 3-7 days
## considers it a crisis 0.7584495 0.7584495
## does not consider it a crisis 0.2415505 0.2415505
Does one’s perception of whether COVID-19 is a crisis moderate this relationship?
Pearson’s Chi-squared test with Yates’ continuity correction: as.factor(x$anxlev)andx$covidfolTest statistic df P value 50.36 1 1.281e-12 * * * fairly/not closely very closely 0-2 days 0.5501 0.4646 3-7 days 0.4499 0.5354
Pearson’s Chi-squared test with Yates’ continuity correction: as.factor(x$anxlev)andx$covidfolTest statistic df P value 3.894 1 0.04847 * fairly/not closely very closely 0-2 days 0.7274 0.6922 3-7 days 0.2726 0.3078
The relationship remains significant
#### Logistic regression
| Estimate | Std. Error | z value | Pr(>|z|) | |
|---|---|---|---|---|
| (Intercept) | -0.1704 | 0.03726 | -4.574 | 4.788e-06 |
| covidfolvery closely | 0.3009 | 0.0419 | 7.183 | 6.839e-13 |
| crisisdoes not consider it a crisis | -0.8673 | 0.04874 | -17.8 | 7.566e-71 |
(Dispersion parameter for binomial family taken to be 1 )
| Null deviance: | 15832 on 11479 degrees of freedom |
| Residual deviance: | 15339 on 11477 degrees of freedom |
## Waiting for profiling to be done...
| OR | 2.5 % | 97.5 % | |
|---|---|---|---|
| (Intercept) | 0.84 | 0.78 | 0.91 |
| covidfolvery closely | 1.35 | 1.24 | 1.47 |
| crisisdoes not consider it a crisis | 0.42 | 0.38 | 0.46 |
mytree=rpart(anx~covidfol+crisis, data=mydata, method="class", cp=0.01)
summary(mytree)## Call:
## rpart(formula = anx ~ covidfol + crisis, data = mydata, method = "class",
## cp = 0.01)
## n= 11480
##
## CP nsplit rel error xerror xstd
## 1 0.04263418 0 1.0000000 1.0000000 0.01015988
## 2 0.01000000 2 0.9147316 0.9147316 0.01006060
##
## Variable importance
## crisis covidfol
## 89 11
##
## Node number 1: 11480 observations, complexity param=0.04263418
## predicted class=0 expected loss=0.4576655 P(node) =1
## class counts: 6226 5254
## probabilities: 0.542 0.458
## left son=2 (2773 obs) right son=3 (8707 obs)
## Primary splits:
## crisis splits as RL, improve=212.84880, (0 missing)
## covidfol splits as LR, improve= 78.95456, (0 missing)
##
## Node number 2: 2773 observations
## predicted class=0 expected loss=0.2870537 P(node) =0.2415505
## class counts: 1977 796
## probabilities: 0.713 0.287
##
## Node number 3: 8707 observations, complexity param=0.04263418
## predicted class=1 expected loss=0.4879982 P(node) =0.7584495
## class counts: 4249 4458
## probabilities: 0.488 0.512
## left son=6 (2387 obs) right son=7 (6320 obs)
## Primary splits:
## covidfol splits as LR, improve=25.33512, (0 missing)
##
## Node number 6: 2387 observations
## predicted class=0 expected loss=0.4499372 P(node) =0.2079268
## class counts: 1313 1074
## probabilities: 0.550 0.450
##
## Node number 7: 6320 observations
## predicted class=1 expected loss=0.464557 P(node) =0.5505226
## class counts: 2936 3384
## probabilities: 0.465 0.535
rpart.plot(mytree, box.palette="RdBu", cex=0.55)fit = predict(mytree,type="class")
cmatrix = table(mydata$anx,fit)
pander(cmatrix)| 0 | 1 | |
|---|---|---|
| 0 | 3290 | 2936 |
| 1 | 1870 | 3384 |
#accuracy
accuracy <- (3290 + 3384) / 11480
#error rate
error <- (2936 + 1870) / 11480
pander(cbind(c("Accuracy","Error Rate"), c(round(accuracy,3),round(error,3))))| Accuracy | 0.581 |
| Error Rate | 0.419 |