How closely an individual reports following COVID news

1: Very closely

0: Fairly/Not closely

Frequency Percent Cum Percent
fairly/not closely 4023 35.04355 35.04355
very closely 7457 64.95645 100.00000
Total 11480 100.00000 NA

If they consider it a crisis

1: crisis

0: not a crisis

Frequency Percent Cum Percent
considers it a crisis 8707 75.84495 75.84495
does not consider it a crisis 2773 24.15505 100.00000
Total 11480 100.00000 NA

The number of times they felt anxious in the past week

0: 0-2 days

1: 3-7 days

Frequency Percent Cum Percent
0 6226 54.23345 54.23345
1 5254 45.76655 100.00000
Total 11480 100.00000 NA

Bivariate graphs

Hypothesis testing
Ho: There is no relationship between how closely an individual follows the news surrounding COVID-19 and their anxiety levels.
Ha: There is a relationship between how closely an individual follows the news surrounding COVID-19 and their anxiety levels.

Explanatory: how closely an individual follows the news surrounding COVID-19 (covidfol with 2 levels)
Response: anxiety levels (anx with 2 levels)

Pearson’s Chi-squared test with Yates’ continuity correction: mydata$anxlev and mydata$covidfol
Test statistic df P value
158.6 1 2.341e-36 * * *
##              mydata$covidfol
## mydata$anxlev fairly/not closely very closely
##      0-2 days               2503         3723
##      3-7 days               1520         3734
##              mydata$covidfol
## mydata$anxlev fairly/not closely very closely
##      0-2 days           2181.812     4044.188
##      3-7 days           1841.188     3412.812
  fairly/not closely very closely
0-2 days 0.6222 0.4993
3-7 days 0.3778 0.5007
##              mydata$covidfol
## mydata$anxlev fairly/not closely very closely
##      0-2 days          0.5423345    0.5423345
##      3-7 days          0.4576655    0.4576655
Pearson’s Chi-squared test with Yates’ continuity correction: mydata$crisis and mydata$anxlev
Test statistic df P value
427.9 1 4.736e-95 * * *
##                                mydata$anxlev
## mydata$crisis                   0-2 days 3-7 days
##   considers it a crisis             4249     4458
##   does not consider it a crisis     1977      796
##                                mydata$anxlev
## mydata$crisis                   0-2 days 3-7 days
##   considers it a crisis         4722.106 3984.894
##   does not consider it a crisis 1503.894 1269.106
  0-2 days 3-7 days
considers it a crisis 0.6825 0.8485
does not consider it a crisis 0.3175 0.1515
##                                mydata$anxlev
## mydata$crisis                    0-2 days  3-7 days
##   considers it a crisis         0.7584495 0.7584495
##   does not consider it a crisis 0.2415505 0.2415505


Does one’s perception of whether COVID-19 is a crisis moderate this relationship?

The relationship remains significant

logistic regression

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1704 0.03726 -4.574 4.788e-06
covidfolvery closely 0.3009 0.0419 7.183 6.839e-13
crisisdoes not consider it a crisis -0.8673 0.04874 -17.8 7.566e-71

(Dispersion parameter for binomial family taken to be 1 )

Null deviance: 15832 on 11479 degrees of freedom
Residual deviance: 15339 on 11477 degrees of freedom
## Waiting for profiling to be done...
    2.5 % 97.5 %
(Intercept) 0.84 0.7839 0.9072
covidfolvery closely 1.35 1.245 1.467
crisisdoes not consider it a crisis 0.42 0.3817 0.462
library(rpart)
library(party)
library(rattle)
library(rpart.plot)
library(RColorBrewer)

mytree=rpart(anx~covidfol+crisis, data=mydata, method="class", cp=0.01)
summary(mytree)
## Call:
## rpart(formula = anx ~ covidfol + crisis, data = mydata, method = "class", 
##     cp = 0.01)
##   n= 11480 
## 
##           CP nsplit rel error    xerror       xstd
## 1 0.04263418      0 1.0000000 1.0000000 0.01015988
## 2 0.01000000      2 0.9147316 0.9147316 0.01006060
## 
## Variable importance
##   crisis covidfol 
##       89       11 
## 
## Node number 1: 11480 observations,    complexity param=0.04263418
##   predicted class=0  expected loss=0.4576655  P(node) =1
##     class counts:  6226  5254
##    probabilities: 0.542 0.458 
##   left son=2 (2773 obs) right son=3 (8707 obs)
##   Primary splits:
##       crisis   splits as  RL, improve=212.84880, (0 missing)
##       covidfol splits as  LR, improve= 78.95456, (0 missing)
## 
## Node number 2: 2773 observations
##   predicted class=0  expected loss=0.2870537  P(node) =0.2415505
##     class counts:  1977   796
##    probabilities: 0.713 0.287 
## 
## Node number 3: 8707 observations,    complexity param=0.04263418
##   predicted class=1  expected loss=0.4879982  P(node) =0.7584495
##     class counts:  4249  4458
##    probabilities: 0.488 0.512 
##   left son=6 (2387 obs) right son=7 (6320 obs)
##   Primary splits:
##       covidfol splits as  LR, improve=25.33512, (0 missing)
## 
## Node number 6: 2387 observations
##   predicted class=0  expected loss=0.4499372  P(node) =0.2079268
##     class counts:  1313  1074
##    probabilities: 0.550 0.450 
## 
## Node number 7: 6320 observations
##   predicted class=1  expected loss=0.464557  P(node) =0.5505226
##     class counts:  2936  3384
##    probabilities: 0.465 0.535
rpart.plot(mytree, box.palette="RdBu", cex=0.55)

fit = predict(mytree,type="class")
cmatrix = table(mydata$anx,fit)
cmatrix
##    fit
##        0    1
##   0 3290 2936
##   1 1870 3384
#accuracy
(3290 + 3384) / 11480
## [1] 0.5813589
#error rate
(2936 + 1870) / 11480
## [1] 0.4186411