In the first few days of 2020, the novel corona virus, now known as COVID-19, was spreading around the world. There was little to no preparation, and therefore catching so many, individuals and countries at large by surprise.
This was also led to denial and refuting of claims by many as rumours, especially because the concerned parties did not have the necessary or qualifying information to update the world leaders and its citizens.
While everyone continued with their day to day activities, the virus was silently spreading from location zero, to other towns, cities, and even crossing boarders to other countries, and making home in their hosts (humans and some isolated cases, in animal), making patient zeros at very alarming rates.
Before it was given a name, or even the genetically sequenced, it was all a guess based on the symptoms that the infected portrayed, with the common one being pneumonia-like symptoms. Except, unlike pneumonia, corona virus would wreck more havoc in vital organs within a very short time, and lead to death.
As of this 21st day of March 2021, courtesy of John Hopkins Uni. there are:
all of which are global cases.
Given that this caught us all by surprise, the level(s) of unpreparedness or government response time might have contributed to the rate of death. In this short research, we will do a brief research of some countries around the world to see if their response time had an effect on the number of infections, and death. This data was collected in the early days of the pandemic between March & April 2020.
Install the necessary package(s) and load the library.
#install.packages("ggplot2")
library("ggplot2")
library("corrplot")
## corrplot 0.84 loaded
library("ggcorrplot")
#install.packages("ACSWR")
library("ACSWR")
Read in, and view the data.
covid <- read.csv(file.choose())
summary(covid)
## Country Population Cases Deaths
## Length:30 Min. : 341243 Min. : 212.0 Min. : 0.0
## Class :character 1st Qu.: 5058650 1st Qu.: 732.8 1st Qu.: 11.0
## Mode :character Median : 8871884 Median : 2252.0 Median : 32.5
## Mean :17854457 Mean :12713.6 Mean : 804.9
## 3rd Qu.:15748560 3rd Qu.:10324.0 3rd Qu.: 254.8
## Max. :83783942 Max. :97689.0 Max. :10779.0
## DeathRates Government_Integrity HDI Date.of.Infection
## Min. :0.001859 Min. :38.80 Min. :0.7910 Length:30
## 1st Qu.:0.006539 1st Qu.:55.10 1st Qu.:0.8600 Class :character
## Median :0.014822 Median :74.55 Median :0.8920 Mode :character
## Mean :0.024167 Mean :71.90 Mean :0.8891
## 3rd Qu.:0.032551 3rd Qu.:90.05 3rd Qu.:0.9287
## Max. :0.101937 Max. :96.10 Max. :0.9540
## Date_Alt Max Days_To_Prepare PopDeathRte
## Min. :-43898 Min. :-43854 Min. : 0.00 Min. :0.000000
## 1st Qu.:-43892 1st Qu.:-43854 1st Qu.:16.25 1st Qu.:0.002048
## Median :-43888 Median :-43854 Median :34.00 Median :0.005576
## Mean :-43882 Mean :-43854 Mean :28.40 Mean :0.021076
## 3rd Qu.:-43870 3rd Qu.:-43854 3rd Qu.:37.75 3rd Qu.:0.016674
## Max. :-43854 Max. :-43854 Max. :44.00 Max. :0.178278
*Having a view of the column names.
names(covid)
## [1] "Country" "Population" "Cases"
## [4] "Deaths" "DeathRates" "Government_Integrity"
## [7] "HDI" "Date.of.Infection" "Date_Alt"
## [10] "Max" "Days_To_Prepare" "PopDeathRte"
Plot of the Death Rate based on the country preparedness / reaction time(s).
d_rate <- ggplot(covid, aes(Days_To_Prepare, PopDeathRte, label = Country, color = "green")) +
geom_text(aes(cex = .5)) +
xlab("Days to Prepare") +
ylab("Population Death Rate") +
ggtitle("Covid Response Time Per Country") +
theme_minimal()
d_rate
QUESTION: Is there any association between the days it took a country to take action towards covid and the death rates? Well, let’s find out.
Null hypothesis (H0): there is no relationship between the country reaction time and the death rate vs Alternative hypothesis (Ha): the is a relationship between the country reaction time and the death rate.
To get the number of days it took a country to enforce any measures, such as lockdown, mask wearing, social distancing among others, we took the difference between the day the mandate took effect / was announced, and the first day the SARS-CoV-2 was confirmed to be in the country.
cor.test(covid$Days_To_Prepare, covid$PopDeathRte)
##
## Pearson's product-moment correlation
##
## data: covid$Days_To_Prepare and covid$PopDeathRte
## t = -2.9684, df = 28, p-value = 0.006073
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.7222229 -0.1565866
## sample estimates:
## cor
## -0.4892553
From the above output, we see that there is a negative correlation value of -0.5, with a p-value of 0.006073 which is greater than the alpha level of 0.05, that is, r(28) = -0.4892553, p = 0.006073. We therefore fail to reject the null hypothesis that there is no relationship between the reaction time and number of deaths in the first days of COVID-19 announced in the USA.
Now that we have the correlation done, let’s perform a linear regression to see the results.
linearMod <- lm(covid$PopDeathRte ~ covid$Days_To_Prepare)
summary(linearMod)
##
## Call:
## lm(formula = covid$PopDeathRte ~ covid$Days_To_Prepare)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.052855 -0.011041 -0.005262 0.001314 0.127589
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0603758 0.0147916 4.082 0.000337 ***
## covid$Days_To_Prepare -0.0013838 0.0004662 -2.968 0.006073 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03613 on 28 degrees of freedom
## Multiple R-squared: 0.2394, Adjusted R-squared: 0.2122
## F-statistic: 8.812 on 1 and 28 DF, p-value: 0.006073
Checking for linearity
par(mfrow = c(1:2))
#layout(matrix(c(1,1,1,1), 2, 2, byrow = TRUE))
plot(linearMod)
scatter.smooth(x = covid$Days_To_Prepare, y = covid$PopDeathRte, main = "Reaction ~ Death Rate")
par(mfrow=c(1, 2)) # divide graph area in 2 columns
boxplot(covid$Days_To_Prepare, main="Days to Prepare", sub=paste("Outlier rows: ", boxplot.stats(covid$Days_To_Prepare)$out)) # box plot for 'Days Reaction time'
boxplot(covid$PopDeathRte, main="Population Death Rate", sub=paste("Outlier rows: ", boxplot.stats(covid$PopDeathRte)$out)) # box plot for 'Population Death Rate'
covid_no_outliers = covid[c(1:10,12:18,21:30),]
cor(covid_no_outliers$Days_To_Prepare, covid_no_outliers$PopDeathRte)
## [1] -0.3915138
p <- ggplot(covid_no_outliers, aes(Days_To_Prepare, PopDeathRte,
label = Country, color = "green")) +
geom_text(aes(cex = .5)) +
xlab("Days to Prepare / React") +
ylab("Population Death Rate") +
ggtitle("Effect of COVID-19 Reaction Time on Population Death") +
theme_minimal()
p
What is the effect of Days to prepare on Population Death Rate, controlling for Human Development Index (HDI) and population?
mod <- lm(PopDeathRte ~ Days_To_Prepare + HDI + Population, data = covid)
summary(mod)
##
## Call:
## lm(formula = PopDeathRte ~ Days_To_Prepare + HDI + Population,
## data = covid)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.064860 -0.012203 -0.003807 0.003080 0.118016
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.352e-02 1.638e-01 0.510 0.614
## Days_To_Prepare -9.777e-04 7.506e-04 -1.302 0.204
## HDI -4.748e-02 1.685e-01 -0.282 0.780
## Population 4.219e-10 4.265e-10 0.989 0.332
##
## Residual standard error: 0.03661 on 26 degrees of freedom
## Multiple R-squared: 0.2748, Adjusted R-squared: 0.1912
## F-statistic: 3.285 on 3 and 26 DF, p-value: 0.03657
*The estimated effect of Days_To_Prepare on Population Death Rate (PopDeathRte) is -9.777e-04, while that of HDI is estimated at -4.748e-02. This implies that for every one percent increase in Days_To_Prepare and HDI, there is a 9.777e-04 and 4.748e-02 decrease in the Population Death Rate
What is the effect of Government Integrity on PopDeathRte, controlling for HDI?
mod2 <- lm(PopDeathRte ~ Government_Integrity + HDI, data = covid)
summary(mod2)
##
## Call:
## lm(formula = PopDeathRte ~ Government_Integrity + HDI, data = covid)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.036365 -0.022169 -0.008270 0.005279 0.147079
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.4298255 0.2559215 -1.680 0.1046
## Government_Integrity -0.0014328 0.0008768 -1.634 0.1139
## HDI 0.6230403 0.3489006 1.786 0.0854 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03989 on 27 degrees of freedom
## Multiple R-squared: 0.1059, Adjusted R-squared: 0.03972
## F-statistic: 1.6 on 2 and 27 DF, p-value: 0.2205
Removing the outliers, what is the effect of Government Integrity on Population, while controlling for HDI?
mod3 <- lm(PopDeathRte ~ Government_Integrity + HDI, data = covid_no_outliers)
mod3
##
## Call:
## lm(formula = PopDeathRte ~ Government_Integrity + HDI, data = covid_no_outliers)
##
## Coefficients:
## (Intercept) Government_Integrity HDI
## 0.0056911 0.0004551 -0.0310015
Taking the log transformation of population. What is the effect of the logged population, HDI and Government Integrity on Death Rates?
mod_3 <- lm(PopDeathRte ~ log(Population) + HDI + Government_Integrity, data = covid_no_outliers)
summary(mod_3)
##
## Call:
## lm(formula = PopDeathRte ~ log(Population) + HDI + Government_Integrity,
## data = covid_no_outliers)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.018473 -0.006364 -0.000658 0.001182 0.025229
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.0220585 0.0881091 -0.250 0.805
## log(Population) 0.0022987 0.0018691 1.230 0.231
## HDI -0.0435049 0.1179734 -0.369 0.716
## Government_Integrity 0.0004898 0.0003027 1.618 0.119
##
## Residual standard error: 0.01226 on 23 degrees of freedom
## Multiple R-squared: 0.3123, Adjusted R-squared: 0.2226
## F-statistic: 3.481 on 3 and 23 DF, p-value: 0.03225