What is your survival probability if you are infected by new COVID-19? This is a complicated question because your response depends of a large number of social and economic variables. However, you can to know this response based in the country that you live.
The datasets about the new COVID-19 generaly is available in a aggregate level with little informations preventing the construction of specific estimations. However, this data is easily found in a country level. Based in this, we can to calculate the tax at which infected people die daile in consequence of the new COVID-19 complications.
The first step is to import the data about NCOV-19 cases in several countries. Fot this, I’m using data available by Johns Hopkins University Center for Systems Science and Engineering. You also can to see this database through address https://github.com/CSSEGISandData/COVID-19.
I uses R for this process and the packages used are described below.
library(dplyr)
library(tidyr)
library(lubridate)
library(ggplot2)
library(tidyverse)
library(plotly)
The script below shows the import of data about the new COVID-19 confirmed cases in world countries.
data_address <- paste("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv", sep = "")
cases <- read_csv(data_address) %>% rename(province = "Province/State",
country_region = "Country/Region") %>%
pivot_longer(-c(province, country_region, Lat, Long), names_to = "Date",
values_to = "cumulative_cases")
## Parsed with column specification:
## cols(
## .default = col_double(),
## `Province/State` = col_character(),
## `Country/Region` = col_character()
## )
## See spec(...) for full column specifications.
Now, I will choose some countries to do this analysis (China, Italy, Brazil, Japan, Germany, Iran and France).
cases <- cases %>%
filter(country_region== 'China' | country_region== 'Italy' | country_region== 'Brazil' |
country_region== 'Japan' | country_region== 'Germany' | country_region== 'Iran'|
country_region== 'France') %>%
select(Date, cumulative_cases)
Now, I Will go to import the database with new COVID-19 deaths information.
deaths_address <- paste("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv", sep = "")
deaths <-read_csv(deaths_address) %>% rename(province = "Province/State",
country = "Country/Region") %>%
pivot_longer(-c(province, country, Lat, Long), names_to = "Date",
values_to = "cumulative_deaths") %>%
filter(country == 'China' | country == 'Italy' | country == 'Brazil' |
country == 'Japan' | country == 'Germany' | country == 'Iran'|
country == 'France') %>% rename(day = 'Date') %>%
select(country, day, cumulative_deaths)
## Parsed with column specification:
## cols(
## .default = col_double(),
## `Province/State` = col_character(),
## `Country/Region` = col_character()
## )
## See spec(...) for full column specifications.
Now, I will go to join the two databases in a unique archive and to generate the sacrifice rate
data <- cbind(cases, deaths)
data$time <- as.numeric(as.factor(data$Date))
data<-data %>% mutate(sacrifice = ifelse(cumulative_deaths==0 | cumulative_cases ==0,
0, cumulative_deaths/cumulative_cases),
Date_infection = mdy(Date))
the database has information from 1/1/2020 to the current date. The descriptive statistics of the sacrifice rate can be seen in table below:
dt<-data%>%
group_by(country)%>%
summarise_at(vars(sacrifice), funs(mean, max, min, sd))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas:
##
## # Simple named list:
## list(mean = mean, median = median)
##
## # Auto named with `tibble::lst()`:
## tibble::lst(mean, median)
##
## # Using lambdas
## list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.
show(dt)
## # A tibble: 7 x 5
## country mean max min sd
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Brazil 0.00682 0.0437 0 0.0130
## 2 China 0.00974 1 0 0.0255
## 3 France 0.00495 0.111 0 0.0163
## 4 Germany 0.00222 0.0158 0 0.00401
## 5 Iran 0.0636 1 0 0.128
## 6 Italy 0.0421 0.123 0 0.0438
## 7 Japan 0.0163 0.0372 0 0.0131
Now, I will go to plot the sacrifice rate by country in the script below. The points in the plot increase according to the number of confirmed cases in each country.
ggplotly(ggplot(data=data, aes(x = time, y = sacrifice, group = cumulative_cases, color = country, size = cumulative_cases)) +
geom_point(aes(frame = time, ids = country)) +
ylim(0,0.25)+ theme_classic() +
scale_x_log10()) %>%
animation_opts(
1000, easing = "elastic", redraw = FALSE)
## Warning: Ignoring unknown aesthetics: frame, ids
If you live in Italy or Iran, you have more probability of dying from problems arising from the new COVID-19. For now, to live in Brazil, Germany or France is a comfortable option.