Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
Explain the objective of the original data visualisation and the targetted audience.
The chosen data visualisation was initially used by former US President Donald Trump to demonstrate how well his administration was performing in terms of combating the COVID-19 pandemic. It has been updated by online news publication “The Conversation”. The figure used was the cumulative tests figure, as opposed to daily tests. The visualisation was then further reviewed by the Conversation, with them overlaying the number of tests per day. The visualisation was created to make the President and his administration was actually performing well, by using a part of the data that didn’t actually explain how the Country was performing. It was aimed at the American public purposefully to be misleading.
However, the changes made by the Conversation, whilst they show how misleading the graph is, in attempt to show to their readers the importance of correct data handling. However, whilst their point is made, it doesn’t tell the full story in terms of how the country was actually handling the pandemic.
The visualisation chosen had the following three main issues:
Issue 1: Using Cumulative totals is completely misleading as it makes the administration look like they have performed significantly more tests than they actually have. The addition of daily cases over the top does illustrate the point of the data being misleading, but in doing so they haven’t told much more of a story in terms of adding the this extra information. The addition of New Cases or the portion of positive results per test results would paint a fuller picture of the mishandling of the pandemic by the US president.
Issue 2: The use of a bar chart to show this choice of information is messy and doesn’t assist in terms of demonstrating a meaningful insights. Changing the data that is being used would assist with this, or plotting it as a line graph would assist for providing a more accurate demonstration of US’s performance or plotting the figures side by side.
Issue 3: The scale on the graph is too broad and doesn’t provide much room for comparison, especially when it comes to the smaller numbers. Again, the point of the graph is to show the administrations choice to mislead the public, but they have only done this in a visual way without anyway to discern the actual numbers as opposed to just comparing the size of each graph.
Reference
The following code was used to fix the issues identified in the original.
library(ggplot2)
library(magrittr)
library(dplyr)
library(readr)
library(here)
library(lubridate)
getwd()
## [1] "/Users/andreasmccarthy/Downloads/Andreas_McCarthy_s3903983_Assignment_2"
covid_df <- read.csv("owid-covid-data.csv")
covid_df %>% View()
covid_df_usa <- covid_df %>% filter(location == "United States")
covid_df_usa %>% summary()
## iso_code continent location date
## Length:556 Length:556 Length:556 Length:556
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## total_cases new_cases new_cases_smoothed total_deaths
## Min. : 1 Min. : 0 Min. : 0 Min. : 1
## 1st Qu.: 1975100 1st Qu.: 22833 1st Qu.: 22559 1st Qu.:131928
## Median : 8695830 Median : 44704 Median : 43984 Median :246038
## Mean :14644412 Mean : 62965 Mean : 62955 Mean :310759
## 3rd Qu.:29414694 3rd Qu.: 71218 3rd Qu.: 67426 3rd Qu.:544012
## Max. :34945468 Max. :300462 Max. :251085 Max. :613013
## NA's :1 NA's :6 NA's :38
## new_deaths new_deaths_smoothed total_cases_per_million
## Min. : 0.0 Min. : 0.0 Min. : 0
## 1st Qu.: 498.2 1st Qu.: 561.9 1st Qu.: 5967
## Median : 925.0 Median : 844.5 Median : 26271
## Mean :1183.4 Mean :1111.8 Mean : 44243
## 3rd Qu.:1559.5 3rd Qu.:1584.9 3rd Qu.: 88865
## Max. :4460.0 Max. :3425.3 Max. :105575
## NA's :38 NA's :6
## new_cases_per_million new_cases_smoothed_per_million total_deaths_per_million
## Min. : 0.00 Min. : 0.00 Min. : 0.003
## 1st Qu.: 68.98 1st Qu.: 68.15 1st Qu.: 398.573
## Median :135.06 Median :132.88 Median : 743.309
## Mean :190.22 Mean :190.20 Mean : 938.842
## 3rd Qu.:215.16 3rd Qu.:203.70 3rd Qu.:1643.528
## Max. :907.73 Max. :758.56 Max. :1851.988
## NA's :1 NA's :6 NA's :38
## new_deaths_per_million new_deaths_smoothed_per_million reproduction_rate
## Min. : 0.000 Min. : 0.000 Min. :0.710
## 1st Qu.: 1.506 1st Qu.: 1.697 1st Qu.:0.910
## Median : 2.795 Median : 2.551 Median :1.020
## Mean : 3.575 Mean : 3.359 Mean :1.137
## 3rd Qu.: 4.712 3rd Qu.: 4.788 3rd Qu.:1.140
## Max. :13.474 Max. :10.348 Max. :3.660
## NA's :38 NA's :6 NA's :45
## icu_patients icu_patients_per_million hosp_patients
## Min. : 3522 Min. :10.64 Min. : 12218
## 1st Qu.: 8140 1st Qu.:24.59 1st Qu.: 28083
## Median : 9797 Median :29.60 Median : 37754
## Mean :12752 Mean :38.53 Mean : 50177
## 3rd Qu.:16072 3rd Qu.:48.56 3rd Qu.: 64028
## Max. :28889 Max. :87.28 Max. :133214
## NA's :181 NA's :181 NA's :181
## hosp_patients_per_million weekly_icu_admissions
## Min. : 36.91 Min. : NA
## 1st Qu.: 84.84 1st Qu.: NA
## Median :114.06 Median : NA
## Mean :151.59 Mean :NaN
## 3rd Qu.:193.44 3rd Qu.: NA
## Max. :402.46 Max. : NA
## NA's :181 NA's :556
## weekly_icu_admissions_per_million weekly_hosp_admissions
## Min. : NA Min. : 13380
## 1st Qu.: NA 1st Qu.: 27490
## Median : NA Median : 35324
## Mean :NaN Mean : 47290
## 3rd Qu.: NA 3rd Qu.: 60438
## Max. : NA Max. :116323
## NA's :556 NA's :504
## weekly_hosp_admissions_per_million new_tests total_tests
## Min. : 40.42 Min. : 348 Min. : 348
## 1st Qu.: 83.05 1st Qu.: 560500 1st Qu.: 42919712
## Median :106.72 Median : 905975 Median :173529102
## Mean :142.87 Mean : 946038 Mean :208601568
## 3rd Qu.:182.59 3rd Qu.:1346530 3rd Qu.:371744417
## Max. :351.43 Max. :2319417 Max. :486263680
## NA's :504 NA's :42 NA's :42
## total_tests_per_thousand new_tests_per_thousand new_tests_smoothed
## Min. : 0.001 Min. :0.001 Min. : 1174
## 1st Qu.: 129.666 1st Qu.:1.693 1st Qu.: 612525
## Median : 524.253 Median :2.737 Median : 941992
## Mean : 630.211 Mean :2.858 Mean : 956538
## 3rd Qu.:1123.086 3rd Qu.:4.068 3rd Qu.:1265032
## Max. :1469.063 Max. :7.007 Max. :1910118
## NA's :42 NA's :42 NA's :49
## new_tests_smoothed_per_thousand positive_rate tests_per_case
## Min. :0.004 Min. :0.01800 Min. : 5.20
## 1st Qu.:1.851 1st Qu.:0.04400 1st Qu.:10.30
## Median :2.846 Median :0.05700 Median :17.50
## Mean :2.890 Mean :0.07264 Mean :18.82
## 3rd Qu.:3.821 3rd Qu.:0.09700 3rd Qu.:22.70
## Max. :5.771 Max. :0.19400 Max. :55.60
## NA's :49 NA's :61 NA's :61
## tests_units total_vaccinations people_vaccinated
## Length:556 Min. : 556208 Min. : 556208
## Class :character 1st Qu.: 65390299 1st Qu.: 45237143
## Mode :character Median :200299982 Median :127743096
## Mean :185152483 Mean :109766165
## 3rd Qu.:302548582 3rd Qu.:171310738
## Max. :344071595 Max. :189945907
## NA's :350 NA's :351
## people_fully_vaccinated new_vaccinations new_vaccinations_smoothed
## Min. : 1342086 Min. : 57909 Min. : 57909
## 1st Qu.: 30231520 1st Qu.: 888684 1st Qu.: 833990
## Median : 91175995 Median :1562682 Median :1401674
## Mean : 86426805 Mean :1698027 Mean :1547772
## 3rd Qu.:141839391 3rd Qu.:2289034 3rd Qu.:2194483
## Max. :163868916 Max. :4629928 Max. :3384387
## NA's :365 NA's :362 NA's :335
## total_vaccinations_per_hundred people_vaccinated_per_hundred
## Min. : 0.17 Min. : 0.17
## 1st Qu.: 19.55 1st Qu.:13.53
## Median : 59.89 Median :38.20
## Mean : 55.36 Mean :32.82
## 3rd Qu.: 90.46 3rd Qu.:51.22
## Max. :102.88 Max. :56.79
## NA's :350 NA's :351
## people_fully_vaccinated_per_hundred new_vaccinations_smoothed_per_million
## Min. : 0.40 Min. : 173
## 1st Qu.: 9.04 1st Qu.: 2494
## Median :27.26 Median : 4191
## Mean :25.84 Mean : 4628
## 3rd Qu.:42.41 3rd Qu.: 6562
## Max. :49.00 Max. :10120
## NA's :365 NA's :335
## stringency_index population population_density median_age
## Min. : 0.00 Min. :3.31e+08 Min. :35.61 Min. :38.3
## 1st Qu.:56.94 1st Qu.:3.31e+08 1st Qu.:35.61 1st Qu.:38.3
## Median :67.13 Median :3.31e+08 Median :35.61 Median :38.3
## Mean :59.76 Mean :3.31e+08 Mean :35.61 Mean :38.3
## 3rd Qu.:71.76 3rd Qu.:3.31e+08 3rd Qu.:35.61 3rd Qu.:38.3
## Max. :75.46 Max. :3.31e+08 Max. :35.61 Max. :38.3
## NA's :4
## aged_65_older aged_70_older gdp_per_capita extreme_poverty
## Min. :15.41 Min. :9.732 Min. :54225 Min. :1.2
## 1st Qu.:15.41 1st Qu.:9.732 1st Qu.:54225 1st Qu.:1.2
## Median :15.41 Median :9.732 Median :54225 Median :1.2
## Mean :15.41 Mean :9.732 Mean :54225 Mean :1.2
## 3rd Qu.:15.41 3rd Qu.:9.732 3rd Qu.:54225 3rd Qu.:1.2
## Max. :15.41 Max. :9.732 Max. :54225 Max. :1.2
##
## cardiovasc_death_rate diabetes_prevalence female_smokers male_smokers
## Min. :151.1 Min. :10.79 Min. :19.1 Min. :24.6
## 1st Qu.:151.1 1st Qu.:10.79 1st Qu.:19.1 1st Qu.:24.6
## Median :151.1 Median :10.79 Median :19.1 Median :24.6
## Mean :151.1 Mean :10.79 Mean :19.1 Mean :24.6
## 3rd Qu.:151.1 3rd Qu.:10.79 3rd Qu.:19.1 3rd Qu.:24.6
## Max. :151.1 Max. :10.79 Max. :19.1 Max. :24.6
##
## handwashing_facilities hospital_beds_per_thousand life_expectancy
## Min. : NA Min. :2.77 Min. :78.86
## 1st Qu.: NA 1st Qu.:2.77 1st Qu.:78.86
## Median : NA Median :2.77 Median :78.86
## Mean :NaN Mean :2.77 Mean :78.86
## 3rd Qu.: NA 3rd Qu.:2.77 3rd Qu.:78.86
## Max. : NA Max. :2.77 Max. :78.86
## NA's :556
## human_development_index excess_mortality
## Min. :0.926 Min. : 0.750
## 1st Qu.:0.926 1st Qu.: 6.525
## Median :0.926 Median :17.835
## Mean :0.926 Mean :19.922
## 3rd Qu.:0.926 3rd Qu.:28.183
## Max. :0.926 Max. :51.560
## NA's :484
covid_usa_march_df <-
covid_df_usa %>%
filter(date >= as.Date("2020-03-05") & date <= as.Date("2020-04-05"))
covid_usa_march_df %>% View()
covid_plot_fix <- ggplot(covid_usa_march_df,aes(x=date,y=new_tests, fill=factor(positive_rate)))+
geom_bar(stat="identity",position=position_stack(reverse = FALSE))+
scale_fill_discrete(name="Positive Rate")+
xlab("05/03/2020-05/04/2020")+ylab("COVID-19 Tests")+
scale_x_discrete(breaks = NULL)
Data Reference
The following plot fixes the main issues in the original.