Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: The Conversation (2020).


Objective

Explain the objective of the original data visualisation and the targetted audience.

The chosen data visualisation was initially used by former US President Donald Trump to demonstrate how well his administration was performing in terms of combating the COVID-19 pandemic. It has been updated by online news publication “The Conversation”. The figure used was the cumulative tests figure, as opposed to daily tests. The visualisation was then further reviewed by the Conversation, with them overlaying the number of tests per day. The visualisation was created to make the President and his administration was actually performing well, by using a part of the data that didn’t actually explain how the Country was performing. It was aimed at the American public purposefully to be misleading.

However, the changes made by the Conversation, whilst they show how misleading the graph is, in attempt to show to their readers the importance of correct data handling. However, whilst their point is made, it doesn’t tell the full story in terms of how the country was actually handling the pandemic.

The visualisation chosen had the following three main issues:

Issue 1: Using Cumulative totals is completely misleading as it makes the administration look like they have performed significantly more tests than they actually have. The addition of daily cases over the top does illustrate the point of the data being misleading, but in doing so they haven’t told much more of a story in terms of adding the this extra information. The addition of New Cases or the portion of positive results per test results would paint a fuller picture of the mishandling of the pandemic by the US president.

Issue 2: The use of a bar chart to show this choice of information is messy and doesn’t assist in terms of demonstrating a meaningful insights. Changing the data that is being used would assist with this, or plotting it as a line graph would assist for providing a more accurate demonstration of US’s performance or plotting the figures side by side.

Issue 3: The scale on the graph is too broad and doesn’t provide much room for comparison, especially when it comes to the smaller numbers. Again, the point of the graph is to show the administrations choice to mislead the public, but they have only done this in a visual way without anyway to discern the actual numbers as opposed to just comparing the size of each graph.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)  
library(magrittr) 
library(dplyr)    
library(readr)    
library(here)
library(lubridate)

getwd()
## [1] "/Users/andreasmccarthy/Downloads/Andreas_McCarthy_s3903983_Assignment_2"
covid_df <- read.csv("owid-covid-data.csv")
covid_df %>% View()
covid_df_usa <- covid_df %>% filter(location == "United States")
covid_df_usa %>% summary()
##    iso_code          continent           location             date          
##  Length:556         Length:556         Length:556         Length:556        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   total_cases         new_cases      new_cases_smoothed  total_deaths   
##  Min.   :       1   Min.   :     0   Min.   :     0     Min.   :     1  
##  1st Qu.: 1975100   1st Qu.: 22833   1st Qu.: 22559     1st Qu.:131928  
##  Median : 8695830   Median : 44704   Median : 43984     Median :246038  
##  Mean   :14644412   Mean   : 62965   Mean   : 62955     Mean   :310759  
##  3rd Qu.:29414694   3rd Qu.: 71218   3rd Qu.: 67426     3rd Qu.:544012  
##  Max.   :34945468   Max.   :300462   Max.   :251085     Max.   :613013  
##                     NA's   :1        NA's   :6          NA's   :38      
##    new_deaths     new_deaths_smoothed total_cases_per_million
##  Min.   :   0.0   Min.   :   0.0      Min.   :     0         
##  1st Qu.: 498.2   1st Qu.: 561.9      1st Qu.:  5967         
##  Median : 925.0   Median : 844.5      Median : 26271         
##  Mean   :1183.4   Mean   :1111.8      Mean   : 44243         
##  3rd Qu.:1559.5   3rd Qu.:1584.9      3rd Qu.: 88865         
##  Max.   :4460.0   Max.   :3425.3      Max.   :105575         
##  NA's   :38       NA's   :6                                  
##  new_cases_per_million new_cases_smoothed_per_million total_deaths_per_million
##  Min.   :  0.00        Min.   :  0.00                 Min.   :   0.003        
##  1st Qu.: 68.98        1st Qu.: 68.15                 1st Qu.: 398.573        
##  Median :135.06        Median :132.88                 Median : 743.309        
##  Mean   :190.22        Mean   :190.20                 Mean   : 938.842        
##  3rd Qu.:215.16        3rd Qu.:203.70                 3rd Qu.:1643.528        
##  Max.   :907.73        Max.   :758.56                 Max.   :1851.988        
##  NA's   :1             NA's   :6                      NA's   :38              
##  new_deaths_per_million new_deaths_smoothed_per_million reproduction_rate
##  Min.   : 0.000         Min.   : 0.000                  Min.   :0.710    
##  1st Qu.: 1.506         1st Qu.: 1.697                  1st Qu.:0.910    
##  Median : 2.795         Median : 2.551                  Median :1.020    
##  Mean   : 3.575         Mean   : 3.359                  Mean   :1.137    
##  3rd Qu.: 4.712         3rd Qu.: 4.788                  3rd Qu.:1.140    
##  Max.   :13.474         Max.   :10.348                  Max.   :3.660    
##  NA's   :38             NA's   :6                       NA's   :45       
##   icu_patients   icu_patients_per_million hosp_patients   
##  Min.   : 3522   Min.   :10.64            Min.   : 12218  
##  1st Qu.: 8140   1st Qu.:24.59            1st Qu.: 28083  
##  Median : 9797   Median :29.60            Median : 37754  
##  Mean   :12752   Mean   :38.53            Mean   : 50177  
##  3rd Qu.:16072   3rd Qu.:48.56            3rd Qu.: 64028  
##  Max.   :28889   Max.   :87.28            Max.   :133214  
##  NA's   :181     NA's   :181              NA's   :181     
##  hosp_patients_per_million weekly_icu_admissions
##  Min.   : 36.91            Min.   : NA          
##  1st Qu.: 84.84            1st Qu.: NA          
##  Median :114.06            Median : NA          
##  Mean   :151.59            Mean   :NaN          
##  3rd Qu.:193.44            3rd Qu.: NA          
##  Max.   :402.46            Max.   : NA          
##  NA's   :181               NA's   :556          
##  weekly_icu_admissions_per_million weekly_hosp_admissions
##  Min.   : NA                       Min.   : 13380        
##  1st Qu.: NA                       1st Qu.: 27490        
##  Median : NA                       Median : 35324        
##  Mean   :NaN                       Mean   : 47290        
##  3rd Qu.: NA                       3rd Qu.: 60438        
##  Max.   : NA                       Max.   :116323        
##  NA's   :556                       NA's   :504           
##  weekly_hosp_admissions_per_million   new_tests        total_tests       
##  Min.   : 40.42                     Min.   :    348   Min.   :      348  
##  1st Qu.: 83.05                     1st Qu.: 560500   1st Qu.: 42919712  
##  Median :106.72                     Median : 905975   Median :173529102  
##  Mean   :142.87                     Mean   : 946038   Mean   :208601568  
##  3rd Qu.:182.59                     3rd Qu.:1346530   3rd Qu.:371744417  
##  Max.   :351.43                     Max.   :2319417   Max.   :486263680  
##  NA's   :504                        NA's   :42        NA's   :42         
##  total_tests_per_thousand new_tests_per_thousand new_tests_smoothed
##  Min.   :   0.001         Min.   :0.001          Min.   :   1174   
##  1st Qu.: 129.666         1st Qu.:1.693          1st Qu.: 612525   
##  Median : 524.253         Median :2.737          Median : 941992   
##  Mean   : 630.211         Mean   :2.858          Mean   : 956538   
##  3rd Qu.:1123.086         3rd Qu.:4.068          3rd Qu.:1265032   
##  Max.   :1469.063         Max.   :7.007          Max.   :1910118   
##  NA's   :42               NA's   :42             NA's   :49        
##  new_tests_smoothed_per_thousand positive_rate     tests_per_case 
##  Min.   :0.004                   Min.   :0.01800   Min.   : 5.20  
##  1st Qu.:1.851                   1st Qu.:0.04400   1st Qu.:10.30  
##  Median :2.846                   Median :0.05700   Median :17.50  
##  Mean   :2.890                   Mean   :0.07264   Mean   :18.82  
##  3rd Qu.:3.821                   3rd Qu.:0.09700   3rd Qu.:22.70  
##  Max.   :5.771                   Max.   :0.19400   Max.   :55.60  
##  NA's   :49                      NA's   :61        NA's   :61     
##  tests_units        total_vaccinations  people_vaccinated  
##  Length:556         Min.   :   556208   Min.   :   556208  
##  Class :character   1st Qu.: 65390299   1st Qu.: 45237143  
##  Mode  :character   Median :200299982   Median :127743096  
##                     Mean   :185152483   Mean   :109766165  
##                     3rd Qu.:302548582   3rd Qu.:171310738  
##                     Max.   :344071595   Max.   :189945907  
##                     NA's   :350         NA's   :351        
##  people_fully_vaccinated new_vaccinations  new_vaccinations_smoothed
##  Min.   :  1342086       Min.   :  57909   Min.   :  57909          
##  1st Qu.: 30231520       1st Qu.: 888684   1st Qu.: 833990          
##  Median : 91175995       Median :1562682   Median :1401674          
##  Mean   : 86426805       Mean   :1698027   Mean   :1547772          
##  3rd Qu.:141839391       3rd Qu.:2289034   3rd Qu.:2194483          
##  Max.   :163868916       Max.   :4629928   Max.   :3384387          
##  NA's   :365             NA's   :362       NA's   :335              
##  total_vaccinations_per_hundred people_vaccinated_per_hundred
##  Min.   :  0.17                 Min.   : 0.17                
##  1st Qu.: 19.55                 1st Qu.:13.53                
##  Median : 59.89                 Median :38.20                
##  Mean   : 55.36                 Mean   :32.82                
##  3rd Qu.: 90.46                 3rd Qu.:51.22                
##  Max.   :102.88                 Max.   :56.79                
##  NA's   :350                    NA's   :351                  
##  people_fully_vaccinated_per_hundred new_vaccinations_smoothed_per_million
##  Min.   : 0.40                       Min.   :  173                        
##  1st Qu.: 9.04                       1st Qu.: 2494                        
##  Median :27.26                       Median : 4191                        
##  Mean   :25.84                       Mean   : 4628                        
##  3rd Qu.:42.41                       3rd Qu.: 6562                        
##  Max.   :49.00                       Max.   :10120                        
##  NA's   :365                         NA's   :335                          
##  stringency_index   population       population_density   median_age  
##  Min.   : 0.00    Min.   :3.31e+08   Min.   :35.61      Min.   :38.3  
##  1st Qu.:56.94    1st Qu.:3.31e+08   1st Qu.:35.61      1st Qu.:38.3  
##  Median :67.13    Median :3.31e+08   Median :35.61      Median :38.3  
##  Mean   :59.76    Mean   :3.31e+08   Mean   :35.61      Mean   :38.3  
##  3rd Qu.:71.76    3rd Qu.:3.31e+08   3rd Qu.:35.61      3rd Qu.:38.3  
##  Max.   :75.46    Max.   :3.31e+08   Max.   :35.61      Max.   :38.3  
##  NA's   :4                                                            
##  aged_65_older   aged_70_older   gdp_per_capita  extreme_poverty
##  Min.   :15.41   Min.   :9.732   Min.   :54225   Min.   :1.2    
##  1st Qu.:15.41   1st Qu.:9.732   1st Qu.:54225   1st Qu.:1.2    
##  Median :15.41   Median :9.732   Median :54225   Median :1.2    
##  Mean   :15.41   Mean   :9.732   Mean   :54225   Mean   :1.2    
##  3rd Qu.:15.41   3rd Qu.:9.732   3rd Qu.:54225   3rd Qu.:1.2    
##  Max.   :15.41   Max.   :9.732   Max.   :54225   Max.   :1.2    
##                                                                 
##  cardiovasc_death_rate diabetes_prevalence female_smokers  male_smokers 
##  Min.   :151.1         Min.   :10.79       Min.   :19.1   Min.   :24.6  
##  1st Qu.:151.1         1st Qu.:10.79       1st Qu.:19.1   1st Qu.:24.6  
##  Median :151.1         Median :10.79       Median :19.1   Median :24.6  
##  Mean   :151.1         Mean   :10.79       Mean   :19.1   Mean   :24.6  
##  3rd Qu.:151.1         3rd Qu.:10.79       3rd Qu.:19.1   3rd Qu.:24.6  
##  Max.   :151.1         Max.   :10.79       Max.   :19.1   Max.   :24.6  
##                                                                         
##  handwashing_facilities hospital_beds_per_thousand life_expectancy
##  Min.   : NA            Min.   :2.77               Min.   :78.86  
##  1st Qu.: NA            1st Qu.:2.77               1st Qu.:78.86  
##  Median : NA            Median :2.77               Median :78.86  
##  Mean   :NaN            Mean   :2.77               Mean   :78.86  
##  3rd Qu.: NA            3rd Qu.:2.77               3rd Qu.:78.86  
##  Max.   : NA            Max.   :2.77               Max.   :78.86  
##  NA's   :556                                                      
##  human_development_index excess_mortality
##  Min.   :0.926           Min.   : 0.750  
##  1st Qu.:0.926           1st Qu.: 6.525  
##  Median :0.926           Median :17.835  
##  Mean   :0.926           Mean   :19.922  
##  3rd Qu.:0.926           3rd Qu.:28.183  
##  Max.   :0.926           Max.   :51.560  
##                          NA's   :484
covid_usa_march_df <-
  covid_df_usa %>% 
 filter(date >= as.Date("2020-03-05") & date <= as.Date("2020-04-05"))

covid_usa_march_df %>% View()
covid_plot_fix <- ggplot(covid_usa_march_df,aes(x=date,y=new_tests, fill=factor(positive_rate)))+
  geom_bar(stat="identity",position=position_stack(reverse = FALSE))+
  scale_fill_discrete(name="Positive Rate")+
  xlab("05/03/2020-05/04/2020")+ylab("COVID-19 Tests")+
  scale_x_discrete(breaks = NULL)

Data Reference

Reconstruction

The following plot fixes the main issues in the original.