Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The objective of the original visualization is to show where successful space flight launches have occurred in the world since the year 2000 to the audience of reddit.com/r/dataisbeautiful. This subreddit is a general data visualization forum where users post visualizations and infographics. Since anyone can post, the quality is mixed and there is a heavy emphasis on interesting and aesthetically pleasing visualizations, however the overall quality can be lacking and users are untrained in spotting issues. Therefore the audience in this context is quite general as there will be a mix of data visualization practitioners, enthusiasts who are more about aesthetics and the general reddit user who may find the content interesting due to aesthetics or the implications of the message. The author has chosen to use a proportional symbols map to communicate ‘Space Flights Launches since 200’
The visualization chosen had the following three main issues:
The authors choice of using a proportional symbols map allows the data to be visualized in a succinct manner but it fails in being able to communicate precisely where a launch has occurred and fails in being able to compare how many launches have occurred at a certain location. Since the objective is to communicate where launches have occurred - this can only be communicated in an extremely general sense due to the size of the circles and the lack of any labeling or legends. The data points do not give the viewer any indication of where a launch is happening and relies purely on the viewer deduce where a launch occurred which along with the following mistakes, diminishes how useful this visualization is.
There are a number of apparent data errors in the original visualization which again fails to communicate the location of Sucessful Space Flight Launches by being misleading/incorrect. In addition to the errors of scale detailed in the above point, data appears to be missing. For example, there have been a number of successful space launches by the company Rocket Lab from New Zealand which are not shown in the original visualization, in addition to this there was one successful launch into space from Alcântara, Brazil which is also missing from the visualization. This is disrespectful to Rocket Lab and the Brazilian Space Agency and is misleading to the general audience of reddit.
The visualization is lacking in a clear objective. The title of the visualization is ‘Space Flights Launches since 2000’ however what is appears to be actually being shown is the location of sucessful space flight launches since 2000. There is a disconnect in what is being shown in the visualization and what is proposed by the title of the authors post. Also due to the inaccuracy of the scales and the appearance circles the appearing to overlap perfectly it is unclear what is defined as a location. Surely if two launches are from the same facility with different launchpads or the same geographical region these could be classed as the same place. This results in a failing of the trifecta checkup as it is unclear what the data and is trying to say and what is the question that is trying to be answered.
Reference
[u/stagOverflow]. (2020, September). [OC] Successful space missions since 2000. Retrieved September 10, 2020 from https://www.reddit.com/r/dataisbeautiful/comments/iphhct/oc_successful_space_missions_since_2000/.
[Agirlcoding]. (2020, August). All Space Missions from 1957 Deep Dive in the Space Race, Version 6. Retrieved September 10, 2020 from https://www.kaggle.com/agirlcoding/all-space-missions-from-1957/metadata.
The following code was used to fix the issues identified in the original.
library(tidyr)
library(dplyr)
library(readr)
library(knitr)
library(ggplot2)
library(forcats)
space <- read_csv("Space_Corrected.csv")
#Separate data data to obtain years
space <- space %>% separate(Datum, into = c("Date", "Dates"), sep = ",", remove = FALSE)
space <- space %>% separate(Dates, into = c("dummy", "Year", "Time", "Zone"), sep = " ", remove = TRUE)
#Separate location into different columns
space <- space %>% separate(Location, into = c("PadB", "Facility", "CountryB", "Country2"), sep = ",", remove = FALSE)
space$Year <- as.numeric(space$Year)
#Filter for datas after year 2000
spaceFilt <- space %>% filter(space$Year >= 2000)
#Refactor Facilities and rename to more succint locations
spaceFilt <- spaceFilt %>% mutate (Facility2 = case_when(
PadB == "Xichang Satellite Launch Center" ~ " Xichan",
PadB == "Svobodny Cosmodrome" ~ " Svobodny",
PadB == "Taiyuan Satellite Launch Center" ~ " Taiyuan",
PadB == "Tai Rui Barge" ~ "Yellow Sea",
PadB == "Uchinoura Space Center" ~ "Uchinoura",
PadB == "Jiuquan Satellite Launch Center" ~ " Jiuquan",
CountryB == " New Zealand" ~ " Mahia, New Zealand",
Country2 == " Brazil" ~ " Alcântara, Brazil",
CountryB == " French Guiana" ~ " Guiana, French Guiana",
CountryB == " Israel" ~ " Palmachim, Israel",
Facility == " Shahrud Missile Test Site" ~ "Shahrud, Iran",
Facility == " Semnan Space Center" ~ " Semnan, Iran",
))
#Refactor Countries and rename to more succint locations
spaceFilt <- spaceFilt %>% mutate (Country2 = case_when(
Location == " Xichang Satellite Launch Center" ~ " China",
Location == " Taiyuan Satellite Launch Center" ~ " China",
Location == " Svobodny Cosmodrome" ~ " Russia",
Facility == " China" ~ " China",
Facility == " Japan" ~ " Japan",
Facility == " Russia" ~ " Eastern Europe",
Facility == " Shahrud Missile Test Site" ~ " Middle East",
Facility == " Semnan Space Center" ~ " Middle East",
Facility == " Yellow Sea" ~ " Offshore",
Facility == " Ronald Reagan Ballistic Missile Defense Test Site" ~ " Pacific",
Country2 == " USA" ~ " N.America",
Country2 == " Brazil" ~ " S.America",
CountryB == " China" ~ " China",
CountryB == " Japan" ~ " Japan",
CountryB == " Algeria" ~ " Algeria",
CountryB == " Kazakhstan" ~ " Eastern Europe",
CountryB == " New Zealand" ~ " Pacific",
CountryB == " Russia" ~ " Eastern Europe",
CountryB == " French Guiana" ~ " S.America",
CountryB == " Iran" ~ " Middle East",
CountryB == " India" ~ " India",
CountryB == " Israel" ~ " Middle East",
CountryB == " Australia" ~ " Australia",
CountryB == " New Mexico" ~ " N.America",
CountryB == " Kenya" ~ " Kenya",
CountryB == " Gran Canaria" ~ " Gran Canaria",
CountryB == " Pacific Missile Range Facility" ~ " Pacific",
CountryB == " Barents Sea" ~ " Offshore" ,
CountryB == " Maranh?œo" ~ " Maranh?œo" ,
CountryB == " North Korea" ~ " Korea" ,
CountryB == " Pacific Ocean" ~ " Pacific" ,
CountryB == " South Korea" ~ " Korea" ,
CountryB == " Texas" ~ " USA" ,
CountryB == " Virginia" ~ " USA" ,
CountryB == " California" ~ " USA" ,
CountryB == " Marshall Islands" ~ " Pacific"
))
spaceFilt <- spaceFilt %>% mutate( Facility3 = coalesce(spaceFilt$Facility2, spaceFilt$Facility)
)
# Further Refactoring for succintness
spaceFilt$Facility3[spaceFilt$Facility3 == ' Sohae Satellite Launching Station'] <- ' Sohae, North Korea'
spaceFilt$Facility3[spaceFilt$Facility3 == ' Tonghae Satellite Launching Ground'] <- ' Tonghae, North Korea'
spaceFilt$Facility3[spaceFilt$Facility3 == ' Naro Space Center'] <- 'Naro, South Korea'
spaceFilt$Facility3[spaceFilt$Facility3 == ' Barents Sea Launch Area'] <- 'Barents Sea'
spaceFilt$Facility3[spaceFilt$Facility3 == ' Kiritimati Launch Area'] <- 'Kiritimati, Kiribati'
spaceFilt$Facility3[spaceFilt$Facility3 == ' Kiritimati Launch Area'] <- 'Kiritimati, Kiribati'
spaceFilt$Facility3[spaceFilt$Facility3 == ' Kauai'] <- 'Kauai, Hawaii'
spaceFilt$Facility3[spaceFilt$Facility3 == " Ronald Reagan Ballistic Missile Defense Test Site"] <- " Marshall Islands"
spaceFilt$Facility3[spaceFilt$Facility3 == " M?\u0081hia Peninsula"] <- " Mahia, New Zealand"
spaceFilt$Facility3[spaceFilt$Country2 == " Brazil"] <- " Alcântara, Brazil"
spaceFilt$Facility2[spaceFilt$Country2 == " Brazil"] <- " Alcântara, Brazil"
spaceFilt$Facility3[spaceFilt$Facility == " Yasny Cosmodrome"] <- "Yasny, Russia"
spaceFilt$Facility3[spaceFilt$Facility == " Plesetsk Cosmodrome"] <- "Plesetsk, Russia"
spaceFilt$Facility3[spaceFilt$Facility3 == " Svobodny"] <- "Svobodny, Russia"
spaceFilt$Facility3[spaceFilt$Facility == " Vostochny Cosmodrome"] <- "Vostochny, Russia"
spaceFilt$Facility3[spaceFilt$Facility == " Baikonur Cosmodrome"] <- "Baikonur, Kasakhstan"
spaceFilt$Facility3[spaceFilt$Facility3 == " Taiyuan Satellite Launch Center"] <- "Taiyuan"
spaceFilt$Facility3[spaceFilt$Facility3 == " Wenchang Satellite Launch Center"] <- "Wenchang"
spaceFilt$Facility3[spaceFilt$Facility3 == " Xichang Satellite Launch Center"] <- "Xichang"
spaceFilt$Facility3[spaceFilt$Facility3 == " Jiuquan Satellite Launch Center"] <- "Jiuquan"
spaceFilt$Facility3[spaceFilt$Facility == " Tanegashima Space Center"] <- "Tanegashima"
spaceFilt$Facility3[spaceFilt$Facility == " Uchinoura Space Center"] <- "Uchinoura"
spaceFilt$Facility3[spaceFilt$Facility == " Satish Dhawan Space Centre"] <- "Satish Dhawan"
spaceFilt$Facility3[spaceFilt$Facility3 == " Cape Canaveral"] <- "Cape Canaveral"
spaceFilt$Facility3[spaceFilt$Facility3 == " Pacific Spaceport Complex"] <- "Pacific Spaceport"
spaceFilt$Facility3[spaceFilt$Facility3 == " Wallops Flight Facility"] <- "Wallops"
spaceFilt$Facility3[spaceFilt$Facility3 == " Mojave Air and Space Port"] <- "Mojave"
# Obtain classes for faceting
countryClasses <- spaceFilt %>% distinct(Facility3,.keep_all = TRUE)
# Get count of all launches by location
facilityCount <- as.data.frame(table(spaceFilt$Facility3))
facilityCount <- facilityCount %>% arrange(Freq)
facilityCount <- facilityCount %>% mutate(Facility3 = fct_reorder(Var1, desc(Freq)))
facilityCount <- left_join(facilityCount, countryClasses, by = 'Facility3' )
facilityCount <- facilityCount %>% mutate(Facility = fct_reorder(Var1, Freq))
facilityCount <- facilityCount %>% arrange(Freq)
# Factorise Classes for location and country for better faceting
countryCount <- as.data.frame(table(spaceFilt$Country2))
countryCount <- countryCount %>% arrange(Freq)
countryCount <- countryCount %>% mutate(Country = fct_reorder(Var1, desc(Freq)))
countryClasses <- (as.list(levels(countryCount$Country)))
facilityCount$Country2 <- factor(facilityCount$Country2, levels = countryClasses, ordered = TRUE)
facilityCount <- facilityCount %>% mutate(Country3 = as.factor(Country2))
t <-ggplot(data = facilityCount, aes(y =Facility, x=Freq)) +
facet_grid( ~Country2~.,switch = 'y', scales = "free", space = "free") +
theme() +
geom_bar(stat = 'identity', width = 0.7, fill= 'skyblue4') +
theme(strip.placement = "outside",
axis.text.x = element_text(angle = 0, vjust = 0.5, hjust=1),
strip.text.y.left = element_text(angle = 0, size = 10, colour = 'white'),
panel.spacing = unit(0.5, "lines"),
plot.title = element_text(hjust = 0.5, size = 15, face = 'bold'),
axis.title.x=element_text(size=11,colour="black"),
axis.title.y=element_text(size=13,colour="black"),
panel.grid.major.x = element_line(colour = "grey50", linetype = "dashed"),
panel.background = element_rect(fill = NA),
# axis.text.y=element_text(size=11, colour="black"),
# axis.text.x=element_text(size=11,colour="black")
strip.text.y = element_text(
size = 20, color = "Black", face = "bold"),
strip.background = element_rect( color = 'white',fill='skyblue3', size=0.1),
strip.text = element_text(vjust=0.95)
) +
coord_cartesian(xlim = c(10, max(facilityCount$Freq))) +
xlab("\n Number of Sucessful Space Launches")+
ylab("Location of Launchees")+
ggtitle("Number of Sucessful Space Launches since 2000 from different Locations grouped by Region\n")
Data Reference
The following plot fixes the main issues in the original.