Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: infogram.com.


Objective

The objective of data visualisation is to convey the weather data readings of Tokyo country on a monthly basis. This graph depicts three aspects, Rainfall, sea-level pressure and temperature data.

This data might be useful for general public by providing basic information as they can plan their travel ahead of time. Especially the rainfall data can be useful for both agriculture and transportation purposes And the sailors or the navy might use the sea level pressure data.

The visualisation chosen had the following three main issues:

  1. Triple Axes :

This is considered the complex version of the dual axis plot.the graph consists of three variables positioned on the y-axis.

All the varibles have improper scaling as the varibles rainfall, sea-level pressure and temperature goes down by 50, 4 and 5 units respectively. This makes the visualisation misleading and very difficult to interpret and there is no perfect relationship between the axes especially the secondary and teritiary axes are arbitrarily scaled.

  1. Deceptive and Misleading Visualisation :

Here multiple graphs are imposed on one another. The major drawback of this visualisation is that one might think that there is proportionality between all the three variables but these are three separate graphs mixed into one with no proportional relationship between them.

The axis for the variable sea-level pressure isn’t even labelled properly. We might be deceived into believing that there is a direct relationship between the variables rainfall and temperature and inverse relation between sea-level pressure and temperature but thats not true.

  1. Color and Contrast Issues :

As there is no proper grid present in the background, the light colors used in the visualisation got caught up by the white background resulting in a bad contrast(green colour of temperature line graph).There seems to be no proper colour association. Out of all three axes, only the sea-level pressure axis followed the proper palette which isn’t helpful when considering the whole visualisation graph.

As discussed before there is an improper scaling of the variables observed. Even though, the graphs doesnt display the readings they represent.

The visualisation doesnot cater for the most common Red-green color blindness by using one of the shades of green to represent temperature.

We managed to solve the above three issues by plotting the graphs of three variables seperately on a single grid.

Scaling of the axes is properly done and followed a good color palette for all the three graphs.

References

Code

The following code was used to fix the issues identified in the original.

library(readr)
library(plotly)
library(ggplot2)
library("cowplot")
library("ggpubr")

Tokyo_weather_data <- read_csv("C:/Users/ravit/OneDrive/Desktop/Data Visualization/Assignment-2/Submission/Tokyo weather data.csv")

Tokyo_rainfall <- as.data.frame(Tokyo_weather_data[c(1,2)])
is.data.frame(Tokyo_rainfall)
## [1] TRUE
Tokyo_rainfall$Month <- factor(Tokyo_rainfall$Month, 
                               levels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun","Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))
TR <- ggplot(data=Tokyo_rainfall, aes(x=Month,y=Rainfall))
TR <- TR + geom_bar(stat="identity", fill="steelblue")+
  geom_text(aes(label=Rainfall), vjust=-0.3, size=3.5)+
  coord_cartesian(ylim=c(0,300))+
  scale_y_continuous(breaks = seq(0,300,50))+
  theme_minimal()



Tokyo_temperature <- as.data.frame(Tokyo_weather_data[c(1,4)])
Tokyo_temperature$Month <- factor(Tokyo_temperature$Month, 
                               levels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun","Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))

TT <- ggplot(Tokyo_temperature, aes(x = Month, y = Temperature, group = 1))+
  geom_text(aes(label=Temperature), vjust=-0.3, size=3.5)+
  geom_line(color = "orange")+
  coord_cartesian(ylim=c(0,30))+
  scale_y_continuous(breaks = seq(0,30,5))+
  geom_point()+
  theme_minimal()



Tokyo_pressure <- as.data.frame(Tokyo_weather_data[c(1,3)])
Tokyo_pressure$Month <- factor(Tokyo_pressure$Month, 
                                  levels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun","Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))

TP <- ggplot(data=Tokyo_pressure, aes(x = Month, y = Sea_level_pressure, group = 1))+
  geom_text(aes(label=Sea_level_pressure), vjust=-0.3, size=3.5)+
  geom_line(color = "blue")+
  coord_cartesian(ylim=c(1000,1030))+
  scale_y_continuous(breaks = seq(1000,1030,5))+
  geom_point()+
  theme_minimal()


Final <- plot_grid(TR, TT, TP, 
          labels = c("Average Monthly Rainfall", "Average Monthly temperature", "Average Monthly Sea level pressure"),
          ncol = 3, nrow = 1)

Final2 <- annotate_figure(Final,
                top = text_grob("Tokyo Average Monthly Weather Data", color = "red", face = "bold", size = 20),
                bottom = text_grob("Data source: www.worldclimate.com", color = "blue",
                                   hjust = 1, x = 1, face = "italic", size = 10))

Data References

Reconstruction

The following plot fixes the main issues in the original.