Background

The Climate Council Australia is an independent and community-funded organisation that provides authoritative and advice to the Australian public on climate change and solutions [1].

On 18 January 2022, the Council published an article contesting the idea and proposal from the Nuclear for Climate Australia to the House Standing Committee on the Environment and Energy Review under Submission 493, of the Climate Change Bill 2020 [2].

The Nuclear for Climate Australia is an organisation that runs the Nuclear for Climate project that proposed Nuclear as a source of energy for Australia, analysed the energy Grid, studied the available Renewables and quantified the Grid and the big costs involved to face the real carbon reduction efforts in Australia [3].

The Nuclear for Climate Australia published in their website the sources of Australia’s CO2 Emissions by Sector based on data acquired from the Quarterly Update of National Greenhouse Gas Inventory in 2016 from Australia’s National Greenhouse Accounts, Department of Environment and Energy. The data was collected since 1990 and accessible from the Department of Industry, Science, Energy and Resource’s website [4][5].

The Original visualisation used and published on the Nuclear for Climate Australia’s website is shown in the Original tab below. This visualisation seemed to provide ineffective and less impactful messages to the target audiences and this assignment intents to fix the issues related to this. The code written to fix the issues is shown in the Code tab and the reconstructed visualisation is shown in the Reconstruction tab.

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Nuclear for Climate Australia [3].


Objective

The Pie Chart used in the Nuclear for Climate Australia’s website as shown above was one of the visuals to show the percentage and the proportion of Carbon gas (CO2) emissions contributed from various sectors in Australia. The idea was to provide an insight in term of the sources of CO2 emissions before explaining the projections and the target achievement challenges in reducing CO2 emissions in Australia.

The chart and the website were targeted to general public and audiences that accessible to the blog discussions that may have similar interests for the implementation of Nuclear energy as a source of energy in Australia. The website was used as a platform to acquire donation to gain support and action from the public, private companies and organisations.

The target audiences may also need to have technical understanding to comprehend the numbers presented in the website in term of CO2 emissions projections in case of no mitigation in place and expected to grow until year 2050. The audience with technical backgrounds specifically in climate, energy, land and water may be benefited more than other general audiences.

Politicians and policy makers potentially be accessing the website to understand the advantage of Nuclear energy, the technology involved and the proposed nuclear reactor sites for each Australian state as well as to gain insights on public responses on their Nuclear proposals. The Nuclear for Climate Australia is a subject matter expert, reference and adviser to the Government when the ongoing Bill is discussed and presented in the Parliament [6].

The visual information provided in the Pie Chart however had the following three main issues:

Issue 1: From the Pie Chart, the rank was difficult to establish visually to form a visual order from the highest to the lowest sectors that contributed in the percentages of the CO2 emissions in Australia. Some sectors appeared to be about the same size in the slices of the Pie but quantitatively different from the percentages shown in the chart. For example the CO2 emissions from sectors such as Agriculture, Electricity - Manufacturing, Electricity - Commercial and Electricity - Residence looked very similar in term of size of the colored areas but from the percentage values were different as reported as 13%, 11.3%, 12.2% and 11.7% respectively.

Issue 2: The use of colors to shade and distinguish different categories for different sectors were inconsistent. For example the Fugitive emissions, Transport and Stationary energy sectors under Energy category were colored in three different colors but Manufacturing, Commercial and Residential sectors under Electricity category only colored in one same color. From an initial perception, the sectors under Electricity category can be perceived as one sector divided into three sub-sectors while the sectors under Energy category can be interpreted as three different sectors similar to how other sectors were presented in the chart.

Issue 3: Texts positioning and management, with leader line for the labels connected to the middle of the arc’s length circumference for each slice of the sector limit and clutter the texts appearance and arrangement in the chart. Inconsistent, with several lengths and shapes of the leader lines used for the labels and center placement of the title reduced a tidy appearance and seriousness of the message conveyed in the chart. As the proportion is too narrow and small to see, the leader line for the Land Use, Land Use Change and Forestry sector appeared as if as a single line as the separation line of the Pie slice adjoining the Waste sector.

Reference

Code

The following code was used to fix the issues identified in the original. The code is divided into 3 chunks from acquiring the data from 2 source files, preparing the data and lastly the code to construct the visualisation required for this assignment. The comments provided brief information about the functions applied during pre-processing until the visualisation is produced.

Acquire Data, Load Files and Source Data Files

# INSTALL AND LOAD REQUIRED PACKAGES

library(ggplot2)
library(readxl)
library(magrittr)
library(dplyr)
library(tidyr)
library(lubridate)
library(stringr)
library(zoo)

# ACQUIRE DATA, LOAD FILES AND SOURCE DATA FILES

# Specify Original Data Sources
addr1 <- "https://www.industry.gov.au/sites/default/files/August%202021/document/nggi-quarterly-update-march-2021-data-sources.xlsx"
addr2 <- "https://www.industry.gov.au/sites/default/files/April%202021/document/national-inventory-by-economic-sector-2019-data-tables.xlsx"

# Specify Saved Data Locations
file1Loc <- "data/nggi-quarterly-update-march-2021-data-sources.xlsx"
file2Loc <- "data/national-inventory-by-economic-sector-2019-data-tables.xlsx"

# Load Data into RStudio Environment
emi_tbl1 <- read_xlsx(file1Loc, sheet = "Data Table 1A", range = "A6:K86")
emi_tbl2 <- read_xlsx(file2Loc, sheet = "Data Table 1", range = "A6:AE54") 

Prepare, Explore, Tidy, Manipulate and Transform Data

# PREPARE, EXPLORE, TIDY, MANIPULATE AND TRANSFORM DATA

# Add Column Names in Emission Table 1
names(emi_tbl1) <- c("Year", "Quarter", "Electricity", "Energy - Stationary Energy excl. Electricity",
                     "Energy - Transport", "Energy - Fugitive Emissions", "Industrial Processes and Product Use",
                     "Agriculture", "Waste", "Land Use, Land Use Change and Forestry",
                     "National Inventory Total")

# Format Column Quarter into Year Quarter Format in Emission Table 1
emi_tbl1$Quarter <- as.yearqtr(emi_tbl1$Quarter)

# Separate Data in Column Quarter into Columns Year and Quarter in Emission Table 1
emi_tbl1 %<>% separate(Quarter, into = c("Year", "Quarter"), sep = " ")

# Rearrange Order of the Columns in Emission Table 1
emi_tbl1 <- emi_tbl1[, c("Year", "Quarter", "Electricity", "Energy - Stationary Energy excl. Electricity",
                         "Energy - Transport", "Energy - Fugitive Emissions",
                         "Industrial Processes and Product Use", "Agriculture", "Waste",
                         "Land Use, Land Use Change and Forestry", "National Inventory Total")]

# Subset Data from Required Columns in Emission Table 1
emi_tbl1a <- emi_tbl1 %>% select("Year", "Energy - Stationary Energy excl. Electricity",
                                "Energy - Transport", "Energy - Fugitive Emissions", 
                                "Industrial Processes and Product Use",
                                "Agriculture", "Waste", "Land Use, Land Use Change and Forestry")

# Filter emi_tbl1a Data Frame for Year 2016 and Gather Required Columns under Variable Name Sector
emi_tbl1b <- emi_tbl1a %>% filter(Year == "2016")
emi_tbl1b %<>% gather(2:8, key = "Sector", value = "Emission")

# Add Column Names in Emission Table 2
names(emi_tbl2) <- c("Industry Classification", "1990", "1991", "1992", "1993", "1994", "1995", "1996", "1997",
                     "1998", "1999", "2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008",
                     "2009", "2010", "2011", "2012", "2013", "2014", "2015", "2016", "2017", "2018", "2019")

# Filter Emission Data for Manufacturing, Commercial and Residential Sectors
emi_tbl2a <- emi_tbl2 %>% slice(c(12,32,46))

# Reshape emi_tbl2a Data Frame for Tidy Structure
emi_tbl2a %<>% gather(2:31, key = "Year", value = "Emission")
emi_tbl2a %<>% spread(key = "Industry Classification", value = "Emission")

# Rename Column Names with Required Variable Names
names(emi_tbl2a) <- c("Year", "Electricity - Manufacturing", "Electricity - Commercial",
                      "Electricity - Residential")

# Gather Required Columns under Variable Name Sector and Filter emi_tbl2a Data Frame for Year 2016
emi_tbl2b <- emi_tbl2a %>% gather(2:4, key = "Sector", value = "Emission")
emi_tbl2b %<>% filter(Year == "2016")

# Convert kilo Tonne unit of co2 emission value into Million Tonne 
emi_tbl2b$Emission <- emi_tbl2b$Emission / 1000

# Merge emi_tbl1b Data Frame  to emi_tbl2b Data Frame
co2 <- bind_rows(emi_tbl1b, emi_tbl2b)

# Calculate Total co2 Emission for Each Sector
co2_total <- aggregate(x = co2$Emission,
                       by = list(co2$Sector),
                       FUN = sum)
# Rename Column Names
names(co2_total) <- c("Sector", "Total Emission")

# Calculate Percentage of co2 Emission for Each Sector
co2_total$Percentage <- abs((co2_total$`Total Emission`/sum(co2_total$`Total Emission`))*100)

Produce Data Visualisation and Presentation

# PRODUCE DATA VISUALISATION AND PRESENTATION

col <- c("#EFEA5A", "#2C699A", "#83E377","#B9E769","#16DB93","#F1C453","#F29E4C","#0DB39E", "#048BA8", "#54478C")

p <- ggplot(data = co2_total, mapping = aes(x = reorder(Sector, Percentage), y = Percentage)) +
  geom_bar(stat = "identity", fill = col) +
  scale_y_continuous(limits = c(0, 24.0), breaks = seq(0, 24,by = 2)) +
  labs(title = "Australia CO2 Emissions by Sector 2016", 
       subtitle = "Percentage Contribution of CO2 Emissions from Major Sectors in Australia",
       caption = "Source: Based on data from Australian Government: Department of Industry, Science, Energy and Resources websites.",
       x = "Sector", y = "Percentage, %") +
  theme(panel.background = element_rect(fill = "White"),
        panel.grid.major.x = element_line(color = "#A8BAC4", size = 0.3),
        axis.ticks.length = unit(1.0, "mm"),
        axis.text.x = element_text(family = "URWTimes", face = "plain", size = 10),
        axis.text.y = element_text(family = "URWTimes", face = "plain", size = 10),
        text = element_text(family = "URWTimes", face = "plain", size = 10),
        plot.title = element_text(family = "URWTimes", face = "bold", size = 12),
        plot.subtitle = element_text(family = "URWTimes", face = "plain", size = 10),
        plot.caption = element_text(family = "URWTimes", face = "italic", size = 8, vjust = -2),
        plot.title.position = "plot",
        plot.caption.position = "plot") +
  geom_text(aes(label = paste(round(Percentage,1), "%", sep = "")), size = 3.5, nudge_y = 2, nudge_x = 0) +
  coord_flip()

Data Reference

Reconstruction

The following plot fixes the main issues in the original.

Conclusion

Bar Chart shown above fixed the issues found in the original Pie Chart from the Nuclear for Climate’s website. It maybe noted that the data obtained and used on the original Pie Chart may not be the same data used to produce the Bar Chart for this assignment. There was no link or source to the original data provided with the Pie Chart page on the Nuclear for Climate Australia’s website to use for replication, hence this assignment obtained and used the data that trust to be the closest from 2 data files of the Australian Government, Department of Industry, Science, Energy and Resources websites [4][5].

The first issue related to the visual rank in the Pie Chart is solved on the Bar Chart with the sorted sectors based on percentage values from highest to lowest arranged from top as the biggest sector contributing to the CO2 emissions in Australia to the bottom as the least sector contributing to the CO2 emissions. It is clearly shown on the Bar Chart the longest bar with the highest percentage of CO2 Emissions was from Energy - Transportation and the shortest bar with the smallest percentage was from the Waste sector.

The issue with inconsistent colors used in the Pie Chart to represent sectors of the same category is solved by treating each sector as a unique sector and assigned a different color for each sector separately. Different colors used in the Bar Chart with sorted percentages shown can visually observed that under Energy category the order of sectors from top to bottom are Energy - Transport, Energy - Stationary Energy excl. Electricity and Energy - Fugitive Emissions. Similarly for Electricity category, the order of sectors from the highest to the lowest in that category are Electricity - Residential, Electricity - Manufacturing and Electricity - Commercial can clearly be identified.

The last issue related to the texts management and placement in the original Pie Chart is solved by placing the labels and texts at more appropriate locations eliminating the use of inconsistent lengths and shapes of leader lines attached to the labels. As the Bar Chart is visualised, from top left are the title and the subtitle with sectors shown on the y-axis and percentages on the x-axis. Labels of the percentage values are horizontally placed at the end of each bar to accurately providing the correct percentage figures in the chart. The same font family is used with different sizes and face types to capture different details and significance.