There is broad global scientific consensus that average global temperatures across the Earth are increasing, and at an increasing rate. In 2020, this has been topical with an especially severe bushfire season, for which some of the key causes are discussed here.
The Australian Bureau of Meteorology generously provides access to The Australian Climate Observations Reference Network – Surface Air Temperature (ACORN-SAT) dataset, which summarises over 100 years of daily minimum and maximum temperatures from 112 weather stations across Australia. These temperatures are adjusted & homogenised to control for biasing factors that can arise over time. The BOM website has further information on these methodologies
The data comprising the ACORN-SAT can be downloaded from this link, which provides 2 x 112 .csv files (minimum & maximum daily temperatures from 112 weather stations
Once the files have been downloaded, we can extract the data from two folders (acorn_sat_v2_daily_tmax & acorn_sat_v2_daily_tmin)
We will start with the ‘maximum’ temperature folder, and bind the 112 .csv’s into a single data-frame. This can be achieved via a nifty function created by Stack Overflow user leerssej (details of which can be found here)
Lets load in all required dependencies …
library(tidyverse)
library(purrr)
library(data.table)
library(kableExtra)
library(ggplot2)
library(plotly)
library(sf)
library(mapview)
Write leerssej’s “.csv reading” function …
# function to read in multiple csv's, plus the file-name
read_plus <- function(flnm) {
read_csv(flnm) %>%
mutate(filename = flnm)
}
Then build our dataframe from the 112 max-temperature files
tbl <-
list.files(path = "directory of max temp folder goes here",
pattern = "*.csv",
full.names = T) %>%
map_df(~read_plus(.))
One of the columns summarises the file-location & file-name. We can use regex & tidyverse tools to extract out each weather station location from the file-path name
# Extract part of the string after "//"
tbl$location <- sub('.*//', '', tbl$filename)
# Extract only the location information
tbl$location2 <- sapply(tbl$location, function(x) unlist(strsplit(x, "\\."))[2])
# remove .csv reference
tbl$location <- NULL
tbl$filename <- NULL
# extract out location names
location_names <- subset(tbl, is.na(date)) %>%
filter(location2 != "csv")
location_names <- select(location_names, `site name`, location2)
# remove NA rows
tbl <- tbl %>% drop_na(date)
# remove unnecessary columns
tbl$`site number` <- NULL
tbl$`site name` <- NULL
tbl$lat <- NULL
tbl$long <- NULL
tbl$elev <- NULL
# join back
tbl <- tbl %>%
left_join(location_names, by = "location2")
# Re-order Columns
tbl <- tbl[,c(1, 4, 2, 3)]
Producing the below dataframe
| location2 | site name | date | maximum temperature (degC) |
|---|---|---|---|
| 001019 | KALUMBURU | 1941-09-01 | 29.8 |
| 001019 | KALUMBURU | 1941-09-02 | 29.8 |
| 001019 | KALUMBURU | 1941-09-03 | 29.3 |
| 001019 | KALUMBURU | 1941-09-04 | 37.1 |
| 001019 | KALUMBURU | 1941-09-05 | 30.9 |
| 001019 | KALUMBURU | 1941-09-06 | NA |
For visualisation purposes, let’s add some further date dimensions
# Year from date
tbl$Year <- year(tbl$date)
# Month from date
library(anytime)
tbl$Month <- format(anydate(tbl$date), "%m")
# Week from date
library(lubridate)
tbl$Week <- isoweek(ymd(tbl$date))
# Day of the week
# tbl$DayofWeek <- weekdays(as.Date(tbl$Date))
# Season
tbl$Season[
tbl$Month == "12" |
tbl$Month == "01" |
tbl$Month == "02"] <-
"Summer"
tbl$Season[
tbl$Month == "03" |
tbl$Month == "04" |
tbl$Month == "05"] <-
"Autumn"
tbl$Season[
tbl$Month == "06" |
tbl$Month == "07" |
tbl$Month == "08"] <-
"Winter"
tbl$Season[
tbl$Month == "09" |
tbl$Month == "10" |
tbl$Month == "11"] <-
"Spring"
# Year-Month
setDT(tbl)[, Yr_Month := format(as.Date(date), "%Y-%m") ]
# Year-Week (created)
tbl$Yr_Week <- paste(tbl$Year, tbl$Week, sep="-")
tbl$Era <- ifelse(tbl$Year>=1910 & tbl$Year<=1919,"1910s",
ifelse(tbl$Year>=1920 & tbl$Year<=1929,"1920s",
ifelse(tbl$Year>=1930 & tbl$Year<=1939,"1930s",
ifelse(tbl$Year>=1940 & tbl$Year<=1949,"1940s",
ifelse(tbl$Year>=1950 & tbl$Year<=1959,"1950s",
ifelse(tbl$Year>=1960 & tbl$Year<=1969,"1960s",
ifelse(tbl$Year>=1970 & tbl$Year<=1979,"1970s",
ifelse(tbl$Year>=1980 & tbl$Year<=1989,"1980s",
ifelse(tbl$Year>=1990 & tbl$Year<=1999,"1990s",
ifelse(tbl$Year>=2000 & tbl$Year<=2009,"2000s",
ifelse(tbl$Year>=2010 & tbl$Year<=2019,"2010s", 0)))))))))))
Let’s preview this revised dataframe
| location2 | site name | date | maximum temperature (degC) | Year | Month | Week | Season | Yr_Month | Yr_Week | Era |
|---|---|---|---|---|---|---|---|---|---|---|
| 001019 | KALUMBURU | 1941-09-01 | 29.8 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s |
| 001019 | KALUMBURU | 1941-09-02 | 29.8 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s |
| 001019 | KALUMBURU | 1941-09-03 | 29.3 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s |
| 001019 | KALUMBURU | 1941-09-04 | 37.1 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s |
| 001019 | KALUMBURU | 1941-09-05 | 30.9 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s |
| 001019 | KALUMBURU | 1941-09-06 | NA | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s |
The zip file includes a .txt/.csv file with coordinates for the weather stations. Let’s read in this file & join to our dataframe
coordinates$location2 <- NULL
tbl <- tbl %>%
left_join(coordinates, by = c("site name" = "site.name"))
Using the “mapview” package, we can visualise the location of the 112 weather stations in Australia. A benefit of this package is unlike Google’s map package, it can be used without an API key
# create a map of all weather stations
weather_stations <- st_as_sf(coordinates, coords = c("long", "lat"), crs = 4326)
Using the same approach implemented for maximum temperatures, we can read-in the minimum temperature files, & join the data to our existing dataframe
tbl_min <-
list.files(path = "directory of max temp folder goes here",
pattern = "*.csv",
full.names = T) %>%
map_df(~read_plus(.))
# this takes every part of the string after "//"
tbl_min$location <- sub('.*//', '', tbl_min$filename)
# now take only the location information
tbl_min$location2 <- sapply(tbl_min$location, function(x) unlist(strsplit(x, "\\."))[2])
# remove .csv reference
tbl_min$location <- NULL
tbl_min$filename <- NULL
# remove these NA columns
tbl_min <- tbl_min %>% drop_na(date)
# remove not needed columns
tbl_min$`site number` <- NULL
tbl_min$`site name` <- NULL
# join min temps to core-table
tbl <- left_join(tbl, tbl_min,
by = c("location2" = "location2", "date" = "date"))
# remove table
rm(tbl_min)
# re-order columns
tbl <- tbl[,c(1:3,15, 4:14)]
This results in a dataframe with ~ 3.5 million rows of minimum and maximum temperature data from 112 Australian weather stations since the beginning of the 20th century. With this, let’s begin some trend-analyses
| location2 | site name | date | minimum temperature (degC) | maximum temperature (degC) | Year | Month | Week | Season | Yr_Month | Yr_Week | Era | lat | long | elev |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 001019 | KALUMBURU | 1941-09-01 | 20.6 | 29.8 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s | -14.3 | 126.65 | 23 |
| 001019 | KALUMBURU | 1941-09-02 | 20.6 | 29.8 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s | -14.3 | 126.65 | 23 |
| 001019 | KALUMBURU | 1941-09-03 | 19.6 | 29.3 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s | -14.3 | 126.65 | 23 |
| 001019 | KALUMBURU | 1941-09-04 | 21.2 | 37.1 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s | -14.3 | 126.65 | 23 |
| 001019 | KALUMBURU | 1941-09-05 | 19.8 | 30.9 | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s | -14.3 | 126.65 | 23 |
| 001019 | KALUMBURU | 1941-09-06 | 19.8 | NA | 1941 | 09 | 36 | Spring | 1941-09 | 1941-36 | 1940s | -14.3 | 126.65 | 23 |
We can efficiently summarise very large data-sets by combining tidyverse functions such as “group_by” with the visualisation capabilities of ggplot2. Below are two line charts depicting change to yearly-average minimum & maximum temperatures in Australia
# Minimalistic theme for visualisation
theme_plot_text <-
theme(
plot.title = element_text(size = 10, hjust = 0.5),
axis.title = element_text(size = 10),
legend.title = element_text(size = 10),
axis.text = element_text(size = 7))
# Min temperature visualisation
temp.min <- tbl %>%
group_by(Year) %>%
filter(Year != 2019) %>%
summarise(avgmin = mean(`minimum temperature (degC)`, na.rm = TRUE)) %>%
ggplot(
aes(x = Year,
y = avgmin,
colour = avgmin,
group = 1)) +
geom_point() +
geom_line() +
geom_smooth(method = "loess",
colour = "#44546A",
linetype = "dashed",
fill = "#FFEDB4",
size = 0.5) +
scale_colour_gradient2(low = "#2BAAED", mid = "#8CD04F" , high = "#FFC52B",
midpoint = 13) +
labs(title = "Figure 1. Average of daily minimum temperature across Australia between 1911 & 2018") +
theme_bw() +
theme(panel.border = element_blank(),
axis.line = element_line(colour = "black", size = 0.2),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.title = element_text(size = 6, hjust = 0.5),
legend.text = element_text(size = 6)) +
theme_plot_text +
labs(color = '°C', size = 6)
# Max temperature visualisation
temp.max <- tbl %>%
group_by(Year) %>%
filter(Year != 2019) %>%
summarise(avgmax = mean(`maximum temperature (degC)`, na.rm = TRUE)) %>%
ggplot(
aes(x = Year,
y = avgmax,
colour = avgmax,
group = 1)) +
geom_point() +
geom_line() +
geom_smooth(method = "loess",
colour = "#80190B",
linetype = "dashed",
fill = "#FFEDB4",
size = 0.5) +
scale_colour_gradient2(low = "#E1B823", mid = "#DD8033" , high = "#C95343",
midpoint = 25) +
labs(title = "Figure 2. Average of daily maximum temperature across Australia between 1911 & 2018") +
theme_bw() +
theme(panel.border = element_blank(),
axis.line = element_line(colour = "black", size = 0.2),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.title = element_text(size = 6, hjust = 0.5),
legend.text = element_text(size = 6)) +
theme_plot_text +
labs(color = '°C', size = 6)
These Figures show that across Australia, the lowest yearly-average minimum & maximum temperatures occurred between the 1940s & 1960s, with steady increases in temperatures occurring up until the present. Of particular note is that increasing temperatures begin to steepen from the 1990s onwards
Clearly, temperatures across Australia are rising. What may be interesting to explore is whether these effects vary by season. Lets create summary tables of decade-average minimum & maximum temperatures for each of the four seasons:
# MINIMUM
tbl.min.season <- tbl %>%
group_by(Era, Season) %>%
filter(Year != 2019) %>%
summarise(avgmin = mean(`minimum temperature (degC)`, na.rm = TRUE)) %>%
dcast(Season ~ Era) %>%
mutate(Change = `2010s` - `1910s`) %>%
mutate(Percent_change = (Change / `2010s`)*100)
# round to 1 decimal place
tbl.min.season[2:14]=round(tbl.min.season[2:14],1)
# MAXIMUM
tbl.max.season <- tbl %>%
group_by(Era, Season) %>%
filter(Year != 2019) %>%
summarise(avgmax = mean(`maximum temperature (degC)`, na.rm = TRUE)) %>%
dcast(Season ~ Era) %>%
mutate(Change = `2010s` - `1910s`) %>%
mutate(Percent_change = (Change / `2010s`)*100)
# round to 1 decimal place
tbl.max.season[2:14]=round(tbl.max.season[2:14],1)
Somewhat arbitrarily I have calculated the % change to min/max temperatures from decade to 1910 to 2010, at the exclusion of all decades in between. This approach shows all four seasons demonstrated increased temperatures, with slightly greater % increases observed in Autumn & Spring
| Season | 1910s | 1920s | 1930s | 1940s | 1950s | 1960s | 1970s | 1980s | 1990s | 2000s | 2010s | Change | Percent_change |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Autumn | 13.1 | 12.9 | 13.1 | 12.7 | 13.3 | 13.3 | 13.5 | 14.1 | 13.9 | 13.9 | 14.3 | 1.3 | 8.9 |
| Spring | 12.2 | 11.9 | 11.9 | 11.9 | 11.9 | 12.2 | 12.4 | 12.8 | 12.9 | 13.3 | 13.4 | 1.2 | 9.0 |
| Summer | 17.4 | 17.1 | 17.1 | 17.2 | 17.2 | 17.6 | 18.0 | 18.1 | 18.3 | 18.5 | 18.9 | 1.5 | 8.0 |
| Winter | 7.5 | 7.1 | 7.3 | 7.1 | 7.4 | 7.3 | 7.4 | 7.7 | 8.1 | 7.9 | 8.1 | 0.6 | 7.8 |
Looking at change in decade-average maximum temperatures, there is remarkable consistency in the increase from the 1910s to the 2010s for summer, autumn & winter (4.6% increase). Interestingly, spring shows a greater % increase (5.5%) in maximum temperatures, perhaps lending credibility to the idea of “longer summers”
| Season | 1910s | 1920s | 1930s | 1940s | 1950s | 1960s | 1970s | 1980s | 1990s | 2000s | 2010s | Change | Percent_change |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Autumn | 24.8 | 24.8 | 24.9 | 24.7 | 25.0 | 24.9 | 25.1 | 25.5 | 25.5 | 25.9 | 26.0 | 1.2 | 4.6 |
| Spring | 25.1 | 25.0 | 24.8 | 24.9 | 24.8 | 25.1 | 24.9 | 25.6 | 25.5 | 26.3 | 26.5 | 1.5 | 5.5 |
| Summer | 29.7 | 29.5 | 29.7 | 29.6 | 29.7 | 29.9 | 30.1 | 30.4 | 30.3 | 30.7 | 31.1 | 1.4 | 4.6 |
| Winter | 19.0 | 18.7 | 18.5 | 19.0 | 18.7 | 18.8 | 19.2 | 19.1 | 19.5 | 19.8 | 19.9 | 0.9 | 4.6 |
Let’s visualise seasonal change over time:
# min temperature Season visualisation
season.min <- tbl %>%
group_by(Year, Season) %>%
filter(Year != 2019) %>%
summarise(avgmin = mean(`minimum temperature (degC)`, na.rm = TRUE)) %>%
ggplot(
aes(x = Year,
y = avgmin,
colour = Season,
group = Season)) +
geom_point() +
geom_line() +
geom_smooth(method = "loess",
colour = "#80190B",
linetype = "dashed",
fill = "#FFEDB4",
size = 0.5) +
scale_colour_manual(values=c("#DD8033", "#43854C", "#C95343", "#77C4DF")) +
labs(title = "Figure 3. Average of daily minimum temperature across seasons between 1911 & 2018") +
theme_bw() +
theme(panel.border = element_blank(),
axis.line = element_line(colour = "black", size = 0.2),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.title = element_text(size = 6, hjust = 0.5),
legend.text = element_text(size = 6)) +
theme_plot_text +
labs(color = 'Season', size = 6)
# Max temperature Season visualisation
season.max <- tbl %>%
group_by(Year, Season) %>%
filter(Year != 2019) %>%
summarise(avgmax = mean(`maximum temperature (degC)`, na.rm = TRUE)) %>%
ggplot(
aes(x = Year,
y = avgmax,
colour = Season,
group = Season)) +
geom_point() +
geom_line() +
geom_smooth(method = "loess",
colour = "#80190B",
linetype = "dashed",
fill = "#FFEDB4",
size = 0.5) +
scale_colour_manual(values=c("#DD8033", "#43854C", "#C95343", "#77C4DF")) +
labs(title = "Figure 4. Average of daily maximum temperature across seasons between 1911 & 2018") +
theme_bw() +
theme(panel.border = element_blank(),
axis.line = element_line(colour = "black", size = 0.2),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.title = element_text(size = 6, hjust = 0.5),
legend.text = element_text(size = 6)) +
theme_plot_text +
labs(color = 'Season', size = 6)
We can also look at location-based differences in rising temperatures. Let’s create tables for decade-average minimum and maximum temepratures, by location.
max.loc <- tbl %>%
group_by(Era, `site name`) %>%
filter(Year != 2019) %>%
summarise(avgmax = mean(`maximum temperature (degC)`, na.rm = TRUE),
avgmin = mean(`minimum temperature (degC)`, na.rm = TRUE))
max.era <- max.loc %>%
subset(select = -c(4)) %>%
dcast(`site name` ~ Era) %>%
mutate(Change = `2010s` - `1910s`) %>%
mutate(Percent_change = (Change / `2010s`)*100) %>%
arrange(desc(Percent_change)) %>%
top_n(20)
# round to 1 decimal place
max.era[2:14]=round(max.era[2:14],1)
min.era <- max.loc %>%
subset(select = -c(3)) %>%
dcast(`site name` ~ Era) %>%
mutate(Change = `2010s` - `1910s`) %>%
mutate(Percent_change = (Change / `2010s`)*100) %>%
arrange(desc(Percent_change)) %>%
top_n(20)
# round to 1 decimal place
min.era[2:14]=round(min.era[2:14],1)
Examining the table below, each of the top-five weather stations in terms of increased minimum temperatures occupy south-east Australia. Four of these five locations are below national averages for minimum temperatures
| site name | 1910s | 1920s | 1930s | 1940s | 1950s | 1960s | 1970s | 1980s | 1990s | 2000s | 2010s | Change | Percent_change |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CANBERRA AIRPORT | 4.5 | 4.2 | 5.1 | 4.8 | 5.3 | 5.3 | 5.6 | 6.0 | 6.0 | 6.5 | 6.6 | 2.0 | 30.9 |
| INVERELL (RAGLAN ST) | 5.9 | 5.5 | 5.7 | 5.5 | 6.5 | 6.3 | 6.8 | 7.3 | 7.3 | 7.3 | 8.1 | 2.2 | 27.7 |
| WAGGA WAGGA AMO | 7.5 | 7.0 | 7.6 | 7.3 | 7.5 | 8.0 | 8.4 | 8.5 | 8.7 | 9.2 | 9.6 | 2.1 | 22.0 |
| RUTHERGLEN RESEARCH | 5.8 | 5.4 | 5.4 | 5.3 | 5.4 | 5.7 | 6.5 | 6.5 | 6.5 | 6.8 | 7.2 | 1.4 | 19.3 |
| CHARLEVILLE AERO | 11.6 | 11.8 | 11.8 | 11.9 | 12.3 | 12.4 | 12.3 | 13.0 | 13.1 | 13.9 | 14.2 | 2.6 | 18.2 |
| LAUNCESTON AIRPORT | 5.2 | 5.2 | 5.2 | 4.9 | 5.0 | 5.3 | 5.8 | 6.0 | 5.9 | 6.2 | 6.3 | 1.1 | 17.7 |
Regarding average maximum temperatures, curiously, five of the top six weather stations are located at airports. I have no idea what might be responsible for this, particularly considering the normalisation that goes into this dataset
| site name | 1910s | 1920s | 1930s | 1940s | 1950s | 1960s | 1970s | 1980s | 1990s | 2000s | 2010s | Change | Percent_change |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CANBERRA AIRPORT | 19.1 | 19.0 | 19.5 | 19.4 | 19.1 | 19.2 | 19.4 | 19.7 | 19.7 | 20.9 | 21.1 | 2.0 | 9.5 |
| EUCLA | 21.7 | 22.0 | 22.1 | 21.9 | 21.8 | 22.2 | 22.1 | 22.4 | 22.8 | 23.3 | 23.7 | 2.0 | 8.5 |
| PERTH AIRPORT | 23.4 | 23.7 | 24.0 | 24.1 | 23.9 | 24.1 | 24.7 | 24.7 | 24.8 | 25.0 | 25.5 | 2.1 | 8.3 |
| ALBANY AIRPORT | 19.3 | 19.3 | 19.6 | 19.7 | 19.7 | 19.8 | 19.8 | 19.9 | 20.0 | 20.1 | 20.9 | 1.5 | 7.4 |
| GERALDTON AIRPORT | 24.9 | 25.1 | 25.0 | 25.2 | 24.9 | 25.1 | 25.3 | 25.5 | 25.7 | 26.1 | 26.9 | 2.0 | 7.4 |
| ALICE SPRINGS AIRPORT | 27.4 | 27.7 | 27.7 | 28.4 | 28.7 | 28.8 | 28.8 | 29.2 | 29.7 | 29.5 | 29.4 | 2.0 | 6.9 |
One final thing we can do is visualise the location of weather stations demonstrating the largest increase in daily max/min temperature
# join our biggest temp increases to coordinates table
top_max <- inner_join(coordinates, max.era, by = c("site.name" = "site name"))
weather_stations_max <- st_as_sf(top_max, coords = c("long", "lat"), crs = 4326)
Examining maximum temperatures, it appears that southern austraila has been disproportionately impacted with 17 of the 20 highest temperature increasing stations being located below the mid-line of the country
# join our biggest temp increases to coordinates table
top_min <- inner_join(coordinates, min.era, by = c("site.name" = "site name"))
weather_stations_min <- st_as_sf(top_min, coords = c("long", "lat"), crs = 4326)
Examining minimum temperatures, a different trend is at play. Eastern austraila has been disproportionately impacted with 16/20 weather stations to record the largest % change in minimum temperatures existing on the eastern side of the country. Concerningly, but perhaps unsuprisingly, some of the biggest temperature changes by magnitude in Australia have been observed in Northern Victoria/Southern NSW, overlapping with regions impacted by the recent bushfires
As Australia deals with a steadily heating climate, historic temperature data can help inform about the uniformity & magnitude of change. Daily maximum temperatures have increased proportionally more in southern australia, and are most strongly observed around Spring time. Daily minimum temperatures have disproportionately increased on the eastern coast, where increases are slightly more pronounced in Autumn & Spring