What evidence is there that climate change has influenced wheat yields? An article by Demirhan (2020) argues that an increase of global average surface temperature of one-degree Celsius corresponds to a drop in total wheat production of 90.4 million tons. In this section of the tutorial, we will look at the evidence that is used to support this case. We will look at the data presented in Demirhan (2020), consider the causal relationships involved and you will work through some specific activities (look for the blue text).

I also want to demonstrate best practice in reproducible analysis, which means that anyone can see the process we have been through to come to our conclusions. You will see blocks of code before any table or figure. You will not need to edit any code in this tutorial – so if at first all this code is overwhelming to you, just focus on the figures and activities and come back to the code in the weeks to come. All code has some explanation commented out with a hash (#).

For this tutorial we will be using the R programming language in Noteable. First we need to import the libraries and data. Then we’ll look at the first few rows of the table (data frame) that we will be working with. This is how we do that:

library(pacman)
pacman::p_load(ggplot2, DiagrammeR, ggpubr, dLagM, smooth, gridExtra)

##Import libraries
suppressPackageStartupMessages(library(dLagM)) #By Demirhan, used to access the wheat data
suppressPackageStartupMessages(library(smooth)) #For calculating moving average
library(ggplot2) #For plotting figures
library(DiagrammeR) #For drawing diagrams
library(ggpubr) #For nicer looking ggplot figures
##Import data from dLagM package
data(wheat)

##Show the first few rows of the data frame
head(wheat)
Year CO2..ppm. CO2..tons. HarvestedArea..million.ha. Production..Mt. ProducationPerArea TempAnomaly..C.degrees.
1960 316.91 9410529753 202 241 1.193069 -0.02
1961 317.64 9452107735 202 227 1.123762 0.05
1962 318.45 9838202053 206 258 1.252427 0.04
1963 318.99 10379067957 209 241 1.153110 0.07
1964 319.62 10969325141 217 276 1.271889 -0.20
1965 320.04 11466150222 219 265 1.210046 -0.10

From this tabular preview of the data, we can see that it is aggregated on an annual basis, starting from 1960. Looking at the column names from left to right, we see that this dataset has information on atmospheric carbon dioxide concentrations (in parts per million), harvested area of wheat (in millions of hectares), total global production (in millions of tons), average yield (tons per hectare) and the magnitude of temperature anomalies. The author defines temperature anomalies as “the changes in surface air temperatures over the land and ice-free sea surface (with respect to 1951–1980 average temperatures)”. Next, we’ll present all this in a way that’s more easily interpreted – we’ll prepare a figure of the three core variables: CO2, temperature anomolies and yield.

##Prepare the figure using ggplot syntax. The last two lines of each of p1, p2 and p3 are simply specifying labels and styling.
p1 <- ggplot(wheat, aes(x = Year,y = CO2..ppm.)) + 
  geom_line(linewidth = 1, color = "blue") +
  scale_x_continuous(breaks=seq(1960,2017,5)) +
  labs(title= expression("(a) Atmospheric" ~ CO[2] ~ "concentration"), x = "Year", y = expression(CO[2] ~ "concentration (ppm)")) + theme_bw() +
  theme(
    plot.title = element_text(size = 10),  # Adjust title size
    axis.title.x = element_text(size = 8),  # Adjust x-axis label size
    axis.title.y = element_text(size = 8)   # Adjust y-axis label size
  )

p2 <- ggplot(wheat, aes(x = Year,y = TempAnomaly..C.degrees.)) + 
  geom_line(linewidth = 1, color = "red") +
  scale_x_continuous(breaks=seq(1960,2017,5)) +
  labs(title="(b) Temperature anomalies (global average)", x = "Year", y = "Temperature anomalies (°C deviation from 1951–1980 average)") + theme_bw() +
    theme(
    plot.title = element_text(size = 10),  # Adjust title size
    axis.title.x = element_text(size = 10),  # Adjust x-axis label size
    axis.title.y = element_text(size = 10)   # Adjust y-axis label size
  )

p3 <- ggplot(wheat, aes(x = Year,y = ProducationPerArea)) + 
  geom_line(linewidth = 1, color = "green") +
  scale_x_continuous(breaks=seq(1960,2017,5)) +
  labs(title="(c) Wheat yield (global average)", x = "Year", y = "Wheat yield (mean tons per hectare)") +  theme_bw() +
    theme(
    plot.title = element_text(size = 10),  # Adjust title size
    axis.title.x = element_text(size = 8),  # Adjust x-axis label size
    axis.title.y = element_text(size = 8)   # Adjust y-axis label size
  )

options(repr.plot.width=18, repr.plot.height=5) #Change plot output size
annotate_figure(ggarrange(p1, p2, p3, nrow = 1, ncol = 3), top = text_grob("Figure 1", size = 12)) #Plot side by side, with a heading at the top

This figure shows that atmospheric concentrations of CO2 have been increasing over time, as has temperature and yield per hectare. If we took a very simple approach of looking at the correlations between these variables, we might prematurely conclude that the increase in atmospheric concentrations of CO2 have resulted in increased temperature anomalies and increased yields. With the increasing trend in each series of this figure, it’s difficult to say what’s causing what.

The author of this article has a logical causal model in mind for the data – represented by the diagram below.

# Draw a diagram using the grVis function in the DiagrammR package. 
# The diagram will go from top to bottom
wheatDAG <- DiagrammeR::grViz(" digraph wheat1 {
    rankdir=TB;
    graph [compound = true, nodesep = .1, ranksep = .025];

    node [shape = ellipse, style = filled, fontsize = 2,  width=0.15, height = 0.2];
    edge [arrowhead = vee, penwidth = 0.2, weight = 0.1, arrowsize = 0.4, color = black];

    CO2 [label='CO2 concentrations', color = grey, fixedsize = true, width=0.25];
    temp [label='Temperature \n anomolies', color = grey, fixedsize = true, width = 0.25];
    yield [label='Wheat yield', color = OliveDrab, fixedsize = true, width = 0.25];

    CO2 -> temp;
    CO2 -> yield;
    temp -> yield;

    }")

wheatDAG

The directed acyclic graph (DAG) represents a causal model of the world where atmospheric concentrations of CO2 influence wheat yield directly through what’s referred to as the CO2 fertilisation effect. In this causal model, CO2 also impacts yields indirectly through temperature anomalies. There are other factors that influence wheat yield mentioned in the discussion of Demirhan (2020). These include the following:

Activity:

List at least 3 other factors that could causally influence wheat yields. Then, put the letter Y next to those factors which could be another indirect avenue for the impacts of climate change.

I’ll start you off with an example. Add as many as you can.

Pest outbreaks (e.g. locusts) Y. Yes, this factor could be influenced by climate change – increasing the occurrence of outbreaks in some locations. Read Deutsch et al. (2018) for more information on this factor

Your answers:

Demirhan (2020) estimates that through CO2 fertilisation, “every ppm increase in CO2 emissions results in a 32.322 tons increase in the world wheat production”. It is also estimated that a one degree warming results in “a drop in wheat yield of 90.4 million tons”. When the author uses the term “wheat yield” here, this can be interpreted as global wheat production

In this next section of the tutorial, we will focus on the relationship between warming and wheat yield per hectare. We will think about how focusing on yield per hectare instead of global wheat production would influence estimates. We will also consider the affect of using a more complete causal model.

The next block of code prepares the data so we can have a closer look at the relationships, without the distraction of having increasing trends in each series. The code block after that prepares visualisations of the relationship between temperature and yield per hectare.

#Prepare time series data for analysis
wheat$yield_movingAve <- sma(wheat$ProducationPerArea, h = 5)$fitted #Calculate 5 year moving average for yield per hectare
wheat$yield_percChange <- ((wheat$ProducationPerArea-wheat$yield_movingAve) / wheat$yield_movingAve) *100 #Calculate percentage change from moving average for yield per hectare

YIELD <- ts(wheat$ProducationPerArea, start = 1960)
TEMP <- ts(wheat$TempAnomaly..C.degrees., start = 1960)
Year <- seq(1960, 2017,1)
data <- data.frame(Year, YIELD, TEMP) #Combine variables into a data frame
data$tempAnomaly_detrend <- NA #Initialise column
data$yield_detrend <- NA #Initialise column

#Detrend data for comparison
data$tempAnomaly_detrend[2:nrow(data)] <- as.numeric(diff(data$TEMP, differences = 1)) #Assuming linear trend
data$yield_detrend[2:nrow(data)] <- as.numeric(diff(data$YIELD, differences = 1)) #Assuming linear trend
##Prepare the figure using ggplot syntax.
p1 <- ggplot(wheat, aes(x = TempAnomaly..C.degrees., y = yield_percChange)) + 
  geom_point() + geom_smooth(formula = y ~ x, method="lm") + 
  scale_x_continuous() +
  scale_y_continuous() +
  labs(title="a) Temperature anomaly and wheat yield per hectare - scatter") +
  xlab("Temperature anomalies (°C deviation from 1951–1980 average)") + ylab("Wheat yield (% change from moving ave.)") + theme_bw()


pDetrend1 <- ggplot(data[data$Year >= 2000,]) +
  geom_line(aes(x = Year, y = tempAnomaly_detrend, color = "Y1"), linewidth = 1) +
  geom_line(aes(x = Year, y = yield_detrend, color = "Y2"), linewidth = 1) + 
  #scale_x_continuous(breaks=seq(1965,2015,5)) +
  coord_cartesian(ylim = c(-0.8, 0.8)) +
  scale_color_manual(name = "Time series", values = c('Y1' = 'darkred',
    'Y2' = 'darkgreen'), labels = c("Temp anomaly", "Wheat yield")) +
  labs(title="b) Temperature and wheat yield over time - detrended") +
  ylab("Detrended value") +
  theme_bw() + theme(axis.text.x = element_text(size=8), axis.text.y = element_text(size=8), legend.title = element_text(size = 8), legend.text = element_text(size = 8))


options(repr.plot.width=20, repr.plot.height=6) #Change plot output size
annotate_figure(ggarrange(p1, pDetrend1, nrow = 1, ncol = 2), top = text_grob("Figure 2", size  = 12))

Figure 2 shows two different ways of visualising the relationship between temperature anomalies and wheat yield per hectare. Figure 2a shows the percentage change of wheat yield from a 5 year moving average, compared to temperature anomalies. Then Figure 2b, presents detrended time series between 2000 and 2015. Detrending the data means that we have removed the linear increase in temperature anomalies and yield per hectare. In Figure 2a we see quite a wide scatter of points around a predicted negative association. This suggests that as temperature anomalies increase, wheat yields will decrease. There is a high degree of uncertainty around this negative association, represented by grey shaded area around the linear regression fit.

In Figure 2b we can see this association over time. To make this easier to see, we are only viewing from 2000 onwards. Comparing these detrended series, we can see that there are several years where changes in temperature result in crop yields per hectare going in the opposite direction.

Activity: In assignment 1 you will provide a critical evaluation of the adequacy of existing data and make a case for intervention. To prepare for that, consider the uncertainty around Figure 2a and the implications for interventions.

Why is the impact on yield per hectare more informative than global annual wheat production? How would these different metrics influence the design of interventions?

In what ways would disaggregating yield and temperature anomalies by geography affect estimates of the impact on wheat yield per hectare? Refer to Figure 3 for one example.

How would disagregating by geography influence the design of interventions to maximise future wheat yields? Hint: budgets will be limited and interventions will need to be effective.

Figure 3 below is an extract from the IPCC report on Food security and production systems. The numbers in parentheses indicate sample sizes.