Climate data analysis project for the Caserta city area
knitr::opts_chunk$set(echo = TRUE)
# Libraries
library(knitr)
library(rmdformats)
library(ggplot2)
library(dplyr)
library(raster)
library(ncdf4)
library(tidyverse)
library(ncdf.tools)
library(dplyr)
library(magrittr)
library(PCICt)
library(readr)
library(DT)
library(kableExtra)Abstract
During my third year of graduate school, I was an intern at the Caserta office of the organization Euro-Mediterranean Center on Climate Change (CMCC), a scientific research facility that works in the field of climate science and aims to deepen the understanding of climate variability, its causes and consequences, through the development of high-resolution simulations with global Earth system models and with regional models, with a focus on the Mediterranean area.
The purpose of the intership was to do an in-depth climate analysis so as to construct a detailed representation of the current climate in the Caserta area (specifically, the coordinates of the area under analysis are N: 41.19046, S: 41.00669, E: 14.39992, W: 14. 20776), making use of a set of indicators commonly used in the literature to characterize the climate and its evolution both in terms of average values, such as temperature trends on an annual and seasonal scale, and in terms of trends in the most extreme values, i.e., the values assumed by the variables of interest (e.g., temperature) that differ from the average values for the area in a given reference period. Specifically, the indicators most commonly used to describe the intensity and frequency of occurrence of these events are those defined by the Expert Team on Climate Change Detection and Indices (ETCCDI).
It is important to point out that the study of climate implies, by definition, the use of long time scales; in particular, the World Meteorological Organization (WMO 2007) establishes 30 years as the standard duration over which to carry out statistical analyses that can be considered representative of the climate of a certain area.
The different indicators are calculated based on atmospheric data derived from a very high spatial resolution (about 2 km) climate reanalysis simulation produced by the CMCC Foundation and available over Italy for the period 1989-2020. This simulation (hereafter referred to as ERA5-2km) is obtained by dynamically localizing, with the COSMO-CLM regional climate model (RCM), a climate model developed by CLM Assembly with which the CMCC Foundation collaborates, the ERA5 reanalysis.
The results presented in this paper indicate that the temperature has been steadily increasing, especially over the past 10 years, an increase driven particularly by the summer and spring seasons. The number of extremely hot days and warm nights is increasing, while days with particularly low temperatures are decreasing. The hottest locations are precisely those in urban areas, while the coolest are hilly and sparsely populated areas.
While the fall months have presented a continuous and gradual increase in temperature over time, the summer months have shown a drastic increase in temperature over the past 10 years; in contrast, the spring months presented this change between 30 and 20 years ago. Spring and summer show an increase in the temperature variance between years, while autumn months show a decrease in the variance and some months have presented an increase in the average temperature difference over time.
TABLE OF CONTENTS
2. World of Change; Global Temperatures
- 4.1. Temperature Anomalies over Time
- 4.2. Max, Mean and Min Temperature over Year
- 4.3. Climate Change Indices
- 4.4. Number of Days above/below the average
5. Temperature Evolution over Time
- 6.1. Seasonal Distribution
- 6.2. Trend Analysis on
Seasons
- 6.2. Seasons during
Time
- 6.2. Monthly temperature change
Introduction
As part of my third year of the degree, I did an internship at the Caserta office of organisation CMCC.
The CMCC (Euro-Mediterranean Centre on Climate Change) collaborates with several international organisations specialising in advanced and applied research to carry out studies and models of our climate system and its interactions with society, in order to ensure reliable, timely and rigorous results to stimulate sustainable growth, protect the environment and develop science-based adaptation and mitigation policies in the context of climate change.
Other research groups of particular relevance in the company and to be named are the IPCC and the REMHI division:
IPCC (Intercontinental Panel of Climate Change): reports are politically neutral, does not conduct any research or monitor climate-related data or parameters. The purpose of the IPCC is to report on the state of scientific, technical and socio-economic knowledge about climate change, its impacts and future risks, as well as options for reducing the rate of climate change, and to provide governments at all levels with scientific information to use in developing climate policies.
REMHI (Regional Models Impacts Coupling Climate with Impact models): its mission is to link climate problems at the local level using regional observations to obtain very detailed information so that statistical models can be used to provide quantitative (not just qualitative) estimates of climate change trends, including the assessment of uncertainty.
Before addressing the topic of statistical models, it is important to understand the difference between some concepts and words that are often misused or confused as synonyms, such as the difference between weather and climate, the distinction between climate projections and weather forecasts, and the differentiation between mitigation and adaptation;
Regarding the difference between weather and climate: the first is studied by meteorology on a daily basis, while climate is the totality of weather conditions at a given location over a long period, at least 30 years.
The second distinction concerns climate projections and weather forecasts: climate projections typically start from the past, from the pre-industrial world to the present, and historical measurements are guided by estimates of past, human-induced and natural climatic situations, while weather forecasts predict the conditions of the atmosphere for a given place and time and are made by collecting quantitative data on the current state of the atmosphere at a given location, using a series of equations to estimate how the atmosphere will change.
The third differentiation is based on the two approaches to the problem of climate change: mitigation and adaptation: mitigation seeks to reduce the causes of climate change, in other words, it is the set of policies that serve to reduce CO2 and other harmful gases in the atmosphere, while adaptation seeks to manage the impacts of climate change that we already have, so it is the set of those policies that ask society to adapt in order to reduce the negative impacts we already have.
Climate represents the set of weather conditions that characterise a geographical region and its variability is defined as the fluctuation of a specific climate variable indicator around its mean value, obtained from long-term measurements of at least thirty years. More specifically, annual or decadal fluctuations involve year-to-year or decade-to-decade variations that overlap with the annual or decadal mean value.
World of Change; Global Temperatures
Earth’s air temperature has been rising since the industrial revolution. Although natural variability plays some role, the preponderance of evidence indicates that human activities, particularly greenhouse gas emissions, are primarily responsible for warming our planet. According to an ongoing temperature analysis by scientists at NASA’s Goddard Institute for Space Studies (GISS), Earth’s global average temperature has increased by at least 1.1° Celsius (1.9° Fahrenheit) since 1880. Most of the warming has occurred since 1975, at a rate of about 0.15-0.20°C per decade. The image below shows global temperature anomalies in 2021, the sixth warmest year on record. Nine of the world’s 10 warmest years have occurred in the past decade.
knitr::include_graphics("C:/Users/claud_kcmwfzd/Desktop/STAGE/PROJECT/Report/img/Immagine1.jpg")As the maps show, global warming does not mean that temperatures
always and everywhere increase at the same rate-for example,
exceptionally cold winters in one place might be balanced by extremely
warm winters in another part of the world. Generally, warming is greater
on land than on oceans because water is slower to absorb and release
heat (thermal inertia).
In the bar graph below, the years from 1880 to 1939 tend to be cooler,
then stabilize by the 1950s. The decades within the base period
(1951-1980) do not appear particularly warm or cold because they are the
standard against which other years are measured.
The leveling off of temperatures in the mid-20th century can be explained by natural variability and the cooling effects of aerosols generated by factories, power plants, and motor vehicles in the years of rapid economic growth after World War II. Fossil fuel use also increased after the war (5 percent per year), increasing greenhouse gases. Cooling due to aerosol pollution occurred rapidly; in contrast, greenhouse gases accumulated slowly, but remain in the atmosphere for a much longer period of time.
knitr::include_graphics("C:/Users/claud_kcmwfzd/Desktop/STAGE/PROJECT/Report/img/Immagine2.jpg")Experts generally agree that the Earth is warming. Globally, 14 of the 15 warmest years on record have all occurred in the 21st century, and each of the last three decades has been warmer than the previous one. The climate changes recorded so far depend on changes in the concentration of climate-altering gases in the atmosphere due to anthropogenic activities; these changes vary geographically and the impacts also differ depending on the atmospheric variable indicator considered. The human influence on the climate system is clear and undeniable, as the IPCC states; in fact, after years and years of observations, thanks to statistical models and new technologies that can be used, it has been concluded that human activity is indeed the cause of climate change.
Since 1950, some changes have been observed in all sectors of the Earth’s climate system:
- Atmospheric concentrations of greenhouse gases are increasing;
- The atmosphere and oceans have warmed;
- The extent and volume of ice has shrunk;
- Sea levels have risen;
- The increase in CO2 has caused the pH of the oceans to decrease;
- Snow cover in the Northern Hemisphere has decreased.
This is why global warming is called “Virtually Certain” (probability > 99%).
Dataset Description
To understand current climate change and extreme weather events, it is important that observations of the Earth system go back as far in time as possible. However, the observations have always been unevenly distributed and are accompanied by errors.
As part of the Copernicus Climate Change Service (C3S), ECMWF has produced the ERA5 reanalysis, which encapsulates a detailed record of the global atmosphere, land surface and ocean waves since 1950, updated daily with a latency of about 5 days. ERA5 benefits from a decade of developments in model physics, core dynamics and data assimilation and is the fifth generation of ECMWF reanalyses for global weather and climate over the past 4-7 decades. The reanalyses combine model data with observations from around the world into a comprehensive and globally consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centers, where every few hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called an analysis, from which an updated and improved forecast is issued. Reanalysis works in the same way, but with reduced resolution to allow for the provision of a data set spanning several decades; in addition, it does not have the constraint of issuing timely forecasts, so there is more time to collect observations and, when going further back in time, to allow for the inclusion of improved versions of the original observations, which benefits the quality of the reanalysis product. In addition to a significantly improved horizontal resolution of 31 km, compared to the 80 km of ERA-Interim, ERA5 has hourly output over the entire area.
In addition to these global reanalysis projects, there are also high-resolution regional reanalysis activities for different regions, such as North America, Europe, or Australia. These regional reanalyses are usually based on a regional weather forecast model and use the boundary conditions of a global reanalysis.
With the Highlander project, CMCC, particularly the REMHI division, and its partners present a new dataset for recent climate developed by dynamically downscaling ERA5 reanalyses, originally available at ’31 km horizontal resolution to a resolution of 2.2 km. The dynamic downscaling was conducted through the COSMO regional climate model (RCM). The temporal resolution of the output is hourly (as for ERA5),the runs cover the entire Italian territory (and neighboring areas as needed) to provide a very detailed (in terms of spatiotemporal resolution) and complete (in terms of meteorological fields) picture of climatological data for at least the last 30 years (01/1989-12/2020).
In this case, the dataset contains dynamically rescaled ERA5 reanalyses at 2.2 km x 2.2 km of measurements collected in the confined area of Caserta: N: 41.19046, S: 41.00669, E: 14.39992, W: 14.20776. In this area, temperatures were measured at 80 different points, one every 2.2 km, forming a 10x8 grid so as to cover the entire city and its boundaries.
library(GiNA)
library(ggmap)
library(rgdal)
library(sf)
Data_first <- read_csv("C:/Users/claud_kcmwfzd/Desktop/STAGE/PROJECT/Report/Data_first.csv")
Data_second <- read_csv("C:/Users/claud_kcmwfzd/Desktop/STAGE/PROJECT/Report/Data_second.csv")
lon_lat = Data_first %>% dplyr::select(long,lat)
long_lat <- lon_lat %>% st_as_sf(coords = c("lat", "long"), crs = 4326)
stations <- st_as_sf( Data_first, coords = c( "long", "lat" ) )
library(pdp)
library(grDevices)
library(leaflet)
library(RColorBrewer)
library(htmltools)
mybins <- c(0,14,14.5,15,15.5,16,16.5,17,17.5,18,18.5,19, Inf)
mypalette <- colorBin( palette="YlOrBr", domain=Data_first$mean_T_2M, na.color="transparent", bins=mybins)
library(cowplot)
tag.map.title <- tags$style(HTML("
.leaflet-control.map-title {
left: 20%;
top: -90px;
background: rgba(255,255,255,0.75);
font-weight: bold;
font-size: 28px;
}
"))
title1 <- tags$div(
tag.map.title, HTML("Temperature sensors")
)
df_long_tot_plot2 = Data_first %>% dplyr::select(long,lat)
df_long_tot_plot2$mean_T_2M <- "30"
df_long_tot_plot2$mean_T_2M= as.numeric(df_long_tot_plot2$mean_T_2M)
lonlat_round1=Data_first %>%
dplyr::select(long,lat)
lonlat_round1$long <- round(lonlat_round1$long ,digit=3) # Round off the column for 2 decimal
lonlat_round1$lat <- round(lonlat_round1$lat ,digit=3) # Round off the column for 2 decimal
mytext <- paste(
"lon: ", lonlat_round1$long,"<br/>",
"lat: ", lonlat_round1$lat,"<br/>",
sep="") %>%
lapply(htmltools::HTML)
l1=leaflet(Data_first) %>%
# Final Mapl=leaflet(Data_first) %>%
addTiles() %>%
setView(lng = mean(Data_first$long), lat = mean(Data_first$lat), zoom = 11) %>%
addCircleMarkers(data = stations,
fillColor = ~mypalette(df_long_tot_plot2$mean_T_2M),
stroke=TRUE,
fillOpacity = 0.8,
color="white",
weight=0.3,
label = mytext,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "13px",
direction = "auto"
)
) %>%
addControl(title1, position = "topleft", className="map-title")
l1The runs cover the entire territory to provide a very detailed (in terms of spatio-temporal resolution) and complete (in terms of meteorological fields) dataset of climatological data for at least the last 30 years (01/1989-10/2020). The temporal coverage of the dataset is from 01/01/1989 00:00 to 31/12/2020 23:00 and the temporal resolution is 1 hour.
Specifically, the data we are going to analyse represent:
mean_T_2M = hourly average of all observations in the area under analysis with respect to the longitude and latitude of the analysed territory;
max_T_2M = represents the maximum value reached per hour in the entire area under analysis;
min_T_2M = represents the minimum value reached per hour in the entire area under analysis.
Link (CCMC DDS) = https://dds.cmcc.it/#/dataset/era5-downscaled-over-italy/VHR-REA_IT_1989_2020_hourly
Database Temperature by Hour :
df_1989_2020_by_hour <- read_csv("C:/Users/claud_kcmwfzd/Desktop/STAGE/PROJECT/Report/df_1989_2020_by_hour.csv")
df_1989_2020_by_hour_plot= df_1989_2020_by_hour %>% dplyr::select(Date, Time, mean_T_2M, min_T_2M, max_T_2M)
df_1989_2020_by_hour_plot=df_1989_2020_by_hour_plot %>% mutate_if(is.numeric, round, digits=3)
datatable(df_1989_2020_by_hour_plot, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX=T) )