March 27, 2025
Abstract
This study applies FDA to examine drought conditions in Southern Portugal using the Consecutive Dry Days (CDD) index. By integrating the FDA with spatial analyses, the research aims to provide a comprehensive understanding of drought dynamics, revealing detailed patterns of drought persistence, seasonal behavior, and spatial distribution. This integrated approach offers a novel perspective on drought characterization, enhancing predictive capabilities and informing more effective resource management and drought mitigation strategies.
The dataset includes measurements from 17 meteorological stations (MS) located in southern mainland Portugal. The available data covers the period from 1990 to 2020 for all MS.
Dataset consists of 17 stations with the number of consecutive days per month with precipitation below a specified threshold - known as Consecutive Dry Days (CDD) index - between 1990 and 2020. Time-series of CCD index (top) and smoothed curves (bottom) fitted to CCD index, with cubic splines (with knots every 3 days). Dark grey boxes refer to the Autumn-Winter periods.
Covariance matrices between days (top: raw covariance; bottom: normalized covariance).
A decomposition of functional data in principal components based on the covariance matrix was performed. The following plot illustrates results of the functional principal component analysis for the first 20 components, namely the proportion of overall variability explained by each component (absolute and cumultative).
For example, as seen in the plot, the cumulative proportion explained by the first 4 components is 74%.
The following plots show the eigenfunctions of the first 4 components across time. For example, the eigenfunction associated to the first principal component (0.535) reveals the overall trend, which is dominated by a seasonal increase of CCD index, more pronounced in autumn-winters 2000-01, 2002-03, 2015-16 and 2019-20. Stations with high positive scores in this component will show especially above-average CCD index values in those periods.
Are curves spatially correlated? i.e., do curves of neighbor stations tend to be more similar than curves from stations further apart? The semi-variogram below says yes, illustrating a clear spatial correlation.
An exponential model with the following estimated parameters was fitted to trace-variogram results:
The estimated parameters suggest that spatial correlation is present up to 80 km, but only part of that variability (0.42) is spatially correlated.
Using the functional principal component results we cluster the stations in groups (based on their curve dissimilarities). With this aim, a hierarchical clustering (HC) technique is used. HC computes a dissimilarity matrix, based on the dissimilarity between station curves, that is used to cluster the stations into homogeneous groups. Here we explicitly add the fitted spatial correlation model (semi-variogram) to weight the dissimilarity matrix (so more weight is given to dissimilarities between curves when stations are closer to each other). The number of clusters is set by the user. Here we set 4 clusters (with the agglomeration method ‘Ward D2’).
The spatial pattern of resulting classification follow the patterns of major landscape features in south region of Portugal (dryer south and inland, more humid in the north and coastland parts).
Next plot looks into the curves by cluster. The black line, illustrated in all plots, refers to the overall median curve. Distinct patterns of CCD index curves are visible across clusters. For example a clear above-average CCD index is illustrated by the stations of cluster 4 across the 31-year period.
The relevance of each station in the different types of variability found is assessed through their functional scores in each component. Here we see their scores, by cluster, in cartesian planes formed by pairs of functional components 1, 2, 3 and 4.