Time series analysis and change detection of selected habitats

This document tries to analyse time series NDVI and NDMI indices data for selected habitats in Czechia.

V1.0

Data is manually created in GEE for one selected polygon. CSV file is then manually downloaded into users PC. After that, R is used to do several steps.

V 2.0 plans
- Script should be 100% in RGEE

V 3.0 plans
- automatic data export from gee of thousands of polygons into CSV
- automatic TS analysis and filtering based on Mann-Kendall tests and other tests
- identifying non-stationary TS, finding out whether its positive or negative trend

Steps
0. Install or activate selected libraries needed for TS analysis into the R software

# install pacman library if needed
if(!require("pacman")){install.packages("pacman")}
## Loading required package: pacman
# Load the necessary libraries
pacman::p_load(char = c('reticulate', 'rgee', 'remotes', 'sf', 'magrittr', 'tigris', 'tibble', 'stars', 'raster',
                         'dplyr', 'forecast', 'stars', 'st', 'lubridate', 'imputeTS', 'leaflet', 'classInt',
                        'RColorBrewer', 'ggplot2', 'googledrive', 'geojsonio', 'ggpubr', 'readr', 'zoo', 'tseries', 'trend', 'Kendall', 'forecast'), 
                        install = T, update = F, character.only = T)
  1. Upload the csv data into R via readr library
# Read in the CSV file (Use read_csv from readr package
data <- read_csv("C:\\Users\\prochazka5\\Desktop\\time series data gee\\ee-chart1.csv")
## Rows: 71 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): system:time_start
## dbl (1): NDVI
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
  1. Convert data to TS object
# Convert the data to a time series object (), rok a den startu měření, frekvence 12 (měsíční, nebo jiná) 
#(je třeba správně nastavit začátek a konec dle datasetu!)
data_ts <- ts(data, start = c(1984, 01), end = c(2011, 12), frequency = 12)
plot(data_ts[,2])

3. Eliminate NA values and model missing data using Kalman smoothing

na.cnt <- length(data_ts[is.na(data_ts)])
ndvi <- if(na.cnt > 0){imputeTS::na_kalman(data_ts, model = "auto.arima", smooth = T)} else {data_ts}
plot(ndvi[,2])

4. Decompose TS and plot the result

# Decompose the time series
decomposed <- stl(ndvi[,2], s.window = 'periodic')

#plot decomposed TS
plot(decomposed)

5. Plot various TS charts and seasonal charts
6. Forecasting

#A variation on the naïve method is to allow the forecasts to increase or decrease over time
#, where the amount of change over time (called the drift) is set to be the average change seen in the historical data
autoplot(ndvi[,2]) +
  autolayer(meanf(ndvi[,2], h=40),
            series="Mean", PI=FALSE) +
  autolayer(rwf(ndvi[,2], h=40),
            series="Naïve", PI=FALSE) +
  autolayer(rwf(ndvi[,2], drift=TRUE, h=40),
            series="Drift", PI=FALSE) +
  ggtitle("NDVI in specified region") +
  xlab("Year") + ylab("NDVI") +
  guides(colour=guide_legend(title="Forecast"))

7. Testing statistical hypotheses regarding stationarity of TS

#Augmented Dickey-Fuller test
fuller <- adf.test(ndvi[,2])
## Warning in adf.test(ndvi[, 2]): p-value smaller than printed p-value
#if p-value less than 0.05, we can reject the null hypothesis (TS is non-stationary)

#Mann-Kendall Trend Test
kendall <- MannKendall(ndvi [,2])
#H0: There is no discernible pattern in the data.
#H1: There is a trend in the data. (This could indicate a positive or negative trend.)

print(fuller)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ndvi[, 2]
## Dickey-Fuller = -16.673, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
print(kendall)
## tau = -0.0013, 2-sided pvalue =0.97208