Kericho Trees

Step 1.1: loading and filtering

Step 1.1 of loading and filtering dataset for the district under investigation. This script is designed to analyse the district of Kericho, Rift Valley region, Kenya.

## Parsed with column specification:
## cols(
##   .default = col_double(),
##   did = col_integer(),
##   district = col_character(),
##   year = col_integer(),
##   month = col_integer(),
##   prov = col_character(),
##   season = col_character(),
##   season_type = col_character(),
##   tmaxgdd = col_integer(),
##   psn_smonth = col_integer(),
##   psn_emonth = col_integer(),
##   psn_length = col_integer(),
##   ssn_smonth = col_integer(),
##   ssn_emonth = col_integer(),
##   ssn_length = col_integer(),
##   mcounter = col_integer(),
##   tmaxgdd_csum = col_integer(),
##   yearf = col_integer()
## )

## See spec(...) for full column specifications.

## Parsed with column specification:
## cols(
##   ENSO = col_double()
## )

Step 1.2: Maize yield climatological anomaly

Step 1.2 consists of obtaining Maize yield climatological anomaly. Maize yield climatological anomaly is obtained as follow: Maize yield climatological anomaly = obseved residuals(after linear detrending) - climatological mean (1983 - 2014).

Step 2.1: fitting a Fast-and-frugal (FFT) model

Step 2.1 consists of fitting a Fast-and-frugal (FFT) decision trees on maize yield anomalies 6 months before harvetsing. To fit a FFT model, I need two types of input: continuos (predictors) and discrete (based on maize yield quantiles that gets value of 1 when observed>=Qx, and 0 when observed<Qx). The quantiles investigated are Q15%, Q20%, Q25%, Q30%, Q35% and Q40%.

Before fitting a FFT model, I filter the 5 most important predictors on my dataset by finding the ones that maximizes Hits and minimizes False Alarms for each quantile investigated and lead time before harvesting.

With these 5 predictors, I fit a FFT model in k-1 elements, and test the model in 1. With the leave-one-out cross-validation step, I am aiming to find out what is best number of branches that I should adopt in my final model (from 1 to 5), which maximizes the weighted accuracy (wacc) index.

To determine my final model, I fit 100% of the data with the respective chosen tunning element.

Model 6 months before harvesting - ThiS trial uses a max.levels=1 to 5, goal=wacc, sens.w=0.75

## Loading required package: FFTrees

##    O

##   / \

##  F   O

##     / \

##    F   T

## FFTrees v1.3.5. Email: Nathaniel.D.Phillips.is@gmail.com

## FFTrees.guide() opens the package guide. Citation info at citation('FFTrees')

Step 2.2: fitting a Fast-and-frugal (FFT) model

Model 5 months before harvesting - ThiS trial uses a max.levels=1 to 5, goal=wacc, sens.w=0.75

Step 2.3: fitting a Fast-and-frugal (FFT) model

Model 4 months before harvesting - ThiS trial uses a max.levels=1 to 5, goal=wacc, sens.w=0.75

Step 2.4: fitting a Fast-and-frugal (FFT) model

Model 3 months before harvesting - ThiS trial uses a max.levels=1 to 5, goal=wacc, sens.w=0.75

Step 2.5: fitting a Fast-and-frugal (FFT) model

Model 2 months before harvesting - ThiS trial uses a max.levels=1 to 5, goal=wacc, sens.w=0.75

Step 2.6: fitting a Fast-and-frugal (FFT) model

Model 1 month before harvesting - ThiS trial uses a max.levels=1 to 5, goal=wacc, sens.w=0.75

Step 3 : Plotting performance

Plotting probabilities of Hit Rate, False Alarm, Correct Rejection and Miss Rate for each lead time and quantile anomaly