Overview

A generalized additive model (GAM) with two predictor variables accounts for 71% of observed deviance in monthly Lake Champlain total phosphorus (TP) flux from 1990-2014.

Flux data are available from the USGS here.

The two predictor variables included in the model are:

On average, the highest loadings occur in March and April, followed by May and November; TP flux is positively correlated with monthly precipitation (Figure 1). Modeled trends are shown with observed values in Figure 2.

Data

head(df)
## # A tibble: 6 x 5
##   Month   mon    yr TP.flux Precip
##   <chr> <dbl> <dbl>   <dbl>  <dbl>
## 1 Mar       3  1990   223.    3.00
## 2 Mar       3  1991   138.    3.34
## 3 Mar       3  1992   131.    3.23
## 4 Mar       3  1993    69.3   3.30
## 5 Mar       3  1994    33.7   3.77
## 6 Mar       3  1995   103.    2.21

Model Summary

summary(G3)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## log(TP.flux) ~ s(mon, bs = "cs", k = 9) + s(Precip, bs = "cs", 
##     k = -1)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.54940    0.04416   80.37   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##             edf Ref.df     F p-value    
## s(mon)    6.827      8 46.86  <2e-16 ***
## s(Precip) 7.104      9 26.61  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.688   Deviance explained = 70.7%
## GCV = 0.46999  Scale est. = 0.4388    n = 225

Model Estimates

Figure 1. Model estimates of TP flux by predictor variable. Note log-scale on y-axis.


Figure 2. Model estimates of TP flux by predictor variable with observed values. Note log-scale on y-axis.

Model Diagnostics

## predictor variable correlation
cor(df$mon, df$Precip)
## [1] 0.09888823

Figures 3-8. Diagnostic plots. Log-scaled response variable and residuals.