A generalized additive model (GAM) with two predictor variables accounts for 71% of observed deviance in monthly Lake Champlain total phosphorus (TP) flux from 1990-2014.
Flux data are available from the USGS here.
The two predictor variables included in the model are:
On average, the highest loadings occur in March and April, followed by May and November; TP flux is positively correlated with monthly precipitation (Figure 1). Modeled trends are shown with observed values in Figure 2.
head(df)
## # A tibble: 6 x 5
## Month mon yr TP.flux Precip
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Mar 3 1990 223. 3.00
## 2 Mar 3 1991 138. 3.34
## 3 Mar 3 1992 131. 3.23
## 4 Mar 3 1993 69.3 3.30
## 5 Mar 3 1994 33.7 3.77
## 6 Mar 3 1995 103. 2.21
summary(G3)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## log(TP.flux) ~ s(mon, bs = "cs", k = 9) + s(Precip, bs = "cs",
## k = -1)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.54940 0.04416 80.37 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(mon) 6.827 8 46.86 <2e-16 ***
## s(Precip) 7.104 9 26.61 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.688 Deviance explained = 70.7%
## GCV = 0.46999 Scale est. = 0.4388 n = 225
Figure 1. Model estimates of TP flux by predictor variable. Note log-scale on y-axis.
Figure 2. Model estimates of TP flux by predictor variable with observed values. Note log-scale on y-axis.
## predictor variable correlation
cor(df$mon, df$Precip)
## [1] 0.09888823
Figures 3-8. Diagnostic plots. Log-scaled response variable and residuals.