Generalized Additive Models by Simon N. Wood
The R Book by Michael Crawley
Linear Regression, GLMs and GAMs with R demonstrates how to use R to extend the basic assumptions and constraints of linear regression to specify, model, and interpret the results of generalized linear (GLMs) and generalized additive (GAMs) models. The course demonstrates the estimation of GLMs and GAMs by working through a series of practical examples from the book Generalized Additive Models: An Introduction with R by Simon N. Wood (Wood 2017).
Linear statistical models have a univariate response modeled as a linear function of predictor variables and a zero mean random error term. The assumption of linearity is a critical (and limiting) characteristic.
Generalized linear models (GLMs) relax this assumption of linearity. They permit the expected value of the response variable to be a smoothed (e.g. non-linear) monotonic function of the linear predictors. GLMs also relax the assumption that the response variable is normally distributed by allowing for many distributions (e.g. normal, poisson, binomial, log-linear, etc.).
Generalized additive models (GAMs) are extensions of GLMs. GAMs allow for the estimation of regression coefficients that take the form of non-parametric smoothers. Nonparametric smoothers like lowess (locally weighted scatterplot smoothing) fit a smooth curve to data using localized subsets of the data.
This course provides an overview of modeling GLMs and GAMs using R. GLMs, and especially GAMs, have evolved into standard statistical methodologies of considerable flexibility. The course addresses recent approaches to modeling, estimating and interpreting GAMs. The focus of the course is on modeling and interpreting GLMs and especially GAMs with R. Use of the freely available R software illustrates the practicalities of linear, generalized linear, and generalized additive models.
What you’ll learn
For whom is this course:
Where
## Galaxy y x
## 1 NGC0300 133 2.00
## 2 NGC0925 664 9.16
## 3 NGC1326A 1794 16.14
## 4 NGC1365 1594 17.95
## 5 NGC1425 1473 21.88
## 6 NGC2403 278 3.22
## 7 NGC2541 714 11.22
## 8 NGC2090 882 11.75
## 9 NGC3031 80 3.63
## 10 NGC3198 772 13.80
## 11 NGC3351 642 10.00
## 12 NGC3368 768 10.52
## 13 NGC3621 609 6.64
## 14 NGC4321 1433 15.21
## 15 NGC4414 619 17.70
## 16 NGC4496A 1424 14.86
## 17 NGC4548 1384 16.22
## 18 NGC4535 1444 15.78
## 19 NGC4536 1423 14.93
## 20 NGC4639 1403 21.98
## 21 NGC4725 1103 12.36
## 22 IC4182 318 4.49
## 23 NGC5253 232 3.15
## 24 NGC7331 999 14.72
##
## Call:
## lm(formula = y ~ x - 1, data = hubble)
##
## Residuals:
## Min 1Q Median 3Q Max
## -736.5 -132.5 -19.0 172.2 558.0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 76.581 3.965 19.32 1.03e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 258.9 on 23 degrees of freedom
## Multiple R-squared: 0.9419, Adjusted R-squared: 0.9394
## F-statistic: 373.1 on 1 and 23 DF, p-value: 1.032e-15
In this line I’ll give you the mean-value of y: 924.4.
plot(fitted(hub.mod),residuals(hub.mod),xlab="fitted values",ylab="residuals")
hub.mod1 <- lm(y~x-1,data=hubble[-c(3,15),])
summary(hub.mod1)
##
## Call:
## lm(formula = y ~ x - 1, data = hubble[-c(3, 15), ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -304.3 -141.9 -26.5 138.3 269.8
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 77.67 2.97 26.15 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 180.5 on 21 degrees of freedom
## Multiple R-squared: 0.9702, Adjusted R-squared: 0.9688
## F-statistic: 683.8 on 1 and 21 DF, p-value: < 2.2e-16
plot(fitted(hub.mod1),residuals(hub.mod1),xlab="fitted values",ylab="residuals")
hubble.const <- c(coef(hub.mod),coef(hub.mod1))/3.09e19
age <- 1/hubble.const
age
## x x
## 4.034934e+17 3.978221e+17
age/(60^2*24*365)
## x x
## 12794692825 12614854757
cs.hubble <- 163000000
t.stat<-(coef(hub.mod1)-cs.hubble)/summary(hub.mod1)$coefficients[2]
pt(t.stat,df=21)*2
## x
## 3.906388e-150
sigb <- summary(hub.mod1)$coefficients[2]
h.ci<-coef(hub.mod1)+qt(c(0.025,0.975),df=21)*sigb
h.ci
## [1] 71.49588 83.84995
h.ci<-h.ci*60^2*24*365.25/3.09e19 # convert to 1/years
sort(1/h.ci)
## [1] 11677548698 13695361072
Easiest way of writing mathematical equation in R Markdown See Youtube: https://www.youtube.com/watch?v=4I3PCDME5U8)↩︎