Data 621 - Discussion 15

Data Descriptions

This data frame contains the following columns:

sr: savings rate - personal saving divided by disposable income
pop15: percent population under age of 15
pop75: percent population over age of 75
dpi: per-capita disposable income in dollars
ddpi: percent growth rate of dpi

The sm package: This package implements nonparametric smoothing methods described in the book of Bowman & Azzalini (1997)

The mgcv package: Mixed GAM Computation Vehicle with GCV/AIC/REML smoothness estimation and GAMMs by REML/PQL

library(sm)

## Package 'sm', version 2.2-5.7: type help(sm) for summary information

library(mgcv)

## Warning: package 'mgcv' was built under R version 4.1.2

## Loading required package: nlme

## This is mgcv 1.8-40. For overview type 'help("mgcv-package")'.

Smoothing savings rate as a function growth and population under 15. Plot on the left (or first) is too rough while that on the right (or second) seems about right.

data(savings, package="faraway")
y <- savings$sr
x <- cbind(savings$pop15,savings$ddpi)
sm.regression(x,y,h=c(1,1),xlab="pop15",ylab="growth",zlab="savings rate")

sm.regression(x,y,h=c(5,5),xlab="pop15",ylab="growth",zlab="savings rate")

# cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5
# h: a vector of length 1 or 2 giving the smoothing parameter. A normal kernel function is used and h is its standard deviation.

Smoothing spline fit w/ default amount of smoothing:

# s: Defining smooths in GAM formulae
amod <- gam(sr ~ s(pop15,ddpi), data=savings)
vis.gam(amod, col="gray", ticktype="detailed",theta=-35)

2D smoothing using Loess:

lomod <- loess(sr ~ pop15 + ddpi, data=savings)
xg <- seq(21,48,len=20)
yg <- seq(0,17,len=20)
zg <- expand.grid(pop15=xg,ddpi=yg)
persp(xg, yg, predict(lomod, zg), theta=-35, ticktype="detailed", xlab="pop15", ylab="growth", zlab="savings rate", col="gray")

Checking the data, we find there are no countries in this region of the predictor space,
meaning this represents an extrapolation away from the observed data. The loess
method uses linear extrapolation, producing the observed result. The kernel-based
method does not even attempt to predict outside the range whereas the spline method
produced a more restrained prediction. It is difficult to say which is best as we must
appeal to subject-matter knowledge to guide our choice.