Basado en Biostatistics

knitr::opts_chunk$set(echo = TRUE)

library(fda)

## Warning: package 'fda' was built under R version 4.4.3

## Cargando paquete requerido: splines

## Cargando paquete requerido: fds

## Cargando paquete requerido: rainbow

## Cargando paquete requerido: MASS

## Warning: package 'MASS' was built under R version 4.4.3

## Cargando paquete requerido: pcaPP

## Cargando paquete requerido: RCurl

## Cargando paquete requerido: deSolve

## Warning: package 'deSolve' was built under R version 4.4.3

## 
## Adjuntando el paquete: 'fda'

## The following object is masked from 'package:graphics':
## 
##     matplot

library(refund)

## Warning: package 'refund' was built under R version 4.4.3

Overview

$\circ\;$ Scatterplot smoothing using basis expansions

$\;$

Motivation
Some R packages used in FDA
The Smoothing Problem
Basis Systems
Statistical Inference

Motivation: Canadian Weather Data

data(CanadianWeather)
attach(CanadianWeather)
y.precip=dailyAv[,,2]
l = which(place=="Vancouver") 
t.day = 1:365  
y=y.precip[,l]
plot(t.day, y, xlab='Day', ylab='Precipitation', type='b')
title('Average Daily Precipitation in Vancouver (BC)')

Motivation: Canadian Weather Data

plot(t.day, y, xlab='Day', ylab='Precipitation', type='b')
title('Average Daily Precipitation in Vancouver (BC)')
lines(lowess(t.day,y),col=4, lwd=3)
lines(smooth.spline(t.day, y),lwd=3,col=2)
legend(100,8, c("Lowess Smoothing", "B-spline (GCV)"), col=c(4,2), lwd=3)

Some R packages for FDA

lme4: linear, generalized linear, and nonlinear mixed models
mgcv: generalized additive (mixed) models; semi-parametric smoothing
fda: functional data analysis in R
fpca: functional principal component analysis
refund: regression with functional data
face: fast covariance estimation for sparse functional data

The Smoothing Problem

Let $y_i \in \mathbb{R}$ and $x_i\in D$, $D$ compact

The smoothing problem consists in estimating the function $g(x)$ in the model

\[y_i(x_i) = g(x_i) + \epsilon_i(x_i), \mbox{ with } E[\epsilon_i(x_i)] = 0\]

Estimation is often carried out assuming $g(x):D \rightarrow \mathbb{R}$, in a function space with specific properties, e.g. continuity and differentiability … smoothness

$\;$

Because $g(\cdot)$ lives in a infinite-dim space, it is common to consider finite dimensional approximations to $g(x)$
Example: in regression we often assume $g(x) = \alpha + \beta x$

Basis Functions

Consider a set of known functions $\{B_j(x)\}_{j=1,\ldots,k}$, with $B_j(x):D\rightarrow \mathbb{R}$
Basis expansions construct an approximation to $g(x)$ through the linear combination

\[g(x)\approx \sum_{j=1}^k c_j B_j(x)\]

An infinite dimensional object $g(\cdot)$ is therefore represented using a $k$-dimensional vector of basis coefficients $(c_1,\ldots,c_k)$.
Formally, given a set of basis functions, a function space is spanned through linear combinations. Properties of the function space depend on the properties of the spanning basis functions.

Examples of Basis Functions

x <- seq(0,1,length=1000)
par(mfrow=c(1,3))
# Monomial
mon.b <- create.monomial.basis(c(0,1),5)
B.mon <- eval.basis(x, mon.b)
matplot(x, B.mon, type='l', lwd=2)
title('Monomial basis')
# Fourier
fou.b <- create.fourier.basis(c(0,1),5)
B.fou <- eval.basis(x, fou.b)
matplot(x, B.fou, type='l', lwd=2)
title('Fourier basis')
# B-splines
spl.b <- create.bspline.basis(c(0,1),nbasis=5)
B.spl <- eval.basis(x, spl.b)
matplot(x, B.spl, type='l', lwd=2)
title('B-spline basis')

Monomial Basis

Monomial Basis consist of powers of the argument \[\{B_j(x)\}_{j=1}^k = \{x^{(j-1)}\}_{j=1}^k\]
The ensuing approximation to $g(x)$ is a polynomial of degree $(k-1)$

\[g(x) \approx a_0 + a_1 x + a_2 x^2 + \cdots + a_k x^{(k-1)}\] Proposition [Polynomial Approximations]

Let $g\in C^\alpha[0,1]$, be the set of functions on $[0,1]$ with $\alpha$ continuous derivatives. There exists a constant $M < \infty$ that depends only on $\alpha$, s.t. $\forall g\in C^\alpha[0,1]$, and $k\in \mathbb{N}$, $\exists$ a polynomial $P$ of degree $k$, s.t.

\[||g - P||_\infty = \sup_{x\in D}|g(x) - P(x)| \leq M k^{-\alpha}||g||_{C^\alpha}\] - The uniform distance between a function $g$ and the closest polynomial of degree $k$ is of the order $k^{-\alpha}$

Monomial Basis

Advantages

Easy intuition about approximation of functions

Limitations

Numerically problems for high order polynomials
Numerical analysis using monomial basis is unstable
Modeling fast local changes in $g(\cdot)$ require a large number of polynomials

Splines

Monomial basis offer arbitrarily accurate polynomial approximations to a function $g\in C^\alpha[0,1]$,
A statistical solution to the approximation problem results in a basis system composed of spline functions

Definition [Spline Function] Given $k$ knots $a=t_0<t_1<\cdots<t_{k-1}=b$, a function $f(x):[a,b]\rightarrow\mathbb{R}$ is a spline of order $q$, if the restriction $f_{[t_j, ,t_{j+1}]}(x)$ of $f$ to any sub-interval $[t_j, t_j+1]$ is a polynomial of degree $q-1$, and $f\in C^{q-2}$, provided $q \geq 2$.

In short, a spline function is a piece-wise polynomial with a given degree of global smoothness.

B-Splines

There are many classes of spline functions. The class of $B$-splines is often used for computational convenience

The points at which the polynomial segments join are called knots
A system of B-splines is defined in relation to the order (degree + 1) of the local polynomial segments, and a set of knots spanning the evaluation domain $D$
A set of $B$-splines with $k$ knots of order $q$ includes $k+q$ basis functions

x <- seq(0,1,length=1000)
par(mfrow=c(1,3))
# B-splines
spl.b <- create.bspline.basis(c(0,1), breaks=c(0,0.333,0.666,1), norder=2)
B.spl <- eval.basis(x, spl.b)
matplot(x, B.spl, type='l', lwd=2)
abline(v=c(0,0.333,0.666,1))
points(c(0,0.333,0.666,1), rep(0,4), pch=19, cex=2)
title('B-splines (Order 2)')
#
spl.b <- create.bspline.basis(c(0,1), breaks=c(0,0.333,0.666,1), norder=3)
B.spl <- eval.basis(x, spl.b)
matplot(x, B.spl, type='l', lwd=2)
abline(v=c(0,0.333,0.666,1))
points(c(0,0.333,0.666,1), rep(0,4), pch=19, cex=2)
title('B-spline (Order 3)')
#
spl.b <- create.bspline.basis(c(0,1), breaks=c(0,0.333,0.666,1), norder=4)
B.spl <- eval.basis(x, spl.b)
matplot(x, B.spl, type='l', lwd=2)
abline(v=c(0,0.333,0.666,1))
points(c(0,0.333,0.666,1), rep(0,4), pch=19, cex=2)
title('B-spline (Order 4)')

B-Splines Properties

Let $\{B_j(x)\}_{j=1}^{J=(k+q)}$ be a set of B-spline basis functions with $k$ knots or order $q$. We have:

$\;$

$B_j(x) \geq 0$, $\forall \;x\in D$
$\sum_{j}^J B_j(x) = 1$, $\forall \;x\in D$
At most $q$ basis functions are non-zero at any given $x\in D$
Derivatives of a spline approximation are up to $q-2$ continuous
Knots placement: place more knots where $g(\cdot)$ exhibits more curvature
Discontinuities possible by repeating knots

B-Splines Properties

Proposition [Spline Approximations]

Let $g\in C^\alpha[0,1]$, be the set of functions on $[0,1]$ with $\alpha$ continuous derivatives. Consider a set of $J$ spline basis functions of order $q\geq\alpha>0$, s.t. \[g(x) \approx \sum_{j=1}^J\theta_j B_j(x) = \boldsymbol{\theta}^T B_J.\]

There exists a constant $M(q,\alpha) < \infty$ that depends only on $\alpha$ and $q$, s.t. $\forall g\in C^\alpha[0,1]$, we can find $\boldsymbol{\theta}\in\mathbb{R}^J$, with $||\boldsymbol{\theta}||_\infty < ||g||_{C^\alpha}$ ,s.t.

\[||g - \boldsymbol{\theta}^TB_J||_\infty \leq M(\alpha, q) J^{-\alpha}||g||_{C^\alpha}\] - The uniform distance between a function $g$ and the closest spline approximation of order $q\geq \alpha$ is of the order $J^{-\alpha}$

Smoothing with Splines

The smoothing problem requires estimating $g(x)$ in the model \[y(x) = g(x) + \epsilon(x),\;\;\;\;E[\epsilon(x)]=0\]
Given a set of $J$ spline basis functions $\{B_j(x)\}_{j=1}^J$, we assume

\[y(x) \approx \sum_{j=1}^J\theta_{j}B_j(x) + \epsilon(x) = \boldsymbol{\theta}^T{\bf B}_J(x) + \epsilon(x)\]

Given ${\bf y} = (y_1(x_1),\ldots, y_n(x_n))'$ observations, a spline expansion defines a design matrix ${\bf B}_J({\bf x}):n\times (J+1)$

\[{\bf B}_J({\bf x}) = [1, B_1({\bf x}),\ldots,B_J(\bf x)]\]

Least Square estimation yields

\[\boldsymbol{\hat{\theta}} = \{{\bf B}_J({\bf x})^T{\bf B}_J({\bf x})\}^{-1}{\bf B}_J({\bf x})^T {\bf y}\]

Smoothing with Splines (LS)

Given ${\bf y} = (y_1(x_1),\ldots, y_n(x_n))'$, least square estimation yields

\[\boldsymbol{\hat{\theta}} = \{{\bf B}_J({\bf x})^T{\bf B}_J({\bf x})\}^{-1}{\bf B}_J({\bf x})^T {\bf y}\]

Challenges

What spline order $q$?
How many knots $k$?
Where to place knots?

Example: Canadian Weather Data

plot(t.day, y, xlab='Day', ylab='Precipitation')
title('Average Daily Precipitation in Vancouver (BC)')
#
# B-splines
x <- t.day
spl.b1 <- create.bspline.basis(c(1,365), nbasis=5, norder=2)
B1 <- eval.basis(x, spl.b1)
th1 <- solve(t(B1)%*%B1, t(B1)%*%y)
#
spl.b2 <- create.bspline.basis(c(0,365), nbasis=5, norder=4)
B2 <- eval.basis(x, spl.b2)
th2 <- solve(t(B2)%*%B2, t(B2)%*%y)
#
spl.b3 <- create.bspline.basis(c(0,365), nbasis=20, norder=4)
B3 <- eval.basis(x, spl.b3)
th3 <- solve(t(B3)%*%B3, t(B3)%*%y)
#
lines(t.day,B1%*%th1, col=4, lwd=3)
lines(t.day,B2%*%th2, col=3, lwd=3)
lines(t.day,B3%*%th3, col=2, lwd=3)
legend(100,8, c("(J=5, q=2)", "(J=5, q=4)", "(J=20, q=4)"), col=c(4,3,2), lwd=3)

Selecting the Number of Basis Functions

We often fix $q$ to a low degree polynomial, e.g. $q=4$ (locally cubic)
A small number of basis functions assumes a simple function $g(\cdot)$
A large number of basis may even lead to interpolation of the data e.g. overfitting

$\;$

Mean Square Error of $\hat{g}_J(x) = \boldsymbol{\hat{\theta}}^TB_J(x)$,

\[MSE\{\hat{g}_J\} := E[\{\hat{g}_J - g\}^2] = [E\{\hat{g}_J - g\}]^2 + E[\{\hat{g}_J - E(\hat{g}_J)\}^2]\]

$\;$

A simple solution would be to set $J^* = arg\min_{J\in \mathbb{N}} MSE\{\hat{g}_J\}$
Estimation of $MSE\{\hat{g}_J\}$ is non-trivial and requires predictive considerations (CV, GCV, REML)

Estimation Using Roughness Penalties

An alternative to choosing the number of basis functions $J$ considers regularized estimation of a flexible family of functions
Typically we set $J \propto n$ so that the space of functions spanned by linear combinations of $J$ B-spline basis allows for the interpolation of the data.
Estimation is carried out minimizing the Penalized Least Squares

\[PSSE_\lambda = ||{\bf y} - g({\bf x})||^2 + \lambda P(g)\] where $P(g)$ measures ``roughness’’ of the function $g$ and $\lambda>0$ defines a trade off between goodness of fit (bias) and smoothness (variance)

Typically $P(g) = ||g''||^2 = \int [g(x)'']^2 dx$ $\rightarrow$ curvature

Penalized Regression Splines

Given a set of $J$ B-spline basis functions ${\bf B} = \{B_1, \ldots, B_J\}$, we assume \[g(x) = \sum_{j=1}^J \theta_j B_j(x) = \boldsymbol{\theta}^T{\bf B}(x)\]
The PLS criterion, given $\lambda>0$, defines \[\boldsymbol{\hat{\theta}}_{\lambda} = \mbox{arg min}_{\theta}\; \left({\bf y}-{\bf B}\boldsymbol{\theta}\right)^T\left({\bf y}-{\bf B}\boldsymbol{\theta}\right) + \lambda \boldsymbol{\theta}^TD\boldsymbol{\theta}\] with $D\geq 0 \in \mathbb{R}^{J\times J}$, known penalty matrix
Leading to \[\boldsymbol{\hat{\theta}}_{\lambda} = [{\bf B}^T{\bf B} + \lambda D]^{-1}{\bf B}^T{\bf y}\]

$\;$

Example: Canadian Weather

#
# B-splines
x <- t.day
spl.b <- create.bspline.basis(c(0,365), nbasis=365, norder=4)
B <- eval.basis(x, spl.b)
D <- bsplinepen(spl.b)
#
par(mfrow=c(1,3))
#
lambda <- 0.001
th <- solve(t(B)%*%B + lambda*D, t(B)%*%y)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('lambda=0.001')
lines(t.day,B%*%th, col=4, lwd=3)
#
lambda <- 10^2
th <- solve(t(B)%*%B + lambda*D, t(B)%*%y)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('lambda=10^2')
lines(t.day,B%*%th, col=3, lwd=3)
#
lambda <- 10^4
th <- solve(t(B)%*%B + lambda*D, t(B)%*%y)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('lambda=10^4')
lines(t.day,B%*%th, col=2, lwd=3)

Derivation of the penalty matrix $D$

The roughness penalty matrix $D$ arises as follows:

$$ \[\begin{array}{lll} \int [g(x)'']^2 dx &=&\int\,[\boldsymbol{\theta}^T{\bf B}(x)][\boldsymbol{\theta}^T{\bf B}(x)]^Tdx\\[.2in] &=& \boldsymbol{\theta}^T\int {\bf B}(x){\bf B}(x)^T dx\;\boldsymbol{\theta}\\[.2in] &=& \boldsymbol{\theta}^T D \;\boldsymbol{\theta} \end{array}\]

Overview

$\circ\;$ Scatterplot smoothing using basis expansions

$\;$

Review of Linear Smoothers
Inference For Linear Smoothers
Mixed Effects Model Representations
ML and REML for Model Selection

Motivation: Canadian Weather Data

data(CanadianWeather)
attach(CanadianWeather)

## The following objects are masked from CanadianWeather (pos = 5):
## 
##     coordinates, dailyAv, geogindex, monthlyPrecip, monthlyTemp, place,
##     province, region

y.precip=dailyAv[,,2]
l = which(place=="Vancouver") 
t.day = 1:365  
y=y.precip[,l]
plot(t.day, y, xlab='Day', ylab='Precipitation', type='b')
title('Average Daily Precipitation in Vancouver (BC)')

Smoothing with Splines

The smoothing problem requires estimating $g(x)$ in the model \[y(x) = g(x) + \epsilon(x),\;\;\;\;E[\epsilon(x)]=0\]
Given a set of $J$ spline basis functions $\{B_j(x)\}_{j=1}^J$, we assume

\[y(x) \approx \sum_{j=1}^J\theta_{j}B_j(x) + \epsilon(x) = \boldsymbol{\theta}^T{\bf B}_J(x) + \epsilon(x)\]

Given ${\bf y} = (y_1(x_1),\ldots, y_n(x_n))'$ observations, a spline expansion defines a design matrix ${\bf B}_J({\bf x}):n\times (J+1)$

\[{\bf B}_J({\bf x}) = [1, B_1({\bf x}),\ldots,B_J(\bf x)]\]

Least Square estimation yields

\[\boldsymbol{\hat{\theta}} = \{{\bf B}_J({\bf x})^T{\bf B}_J({\bf x})\}^{-1}{\bf B}_J({\bf x})^T {\bf y}\]

Estimation Using Roughness Penalties

An alternative to choosing the number of basis functions $J$ considers regularized estimation of a flexible family of functions
Typically we set $J \propto n$ so that the space of functions spanned by linear combinations of $J$ B-spline basis allows for the interpolation of the data.
Estimation is carried out minimizing the Penalized Least Squares

Typically $P(g) = ||g''||^2 = \int [g(x)'']^2 dx$ $\rightarrow$ curvature

Penalized Regression Splines

Given a set of $J$ B-spline basis functions ${\bf B} = \{B_1, \ldots, B_J\}$, we assume \[g(x) = \sum_{j=1}^J \theta_j B_j(x) = \boldsymbol{\theta}^T{\bf B}(x)\]
The PLS criterion, given $\lambda>0$, defines \[\boldsymbol{\hat{\theta}}_{\lambda} = \mbox{arg min}_{\theta}\; \left({\bf y}-{\bf B}\boldsymbol{\theta}\right)^T\left({\bf y}-{\bf B}\boldsymbol{\theta}\right) + \lambda \boldsymbol{\theta}^TD\boldsymbol{\theta}\] with $D\geq 0 \in \mathbb{R}^{J\times J}$, known penalty matrix
Leading to \[\boldsymbol{\hat{\theta}}_{\lambda} = [{\bf B}^T{\bf B} + \lambda D]^{-1}{\bf B}^T{\bf y}\]

Example: Canadian Weather

#
# B-splines
x <- t.day
spl.b <- create.bspline.basis(c(0,365), nbasis=365, norder=4)
B <- eval.basis(x, spl.b)
D <- bsplinepen(spl.b)
#
par(mfrow=c(1,3))
#
lambda <- 0.001
th <- solve(t(B)%*%B + lambda*D, t(B)%*%y)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('lambda=0.001')
lines(t.day,B%*%th, col=4, lwd=3)
#
lambda <- 10^2
th <- solve(t(B)%*%B + lambda*D, t(B)%*%y)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('lambda=10^2')
lines(t.day,B%*%th, col=3, lwd=3)
#
lambda <- 10^4
th <- solve(t(B)%*%B + lambda*D, t(B)%*%y)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('lambda=10^4')
lines(t.day,B%*%th, col=2, lwd=3)

Selection of Regularization $\lambda$ through Cross Validation

The regularization parameter $\lambda$ may be chosen to minimize the LOO-CV criterion \[CV(\lambda) = \sum_{i=1}^n \{y_i(x_i) - \hat{g}_{(i)}(x_i, \lambda)\}^2\] where $\hat{g}_{(i)}(x_i, \lambda)$ is the estimator of $g(x_i)$ obtained with $(x_i, y_i(x_i))$ witheld from the training sample.
For linear smoothers we can show that the naive computation of $CV(\lambda)$ can be accelerated to avoid fitting $n$ separate models

Inference for Linear Smoothers

Consider the estimated function $\hat{g}({\bf x})_{\lambda}$ over the observed grid ${\bf x}$

\[ \begin{array}{lllllll} \left[\begin{array}{c} \hat{g}_\lambda(x_1)\\ \vdots\\ \hat{g}_\lambda(x_n)\end{array}\right] & = & {\bf B}\hat{\boldsymbol{\theta}}_\lambda & = & {\bf B}\,[{\bf B}^T{\bf B} + \lambda D]^{-1}{\bf B}^T{\bf y} & = & S_\lambda\,{\bf y} \end{array} \]

If $\lambda > 0$, and $\sigma^2>0$ are assumed known, inference about $\hat{\boldsymbol{\theta}}_\lambda$ follows usual LS large sample properties

\[\left[\{{\bf B}^T{\bf B} + \lambda D\}^{-1}{\bf B}^T {\bf B} \,\{{\bf B}^T{\bf B} + \lambda D\}^{-1}\right]^{-1/2}(\hat{\boldsymbol{\theta}}_\lambda - \boldsymbol{\theta})\rightarrow_d N(0, \sigma^2 I)\]

We are however, interested in $\hat{g}_\lambda(x) = S_\lambda(x)^T{\bf y}$, with

\[ \begin{array}{lll} E\{\hat{g}_\lambda(x)\} & = & S_\lambda(x)^TE({\bf y}) = S_\lambda(x)^Tg(x)\\[.1in] Var\{\hat{g}_\lambda(x)\} & = & Var\{S_\lambda(x)^T {\bf y}\} = \sigma^2 S_\lambda(x)^T S_\lambda(x) = \sigma^2||S(x)_\lambda||^2 \end{array} \]

Inference for Linear Smoothers

We note that $E\{\hat{g}_\lambda(x)\} \neq g(x)$, and expect a bias term $b(x)$, s.t.

\[ \frac{\hat{g}_\lambda(x) - g(x) - b_\lambda(x)}{\sigma||S_\lambda(x)||}\rightarrow_d N(0,1) \]

$||S_\lambda(x)||^2$ defines a measure of ``local’’ sample size
Estimation of $b_\lambda(x)$ is often difficult $\rightarrow$ we often think of $b(x)$ as being ignorable
The resulting C.I. are interpreted as C.I. for $\bar{g}_\lambda(x) = \{\hat{g}_\lambda(x)\}$, with

\[ \frac{\hat{g}_\lambda(x) - g(x)}{\sigma||S_\lambda(x)||} = \frac{\hat{g}_\lambda(x) - \bar{g}_\lambda(x)}{\sigma||S_\lambda(x)||} + \frac{\bar{g}_\lambda(x) - g(x)}{\sigma||S_\lambda(x)||} = Z_n(x) + \frac{b_\lambda(x)}{\sigma||S_\lambda(x)||} \]

Smoothing as a Linear Mixed Model

Consider the following spline basis representation, for knots $(\eta_1,\ldots,, \eta_k)\in (0,1)$

\[\left\{1, x, \ldots,x^p,|x-\eta_1|^p_+,\ldots, |x-\eta_k|^p_+\right\}\]

x   <- seq(0, 1, length=500)
eta <- c(0.25,0.5,0.75)
sp1 <- function(x, eta, p){
  J <- p + length(eta) + 1
  B <- matrix(1, nrow=length(x), ncol=J)
  for(i in 2:(p+1)) B[,i] <- x^(i-1) 
  k <- 1
  for(i in (p+2):J){
    x1 <- (x-eta[k])
    x1[x1<0] <- 0.0
    B[,i] <- x1^p
    k <- k+1
  }
  return(B)
}
par(mfrow=c(1,3))
B1 <- sp1(x, eta, 1)
B2 <- sp1(x, eta, 2)
B3 <- sp1(x, eta, 3)
#
matplot(x, B1, type='l', xlab='x', ylab='B(x)', lwd=2)
abline(v=eta,, col='grey')
points(eta, rep(0,3), pch=19, cex=2)
title("Linear")
#
#
matplot(x, B2, type='l', xlab='x', ylab='B(x)', lwd=2)
abline(v=eta, col='grey')
points(eta, rep(0,3), pch=19, cex=2)
title("Quadratic")
#
#
matplot(x, B3, type='l', xlab='x', ylab='B(x)', lwd=2)
abline(v=eta, col='grey')
points(eta, rep(0,3), pch=19, cex=2)
title("Cubic")

Linear Mixed Model Representation

Let ${\bf y} = g(\bf{x}) + \epsilon(\bf{x})$, $\epsilon({\bf x}) \sim N_n(0, \sigma^2_\epsilon I_n)$
Using a spline basis of order $p+1$, we knots $(\eta_1,\ldots,\eta_k)$, we have \[g(x) = \beta_0 + \beta_1 x + \cdots + \beta_p x^p + \sum_{j=1}^kb_j|x-\eta_j|_+^p\]
In vector form ${\bf y} = X\boldsymbol{\beta} + Z{\bf b} + \boldsymbol{\epsilon}$
Assuming ${\bf b} = (b_1,\ldots, b_k)^T \sim N(0, \sigma^2_b I_k)$, with ${\bf b} \perp \epsilon({\bf x})$

\[ \begin{array}{ll} {\bf y \mid b} & \sim \hspace{2in}\\[.2in] {\bf y} &\sim \end{array} \]

Estimation of $g(x)$ through LMM

Let $V = \sigma^2_bZZ^T + \sigma^2_\epsilon I_n$, using standard ML theory we have

\[\hat{\boldsymbol{\beta}} = \{X^T\hat{V}^{-1}X\}X^T\hat{V}^{-1}{\bf y}\]

A point estimate of the random effects ${\bf b}$ is obtained via BLUP

\[{\bf\tilde{b}} = \hat{\sigma}^2_b Z^T\hat{V}^{-1}({\bf y} - X\hat{\boldsymbol{\beta}})\]

Let $C = [X, Z]$ and $\boldsymbol{\theta} = [\boldsymbol{\beta}, {\bf b}]$, $\lambda = \sigma^2_\epsilon/\sigma^2_b$. Show that

\[\hat{\boldsymbol \theta} = [C^TC + \lambda D]C^T{\bf y}\]

Estimation of Variance Components

As usual ML estimation of $\alpha = (\sigma^2_\epsilon, \sigma^2_b)$ is based on maximization of the profile log-likelihood

\[\ell_p(\alpha) = -\frac{1}{2}\log|V(\alpha)| - \frac{1}{2}({\bf y} - X^T\hat{\boldsymbol \beta}(\alpha))^TV(\alpha)^{-1}({\bf y} - X^T\hat{\boldsymbol \beta}(\alpha))\]

REML estimation is base on maximizing restricted log likelihood

\[\ell_R(\alpha) = -\frac{1}{2}\log|X^TV(\alpha)^{-1}X| + \ell_p(\alpha)\]

Estimators of $\lambda = \frac{\sigma^2_\epsilon}{\sigma^2_b}$ are obtained without resorting to CV

Bayesian Interpretation

LMM and Uncertainty around $\hat{g}(x)$

The LMM solution yeilds a linear smoother equivalent to PLS $\hat{g}(x) = S_\lambda(x){\bf y}$
Uncertainty evaluations using the conditional interpretation of $Var({\bf y}\mid {\bf b}) = \sigma^2_\epsilon I_n$, obtains

\[Var\{\hat{g}(x)\mid {\bf b}\} = \sigma^2_\epsilon ||S_\lambda(x)||^2\]

The same result is obtained using

\[Cov\left.\left(\begin{array}{c} \hat{\boldsymbol{\beta}}\\\tilde{\bf b}\end{array} \right. \mid b\right) = \sigma^2_\epsilon (C^TC + \lambda D) ^{-1}C^TC (CC+\lambda D)^{-1}\]

A Study of Bias

Conditioining on ${\bf b}$

\[ \begin{array}{lll} E\{\hat{g} - g \mid {\bf b}\} &=& E\{X\hat{\boldsymbol{\beta}} - X\boldsymbol{\beta} + Z\tilde{\bf b} - Z{\bf b}\}\\[.1in] &=&X\{E(\hat{\boldsymbol \beta}\mid {\bf b}) - \boldsymbol{\beta}\} + Z\{E(\tilde{\bf b}\mid {\bf b}) - {\bf b}\}\\[.1in] &=&-\frac{\sigma^2_\epsilon}{\sigma^2_b} C\left(C^TC + \frac{\sigma^2_\epsilon}{\sigma^2_b}D\right)^{-1} \left[\begin{array}{c}0\\{\bf b}\end{array}\right] \end{array} \]

Averaging of marginalizing over ${\bf b}$

\[E(\hat{g} - g) = 0\]

This also suggests an alternative to the PLS variance estimator

\[Var(\hat{g} - g) = \sigma^2_\epsilon \,C\left(CC + \frac{\sigma^2_\epsilon}{\sigma^2_b}D\right)^{-1}C^T\]

Canadian Weather

y.precip=dailyAv[,,2]
l = which(place=="Vancouver") 
t.day = 1:365  
y=y.precip[,l]
par(mfrow=c(2,3))
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('GCV')
fit1 <- gam(y~s(t.day), method='GCV.Cp')
lines(t.day, fit1$fitted.values, lwd=2, col=2)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('REML')
fit2 <- gam(y~s(t.day), method='REML')
lines(t.day, fit2$fitted.values, lwd=2, col=3)
plot(t.day, y, xlab='Day', ylab='Precipitation')
title('ML')
fit3 <- gam(y~s(t.day), method='ML')
lines(t.day, fit3$fitted.values, lwd=2, col=4)
#
plot(fit1, residuals=T)
plot(fit2, residuals=T)
plot(fit3, residuals=T)

Functional Data

Jhonier Rangel

2025-04-24

Overview

Motivation: Canadian Weather Data

Motivation: Canadian Weather Data

Some R packages for FDA

The Smoothing Problem

Basis Functions

Examples of Basis Functions

Monomial Basis

Monomial Basis

Splines

B-Splines

B-Splines Properties

B-Splines Properties

Smoothing with Splines

Smoothing with Splines (LS)

Example: Canadian Weather Data

Selecting the Number of Basis Functions

Estimation Using Roughness Penalties

Penalized Regression Splines

Example: Canadian Weather

Derivation of the penalty matrix \(D\)

Overview

Motivation: Canadian Weather Data

Smoothing with Splines

Estimation Using Roughness Penalties

Penalized Regression Splines

Example: Canadian Weather

Selection of Regularization \(\lambda\) through Cross Validation

Inference for Linear Smoothers

Inference for Linear Smoothers

Smoothing as a Linear Mixed Model

Linear Mixed Model Representation

Estimation of \(g(x)\) through LMM

Estimation of Variance Components

Bayesian Interpretation

LMM and Uncertainty around \(\hat{g}(x)\)

A Study of Bias

Canadian Weather

THANK YOU