Here is the Gram-Schmidt stuff. It has two code chunks
gs.c
gs.R
The chunk gs.c
defines Gram-Schmidt in C
, in such a way that it can be called from R
. It does not do pivoting and it assumes the input matrix is of full column rank.
The chunk gs.R
defines the R
functions gs()
and gsrc()
. The first one does Gram-Schmidt in R
, the second calls the Gram-Schmidt in C
. First the code in C
.
#include <math.h>
void
gsc(double *x, double *q, double *r, int *n, int *m)
{
int i, j, l, jn, ln, jm, imax = *n, jmax = *m;
double s = 0.0;
for (i = 0; i < imax; i++)
s += *(x + i) * *(x + i);
s = sqrt(s);
*r = s;
for (i = 0; i < imax; i++)
*(q + i) = *(x + i) / s;
for (j = 1; j < jmax; j++) {
jn = j * imax;
jm = j * jmax;
for (l = 0; l < j; l++) {
ln = l * imax;
s = 0.0;
for (i = 0; i < imax; i++)
s += *(q + ln + i) * *(x + jn + i);
*(r + jm + l) = s;
for (i = 0; i < imax; i++)
*(q + jn + i) += s * *(q + ln + i);
}
for (i = 0; i < imax; i++)
*(q + jn + i) = *(x + jn + i) - *(q + jn + i);
s = 0.0;
for (i = 0; i < imax; i++)
s += *(q + jn + i) * *(q + jn + i);
s = sqrt(s);
*(r + jm + j) = s;
for (i = 0; i < imax; i++)
*(q + jn + i) /= s;
}
}
And now the code in R
.
gs <- function(x) {
n <- dim(x)[1]
m <- dim(x)[2]
q <- matrix(0,n,m)
r <- matrix(0,m,m)
qi <- x[,1]
si <- sqrt(sum(qi ^ 2))
q[,1] <- qi / si
r[1,1] <- si
for (i in 2:m) {
xi <- x[,i]
qj <- q[,1:(i - 1)]
rj <- t(qj) %*% xi
qi <- xi - qj %*% rj
r[1:(i - 1),i] <- rj
si <- sqrt(sum(qi ^ 2))
q[,i] <- qi / si
r[i,i] <- si
}
return(list(q = q,r = r))
}
gsrc <- function(x) {
n <- dim(x)[1]
m <- dim(x)[2]
q <- matrix(0,n,m)
r <- matrix(0,m,m)
qr <-
.C(
"gsc",as.double(x),as.double(q),as.double(r),as.integer(dim(x)[1]),
as.integer(dim(x)[2])
)
return(list(q = matrix(qr[[2]],n,m),r = matrix(qr[[3]],m,m)))
}
How does one use this ? In RStudio
compilation and building of the shared library is taken care of automatically. Otherwise, we first compile gs.c
to a shared library using
R CMD SHLIB gs.c
in the shell. Then in R we say
dyn.load("gs.so")
source("gs.R")
and we run a small example
set.seed (12345)
x<-matrix (rnorm (1000000L), 10000L, 100L)
library (microbenchmark)
mb<-microbenchmark(R = gs(x), C = gsrc(x), Q = qr(x), times = 100L)
mb
## Unit: milliseconds
## expr min lq mean median uq max
## R 987.62120 1040.60052 1104.02168 1098.28220 1134.02603 1405.8157
## C 83.81926 94.27173 110.80002 100.78089 133.08216 167.5365
## Q 53.12226 57.91658 65.94093 60.34848 64.20902 107.1689
## neval cld
## 100 c
## 100 b
## 100 a
Thus for this example the C
code is about 8-10 times as fast as the R
code. The QR decomposition that comes with R
, based on Householder transformations, is again twice as fast.
Note: In a personal communication Bill Venables pointed out (01/18/16) that the above timing comparisons are somewhat unfavorable to our routines, because the standard qr
routines in R
still have to dig \(Q\) and \(R\) out of the qr structure. So an alternative, and perhaps more suitable comparison, is
mb<-microbenchmark(R = gs(x), C = gsrc(x), Q = {qrx <- qr(x); list(q = qr.Q(qrx), r = qr.R(qrx))}, times = 100L)
mb
## Unit: milliseconds
## expr min lq mean median uq max neval
## R 961.36460 1062.7542 1124.3137 1098.15344 1171.5015 1458.0584 100
## C 87.04156 93.5807 106.1242 98.99528 109.9939 162.4064 100
## Q 136.50614 153.7398 173.2713 163.18265 192.7677 246.1762 100
## cld
## c
## a
## b
Now gsrc
is faster than qr
, which now includes the cost of the copies and assignments. So a completely fair comparison will be somewhere in between the two benchmark results.