This pre-live session utilizes LaTex to write out mathematical equations and matrices. When organizing your own work, you are not required to write mathematically in R markdown although you are welcome to try. For any hand calculations you can simply write them on paper, take a quick picture with your phone and include it in the markdown file, additional PowerPoint slide, or Word document to submit.
During the videos, Dr. Turner showed that matrix operations can be used to calculate basic statistics of variables. For example, suppose we have data stored in a a vector \(X\) with 5 numbers. We can compute the average of the the data by computing \[C'X\] where \(C'=(1/5,1/5,1/5,1/5,1/5)\).
#computing an average using matrices
x<-c(1,2,3,4,5)
c.vec<-c(1/5,1/5,1/5,1/5,1/5)
mean(x)
## [1] 3
t(c.vec) %*% x
## [,1]
## [1,] 3
We also showed that you can stack different coefficients in a matrix
to perform the computations simultaneously. For example, the following
\(C\) matrix computes the average as
well as the sum.\[C'=\begin{bmatrix}
1/5&1/5&1/5&1/5&1/5 \\
1&1&1&1&1 \\
\end{bmatrix}\] In R
, we compute the values by
stacking the rows on top of each other, c.vec
for the
averages and c.vec1
for the sum, as follows:
#Summing up the variables
c.vec2<-c(1,1,1,1,1)
sum(x)
## [1] 15
t(c.vec2) %*% x
## [,1]
## [1,] 15
#Doing both calculations simultaneously
c.mat<-rbind(c.vec,c.vec2)
c.mat
## [,1] [,2] [,3] [,4] [,5]
## c.vec 0.2 0.2 0.2 0.2 0.2
## c.vec2 1.0 1.0 1.0 1.0 1.0
c.mat %*% x
## [,1]
## c.vec 3
## c.vec2 15
We also discussed sums of squares.\[x'x=\begin{bmatrix} 1&2&3&4&5 \\ \end{bmatrix} \begin{bmatrix} 1 \\ 2 \\ 3 \\ 4\\ 5\\ \end{bmatrix}= 1^2+2^2+3^2+4^2+5^2 \].
t(x) %*% x
## [,1]
## [1,] 55
c.mat
dat<-matrix(c(1:5,6:10,11:15),5,3)
dat
result<-c.mat %*% dat
result
The code multiplies the c.mat matrix, which contains coefficients for calculating the average and sum, with the dat matrix, which contains multiple variables. This operation produces a new matrix with the average and sum of each variable.
step1<-mean(x)*c(1,1,1,1,1)
step2<-x-step1
step3<-t(step2) %*% step2
step4<-step3/(5-1)
step6<-sqrt(step4)
#Verify
sd(x)
step6
The steps involve subtracting the mean from each value, squaring, summing up, and dividing by the sample size minus one, finally taking the square root. The sequence of matrix operations mimics the manual calculation of standard deviation.
\[\begin{bmatrix} x_{11}&x_{21} \\ x_{12}&x_{22} \\ x_{13}&x_{23} \\ \end{bmatrix} \begin{bmatrix} 1&1 \\ 1&-1 \\ \end{bmatrix} \].
C<-matrix(c(1,1,1,-1),2,2)
X<-matrix(c(1:5,6:10),5,2)
X %*% C
newX<-data.frame(X)
names(newX)
XC<-data.frame(V1=newX$X1+newX$X2,V2=newX$X1-newX$X2)
XC
This matrix multiplication effectively combines and differentiates each variable’s data points. The result is a transformation of the original data, creating one set as the sum of variables and the other as the differenc
Suppose we have four random variables that are multivariate normal with covariance matrix \[\Sigma=\begin{bmatrix} 4&1&0&0 \\ 1&9&0&0 \\ 0&0&16&6 \\ 0&0&6&9 \\ \end{bmatrix} \].
A. Variables 1 & 2 and 3 & 4 are correlated due to non-zero off-diagonal elements in their respective covariance matrix blocks. The others are uncorrelated.
B. The standard deviations are square roots of the diagonal elements: 2, 3, 4, and 3, respectively.
C. The correlations can be computed as the covariance divided by the product of standard deviations. For 1 & 2: 1/(23), and for 3 & 4: 6/(43).
D. The covariance matrix will be diagonal with each element being the variance (25).
This week’s videos covered the fact that regression coefficients of an MLR model can be computed using the design matrix, which is constructed using the columns of the predictor variables and an extra column of ones for the intercept: \[\hat{\beta}=(X'X)^{-1}X'Y\]
\[X=\begin{bmatrix} 2&1 \\ 5&-1 \\ 0&7 \\ \end{bmatrix}\].
Note that the first column of \(X\) is \(x_1\) and the seconnd column is \(x_2\)
Predictions for the given observations can be computed as follows: Observation 1: 3 + 5(2) + 2(1) = 13 Observation 2: 3 + 5(5) + 2(-1) = 24 Observation 3: 3 + 5(0) + 2(7) = 17
Once we have our regression coefficients estimated from data, making predictions on future data is straight forward: \[\hat{Y}_{new}=X_{new}\hat{\beta}\] Compute the following matrix multiplication and compare this to your answers in part A.
newX<-matrix(c(2,5,0,1,-1,7),3,2)
newX<-model.matrix(~newX)
beta.hats<-c(3,5,2)
newX %*% beta.hats
## [,1]
## 1 15
## 2 26
## 3 17
Computing the matrix multiplication newX %*% beta.hats will yield the same predictions as in part A.