Computational Statistics

Review of RMarkdown, Vectors and Matrices

Ronald Wesonga (Ph.D)

18 Sep 2022

Practice How to Write these Statistical Equations in Markdown

We can write fractions: \(\frac{2}{3}\). We can also handle things like estimated population growth rate, e.g., \(\hat{\lambda}=1.02\). And, \(\sqrt{4}=2\).

\[\alpha, \beta, \gamma, \Gamma\]

\[a \pm b\] \[x \ge 15\] \[a_i \ge 0~~~\forall i\]

Practice How to Write Matrix Definition

\[A_{m,n} = \begin{pmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{pmatrix}\]

Practice How to Write Some Statistical Equations

The binomial probability: \[f(y|N,p) = \frac{N!}{y!(N-y)!}\cdot p^y \cdot (1-p)^{N-y} = {{N}\choose{y}} \cdot p^y \cdot (1-p)^{N-y}\]

To calculate the mean of observations of variable , you can use: \[\bar{x} = \frac{1}{n} \sum_{i=1}^{n}x_{i}\]

Note that this equation looks quite nice above where it’s in display math mode. It is more compact but not quite as nice looking if we present it using inline mode, e.g., \(\bar{x} = \frac{1}{n} \sum_{i=1}^{n}x_{i}\).

Let’s do the same with the equation for variance. First the inline version, which is \(\sigma^{2} = \frac{\sum\limits_{i=1}^{n} \left(x_{i} - \bar{x}\right)^{2}} {n-1}\). And then the display mode version: \[\sigma^{2} = \frac{\sum_{i=1}^{n} \left(x_{i} - \bar{x}\right)^{2}} {n-1}\]

Next, it’s good to look at the equation for covariance to see how it is just a generalization of variance to two variables. An inline version of the equation is \(cov_{x,y} = \frac{\sum\limits_{i=1}^{n}{(x_i-\overline{x}) \cdot (y_i-\overline{y})} }{n-1}\). And, the display mode is: \[cov_{x,y} = \frac{\sum\limits_{i=1}^{n}{(x_i-\overline{x}) \cdot (y_i-\overline{y})} }{n-1}\]

And, finally, we’ll end with the standard deviation. Here’s the inline version, \(\sigma = \sqrt{\frac{\sum\limits_{i=1}^{n} \left(x_{i} - \bar{x}\right)^{2}} {n-1}}\). And here’s the display version. \[\sigma = \sqrt{\frac{\sum\limits_{i=1}^{n} \left(x_{i} - \bar{x}\right)^{2}} {n-1}}\]

There are helpful online editors to help you learn code for various equations you might want to include. There is one at: http://visualmatheditor.equatheque.net/VisualMathEditor.html. You can work out the code there and then copy it over to your RMarkdown document in between dollar signs (1 or 2 on either end depending on whether you want the equation in line or in display mode).

Practice How To Define and Manipulate Vectors and Matrices

      x<-c(1, 2, 3)
      y<-c(4, 5, 6)
      z<-seq(1,10,by=0.5)
      w<-1:10
      A<-cbind(x,y)
      B<-rbind(x,y)
      ones<-rep(1,3)

Hint: To make sure R respects vector dimensions, save them as matrices

    x<-as.matrix(x)
    dim(x)
## [1] 3 1
    y<-as.matrix(y)
    dim(y)
## [1] 3 1
    ones<-as.matrix(ones)
    dim(ones)
## [1] 3 1

Practice How To Define and Manipulate Matrices

      A<-matrix(c(1, 2, 3, 4, 5, 6), byrow=T, ncol=3)
      A
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
      A[1,1]
## [1] 1
      A[1,]
## [1] 1 2 3
      A[,1]
## [1] 1 4
      B<-matrix(c(1, 2, 3, 4, 5, 6), byrow=F, ncol=3)
      B
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
      D<-diag(c(1,2,3)) # diagonal matrix
      ONE<-matrix(rep(1,9),ncol=3) # matrix of all ones
      D
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    2    0
## [3,]    0    0    3
      ONE
##      [,1] [,2] [,3]
## [1,]    1    1    1
## [2,]    1    1    1
## [3,]    1    1    1

More Hands On Operations with Vectors and Matrices

Transpose operation

      t(A)
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6
      t(B)
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
## [3,]    5    6
      t(D)
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    2    0
## [3,]    0    0    3

Element-wise operations

      A+B
##      [,1] [,2] [,3]
## [1,]    2    5    8
## [2,]    6    9   12
      A-B
##      [,1] [,2] [,3]
## [1,]    0   -1   -2
## [2,]    2    1    0
      A*B
##      [,1] [,2] [,3]
## [1,]    1    6   15
## [2,]    8   20   36
      A/B
##      [,1]      [,2] [,3]
## [1,]    1 0.6666667  0.6
## [2,]    2 1.2500000  1.0
      A^B
##      [,1] [,2]  [,3]
## [1,]    1    8   243
## [2,]   16  625 46656
      x+y
##      [,1]
## [1,]    5
## [2,]    7
## [3,]    9
      x-y
##      [,1]
## [1,]   -3
## [2,]   -3
## [3,]   -3
      x*y
##      [,1]
## [1,]    4
## [2,]   10
## [3,]   18
      x/y
##      [,1]
## [1,] 0.25
## [2,] 0.40
## [3,] 0.50
      y^x
##      [,1]
## [1,]    4
## [2,]   25
## [3,]  216

Vector and Matrix Operations

      # A%*%B will give an error message: non-conformable
      
      dim(A) # check the matrix dimension
## [1] 2 3
      dim(B)
## [1] 2 3
      A%*%t(B)
##      [,1] [,2]
## [1,]   22   28
## [2,]   49   64
      t(A)%*%B
##      [,1] [,2] [,3]
## [1,]    9   19   29
## [2,]   12   26   40
## [3,]   15   33   51
      t(B)%*%A
##      [,1] [,2] [,3]
## [1,]    9   12   15
## [2,]   19   26   33
## [3,]   29   40   51
      B%*%t(A)
##      [,1] [,2]
## [1,]   22   49
## [2,]   28   64
      x%*%t(y)
##      [,1] [,2] [,3]
## [1,]    4    5    6
## [2,]    8   10   12
## [3,]   12   15   18
      t(x)%*%y
##      [,1]
## [1,]   32
      t(x)%*%t(A)
##      [,1] [,2]
## [1,]   14   32
      B%*%D # multiplies each column of B by a number
##      [,1] [,2] [,3]
## [1,]    1    6   15
## [2,]    2    8   18
      diag(c(3,4))%*%B # multiplies each row of B by a number
##      [,1] [,2] [,3]
## [1,]    3    9   15
## [2,]    8   16   24

Determinant of Matrix

      det(D)
## [1] 6
      det(ONE)
## [1] 0

Inverse Matrix

      Di<-solve(D)
      D%*%Di
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1
      Di%*%D
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1
      N<-matrix(rnorm(9,sd=10^-6),3,3)
      Ii<-solve(ONE+N)
      (ONE+N)%*%Ii
##      [,1]          [,2] [,3]
## [1,]    1 -5.820766e-11    0
## [2,]    0  1.000000e+00    0
## [3,]    0  0.000000e+00    1
      Ii%*%(ONE+N)
##      [,1]         [,2]         [,3]
## [1,]    1 0.000000e+00 0.000000e+00
## [2,]    0 1.000000e+00 1.164153e-10
## [3,]    0 2.328306e-10 1.000000e+00

Hands On With Positive Definite Matrices, Quadratic Forms

Eigenvalues and Eigenvectors

      eigen(D)
## eigen() decomposition
## $values
## [1] 3 2 1
## 
## $vectors
##      [,1] [,2] [,3]
## [1,]    0    0    1
## [2,]    0    1    0
## [3,]    1    0    0
      M<-matrix(rnorm(9,sd=1),3,3)
      eigen(M)
## eigen() decomposition
## $values
## [1] -1.323489+0.000000i  1.118697+0.444746i  1.118697-0.444746i
## 
## $vectors
##                [,1]                  [,2]                  [,3]
## [1,] -0.08708167+0i -0.1135622+0.6059050i -0.1135622-0.6059050i
## [2,] -0.38344958+0i  0.6560852+0.0000000i  0.6560852+0.0000000i
## [3,] -0.91944723+0i -0.3916733-0.1900713i -0.3916733+0.1900713i
      eigen(var(M))
## eigen() decomposition
## $values
## [1] 1.303054e+00 6.281558e-01 2.220446e-16
## 
## $vectors
##           [,1]       [,2]       [,3]
## [1,] 0.7717766  0.4634804  0.4353697
## [2,] 0.6105456 -0.7314777 -0.3036023
## [3,] 0.1777495  0.5001262 -0.8475133

Random Data Using Quadratic Forms

N<-100
D<-matrix(rnorm(N),N,1)
E<-matrix(rnorm(N,sd=0.5),N,1)
D<-cbind(D,-D+E)
A<-var(D)
e<-eigen(A)
e
## eigen() decomposition
## $values
## [1] 1.7384200 0.1158355
## 
## $vectors
##            [,1]       [,2]
## [1,] -0.6314137 -0.7754462
## [2,]  0.7754462 -0.6314137
e$vectors %*% diag(e$values) %*% t(e$vectors) # the same as A
##            [,1]       [,2]
## [1,]  0.7627328 -0.7944617
## [2,] -0.7944617  1.0915227
A
##            [,1]       [,2]
## [1,]  0.7627328 -0.7944617
## [2,] -0.7944617  1.0915227

How To Statistics Random Matrix

x<-matrix(rnorm(6), ncol=2)
x
##           [,1]        [,2]
## [1,] 2.2277406 -0.85127152
## [2,] 1.1501472 -0.03577439
## [3,] 0.4009243  0.80914034
mean(x) # mean(x) DOES NOT produce what we want!!!
## [1] 0.6168177
apply(x,1,mean) # the column-wise mean
## [1] 0.6882345 0.5571864 0.6050323
apply(x,2,mean) # the row-wise mean
## [1]  1.25960401 -0.02596852

Matrix Representation of the Mean

      n<-dim(x)[1]
      ones<-matrix(rep(1,n),ncol=1)
      ones
##      [,1]
## [1,]    1
## [2,]    1
## [3,]    1
      mu<-t(x) %*% ones / n
      mu
##             [,1]
## [1,]  1.25960401
## [2,] -0.02596852

Variance | Standard Deviation of a Vector

      x
##           [,1]        [,2]
## [1,] 2.2277406 -0.85127152
## [2,] 1.1501472 -0.03577439
## [3,] 0.4009243  0.80914034
      var(x[,1])
## [1] 0.8433
      var(x[,2])
## [1] 0.689314
      sd(x[,1])
## [1] 0.9183137
      sd(x[,2])
## [1] 0.8302494
      var(x[,1], x[,2]) # covariance 
## [1] -0.7575119

Variance-Covariance Matrix

      var(x)
##            [,1]       [,2]
## [1,]  0.8433000 -0.7575119
## [2,] -0.7575119  0.6893140
      cor(x)
##            [,1]       [,2]
## [1,]  1.0000000 -0.9935502
## [2,] -0.9935502  1.0000000

Handling Deviations

      d1<-x[,1]-mu[1]*ones
      d2<-x[,2]-mu[2]*ones
      
      d1
##            [,1]
## [1,]  0.9681366
## [2,] -0.1094568
## [3,] -0.8586797
      d2
##              [,1]
## [1,] -0.825302999
## [2,] -0.009805867
## [3,]  0.835108866
      t(d1)%*%d2  # produces biased version of variance
##           [,1]
## [1,] -1.515024
      (n-1)*var(x[,1], x[,2])  
## [1] -1.515024

Sample Variance-Covariance Matrices

      ones%*%t(ones)  # 3x3 matrix of 1s
##      [,1] [,2] [,3]
## [1,]    1    1    1
## [2,]    1    1    1
## [3,]    1    1    1
      diag(3)         # identity matrix
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1

Matrix Computation of S (unbiased Covariance)

      (1/(n-1)) * t(x) %*% (diag(3)-(1/n)*ones %*% t(ones)) %*% x
##            [,1]       [,2]
## [1,]  0.8433000 -0.7575119
## [2,] -0.7575119  0.6893140
      var(x)                          # ... produces the same result
##            [,1]       [,2]
## [1,]  0.8433000 -0.7575119
## [2,] -0.7575119  0.6893140

Just a Function called ellipse

ellipse<-function(mu,Sigma,R,col='red',add=FALSE,xlim=NULL,ylim=NULL,N=1000)
{   
    # Find coordinates of a circle
    t<-seq(0,2*pi,length.out=N)
    x<-R*cos(t)
    y<-R*sin(t)
    
    # Spectral decomposition of Sigma
    e<-eigen(Sigma)                    # spectral decomposition
    P<-e$vectors                       # eigenvectors
    L<-e$values 

  # Square root matrix
  S05<-P%*%sqrt(diag(L))%*%t(P)
    
  # Ellipse cordinates 
    vec<-cbind(x,y)
    vec<-t(vec%*%S05)
    x<-vec[1,]+mu[1]
    y<-vec[2,]+mu[2]
    
    if (add)
        {
            points(x,y,type='l',col=col)
        }   
    else
        plot(x,y,type='l',col=col,xlim=xlim,ylim=ylim)
}

ARE YOU REALLY DONE PRACTICING?