\(1.\) Let \(A\) and \(B\) be symmetric matrices of size \(p\) and let \(B\) be positive definite. \(λ\) is said to be an (generalized) eigenvalue of \(A\) with respect to \(B\) if there exists a vector \(x ∈ R^p\) such that \(Ax = λBx\). We will denote the ordered eigenvalues of \(A\) with respect to \(B\) as \(λ_1 ≥ λ_2 ≥ ··· ≥ λ_p\) with corresponding eigenvectors \(e_1, e_2, · · · , e_p\).

\((a)\) Show that if \(λ\) is a eigenvalue of \(A\) with respect to \(B\) then \(λ\) is also an eigenvalue of \((1)\) \(B^{−1}A\), \((2)\) \(AB^{−1}\) and \((3) B^{−1/2}AB^{−1/2}\).

\((1)\)

\[\begin{equation} \begin{split} Ax &= \lambda Bx \\ B^{-1} Ax &= \lambda B^{-1}Bx \\ B^{-1} Ax &= \lambda x \end{split} \end{equation}\]

\((2)\) We have shown \(B^{-1} Ax = \lambda x\). Use this result, we can have:

\[\begin{equation} \begin{split} AB^{-1}Ax &= \lambda xA \\ AB^{-1} y &= \lambda y \space \space \space \space \text{where $y=Ax$} \\ \end{split} \end{equation}\]

\((3)\) We know \(Ax = \lambda Bx\) and \(B\) is positive definite. So

\[\begin{equation} \begin{split} B^{-\frac{1}{2}} Ax &= \lambda B^{-\frac{1}{2}} B^{\frac{1}{2}} B^{\frac{1}{2}} x \\ B^{-\frac{1}{2}} Ax &= \lambda B^{-\frac{1}{2}} x \\ B^{-\frac{1}{2}} AB^{-\frac{1}{2}} y &= \lambda y \space \space \space \space \text{where $y=B^{\frac{1}{2}}x$} \end{split} \end{equation}\]

\((b)\) Show that

\[\max_{x\ne 0} \frac{x^\prime Ax}{x^\prime Bx}=\lambda_1\]

and that this maximum is attained at \(x = B^{−1/2}e_1\).

\[\begin{equation} \begin{split} Ax &= \lambda x \space \space \implies x = \lambda A^{-1}x \space \space \implies \frac{1}{\lambda} x = A^{-1}x \\ Ax &= \lambda Bx \space \space \implies x = \lambda A^{-1}Bx \space \space \implies \frac{1}{\lambda} x = A^{-1}Bx \\ &\implies \text{$\frac{1}{\lambda}$ is eigenvalue of $A^{-1}B$} \\ &\implies \text{$\frac{1}{\lambda}$ is eigenvalue of $B$ with respect to $A$} \\ \max_{x \ne 0} \frac{x^\prime A x}{x^\prime B x} &= \max_{x \ne 0} \frac{x^\prime A x}{x^\prime B^{^1/_2}x B^{^1/2}} \\ &= \max_{x \ne 0} \frac{x^\prime B^{^1/2} B^{^-1/_2} A B^{-^1/_2} B^{^1/_2} x}{x^\prime B^{^1/_2} B^{^1/_2} x} \hspace{1cm} \space \text{where $y=B^{^1/_2}x$} \\ &= \max_{y \ne 0} \frac{y^\prime B^{^-1/_2} A B^{-^1/_2} y}{y^\prime y} \\ &= \max_{y \ne 0} \frac{y^\prime \Lambda y}{y^\prime y} \hspace{1cm} \text{From $1a(3)$} \\ &= \frac{\sum^p_{i=1} \lambda_i y_i^2}{\sum^p_{i=1} y^2_i} \le \lambda_1\frac{\sum^p_{i=1} y_i^2}{\sum^p_{i=1} y^2_i}=\lambda_1 \end{split} \end{equation}\]

setting \(y=e_1\), and combining with \(y=B^{\frac{1}{2}}x\), we have \(B^{\frac{1}{2}}x = e_1 \implies x=B^{-\frac{1}{2}}e_1\)

Thus, \[\max_{x\ne 0} \frac{x^\prime Ax}{x^\prime Bx}=\lambda_1\]

and that this maximum is attained at \(x = B^{−1/2}e_1\).

\(2\). Consider the following (one) factor effect MANOVA model

\[X_{li} = μ + τ_l + ε_{li}, i = 1, ..., n_l, l = 1, ..., g\]

where \(\left\{ε_{li}\right\}\) are \(i.i.d.\space Np(0,Σ)\). Let\(n=n_1+n_2+···+n_g\) and \(μ= \frac{1}{n} \sum_l n_lμ_l\) and then \(τ_l =μ_l−μ\).

Recall the point estimate for \(\hat{\mu_l}=\overline{X}_{_l.}=\frac{1}{n_l}\sum_i X_{li}\), \(l=1, \ldots, g\) and \(\hat{\tau_l}=\overline{X}_{l.}-\overline{X}_{..}\) where \(\overline{X}_{..}=\frac{1}{n}\sum_i \sum_l X_{il}\).

\((a)\) Show that \(\overline{X}_{..}\sim N_p \left( 0, \Sigma \right)\).

\[\begin{equation} \begin{split} E(\bar{X}_{..}) &= E(\frac{1}{n}\sum_i \sum_l X_{li}) \\ &= \frac{1}{n} \sum_i \sum_l E(X_{li}) \\ &= \frac{1}{n} \sum_l n_l \mu_l \\ &= \mu \\ Var(\bar{X}_{..}) &= \frac{1}{n^2} \sum_l n_l \Sigma \\ &= \frac{1}{n} \Sigma \end{split} \end{equation}\]

\((b)\) Let \(\theta=\sum_l c_l \tau_l\) be a contrast (i.e. the real numbers \(\left\{ c_l \right\}\) satisfy the constraint \(\sum_l c_l=0)\). Let \(\hat{\theta}=\sum_l c_l \hat{\tau}_l\). show \[\hat{\theta}\sim N_p \left( \theta, \sum_l \frac{c^2_l}{n_l} \Sigma \right)\]

\[\begin{equation} \begin{split} E(\hat{\theta}) &= E\left(\sum_l c_l \hat{\tau}_l\right) \\ &= \sum_l c_l E(\hat{\tau}_l) \\ &= \sum_l c_l E(\bar{X}_{li}-\bar{X}_{,,}) \\ &= \sum_l c_l (\mu_l-\mu) \\ &= \sum_l c_l \tau_l \\ &= \theta \\ Var(\hat{\theta}) &= Var\left(\sum_l c_l \hat{\tau}_l\right) \\ &= \sum_l c^2_l Var(\hat{\tau}_l) \\ &= \sum_l \frac{c^2_l}{n_l} \Sigma \end{split} \end{equation}\]

Comment: The variance covariance matrix is denoted \(\Sigma\) and should not be confused with summation notation \(\sum_r\) which will always include an index in the subscript.

\(3.\) Construct a one-way MANOVA using the width measurements (column 3 and 5) from the iris data. Recall there are three groups.

\((a)\) Find \(S_1\), \(S_2\) and \(S_3\) and \(S_{pool}\). Do you think it is reasonable to assume the covariance matrices are equal?

setwd("/Users/terrysitu/Dropbox/Stat MS Classes/Regression Modeling/Data")
iris <- as.matrix(read.table("iris.txt", header=F))
head(iris, 3)
##      V1  V2  V3  V4  V5
## [1,]  1 5.1 3.5 1.4 0.2
## [2,]  1 4.9 3.0 1.4 0.2
## [3,]  1 4.7 3.2 1.3 0.2
iris <- iris[,c(1,3,5)]

# subset the data of iris to different groups #

iris1 <- as.matrix(subset(iris, iris[,1]==1)[,2:3])
iris2 <- as.matrix(subset(iris, iris[,1]==2)[,2:3])
iris3 <- as.matrix(subset(iris, iris[,1]==3)[,2:3])

# compute S_1, S_2 and S_3 #

S_1 <- var(iris1)
S_2 <- var(iris2)
S_3 <- var(iris3)

S_1
##             V3          V5
## V3 0.143689796 0.009297959
## V5 0.009297959 0.011106122
S_2
##            V3         V5
## V3 0.09846939 0.04120408
## V5 0.04120408 0.03910612
S_3
##            V3         V5
## V3 0.10400408 0.04762857
## V5 0.04762857 0.07543265
# compute S_pooled #

S_pooled <-((nrow(iris1)-1)/(nrow(iris)-3))*(S_1+S_2+S_3)

S_pooled
##           V3         V5
## V3 0.1153878 0.03271020
## V5 0.0327102 0.04188163

Now, we test if it is reasonable to assume they have the same variances:

# sample size #
n1 = n2 = n3 <- nrow(iris1)

p <- 2
g <- 3

# compute the u #
u <- (1/(n1-1)+1/(n2-1)+1/(n3-1)-1/(n1+n2+n3-3)) * (2*p^2+3*p-1)/ (6*(p+1)*(g-1))
u 
## [1] 0.01965231
# compute the M #
M <- (n1+n2+n3-3)*log(det(S_pooled))-((n1-1)*log(det(S_1))+(n2-1)*log(det(S_2))+(n3-1)*log(det(S_3)))

# compute the C #
C <- (1-u)*M

# compute the df for hte chi-square test #
v <- g*0.5*p*(p+1)-0.5*p*(p+1)

# conclude the test of equality of variances #
ifelse(C>qchisq(.95, v), "Reject null hypothesis", "Fail to reject null hypothesis")
## [1] "Reject null hypothesis"

\((b)\) Find \(W\), \(B\) and \(T\).

# compute x_bar_l, x_bar_2 and x_bar_3#

x_bar_1 <- c(mean(iris1[,1]), mean(iris1[,2]))

x_bar_2 <- c(mean(iris2[,1]), mean(iris2[,2]))

x_bar_3 <- c(mean(iris3[,1]), mean(iris3[,2]))

# compute W #
W <- (nrow(iris1)-1)*(S_1+S_2+S_3)
W
##         V3     V5
## V3 16.9620 4.8084
## V5  4.8084 6.1566
# compute B # 
x_bar <- x_bar_1/3 + x_bar_2/3 + x_bar_3 /3
x_bar
## [1] 3.057333 1.199333
B <- 50*(x_bar_1-x_bar)%*%t(x_bar_1-x_bar)+50*(x_bar_2-x_bar)%*%t(x_bar_2-x_bar)+50*(x_bar_3-x_bar)%*%t(x_bar_3-x_bar)
B
##           [,1]      [,2]
## [1,]  11.34493 -22.93267
## [2,] -22.93267  80.41333
# compute T #
T <- B+W
T
##           V3        V5
## V3  28.30693 -18.12427
## V5 -18.12427  86.56993

\((c)\) State the null and alternative hypothesis of interest and conduct the test stating the test statistic its distribution under the null hypothesis.

In English, we want to test if there is any difference effect in iris among the \(3\) of groups.

In mathematics, \[H_0: \space \tau_1=\tau_2=\tau_3 \space \space v.s \space \space H_a: \space Not \space \space H_0\]

# compute big lambda star #
Lambda_star <- det(W) / det(T)

# conclude the test #
ifelse(-(nrow(iris)-1-(ncol(iris1)+3)/2) * log(Lambda_star) < qchisq(0.95, 2*(3-1),), "Fail to reject the null hypothess", "Reject the null hypothesis")
## [1] "Reject the null hypothesis"

with \(\alpha=0.05\), we reject \(H_0\).

\((d)\) Construct \(95\)% simultaneous confidence intervals for differences in the mean components for the two response variables for each pair of populations.

# compute the t value with degree of freedom = n-g #
t <- qt((1-(.05/(ncol(iris1)*max(iris[,1]*(max(iris[,1])-1))))), nrow(iris)-max(iris[,1]))

# compute differece in mean width vectors among different groups #
tau_hat_1 <- x_bar_1-x_bar
tau_hat_2 <- x_bar_2-x_bar
tau_hat_3 <- x_bar_3-x_bar

tau_hat_1
## [1]  0.3706667 -0.9533333
tau_hat_2
## [1] -0.2873333  0.1266667
tau_hat_3
## [1] -0.08333333  0.82666667
# the difference in the mean width between each pair of population #

d_t1_t2 <- tau_hat_1-tau_hat_2
d_t1_t3 <- tau_hat_1-tau_hat_3
d_t2_t3 <- tau_hat_2-tau_hat_3

# 95% simultaneous confidence intervals for treatment effects #
upper <- d_t1_t2 + t*sqrt((1/nrow(iris1)+1/nrow(iris2))*(1/(nrow(iris)-max(iris[,1])))*diag(W))
lower <- d_t1_t2 - t*sqrt((1/nrow(iris1)+1/nrow(iris2))*(1/(nrow(iris)-max(iris[,1])))*diag(W))

table <- cbind(lower, upper)
rownames(table) <- c("t_13-t_23", "t_15-t_25")
table
##                lower      upper
## t_13-t_23  0.4763056  0.8396944
## t_15-t_25 -1.1894645 -0.9705355
upper <- d_t1_t3 + t*sqrt((1/nrow(iris1)+1/nrow(iris2))*(1/(nrow(iris)-max(iris[,1])))*diag(W))
lower <- d_t1_t3 - t*sqrt((1/nrow(iris1)+1/nrow(iris2))*(1/(nrow(iris)-max(iris[,1])))*diag(W))

table <- cbind(lower, upper)
rownames(table) <- c("t_13-t_33", "t_15-t_35")
table
##                lower      upper
## t_13-t_33  0.2723056  0.6356944
## t_15-t_35 -1.8894645 -1.6705355
upper <- d_t2_t3 + t*sqrt((1/nrow(iris1)+1/nrow(iris2))*(1/(nrow(iris)-max(iris[,1])))*diag(W))
lower <- d_t2_t3 - t*sqrt((1/nrow(iris1)+1/nrow(iris2))*(1/(nrow(iris)-max(iris[,1])))*diag(W))

table <- cbind(lower, upper)
rownames(table) <- c("t_23-t_33", "t_25-t_35")
table
##                lower       upper
## t_23-t_33 -0.3856944 -0.02230564
## t_25-t_35 -0.8094645 -0.59053548

\(4\). Consider the following summary statistics from a study on electrical usage. Samples of size \(n_1 = 45\) and \(n_2 = 55\) were taken of Wisconsin homeowners with (Group \(1\)) and without (Group \(2\)) air conditioning, respectively. The first measurement is total on-peak consumption during July and the second measurement is total off -peak consumption in July. We find

\[S_1 = \begin{pmatrix} 13825.3 & 23823.4 \\ & 73107.4 \end{pmatrix}\]

and

\[S_2 = \begin{pmatrix} 8632 & 19616.7 \\ & 55964.5 \end{pmatrix}\]

Use Bartlett’s Test (Box’s M-test in book) to test \(H_0 :\space Σ_1 = Σ_2\). Use \(α = 0.05\).

# sample size #
n1 <- 45
n2 <- 55
p <- 2
g <- 2

# construct S matrices #
S1 <- matrix(c(13825.3, 23823.4, 23823.4, 73107.4), byrow=T, ncol=2)
S2 <- matrix(c(8632, 19616.7, 19616.7, 55964.5), byrow=T, ncol=2)
S_pooled <- (n1-1)*S1/(n1+n2-2) + (n2-1)*S2/(n1+n2-2)

# compute the u #
u <- (1/(n1-1)+1/(n2-2)-1/(n1+n2-2)) * (2*p^2+3*p-1)/ (6*(p+1)*(g-1))
u 
## [1] 0.02267136
# compute the M #
M <- (n1+n2-2)*log(det(S_pooled))-((n1-1)*log(det(S1))+(n2-1)*log(det(S2)))

# compute the C #
C <- (1-u)*M
C
## [1] 18.92817
# compute the df for hte chi-square test #
v <- g*0.5*p*(p+1)-0.5*p*(p+1)

# conclude the test #
ifelse(C>qchisq(.95, v), "Reject null hypothesis", "Fail to reject null hypothesis")
## [1] "Reject null hypothesis"