Introducción

En el siguiente Notebook se encuentran distintos ejercicios de tipo practico relacionados al \(T^2\) de Hotelling, del libro “Multivariate Statistical Inference and Applications” de Alvin Rencher. 1998. Y “Methods of Multivariate Analysis” de Alvin Rencher. 2002.

Ejercicio 3.1

Show that the use of \(Z^2=n(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})'\boldsymbol{\Sigma^{-1}}(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})\) in (3.1) to test \(H_0: \mu=\mu_0\) is equivalent to the likelihood ratio test.

Solución 3.1

Sí partimos de que \(y_1,y_2,\cdot\cdot\cdot,y_n \sim N_p(\boldsymbol{\mu},\boldsymbol{\Sigma})\)

La densidad es de la forma

\[f(\boldsymbol{y_i}; \boldsymbol{\mu}, \boldsymbol{\Sigma}) = \frac{1}{(\sqrt{2\pi})^{p} |\boldsymbol{\Sigma}|^{1/2}} e^{-{(\boldsymbol{y_i}-\boldsymbol{\mu})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu})}/2}\]

Para formar la prueba de la razón de verosimilitud necesitamos de

\[L(\boldsymbol{\mu}, \boldsymbol{\Sigma}) = \prod_{i=1}^{n} \frac{1}{(\sqrt{2\pi})^{p} |\boldsymbol{\Sigma}|^{1/2}} e^{-{(\boldsymbol{y_i}-\boldsymbol{\mu})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu})}/2}\]

\[L(\boldsymbol{\mu}, \boldsymbol{\Sigma}) = \frac{1}{(\sqrt{2\pi})^{np} |\boldsymbol{\Sigma}|^{n/2}} e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\mu})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu})}/2}\]

Dado que usamos \(Z^2=n(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})'\boldsymbol{\Sigma^{-1}}(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})\) asumimos a \(\boldsymbol{\Sigma}\) conocida, y \(\boldsymbol{\hat{\mu}} = \boldsymbol{\bar{y}}\)

Donde buscamos construir

\[L.R = \frac{max_{H_0} L}{max_{H_1} L}\]

\[max_{H_0} L = \frac{1}{(\sqrt{2\pi})^{np} |\boldsymbol{\Sigma}|^{n/2}} e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\mu_0})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu_0})}/2}\]

Y

\[max_{H_1} L = \frac{1}{(\sqrt{2\pi})^{np} |\boldsymbol{\Sigma}|^{n/2}} e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\bar{y}})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\bar{y})}/2}}\]

Así

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{\frac{1}{(\sqrt{2\pi})^{np} |\boldsymbol{\Sigma}|^{n/2}} e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\mu_0})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu_0})}/2}}{\frac{1}{(\sqrt{2\pi})^{np} |\boldsymbol{\Sigma}|^{n/2}} e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\bar{y}})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\bar{y})}/2}}}\]

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{ e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\mu_0})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu_0})}/2} (\sqrt{2\pi})^{np}|\boldsymbol{\Sigma}|^{n/2}}{e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\bar{y}})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\bar{y})}/2}}(\sqrt{2\pi})^{np}|\boldsymbol{\Sigma}|^{n/2}}\]

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{ e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\mu_0})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu_0})}/2} }{e^{-\sum_{i=1}^{n}{(\boldsymbol{y_i}-\boldsymbol{\bar{y}})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\bar{y})}/2}}}\]

Dado que lo que hay en los exponentes es un escalar se puede aplicar la traza sin problema dado que la traza de un escalar es el mismo escalar.

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{ e^{-\sum_{i=1}^{n}tr{(\boldsymbol{y_i}-\boldsymbol{\mu_0})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\mu_0})}/2} }{e^{-\sum_{i=1}^{n}tr{(\boldsymbol{y_i}-\boldsymbol{\bar{y}})'\boldsymbol{\Sigma}^{-1} (\boldsymbol{y_i}-\boldsymbol{\bar{y})}/2}}}\]

Además \(tr(AB)=tr(BA)\), entonces

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{exp\{-tr[\boldsymbol{\Sigma}^{-1} \sum_i (\boldsymbol{y_i}-\boldsymbol{\mu_0})(\boldsymbol{y_i}-\boldsymbol{\mu_0})']/2 \}}{exp\{-tr[\boldsymbol{\Sigma}^{-1} \sum_i (\boldsymbol{y_i}-\boldsymbol{\bar{y}})(\boldsymbol{y_i}-\boldsymbol{\bar{y}})']/2 \}}\]

Agregamos y quitamos \(\boldsymbol{\bar{y}}\) y obtenemos

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{exp\{-tr[\boldsymbol{\Sigma}^{-1} \sum_i (\boldsymbol{y_i}-\boldsymbol{\bar{y}}+\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})(\boldsymbol{y_i}-\boldsymbol{\bar{y}}+\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})']/2 \}}{exp\{-tr[\boldsymbol{\Sigma}^{-1} \sum_i (\boldsymbol{y_i}-\boldsymbol{\bar{y}}+\boldsymbol{\bar{y}}-\boldsymbol{\bar{y}})(\boldsymbol{y_i}-\boldsymbol{\bar{y}}+\boldsymbol{\bar{y}}-\boldsymbol{\bar{y}})']/2 \}} \]

Donde aplicamos la propiedad distributiva y también \(\sum_{i=1}^{n}(\boldsymbol{y_i}-\boldsymbol{\bar{y}})=0\)

Así

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{exp\{ -tr[\boldsymbol{\Sigma^{-1}} \sum_i(\boldsymbol{y_i}-\boldsymbol{\bar{y}})(\boldsymbol{y_i}-\boldsymbol{\bar{y}})' +n(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})']/2 \}}{exp\{ -tr[\boldsymbol{\Sigma^{-1}} \sum_i(\boldsymbol{y_i}-\boldsymbol{\bar{y}})(\boldsymbol{y_i}-\boldsymbol{\bar{y}})']/2 \}} \]

Ya que \(tr(A+B)=tr(A)+tr(B)\)

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{e^{-tr[\boldsymbol{\Sigma^{-1}} \sum_i (\boldsymbol{y_i-\bar{y}})(\boldsymbol{y_i-\bar{y}})']/2 } e^{-tr[\boldsymbol{\Sigma^{-1}} n(\boldsymbol{\bar{y} - \bar{\mu_0}})(\boldsymbol{\bar{y} - \bar{\mu_0}})']/2}}{e^{-tr[\boldsymbol{\Sigma^{-1}} \sum_i (\boldsymbol{y_i-\bar{y}})(\boldsymbol{y_i-\bar{y}})']/2 }}\]

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = e^{-tr[\boldsymbol{\Sigma^{-1}} n(\boldsymbol{\bar{y} - \bar{\mu_0}})(\boldsymbol{\bar{y} - \bar{\mu_0}})']/2}\]

Donde como sabemos \(tr(AB)=tr(BA)\) así

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = e^{-n(\boldsymbol{\bar{y} - \bar{\mu_0}})'\boldsymbol{\Sigma^{-1}}(\boldsymbol{\bar{y} - \bar{\mu_0}})]/2} = e^{-Z^2/2} \;\; \blacksquare\]

De esa forma se observa como \(L.R\) es dependiente de \(Z^2\), igualmente podemos despejar y expresarlo de una forma mas explicita

\[e^{-Z^2/2} \le \lambda_0\] \[ln(e^{-Z^2/2}) \le ln(\lambda_0)\] \[-Z^2/2 \le ln(\lambda_0)\] \[ \Rightarrow Z^2 \ge -2ln(\lambda_0) \Rightarrow Z^2 \ge c \;\;\blacksquare\]

Ejercicio 3.6

If a random sample \(\boldsymbol{y_1,y_2,..,y_n}\sim N_p(\boldsymbol{\mu}, \boldsymbol{\Sigma})\), show that \(\boldsymbol{\bar{v}} = \sqrt{n}(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})\) is \(N_p(\boldsymbol{0}, \boldsymbol{\Sigma})\) under \(H_0:\boldsymbol{\mu}=\boldsymbol{\mu_0}\), that \(\boldsymbol{W} = (n-1)\boldsymbol{S}\) is \(W_p(n-1, \boldsymbol{\Sigma})\) and that \(\boldsymbol{\bar{v}}\) and \(\boldsymbol{W}\) are independent, thus verifying that \(T^2 = n(\boldsymbol{\bar{y} - \mu_0})'\boldsymbol{S}^{-1}(\boldsymbol{\bar{y} - \mu_0})\) satisfies the formal definition (3.16).

Solución 3.6

Primero requerimos verificar las distribuciones de

\[\boldsymbol{\bar{v}} = \sqrt{n}(\boldsymbol{\bar{y}}-\boldsymbol{\mu_0})\] \[\boldsymbol{W} = (n-1)\boldsymbol{S}\]

Asumiendo que \(\boldsymbol{y_1,y_2,..,y_n}\sim N_p(\boldsymbol{\mu}, \boldsymbol{\Sigma})\)

Para el primer caso, si sabemos que \(\boldsymbol{y_1,...,y_n}\sim N_p(\boldsymbol{\mu, \Sigma})\)

Entonces

\[\boldsymbol{\bar{y}} \sim N_p\left(\boldsymbol{\mu}, \frac{1}{n}\boldsymbol{\Sigma}\right)\]

Así

\[m_{\bar{v}}(t) = E[e^{t'\bar{v}}] = E[e^{t'(\sqrt{n}\bar{y}-\sqrt{n} \mu_0)}] = E[e^{t'\sqrt{n}\bar{y}}e^{-t'\sqrt{n} \mu_0}] = e^{-t'\sqrt{n} \mu_0}E[e^{t'\sqrt{n}\bar{y}}]\]

\[m_{\bar{v}}(t) = e^{-t'\sqrt{n} \mu_0}E[e^{(\sqrt{n}'t)'\bar{y}}] = e^{-t'\sqrt{n} \mu_0} \cdot m_{\bar{y}}(\sqrt{n}'t)\]

Y considerando que

\[m_{\bar{y}}(\sqrt{n}'t) = [e^{t'\sqrt{n}\mu + t'\sqrt{n}\frac{\boldsymbol{\Sigma}}{n}\sqrt{n}t/2}]\]

\[\Rightarrow m_{\bar{v}}(t) = e^{-t'\sqrt{n} \mu_0}[e^{t'\sqrt{n}\mu + t'\sqrt{n}\frac{\boldsymbol{\Sigma}}{n}\sqrt{n}t/2}]\]

Con \(H_0 : \mu=\mu_0\)

\[m_{\bar{v}}(t) = m_{\bar{v}}(t) = e^{-t'\sqrt{n} \mu}[e^{t'\sqrt{n}\mu + t'\sqrt{n}\frac{\boldsymbol{\Sigma}}{n}\sqrt{n}t/2}]\]

\[\Rightarrow m_{\bar{v}}(t) = e^{t'\boldsymbol{\Sigma}t/2}\]

Así \(\bar{v}\sim N_p(0,\boldsymbol{\Sigma}) \;\; \square\)

Para el segundo estadístico usamos el teorema 2.3.I del libro de Inferencia Estadística de Rencher que dice:

Teorema 2.3.I:\(\boldsymbol{S}\) esta basada en una muestra aleatoria proveniente de \(N_p(\boldsymbol{\mu, \Sigma})\), entonces

\[(n-1)\boldsymbol{S}=\sum_{i=1}^{n} \boldsymbol{(y_i-\bar{y})(y_i-\bar{y})'} \sim W_p(n-1, \boldsymbol{\Sigma})\]

De esa forma directamente concluimos que \(W\sim W_p(n-1,\boldsymbol{\Sigma}) \;\; \square\)

Ahora para determinar que \(\bar{v}\) y \(\boldsymbol{W}\) son independientes se hace uso del teorema 2.3.D del libro de Inferencia Estadística de Rencher.

Teorema 2.3.D:\(\boldsymbol{\bar{y}}\) y \(\boldsymbol{S}\), provienen de \(\boldsymbol{y_1,...,\boldsymbol{y_n}}\sim N_p(\boldsymbol{\mu,\Sigma})\), entonces \(\boldsymbol{\bar{y}}\) y \(\boldsymbol{S}\) son independientes.

Para este caso \(\boldsymbol{{\bar{v}}}\) y \(\boldsymbol{W}\) siendo funciones de estos, y mas importante siendo funciones lineales de estos, los hace invariantes a esa propiedad, lo que nos permite concluir que \(\boldsymbol{\bar{v}}\) y \(\boldsymbol{W}\) son independientes. \(\; \square\)

Con eso se cumplen las condiciones para satisfacer el hecho de que \(T^2\) se puede definir como:

\[T^2=\boldsymbol{\bar{v}}'\left(\frac{W}{\vartheta}\right)^{-1}\boldsymbol{\bar{v}}\]

\(\vartheta = \text{Grados de libertad}\)

Así

\[T^2=\boldsymbol{[\sqrt{n}(\bar{y}-\mu_0)]}'\left(\frac{W}{n-1}\right)^{-1}\boldsymbol{\sqrt{n}(\bar{y}-\mu_0)}\]

\[T^2=\boldsymbol{[\sqrt{n}(\bar{y}-\mu_0)]}'\left(\frac{(n-1)\boldsymbol{S}}{n-1}\right)^{-1}\boldsymbol{\sqrt{n}(\bar{y}-\mu_0)}\]

Con \(\sqrt{n}\) siendo un escalar

\[T^2=n\boldsymbol{(\bar{y}-\mu_0)}'\boldsymbol{S}^{-1}\boldsymbol{(\bar{y}-\mu_0)} \;\; \blacksquare\]

Ejercicio 3.9

Show that \(T^2=n\boldsymbol{(\bar{y}-\mu_0)}'\boldsymbol{S}^{-1}\boldsymbol{(\bar{y}-\mu_0)}\) is unchanged by the transformation \(\boldsymbol{z_i}=\boldsymbol{Ay_i+b}\) where \(\boldsymbol{A}\) is nonsingular, thus verifying property 1 in Section 3.3.6.

Solución 3.9

Se desea demostrar que \(T^2_z = T^2_y\), a partir de la transformación de la forma \(\boldsymbol{z_i}=\boldsymbol{Ay_i+b}\)

Donde \(T^2_y\) es de la forma

\[T^2_y = n(\boldsymbol{\bar{y}} - \boldsymbol{\mu_{0y}})'\boldsymbol{S}^{-1} (\boldsymbol{\bar{y}} - \boldsymbol{\mu_{0y}})\]

Y buscamos que sea igual a

\[T^2_z = n(\boldsymbol{\bar{z}} - \boldsymbol{\mu_{0z}})'\boldsymbol{S_z}^{-1} (\boldsymbol{\bar{z}} - \boldsymbol{\mu_{0z}})\]

Por lo que necesitaremos los valores de \(\boldsymbol{\bar{z}}\), \(\boldsymbol{\mu_{0z}}\) y de \(\boldsymbol{S_z}\)

Los cuales se obtienen a partir de

\[\boldsymbol{z_i}=\boldsymbol{Ay_i+b}\]

Para \(\boldsymbol{\bar{z}}\)

\[\boldsymbol{\bar{z}} = \frac{1}{n}\cdot\underset{n\times n}{\boldsymbol{A}} \times \underset{n\times 1}{(\boldsymbol{y_1+\cdot\cdot\cdot+y_n})} + \underset{n\times 1}{\boldsymbol{b}}\] \[\Rightarrow \boldsymbol{\bar{z}} = \boldsymbol{A\bar{y} + b} \;\; \square\]

Y para \(\boldsymbol{S_z}\)

\[\boldsymbol{S_z} = \frac{1}{n-1} \sum_{i=1}^{n} \boldsymbol{(z_i - \bar{z})(z_i-\bar{z})'} = \frac{1}{n-1} \sum_{i=1}^{n} \boldsymbol{(Ay_i+b-A\bar{y}-b)(Ay_i+b-A\bar{y}-b)'}\]

\[\boldsymbol{S_z} = \frac{1}{n-1} \sum_{i=1}^{n} \boldsymbol{(Ay_i-A\bar{y})(Ay_i-A\bar{y})'}\]

\[\boldsymbol{S_z} = \boldsymbol{A \left[ \frac{1}{n-1} \sum_{i=1}^{n} \boldsymbol{(y_i-\bar{y})(y_i-\bar{y})'}\right] A'}\]

\[\Rightarrow \boldsymbol{S_z} = \boldsymbol{ASA'} \;\; \square\]

Y para \(\boldsymbol{\mu_{0z}}\)

\[E[\boldsymbol{z_i}] = E[\boldsymbol{Ay_i + b}] = E[\boldsymbol{Ay_i}] + E[\boldsymbol{b}] = \boldsymbol{A}E[\boldsymbol{y_i}] + \boldsymbol{b} = \boldsymbol{A\mu_y + b}\]

Y asumiendo que \(H_0:\boldsymbol{\mu_z=\mu_{0z}}\)

\[E[\boldsymbol{z_i}] = \boldsymbol{\mu_{z}} = \boldsymbol{\mu_{0z}} = \boldsymbol{A\mu_y + b} \;\; \square\]

Así ya podemos sustituir los valores en

\[T^2_z = n(\boldsymbol{\bar{z}} - \boldsymbol{\mu_{0z}})'\boldsymbol{S_z}^{-1} (\boldsymbol{\bar{z}} - \boldsymbol{\mu_{0z}})\]

\[T^{2}_{z} = n(\boldsymbol{A\bar{y} + b} - (\boldsymbol{A\mu_y + b}))' (\boldsymbol{ASA'})^{-1}(\boldsymbol{A\bar{y} + b} - (\boldsymbol{A\mu_y + b}))\]

\[T^{2}_{z} = n(\boldsymbol{A\bar{y} + b} - \boldsymbol{A\mu_y - b})' (\boldsymbol{ASA'})^{-1}(\boldsymbol{A\bar{y} + b} - \boldsymbol{A\mu_y - b})\]

\[T^{2}_{z} = n(\boldsymbol{A\bar{y} } - \boldsymbol{A\mu_y })' (\boldsymbol{ASA'})^{-1}(\boldsymbol{A\bar{y} } - \boldsymbol{A\mu_y })\]

\[T^{2}_{z} = n(\boldsymbol{\bar{y} } - \boldsymbol{\mu_y })'\boldsymbol{A}' (\boldsymbol{ASA'})^{-1}\boldsymbol{A}(\boldsymbol{\bar{y} } - \boldsymbol{\mu_y })\]

\[T^{2}_{z} = n(\boldsymbol{\bar{y} } - \boldsymbol{\mu_y })' \boldsymbol{S}^{-1}(\boldsymbol{\bar{y} } - \boldsymbol{\mu_y })\]

\[T^2_z = T^2_y \;\; \blacksquare\]

Así se demuestra que \(T^2\) es invariante a las transformaciones de tipo \(\boldsymbol{z_i = Ay_i +b}\) con \(\boldsymbol{A}\) no singular.

Ejercicio 3.20

Show that \(\boldsymbol{S_{pl}} = [(n_1-1)\boldsymbol{S_1}+(n_2-1)\boldsymbol{S_2}]/(n_1+n_2-2)\) is an unbiased estimator of the common population covariance matrix \(\boldsymbol{\Sigma}\)

Solución 3.20

Partiendo de

\[\boldsymbol{S_{pl}} = \frac{(n_1-1)\boldsymbol{S_1} + (n_2-1)\boldsymbol{S_2}}{n_1+n_2-2}\]

Así

\[E[\boldsymbol{S_{pl}}] = E\left[\frac{(n_1-1)\boldsymbol{S_1} + (n_2-1)\boldsymbol{S_2}}{n_1+n_2-2}\right] = E\left[\frac{(n_1-1)\boldsymbol{S_1}}{n_1+n_2-2} + \frac{(n_2-1)\boldsymbol{S_2}}{n_1+n_2-2}\right]\]

\[E[\boldsymbol{S_{pl}}] = \frac{(n_1-1)}{(n_1+n_2-2)}E[\boldsymbol{S_1}]+\frac{(n_2-1)}{(n_1+n_2-2)}E[\boldsymbol{S_2}]\]

\[E[\boldsymbol{S_{pl}}] = \frac{(n_1-1)}{(n_1+n_2-2)}\boldsymbol{\Sigma_1}+\frac{(n_2-1)}{(n_1+n_2-2)}\boldsymbol{\Sigma_2}\]

Bajo que \(H_0: \boldsymbol{\mu_1 = \mu_2}\) con \(\boldsymbol{\Sigma_1 = \Sigma_2 = \Sigma}\)

\[E[\boldsymbol{S_{pl}}] = \frac{(n_1-1)}{(n_1+n_2-2)}\boldsymbol{\Sigma}+\frac{(n_2-1)}{(n_1+n_2-2)}\boldsymbol{\Sigma} = \boldsymbol{\Sigma} \left[\frac{(n_1-1)}{(n_1+n_2-2)}+\frac{(n_2-1)}{(n_1+n_2-2)} \right]\]

\[E[\boldsymbol{S_{pl}}] = \boldsymbol{\Sigma} \left[\frac{n_1+n_2-2}{n_1+n_2-2} \right] = \boldsymbol{\Sigma} \;\; \blacksquare\]

Con lo que se demuestra que \(\boldsymbol{S_{pl}}\) es un estimador insesgado.

Ejercicio 3.21

Show that the two-sample \(T^2\)-statistic given in (3.57) satisfies the formal definition of a random variable in (3.16)

Solución 3.21

Deseamos probar que

\[T^2 = \frac{n_1n_2}{n_1+n_2}\boldsymbol{(\bar{y_1}-\bar{y_2})'S_{pl}(\bar{y_1}-\bar{y_2})}\]

Cumple con

\[T^2=\boldsymbol{z'}\left( \frac{\boldsymbol{W}}{\vartheta} \right)^{-1}\boldsymbol{z}\]

Se proponen las transformaciones

\[\boldsymbol{\bar{z}} = \sqrt{\frac{n_1n_2}{n_1+n_2}} (\boldsymbol{\bar{y_1}-\bar{y_2}})\]

\[\boldsymbol{W} = (n_1+n_2-2)\boldsymbol{S_{pl}}\]

Donde basados en el teorema 2.3.I

\[W=(n_1+n_2-2)\boldsymbol{S_{pl}} \sim W_p(n_1+n_2-2,\boldsymbol{\Sigma}) \;\; \square\]

Para \(\boldsymbol{\bar{z}}\) determinamos su función de densidad a partir de que

\[\boldsymbol{y_{11},y_{12},...,y_{1n}} \sim N_p(\boldsymbol{\mu_1, \Sigma_1})\] \[\boldsymbol{y_{21},y_{22},...,y_{2n}} \sim N_p(\boldsymbol{\mu_2, \Sigma_2})\]

Por lo tanto

\[\boldsymbol{\bar{y_1}} \sim N_p \left(\boldsymbol{\mu_1}, \frac{1}{n}\boldsymbol{\Sigma_1} \right)\] \[\boldsymbol{\bar{y_2}} \sim N_p \left(\boldsymbol{\mu_2}, \frac{1}{n}\boldsymbol{\Sigma_2} \right)\]

Además de que \(\boldsymbol{\bar{y_1}}\) y \(\boldsymbol{\bar{y_2}}\) son independientes.

Así

\[m_{\bar{z}}(t) = E\left[exp\left\{t'\left( \sqrt{\frac{n_1n_2}{n_1+n_2}}[\boldsymbol{\bar{y_1}}-\boldsymbol{\bar{y_2}}] \right)\right\} \right]\]

\(\theta = \frac{n_1n_2}{n_1+n_2}\)

\[m_{\bar{z}}(t) = E\left[exp\left\{t'\left( \sqrt{\theta}[\boldsymbol{\bar{y_1}}-\boldsymbol{\bar{y_2}}] \right)\right\} \right]\]

\[m_{\bar{z}}(t) = E\left[exp\left\{t'\left( \sqrt{\theta}\boldsymbol{\bar{y_1}} \right)\right\}exp\left\{-t'\left( \sqrt{\theta}\boldsymbol{\bar{y_2}} \right)\right\} \right]\]

\[m_{\bar{z}}(t) = E\left[ exp\left\{\left(\sqrt{\theta}'t \right)' \boldsymbol{\bar{y}_1} \right\}exp\left\{\left(-\sqrt{\theta}'t \right)' \boldsymbol{\bar{y}_2} \right\}\right]\]

Así al ser independientes sabemos que

\[m_{\bar{z}}(t) = m_{\bar{y_1}}\left(\sqrt{\theta}'t\right)\cdot m_{\bar{y_2}}\left(-\sqrt{\theta}'t \right)\]

Así

\[m_{\bar{z}} = exp\left\{t'\sqrt{\theta}\mu_1 + t'\sqrt{\theta} \boldsymbol{\Sigma_1}/n_1 t\sqrt{\theta}/2\right\} \cdot exp\left\{-t'\sqrt{\theta}\mu_2 + t'\sqrt{\theta} \boldsymbol{\Sigma_2}/n_2 t\sqrt{\theta}/2\right\}\]

Bajo el la suposición de que \(H_0: \boldsymbol{\mu_1 = \mu_2 = \mu}\)

\[m_{\bar{z}} = exp\left\{t'\sqrt{\theta}\mu + t'\sqrt{\theta} \boldsymbol{\Sigma_1}/n_1 t\sqrt{\theta}/2\right\} \cdot exp\left\{-t'\sqrt{\theta}\mu + t'\sqrt{\theta} \boldsymbol{\Sigma_2}/n_2 t\sqrt{\theta}/2\right\}\]

\[m_{\bar{z}} = exp\left\{t'\sqrt{\theta}\mu + t'\sqrt{\theta} \boldsymbol{\Sigma_1}/n_1 t\sqrt{\theta}/2 -t'\sqrt{\theta}\mu + t'\sqrt{\theta} \boldsymbol{\Sigma_2}/n_2 t\sqrt{\theta}/2\right\}\]

\[m_{\bar{z}} = exp\left\{t'\sqrt{\theta} \boldsymbol{\Sigma_1}/n_1 t\sqrt{\theta}/2 + t'\sqrt{\theta} \boldsymbol{\Sigma_2}/n_2 t\sqrt{\theta}/2\right\}\]

\[m_{\bar{z}} = exp\left\{t'\theta \boldsymbol{\Sigma_1}/n_1 t/2 + t'\theta \boldsymbol{\Sigma_2}/n_2 t/2\right\}\]

\[m_{\bar{z}} = exp\left\{t'\frac{n_1n_2}{n_1+n_2}\frac{1}{n_1} \boldsymbol{\Sigma_1} t/2 + t'\frac{n_1n_2}{n_1+n_2}\frac{1}{n_2} \boldsymbol{\Sigma_2} t/2\right\}\]

\[m_{\bar{z}} = exp\left\{t'\frac{n_2}{n_1+n_2} \boldsymbol{\Sigma_1} t/2 + t'\frac{n_1}{n_1+n_2} \boldsymbol{\Sigma_2} t/2\right\}\]

Con \(\boldsymbol{\Sigma_1=\Sigma_2=\Sigma}\)

\[m_{\bar{z}} = exp\left\{t'\frac{n_2}{n_1+n_2} \boldsymbol{\Sigma} t/2 + t'\frac{n_1}{n_1+n_2} \boldsymbol{\Sigma} t/2\right\}\]

\[m_{\bar{z}} = exp\left\{t' \boldsymbol{\Sigma} t/2 \left[ \frac{n_1}{n_1+n_2} + \frac{n_2}{n_1+n_2} \right]\right\}\]

\[m_{\bar{z}} = exp\left\{t' \boldsymbol{\Sigma} t/2 \right\} = e^{t'\boldsymbol{\Sigma}t/2}\]

De esa forma \(\boldsymbol{\bar{z}} \sim N_p(0,\boldsymbol{\Sigma}) \;\; \square\)

Por ultimo basados en

Teorema 2.3.D:\(\boldsymbol{\bar{y}}\) y \(\boldsymbol{S}\), provienen de \(\boldsymbol{y_1,...,\boldsymbol{y_n}}\sim N_p(\boldsymbol{\mu,\Sigma})\), entonces \(\boldsymbol{\bar{y}}\) y \(\boldsymbol{S}\) son independientes.

Para este caso \(\boldsymbol{{\bar{z}}}\) y \(\boldsymbol{W}\) siendo funciones de estos, y mas importante siendo funciones lineales de estos, los hace invariantes a esa propiedad, lo que nos permite concluir que \(\boldsymbol{\bar{z}}\) y \(\boldsymbol{W}\) son independientes. \(\; \square\)

De esa forma para

\[T^2=\boldsymbol{z'}\left( \frac{\boldsymbol{W}}{\vartheta} \right)^{-1}\boldsymbol{z}\]

\[T^2= \left(\sqrt{\frac{n_1n_2}{n_1+n_2}} (\boldsymbol{\bar{y_1}-\bar{y_2}})\right)'\left( \frac{\boldsymbol{W}}{n_1+n_2-2} \right)^{-1}\left(\sqrt{\frac{n_1n_2}{n_1+n_2}} (\boldsymbol{\bar{y_1}-\bar{y_2}})\right)\]

\[T^2= \left(\sqrt{\frac{n_1n_2}{n_1+n_2}} (\boldsymbol{\bar{y_1}-\bar{y_2}})\right)'\left( \frac{(n_1+n_2-2)\boldsymbol{S_{pl}} }{n_1+n_2-2} \right)^{-1}\left(\sqrt{\frac{n_1n_2}{n_1+n_2}} (\boldsymbol{\bar{y_1}-\bar{y_2}})\right)\]

\[T^2 = \frac{n_1n_2}{n_1+n_2} (\boldsymbol{\bar{y_1}-\bar{y_2}})'\boldsymbol{S_{pl}}^{-1}(\boldsymbol{\bar{y_1}-\bar{y_2}}) \;\; \blacksquare\]

Ejercicio 3.24

Prove Theorem 3.5.C

Solución 3.24

El teorema dice

Teorema 3.5.C: If \(y_{11}, y_{12}, ..., y_{1n_1} \; \wedge \; y_{21},...,y_{2n_2}\) are two independent random samples from \(N_p(\boldsymbol{\mu_1}, \boldsymbol{\Sigma_1}) \;\wedge\; N_p(\boldsymbol{\mu_1}, \boldsymbol{\Sigma_1})\), respectively, where \(\boldsymbol{\Sigma_1=\Sigma_2}\), then the likelihood ratio approach leads to \(T^2\)-Statistic.

Lo cual procedemos a demostrar.

Partimos de

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma_1,\Sigma_2}) = \prod_{i=1}^{n_1} f(\boldsymbol{y_{1i};\mu_1,\Sigma_1}) \cdot \prod_{i=1}^{n_2} f(\boldsymbol{y_{2i};\mu_2,\Sigma_2})\]

Donde sabemos que \(\boldsymbol{\Sigma_1=\Sigma_2}\)

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma}) = \prod_{i=1}^{n_1} f(\boldsymbol{y_{1i};\mu_1,\Sigma}) \cdot \prod_{i=1}^{n_2} f(\boldsymbol{y_{2i};\mu_2,\Sigma})\]

Así

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma}) = \prod_{i=1}^{n_1} \frac{1}{(\sqrt{2\pi})^{p/2}|\boldsymbol{\Sigma}|}e^{-\frac{1}{2}\boldsymbol{(y_{1i}-\mu_1)'\Sigma^{-1}}(y_{1i}-\mu_1)} \cdot \prod_{i=1}^{n_2} \frac{1}{(\sqrt{2\pi})^{p/2}|\boldsymbol{\Sigma}|}e^{-\frac{1}{2}\boldsymbol{(y_{2i}-\mu_2)'\Sigma^{-1}}(y_{2i}-\mu_2)}\]

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma}) = \frac{1}{(\sqrt{2\pi})^{(n_1+n_2)p} |\boldsymbol{\Sigma}|^{(n_1+n_2)/2}} e^{-\frac{1}{2} \sum_{i=1}^{n_1} \boldsymbol{(y_{1i}-\mu_1)'\Sigma^{-1} (y_{1i}-\mu_1)} -\frac{1}{2} \sum_{i=1}^{n_2} \boldsymbol{(y_{2i}-\mu_2)'\Sigma^{-1} (y_{2i}-\mu_2)} }\]

Donde buscamos construir

\[L.R = \frac{max_{H_0} L}{max_{H_1} L}\]

Para \(max_{H_1} L\), se estiman los estimadores máximos verosímiles, así, planteamos el uso de la traza, en los exponentes dado que es una forma cuadrática, y es un escalar.

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma}) = \frac{1}{(\sqrt{2\pi})^{(n_1+n_2)p} |\boldsymbol{\Sigma}|^{(n_1+n_2)/2}} e^{-\frac{1}{2} \sum_{i=1}^{n_1} tr\boldsymbol{(y_{1i}-\mu_1)'\Sigma^{-1} (y_{1i}-\mu_1)} -\frac{1}{2} \sum_{i=1}^{n_2} tr\boldsymbol{(y_{2i}-\mu_2)'\Sigma^{-1} (y_{2i}-\mu_2)} }\]

Usando \(tr(AB)=tr(BA)\)

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma}) = \frac{1}{(\sqrt{2\pi})^{(n_1+n_2)p} |\boldsymbol{\Sigma}|^{(n_1+n_2)/2}} e^{-\frac{1}{2} tr\Sigma^{-1}\sum_{i=1}^{n_1} \boldsymbol{ (y_{1i}-\mu_1)(y_{1i}-\mu_1)'} -\frac{1}{2} tr\Sigma^{-1}\sum_{i=1}^{n_2} \boldsymbol{ (y_{2i}-\mu_2)(y_{2i}-\mu_2)'} }\]

Si desarrollamos \(\sum_{i=1}^{n_1} \boldsymbol{ (y_{1i}-\mu_1)(y_{1i}-\mu_1)'}\) en el interior de la traza

\[\sum_{i=1}^{n_1} \boldsymbol{ (y_{1i}-\mu_1)(y_{1i}-\mu_1)'} = \sum_{i=1}^{n_1} \boldsymbol{ (y_{1i}-\bar{y_1}+\bar{y_1}-\mu_1)(y_{1i}-\bar{y_1}+\bar{y_1}-\mu_1)'}\]

\[\sum_{i=1}^{n_1} \boldsymbol{ (y_{1i}-\mu_1)(y_{1i}-\mu_1)'} = \sum_{i=1}^{n_1}\boldsymbol{ (y_{1i}-\bar{y_1})(y_{1i}-\bar{y_1})'}+n_1\boldsymbol{(\bar{y_1}-\mu_1)(\bar{y_1}-\mu_1)'} \]

\[\sum_{i=1}^{n_1} \boldsymbol{ (y_{1i}-\mu_1)(y_{1i}-\mu_1)'} =\boldsymbol{W_1}+n_1\boldsymbol{(\bar{y_1}-\mu_1)(\bar{y_1}-\mu_1)'}\]

De igual manera para

\[\sum_{i=1}^{n_2} \boldsymbol{ (y_{2i}-\mu_2)(y_{2i}-\mu_2)'} =\boldsymbol{W_2}+n_2\boldsymbol{(\bar{y_2}-\mu_2)(\bar{y_2}-\mu_2)'}\]

Así

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma}) = \frac{1}{(\sqrt{2\pi})^{(n_1+n_2)p} |\boldsymbol{\Sigma}|^{(n_1+n_2)/2}} e^{-\frac{1}{2} tr[\boldsymbol{ \boldsymbol{W_1}+n_1\boldsymbol{(\bar{y_1}-\mu_1)(\bar{y_1}-\mu_1)'} }] -\frac{1}{2} tr[\boldsymbol{ \boldsymbol{W_2}+n_2\boldsymbol{(\bar{y_2}-\mu_2)(\bar{y_2}-\mu_2)'} }] }\]

Aplicando logaritmo

\[\begin{multline*} ln(L(\boldsymbol{\mu_1,\mu_2,\Sigma})) = -(n_1+n_2)pln(\sqrt{2\pi}) - \frac{n_1+n_2}{2}ln|\boldsymbol{\Sigma}| -\frac{1}{2}tr\boldsymbol{\Sigma^{-1}W_1}-\frac{n}{2}\boldsymbol{(\bar{y_1}-\mu_1)'\Sigma^{-1}(\bar{y_1}-\mu_1)}\\ -\frac{1}{2}tr(\boldsymbol{\Sigma^{-1}W_2})-\frac{n}{2}\boldsymbol{(\bar{y}_2-\mu_2)'\Sigma^{-1}(\bar{y}_2-\mu_2)} \end{multline*}\]

Así derivamos para \(\boldsymbol{\mu_1}\)

\[\frac{\partial ln(L(\boldsymbol{\mu_1,\mu_2,\Sigma}))}{\partial \boldsymbol{\mu_1}} = n_1\boldsymbol{\Sigma^{-1}(\bar{y_1}-\mu_1})=0 \Rightarrow \boldsymbol{\hat{\mu_1} = \bar{y}_1}\]

Para \(\boldsymbol{\mu_2}\)

\[\frac{\partial ln(L(\boldsymbol{\mu_1,\mu_2,\Sigma}))}{\partial \boldsymbol{\mu_2}} = n_2\boldsymbol{\Sigma^{-1}(\bar{y_2-\mu_2})}=0 \Rightarrow \boldsymbol{\hat{\mu_2} = \bar{y}_2}\]

Para \(\boldsymbol{\Sigma}\) empezamos sustituyendo \(\boldsymbol{\hat{\mu_1}}\) y \(\boldsymbol{\hat{\mu_2}}\)

Así

\[\Rightarrow -(n_1+n_2)pln(\sqrt{2\pi}) - \frac{(n_1+n_2)}{2} ln|\boldsymbol{\Sigma}| - \frac{1}{2}tr(\boldsymbol{\Sigma^{-1}W_1})-\frac{1}{2}tr(\boldsymbol{\Sigma^{-1}W_2})\]

Donde \(-ln|\boldsymbol{\Sigma}|=ln|\boldsymbol{\Sigma^{-1}}|\)

Así

\[\Rightarrow -(n_1+n_2)pln(\sqrt{2\pi}) + \frac{(n_1+n_2)}{2} ln|\boldsymbol{\Sigma^{-1}}| - \frac{1}{2}tr(\boldsymbol{\Sigma^{-1}W_1})-\frac{1}{2}tr(\boldsymbol{\Sigma^{-1}W_2})\]

Ahora usando

\[\frac{\partial ln(|A|)}{\partial A} = 2A^{-1}+diag(A^{-1})\] \[\frac{\partial tr(AB)}{\partial A} = B + B' -diag(B)\]

\[\begin{multline*} \frac{\partial ln(L(\boldsymbol{\hat{\mu_1},\hat{\mu_2},\Sigma}))}{\partial \boldsymbol{\boldsymbol{\Sigma^{-1}}}} = \frac{n_1+n_2}{2}\left[ 2\boldsymbol{\Sigma}-diag(\boldsymbol{\Sigma}) \right]-\frac{1}{2}[\boldsymbol{W_1+W_1'}-diag(\boldsymbol{W_1})]\\ -\frac{1}{2}[\boldsymbol{W_2+W_2'}-diag(\boldsymbol{W_2})] \end{multline*}\]

Con \(\boldsymbol{W_1\; \wedge\; W_2}\) simétricas al igualar a cero obtenemos

\[(n_1+n_2)\boldsymbol{\hat{\Sigma}}-\frac{n_1+n_2}{2}diag(\boldsymbol{\hat{\Sigma}}) = \boldsymbol{W_1} - \frac{1}{2}diag(\boldsymbol{W_1})+\boldsymbol{W_2} - \frac{1}{2}diag(\boldsymbol{W_2})\]

Caso (i) si \(i\ne j\)

\[(n_1+n_2)\hat{\sigma}_{ij} = w_{1ij}+w_{2ij}\Rightarrow \hat{\sigma}_{ij} = \frac{w_{1ij}+w_{2ij}}{(n_1+n_2)}\]

Para \(i=j\), caso (ii)

\[(n_1+n_2)\hat{\sigma}_{ii}-\frac{n_1+n_2}{2}\hat{\sigma}_{ii} = w_{1ii}-\frac{1}{2}w_{ii}+w_{2ii}-\frac{1}{2}w_{2ii}\] \[\frac{n_1+n_2}{2}\hat{\sigma}_{ii} = \frac{1}{2}(w_{1ii}+w_{2ii})\] \[\hat{\sigma_{ii}} = \frac{w_{1ii} + w_{2ii}}{n_1+n_2}\]

Así

\[\Rightarrow \boldsymbol{\hat{\Sigma}} = \frac{\boldsymbol{W_1+W_2}}{n_1+n_2}\]

De esa forma si sustituimos en

\[L(\boldsymbol{\mu_1,\mu_2,\Sigma}) = \frac{1}{(\sqrt{2\pi})^{(n_1+n_2)p} |\boldsymbol{\Sigma}|^{(n_1+n_2)/2}} e^{-\frac{1}{2} tr[\boldsymbol{ \boldsymbol{W_1}+n_1\boldsymbol{(\bar{y_1}-\mu_1)(\bar{y_1}-\mu_1)'} }] -\frac{1}{2} tr[\boldsymbol{ \boldsymbol{W_2}+n_2\boldsymbol{(\bar{y_2}-\mu_2)(\bar{y_2}-\mu_2)'} }] }\]

Obtenemos

\[max_{H_1}L=\frac{1}{ (\sqrt{2\pi})^{(n_1+n_2)p} \left| \frac{\boldsymbol{W_1+W_2}}{n_1+n_2} \right|^{(n_1+n_2)/2} } e^{-\frac{1}{2} tr\left\{ \left| \frac{\boldsymbol{W_1+W_2}}{n_1+n_2} \right|^{-1} [\boldsymbol{W_1+W_2}] \right\}}\]

\[max_{H_1}L=\frac{(n_1+n_2)^{(n_1+n_2)/2}}{ (\sqrt{2\pi})^{(n_1+n_2)p} \left| \boldsymbol{W_1+W_2} \right|^{(n_1+n_2)/2} } e^{-\frac{n_1+n_2}{2}}\]

Para \(max_{H_0} L\) Asumimos \(H_0:\boldsymbol{\mu_1=\mu_2=\mu}\)

Así por ejemplo para estimar \(\boldsymbol{\hat{\mu}}\)

Obtenemos

\[\frac{\partial ln(L(\boldsymbol{\mu,\Sigma}))}{\partial \boldsymbol{\mu}} = n_1\boldsymbol{\Sigma^{-1}(\bar{y_1}-\mu})+n_2\boldsymbol{\Sigma^{-1}(\bar{y_2}-\mu}) = 0 \] \[\Rightarrow \boldsymbol{\mu}(n_1+n_2)=n_1\boldsymbol{\bar{y}_1}+n_2\boldsymbol{\bar{y}_2}\] \[\Rightarrow \boldsymbol{\hat{\mu}} = \frac{n_1\boldsymbol{\bar{y}_1}+n_2\boldsymbol{\bar{y}}_2}{n_1+n_2}\]

De igual forma para \(\boldsymbol{\hat{\Sigma}_0}\):

El cual se puede hallar asumiendo la \(H_0\) como cierta y volviendo a unos pasos atrás con donde se desarrollaba la traza del exponente añadiendo y eliminando términos, solo que en este caso se añade y quita \(\boldsymbol{\mu}\), que sabemos que su estimador es \(\boldsymbol{\hat{\mu}}\), con lo que seguimos operando y llegamos a la siguiente expresión al derivarla con respecto a \(\boldsymbol{\Sigma^{-1}}\)

\[\boldsymbol{\hat{\Sigma}_0} = \frac{\sum_{i=1}^{n_1} \boldsymbol{(y_{1i}-\hat{\mu})(y_{1i}-\hat{\mu})'} + \sum_{i=1}^{n_2}\boldsymbol{(y_{2i}-\hat{\mu})(y_{2i}-\hat{\mu}})'}{n_1+n_2}\]

Así al sustituir en

\[L(\boldsymbol{\mu,\Sigma}) = \frac{1}{(\sqrt{2\pi})^{(n_1+n_2)p} |\boldsymbol{\hat{\Sigma}_0}|^{(n_1+n_2)/2}} e^{-\frac{1}{2} tr\hat{\Sigma}_{0}^{-1}\sum_{i=1}^{n_1} \boldsymbol{ (y_{1i}-\hat{\mu})(y_{1i}-\hat{\mu})'} -\frac{1}{2} tr\hat{\Sigma}_0^{-1}\sum_{i=1}^{n_2} \boldsymbol{ (y_{2i}-\hat{\mu})(y_{2i}-\hat{\mu})'} }\]

Obtenemos

\[max_{H_0} L = \frac{(n_1+n_2)^{(n_1+n_2)/2} }{ (\sqrt{2\pi})^{(n_1+n_2)p} |\sum_{i=1}^{n_1} \boldsymbol{(y_{1i}-\hat{\mu})(y_{1i}-\hat{\mu})'}+\sum_{i=1}^{n_2} \boldsymbol{(y_{2i}-\hat{\mu})(y_{2i}-\hat{\mu})'}|^{(n_1+n_2)/2}} e^{-\frac{(n_1+n_2)}{2}}\]

Así

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{\frac{(n_1+n_2)^{(n_1+n_2)/2} }{ (\sqrt{2\pi})^{(n_1+n_2)p} |\sum_{i=1}^{n_1} \boldsymbol{(y_{1i}-\hat{\mu})(y_{1i}-\hat{\mu})'}+\sum_{i=1}^{n_2} \boldsymbol{(y_{2i}-\hat{\mu})(y_{2i}-\hat{\mu})'}|^{(n_1+n_2)/2}} e^{-\frac{(n_1+n_2)}{2}}}{\frac{(n_1+n_2)^{(n_1+n_2)/2}}{ (\sqrt{2\pi})^{(n_1+n_2)p} \left| \boldsymbol{W_1+W_2} \right|^{(n_1+n_2)/2} } e^{-\frac{n_1+n_2}{2}}}\]

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{|\boldsymbol{W_1+W_2}|^{(n_1+n_2)/2}}{|\sum_{i=1}^{n_1} \boldsymbol{(y_{1i}-\hat{\mu})(y_{1i}-\hat{\mu})'} + \sum_{i=1}^{n_2}\boldsymbol{(y_{2i}-\hat{\mu})(y_{2i}-\hat{\mu}})'|^{(n_1+n_2)/2}}\]

Desarrollando el denominador

\[\begin{multline*} \sum_{i=1}^{n_1} \boldsymbol{(y_{1i}-\hat{\mu})(y_{1i}-\hat{\mu})'} + \sum_{i=1}^{n_2}\boldsymbol{(y_{2i}-\hat{\mu})(y_{2i}-\hat{\mu}})' = \boldsymbol{W_1+W_2}+n_1\boldsymbol{(\bar{y}_1-\hat{\mu})(\bar{y}_1-\hat{\mu})'}\\ +n_2\boldsymbol{(\bar{y}_2-\hat{\mu})(\bar{y}_2-\hat{\mu})'} \end{multline*}\]

Sustituyendo \(\boldsymbol{\hat{\mu}}\)

\[\begin{multline*} \Rightarrow \boldsymbol{W_1+W_2}+n_1\boldsymbol{(\bar{y}_1-\frac{n_1\boldsymbol{\bar{y}_1}+n_2\boldsymbol{\bar{y}}_2}{n_1+n_2})(\bar{y}_1-\frac{n_1\boldsymbol{\bar{y}_1}+n_2\boldsymbol{\bar{y}}_2}{n_1+n_2})'}\\ +n_2\boldsymbol{(\bar{y}_2-\frac{n_1\boldsymbol{\bar{y}_1}+n_2\boldsymbol{\bar{y}}_2}{n_1+n_2})(\bar{y}_2-\frac{n_1\boldsymbol{\bar{y}_1}+n_2\boldsymbol{\bar{y}}_2}{n_1+n_2})'} \end{multline*}\]

Factorizando

\[\begin{multline*} \Rightarrow \boldsymbol{W_1+W_2}+ \frac{n_1}{(n_1+n_2)^2}[\boldsymbol{(n_2\bar{y}_1 -n_2\bar{y_2})(n_2\bar{y}_1 -n_2\bar{y_2})' }] \\ +\frac{n_2}{(n_1+n_2)^2}[\boldsymbol{(n_1\bar{y}_1 -n_1\bar{y_2})(n_1\bar{y}_1 -n_1\bar{y}_2)' }] \end{multline*}\]

\[\begin{multline*} \Rightarrow \boldsymbol{W_1+W_2}+ \frac{n_1n_2^2}{(n_1+n_2)^2}[\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y_2})' }] \\ +\frac{n_1^2n_2}{(n_1+n_2)^2}[\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' }] \end{multline*}\]

\[\Rightarrow \boldsymbol{W_1+W_2}+ \left[\frac{n_1n_2^2}{(n_1+n_2)^2} +\frac{n_1^2n_2}{(n_1+n_2)^2}\right][\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' }]\]

\[\Rightarrow \boldsymbol{W_1+W_2}+ \left[\frac{n_1n_2}{(n_1+n_2)^2}(n_1+n_2) \right][\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' }]\] \[\Rightarrow \boldsymbol{W_1+W_2}+ \left[\frac{n_1n_2}{(n_1+n_2)}\right][\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' }]\]

Volviendo al test sustituimos en

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{|\boldsymbol{W_1+W_2}|^{(n_1+n_2)/2}}{|\sum_{i=1}^{n_1} \boldsymbol{(y_{1i}-\hat{\mu})(y_{1i}-\hat{\mu})'} + \sum_{i=1}^{n_2}\boldsymbol{(y_{2i}-\hat{\mu})(y_{2i}-\hat{\mu}})'|^{(n_1+n_2)/2}}\]

Obtenemos

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{|\boldsymbol{W_1+W_2}|^{(n_1+n_2)/2}}{\left|\boldsymbol{W_1+W_2}+ \left[\frac{n_1n_2}{(n_1+n_2)}\right][\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' }]\right|^{(n_1+n_2)/2}}\]

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \left[ \frac{\boldsymbol{|W_1+W_2}|}{|\boldsymbol{W_1+W_2}+ \left[\frac{n_1n_2}{(n_1+n_2)}\right][\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' }|]} \right]^{(n_1+n_2)/2}\]

Si agregamos \(\frac{\frac{1}{n_1+n_2-2}}{\frac{1}{n_1+n_2-2}}\) que es igual a multiplicar por uno.

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \left[\frac{\frac{1}{n_1+n_2-2}}{\frac{1}{n_1+n_2-2}} \frac{\boldsymbol|{W_1+W_2}|}{|\boldsymbol{W_1+W_2}+ \left[\frac{n_1n_2}{(n_1+n_2)}\right][\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' }|]} \right]^{(n_1+n_2)/2}\]

Donde sabemos que \(\boldsymbol{S_{pl}} = \frac{\boldsymbol{W_1+W_2}}{n_1+n_2-2}\)

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \left[ \frac{\boldsymbol|{S_{pl}|}}{|\boldsymbol{S_{pl}}+ \left[\frac{n_1n_2}{(n_1+n_2)}\right][\boldsymbol{(\bar{y}_1 -\bar{y_2})(\bar{y}_1 -\bar{y}_2)' |}]/(n_1+n_2-2)} \right]^{(n_1+n_2)/2}\]

Usando \(|B+cc'|=|B|(1+c'B^{-1}c)\) obtenemos

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{|\boldsymbol{S_{pl}}|^{(n_1+n_2)/2}}{|\boldsymbol{S_{pl}}|^{(n_1+n_2)/2} \left( 1 + \frac{n_1n_2}{n_1+n_2}[\boldsymbol{(\bar{y}_1 - \bar{y}_2)'S_{pl}^{-1}(\bar{y}_1 - \bar{y}_2)}]/(n_1+n_2-2)\right)^{(n_1+n_2)/2}} \]

Donde

\[T^2 = \frac{n_1n_2}{n_1+n_2}[\boldsymbol{(\bar{y}_1 - \bar{y}_2)'S_{pl}^{-1}(\bar{y}_1 - \bar{y}_2)}]\]

\[L.R = \frac{max_{H_0} L}{max_{H_1} L} = \frac{1}{\left(1+\frac{T^2}{n_1+n_2-2}\right)^{(n_1+n_2)/2}} \;\; \blacksquare\]

Con lo cual concluimos que \(H_0: \boldsymbol{\mu_1=\mu_2}\) se rechaza para valores pequeños de \(L.R\) o valores altos de \(T^2\), demostrando la proposición del teorema.


Ejercicio 3.28

Using the probe word data in Table 1.3 do the following:

  1. \(H_0: \mu = (30,25,40,25,30)'\)

  2. Obtain 95% simultaneous confidence intervals for \(\mu_1,\mu_2,...,\mu_5\)

Solución

Primero cargamos los datos

# Cargando los datos 
sujeto <- c(1,2,3,4,5,6,7,8,9,10,11)
y1 <- c(51,27,37,42,27,43,41,38,36,26,29)
y2 <- c(36,20,22,36,18,32,22,21,23,31,20)
y3 <- c(50,26,41,32,33,43,36,31,27,31,25)
y4 <- c(35,17,37,34,14,35,25,20,25,32,26)
y5 <- c(42,27,30,27,29,40,38,16,28,36,25)

# Creación del dataframe
datos_1 <- data.frame(sujeto,y1,y2,y3,y4,y5)
#Presentación de datos
DT::datatable(data = datos_1)

3.28 a

Contraste de Hipótesis

\[H_0: (\mu_1,\mu_2,\mu_3,\mu_4,\mu_5)' = (30,25,40,25,30)' \\ vs\\ H_1: (\mu_1,\mu_2,\mu_3,\mu_4,\mu_5)' \ne (30,25,40,25,30)'\]

De manera manual

Vector de Medias

# Vector de Medias
vector.medias <- apply(datos_1[,-1], MARGIN = 2, FUN = mean)
vector.medias
##       y1       y2       y3       y4       y5 
## 36.09091 25.54545 34.09091 27.27273 30.72727

Matriz de Covarianza

# Matriz de Covarianza 
S <- cov(datos_1[,-1])
DT::datatable(round(S, 3))

Datos iniciales

\[T^2 = n(\bar{y}-\mu)'S^{-1}(\bar{y}-\mu)\]

# Medias de las hipótesis
mu <- c(30,25,40,25,30)
# Observaciones
n <- nrow(datos_1)
# parametros
p <- ncol(datos_1[,-1])

# Calculo
t2 <- as.numeric(n*(t((vector.medias-mu))%*%solve(S)%*%(vector.medias-mu)))
t2
## [1] 85.3327

Con un valor tabulado de \(T_{0.05, 5, 10}\)

# Valor tabulado de T^2
((n*p-p)/(n-p))*qf(0.95,p,n-p)
## [1] 36.56145

Dado que el estadístico de prueba es mayor que el valor tabulado \(85.33 > 36.56\), se procede a rechazar \(H_0\), las medias del sondeo de palabras son significativamente diferentes de \(\mu = (30,25,40,25,30)'\) con un nivel de confianza del 95%


Usando ICSNP: Tools for Multivariate Nonparametrics,

# Función de Estadístico
t_hotelling <- ICSNP::HotellingsT2(datos_1[,-1],
                                   mu = c(30,25,40,25,30),
                                   test = "chi")
# Salida por pantalla
print(t_hotelling)
## 
##  Hotelling's one sample T2-test
## 
## data:  datos_1[, -1]
## T.2 = 85.333, df = 5, p-value < 2.2e-16
## alternative hypothesis: true location is not equal to c(30,25,40,25,30)

Donde con un \(\alpha=0.05\) y con un \(p-value = 0.0067\), rechazamos la hipótesis nula, las medias del sondeo de palabras son significativamente diferentes de \(\mu = (30,25,40,25,30)'\)

3.28 b

  1. Obtain 95% simultaneous confidence intervals for \(\mu_1,\mu_2,...,\mu_5\)
# Usando este comando se estiman los intervalos
#intervalos <- mvdalab::MVcis(data = datos_1[,-1], level = 0.95, include.zero = T)

# presentación de los intervalos
DT::datatable(round(intervalos, 4),
              colnames = c("Inferior","Superior"))

Ejercicio 3.41

Lubischev (1962), measured the following four variables on two species of flea beetles:

\(y_1 = \text{the distance of the transverse groove from the posterior border of the prothorax} \;(\mu m),\) \(y_2 = \text{the length of the elytra (in 0.01 mm)},\) \(y_3 = \text{the length of the second antennal joint}\; (\mu m),\) \(y_4 = \text{the lenght of the third antennal joint} \; (\mu m)\)

# Datos de la primera poblacion
y11 <- c(189,192,217,221,171,192,213,192,170,201,195,205,180,192,200,192,200,181,192)
y21 <- c(245,260,276,299,239,262,278,255,244,276,242,263,252,283,294,277,287,255,287)
y31 <- c(137,132,141,142,128,147,136,128,128,146,128,147,121,138,138,150,136,146,141)
y41 <- c(163,217,192,213,158,173,201,185,192,186,192,192,167,183,188,177,173,183,198)

# Creación del dataframe
datos_2 <- data.frame(y11,y21,y31,y41)

# Segunda población
y12 <- c(181,158,184,171,181,181,177,198,180,177,176,192,176,169,164,181,192,181,175,197)
y22 <- c(305,237,300,273,297,308,301,308,286,299,317,312,285,287,265,308,276,278,271,303)
y32 <- c(184,133,166,162,163,160,166,141,146,171,166,166,141,162,147,157,154,149,140,170)
y42 <- c(209,188,231,213,224,223,221,197,214,192,213,209,200,214,192,204,209,235,192,205)

# Creación del dataframe
datos_3 <- data.frame(y12,y22,y32,y42)

  1. Test \(H_0: \mu_1 = \mu_2\) using \(T^2\)

  2. Calculate the discriminant function coefficient vector \(a = S_{pl}^{-1}(\bar{y}_1-\bar{y}_2)\)

3.41 a

\[H_0: \mu_1 = \mu_2\\ vs\\ H_1: \mu_1 \ne \mu_2\]

# Función de Test
t_hotelling_2 <- ICSNP::HotellingsT2(datos_2, datos_3,
                                     test = "chi")

# Salida por pantalla
print(t_hotelling_2)
## 
##  Hotelling's two sample T2-test
## 
## data:  datos_2 and datos_3
## T.2 = 133.49, df = 4, p-value < 2.2e-16
## alternative hypothesis: true location difference is not equal to c(0,0,0,0)

Así con un nivel de significancia de \(\alpha = 0.05\), y un valor tabulado de \(T^2_{0.05,4,37} = 11.35\), el cual es menor a \(133.49\), de esa forma ademas de tener un \(p-value\) del 0.05>2.2e-16, rechazamos la hipótesis nula, esto es, que existe suficiente evidencia en la muestra como para concluir que un promedio de algunas de las variables difiere entre poblaciones.


3.41 b

  1. Calculate the discriminant function coefficient vector \(a = S_{pl}^{-1}(\bar{y}_1-\bar{y}_2)\)

Donde

\[S_{pl}=\frac{(n_1-1)S_1+(n_2-1)S_2}{(n_1+n_2-2)}\]

# Vectores de medias
y1.barra <- colMeans(datos_2)
y2.barra <- colMeans(datos_3)

# Varianza poleada
n1 <- nrow(datos_2)
n2 <- nrow(datos_3)
S1 <- cov(datos_2)
S2 <- cov(datos_3)

# Calculo
Spl <- ((n1-1)*S1+(n2-1)*S2)/(n1+n2-2)

# vector de coeficientes determinantes
a <- solve(Spl)%*%(y1.barra-y2.barra)

# Presentación de resultados
DT::datatable(round(a, 4), colnames = c("Aportación"))

Donde observamos la aportación de cada variable para el rechazo, donde se observa como la mayoría de este se lo lleva la distancia del surco transversal desde el borde posterior del protórax, siendo este un hallazgo a discutir con el investigador con el cual se lleve a cabo la investigación.


Nota: Ejercicio 5.19 los datos no están disponibles

Ejercicio 5.18

Dado que no se puede realizar el 5.19 a falta de datos

Twenty engineer apprentices and 20 pilots were given six tests (Travers 1939). The variables were

\(y_1: \text{Intelligence}\)

\(y_2: \text{Form Relations}\)

\(y_3: \text{Dynamometer}\)

\(y_4: \text{Dotting}\)

\(y_5: \text{Sensory motor coordination}\)

\(y_6: \text{Perseveration}\)

# Primera población
x11 <- c(121,108,122,77,140,108,124,130,149,129,154,145,112,120,118,141,135,151,97,109)
x21 <- c(22,30,49,37,35,37,39,34,55,38,37,33,40,39,21,42,49,37,46,42)
x31 <- c(74,80,87,66,71,57,52,89,91,72,87,88,60,73,83,80,73,76,83,82)
x41 <- c(223,175,266,178,175,241,194,200,198,162,170,208,232,159,152,195,152,223,164,188)
x51 <- c(54,40,41,80,38,59,72,85,50,47,60,51,29,39,88,36,42,74,31,57)
x61 <- c(254,300,223,209,261,245,242,242,277,268,244,228,279,233,233,241,249,268,243,267)
# Creación del dataframe
datos_4 <- data.frame(x11,x21,x31,x41,x51,x61)


# Segunda población
x12 <- c(132,123,129,131,110,47,125,129,130,147,159,135,100,149,149,153,136,97,141,164)
x22 <- c(17,32,31,23,24,22,32,29,26,47,37,41,35,37,38,27,31,36,37,32)
x32 <- c(77,79,96,67,96,87,87,102,104,82,80,83,83,94,78,89,83,100,105,76)
x42 <- c(232,192,250,291,239,231,227,234,256,240,227,216,183,227,258,283,257,252,250,187)
x52 <- c(50,64,55,48,42,40,30,58,58,30,58,39,57,30,42,66,31,30,27,30)
x62 <- c(249,315,319,310,268,217,324,300,270,322,317,306,242,240,271,291,311,225,243,264)
# Creación del dataframe
datos_5 <- data.frame(x12,x22,x32,x42,x52,x62)
  1. Test \(H_0:\mu_1=\mu_2\)

  2. If the \(T^2-test\) in part (a) rejects \(H_0\), carry out a \(t-tes\) for each variable as in (5.15)

5.18 a

\[H_0: \mu_1 = \mu_2\\ vs\\ H_1: \mu_1 \ne \mu_2\]

# Función del Test
t_hotelling_3 <- ICSNP::HotellingsT2(datos_4, datos_5,
                                     test = "chi")
#Salida por pantalla
print(t_hotelling_3)
## 
##  Hotelling's two sample T2-test
## 
## data:  datos_4 and datos_5
## T.2 = 66.66, df = 6, p-value = 1.975e-12
## alternative hypothesis: true location difference is not equal to c(0,0,0,0,0,0)

Para un \(T^2=66.66\) y un valor de tabulado de \(T^2_{0.05,6,36}=16.26\) se rechaza \(H_0\) ademas para un \(\alpha = 0.05\) y un p-value de 1.975e-12, rechazamos la hipótesis nula, de esa forma tenemos suficiente información en la muestra para decir que hay diferencias en las medias de las dos poblaciones de las variables que fueron evaluadas los pilotos.


5.18 b

  1. If the \(T^2-test\) in part (a) rejects \(H_0\), carry out a \(t-tes\) for each variable as in (5.15)

Usando

\[t_j = \frac{\bar{y}_{1j}-\bar{y}_{2j}}{\sqrt{\left(\frac{n_1+n_2}{n_1n_2}\right)s_{jj}}},\;\; j=1,2,...,p\]

Donde se rechaza \(H_0: \mu_{1j}=\mu_{2j}\)\(|t_j|>t_{\alpha/2,n_1+n_2-2}\).

Y \(s_{jj}\) son los elementos diagonales de \(S_{pl}\)

# Vectores de medias
y1j.barra <- colMeans(datos_4)
y2j.barra <- colMeans(datos_5)

# Números de datos
n1 <- nrow(datos_4)
n2 <- nrow(datos_5)

# Matriz de covarianzas individuales
S.1 <- cov(datos_4)
S.2 <- cov(datos_5)

# Calculo varianza pooleada
Spl.1 <- ((n1-1)*S.1+(n2-1)*S.2)/(n1+n2-2)
# Valores diagonales
sjj <- diag(Spl.1)

# Calculo del estadístico (usando propiedades de como R maneja 
# escalares/vectores y matrices se puede operar
#sin necesidad de correr una función a parte)
tj <- (y1j.barra - y2j.barra)/(sqrt((n1+n2)/(n1*n2)*sjj))

#Presentación de los resultados
knitr::kable(round(tj, 4))
x
x11 -0.6556
x21 2.6139
x31 -3.2884
x41 -4.6315
x51 1.8873
x61 -3.2205

Mas cómodamente con un bucle

# Diferencia de vectores (numerador)
diffr <- y1j.barra - y2j.barra

# Estadístico de la función del bucle
tjf = c()
# Operación
for (i in 1:6) {
  tjf[i] = diffr[i]/sqrt((n1+n2)/(n1*n2)*sjj[i])
} 

# Presentación de los resultados
result <- data.frame(ti = c("t1", "t2", "t3", "t4", "t5", "t6"),
                     tj = round(tjf, 4))
DT::datatable(result)

Donde como recordamos se rechaza \(H_0: \mu_{1j}=\mu_{2j}\)\(|t_j|>t_{\alpha/2,n_1+n_2-2}\).

Sí planteamos un \(\alpha = 0.05\)

En este caso \(t_{\alpha/2,n_1+n_2-2} = 2.02\)

Donde al evaluar rechazamos para \(t_2, t_3, t_4, t_6\)

Esto es que existen diferencias para un nivel de confianza del 95% entre las respuestas promedio de las pruebas de ambos grupos respecto a “formación de relaciones”, “dinamómetros”, “salpicamiento/dotting”, y la ultima prueba de “perseverancia”.


Conclusión

Con eso se finaliza los ejercicios teóricos, como prácticos sobre el test \(T^2\) de Hotelling, donde se aprendió a hacer implementaciones de este en R.


Referencia