In this section we’re going to explain the concept of collinearity and how understanding this can help us avoid designing experiments where ultimately we won’t be able to estimate the effects we’re interested in.

Let’s motivate this with a system of equations:

The following system of equations \[ \left.\begin{align}a+c&=1\\b-c&=1\\a+b&=2\end{align}\right\}\#\] has more than one solution: \(a=1-c\), \(b=1+c\).

Two examples: \(a=1,b=1,c=0\) and \(a=0,b=2,c=1\).

Using linear algebra, we can see this immediately because one of the three columns can be written as a linear combination of the first two.

\(\#\) can be written as follows: \[\begin{pmatrix}1&0&1\\0&1&-1\\ 1&1&0\end{pmatrix}\begin{pmatrix}a\\b\\c\end{pmatrix}=\begin{pmatrix}1\\1\\2\end{pmatrix}\] Note that the third column is a linear combination of the first two:

\[\begin{pmatrix}1\\0\\1\end{pmatrix}-\begin{pmatrix}0\\1\\1\end{pmatrix}=\begin{pmatrix}1\\-1\\0\end{pmatrix}\] So we can say that the third column is collinear to the other two. What this implies is that we can rewrite the system of equations using that fact and simplify it into two multiplications: \[\begin{align}\begin{pmatrix}1&0&1\\0&1&-1\\ 1&1&0\end{pmatrix}\begin{pmatrix}a\\b\\c\end{pmatrix}&=a\begin{pmatrix}1\\0\\1\end{pmatrix}+b\begin{pmatrix}0\\1\\1\end{pmatrix}+c\begin{pmatrix}1-0\\0-1\\1-1\end{pmatrix}\\ &=(a+c)\begin{pmatrix}1\\0\\1\end{pmatrix}+(b-c)\begin{pmatrix}0\\1\\1\end{pmatrix}\end{align}\] the third column does not add a constraint.

Note that an infinite number of values of \(a\) and \(c\) give same \(a+c\).

In data analysis the way this comes up is when we try to fit a model that has this problem, that one of the columns of our design matrix is collinear to the others.

A simple example:

Consider a design matrix \(X\) with two collinear columns:

\[X=(1 \ \ X_1 \ \ X_{2} \ \ X_{3}) \ \ \text{with, say} \ \ X_{3}=-X_{2}\] we can rewrite the residuals like this \[\begin{align}&Y-\{1\beta_{0}+X_{1}\beta_{1}+X_{2}\beta_{2}+X_{3}\beta_{3}\}\\ &=Y-\{1\beta_{0}+X_{1}\beta_{1}+X_{2}\beta_{2}-X_{2}\beta_{3}\} \\ &=Y-\{1\beta_{0}+X_{1}\beta_{1}+X_{2}(\beta_{2}-\beta_{3})\}\end{align}\] If \(\hat{\beta_{1}},\hat{\beta_{2}},\hat{\beta_{3}}\) is a solution, then \(\hat{\beta_{1}},\hat{\beta_{2}}+1,\hat{\beta_{3}}+1\) is also a solution.

So it does not have a unique solution and, we can’t estimate these parameters.

Imagine that you’re interested in studying the effect of four drugs \(A,B,C \ \ \text{and} \ \ D\) and you’re going to treat mice with these drugs. You assign two mice to each group. After starting the experiment by giving \(A\) and \(B\) to female mice, we realize there might be sex effect. We decide to give \(C\) and \(D\) to males with hopes of estimating this sex effect as well.

Now we write down the design matrix without an intercept for explaining purposes:

\[\begin{pmatrix}Sex & A & B & C & D\\0 & 1 & 0 & 0 & 0 \\0 & 1 & 0 & 0 & 0 \\0 & 0 & 1 & 0 & 0 \\0 & 0 & 1 & 0 & 0 \\1 & 0 & 0 & 1 & 0 \\1 & 0 & 0 & 1 & 0 \\1 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 1\end{pmatrix}\] We can see that the “Sex” column is equal to the C-column plus the D-column: \[\begin{pmatrix}Sex\\0\\0\\0\\0\\1\\1\\1\\1\end{pmatrix}=\begin{pmatrix}C\\0\\0\\0\\0\\1\\1\\0\\0\end{pmatrix}+\begin{pmatrix}D\\0\\0\\0\\0\\0\\0\\1\\1\end{pmatrix}\] Therefore it is collinear and we can’t obtain estimates. We can see this in R very quickly.

If the rank is smaller than the number of columns the LSE are not unique.

set.seed(1)
Sex<-c(0,0,0,0,1,1,1,1)
A<-c(1,1,0,0,0,0,0,0)
B<-c(0,0,1,1,0,0,0,0)
C<-c(0,0,0,0,1,1,0,0)
D<-c(0,0,0,0,0,0,1,1)
X<-model.matrix(~0+Sex+A+B+C+D)
cat("number of columns=",ncol(X),"rank=",qr(X)$rank,"\n")
number of columns= 5 rank= 4 
LS0tCnRpdGxlOiAiQ2FsY3VsYXRpb24gb2YgTGluZWFyIE1vZGVscyIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKKiBDb2xsaW5lYXJpdHkKCkluIHRoaXMgc2VjdGlvbiB3ZSdyZSBnb2luZyB0byBleHBsYWluIHRoZSBjb25jZXB0IG9mIGNvbGxpbmVhcml0eSBhbmQgaG93IHVuZGVyc3RhbmRpbmcgdGhpcyBjYW4gaGVscCB1cyBhdm9pZCBkZXNpZ25pbmcgZXhwZXJpbWVudHMgd2hlcmUgdWx0aW1hdGVseSB3ZSB3b24ndCBiZSBhYmxlIHRvIGVzdGltYXRlIHRoZSBlZmZlY3RzIHdlJ3JlIGludGVyZXN0ZWQgaW4uCgpMZXQncyBtb3RpdmF0ZSB0aGlzIHdpdGggYSBzeXN0ZW0gb2YgZXF1YXRpb25zOgoKVGhlIGZvbGxvd2luZyBzeXN0ZW0gb2YgZXF1YXRpb25zCiQkIFxsZWZ0LlxiZWdpbnthbGlnbn1hK2MmPTFcXGItYyY9MVxcYStiJj0yXGVuZHthbGlnbn1ccmlnaHRcfVwjJCQKaGFzIG1vcmUgdGhhbiBvbmUgc29sdXRpb246ICRhPTEtYyQsICRiPTErYyQuCgpUd28gZXhhbXBsZXM6ICRhPTEsYj0xLGM9MCQgYW5kICRhPTAsYj0yLGM9MSQuCgpVc2luZyBsaW5lYXIgYWxnZWJyYSwgd2UgY2FuIHNlZSB0aGlzIGltbWVkaWF0ZWx5IGJlY2F1c2Ugb25lIG9mIHRoZSB0aHJlZSBjb2x1bW5zIGNhbiBiZSB3cml0dGVuIGFzIGEgbGluZWFyIGNvbWJpbmF0aW9uIG9mIHRoZSBmaXJzdCB0d28uCgokXCMkIGNhbiBiZSB3cml0dGVuIGFzIGZvbGxvd3M6CiQkXGJlZ2lue3BtYXRyaXh9MSYwJjFcXDAmMSYtMVxcIDEmMSYwXGVuZHtwbWF0cml4fVxiZWdpbntwbWF0cml4fWFcXGJcXGNcZW5ke3BtYXRyaXh9PVxiZWdpbntwbWF0cml4fTFcXDFcXDJcZW5ke3BtYXRyaXh9JCQKTm90ZSB0aGF0IHRoZSB0aGlyZCBjb2x1bW4gaXMgYSBsaW5lYXIgY29tYmluYXRpb24gb2YgdGhlIGZpcnN0IHR3bzoKCiQkXGJlZ2lue3BtYXRyaXh9MVxcMFxcMVxlbmR7cG1hdHJpeH0tXGJlZ2lue3BtYXRyaXh9MFxcMVxcMVxlbmR7cG1hdHJpeH09XGJlZ2lue3BtYXRyaXh9MVxcLTFcXDBcZW5ke3BtYXRyaXh9JCQKU28gd2UgY2FuIHNheSB0aGF0IHRoZSB0aGlyZCBjb2x1bW4gaXMgY29sbGluZWFyIHRvIHRoZSBvdGhlciB0d28uIFdoYXQgdGhpcyBpbXBsaWVzIGlzIHRoYXQgd2UgY2FuIHJld3JpdGUgdGhlIHN5c3RlbSBvZiBlcXVhdGlvbnMgdXNpbmcgdGhhdCBmYWN0IGFuZCBzaW1wbGlmeSBpdCBpbnRvIHR3byBtdWx0aXBsaWNhdGlvbnM6IAokJFxiZWdpbnthbGlnbn1cYmVnaW57cG1hdHJpeH0xJjAmMVxcMCYxJi0xXFwgMSYxJjBcZW5ke3BtYXRyaXh9XGJlZ2lue3BtYXRyaXh9YVxcYlxcY1xlbmR7cG1hdHJpeH0mPWFcYmVnaW57cG1hdHJpeH0xXFwwXFwxXGVuZHtwbWF0cml4fStiXGJlZ2lue3BtYXRyaXh9MFxcMVxcMVxlbmR7cG1hdHJpeH0rY1xiZWdpbntwbWF0cml4fTEtMFxcMC0xXFwxLTFcZW5ke3BtYXRyaXh9XFwgJj0oYStjKVxiZWdpbntwbWF0cml4fTFcXDBcXDFcZW5ke3BtYXRyaXh9KyhiLWMpXGJlZ2lue3BtYXRyaXh9MFxcMVxcMVxlbmR7cG1hdHJpeH1cZW5ke2FsaWdufSQkCnRoZSB0aGlyZCBjb2x1bW4gZG9lcyBub3QgYWRkIGEgY29uc3RyYWludC4KCk5vdGUgdGhhdCBhbiBpbmZpbml0ZSBudW1iZXIgb2YgdmFsdWVzIG9mICRhJCBhbmQgJGMkIGdpdmUgc2FtZSAkYStjJC4KCiogKkNvbGxpbmVhcml0eSBhbmQgTGVhc3QgU3F1YXJlcyoKCkluIGRhdGEgYW5hbHlzaXMgdGhlIHdheSB0aGlzIGNvbWVzIHVwIGlzIHdoZW4gd2UgdHJ5IHRvIGZpdCBhIG1vZGVsIHRoYXQgaGFzIHRoaXMgcHJvYmxlbSwgdGhhdCBvbmUgb2YgdGhlIGNvbHVtbnMgb2Ygb3VyIGRlc2lnbiBtYXRyaXggaXMgY29sbGluZWFyIHRvIHRoZSBvdGhlcnMuCgpBIHNpbXBsZSBleGFtcGxlOgoKQ29uc2lkZXIgYSBkZXNpZ24gbWF0cml4ICRYJCB3aXRoIHR3byBjb2xsaW5lYXIgY29sdW1uczoKCiQkWD0oMSBcIFwgIFhfMSBcIFwgWF97Mn0gXCBcIFhfezN9KSBcIFwgXHRleHR7d2l0aCwgc2F5fSBcIFwgWF97M309LVhfezJ9JCQKd2UgY2FuIHJld3JpdGUgdGhlIHJlc2lkdWFscyBsaWtlIHRoaXMKJCRcYmVnaW57YWxpZ259JlktXHsxXGJldGFfezB9K1hfezF9XGJldGFfezF9K1hfezJ9XGJldGFfezJ9K1hfezN9XGJldGFfezN9XH1cXCAmPVktXHsxXGJldGFfezB9K1hfezF9XGJldGFfezF9K1hfezJ9XGJldGFfezJ9LVhfezJ9XGJldGFfezN9XH0gXFwgJj1ZLVx7MVxiZXRhX3swfStYX3sxfVxiZXRhX3sxfStYX3syfShcYmV0YV97Mn0tXGJldGFfezN9KVx9XGVuZHthbGlnbn0kJApJZiAkXGhhdHtcYmV0YV97MX19LFxoYXR7XGJldGFfezJ9fSxcaGF0e1xiZXRhX3szfX0kIGlzIGEgc29sdXRpb24sIHRoZW4gJFxoYXR7XGJldGFfezF9fSxcaGF0e1xiZXRhX3syfX0rMSxcaGF0e1xiZXRhX3szfX0rMSQgaXMgYWxzbyBhIHNvbHV0aW9uLgoKU28gaXQgZG9lcyBub3QgaGF2ZSBhIHVuaXF1ZSBzb2x1dGlvbiBhbmQsIHdlIGNhbid0IGVzdGltYXRlIHRoZXNlIHBhcmFtZXRlcnMuIAoKICAqICpFeGFtcGxlKgogIApJbWFnaW5lIHRoYXQgeW91J3JlIGludGVyZXN0ZWQgaW4gc3R1ZHlpbmcgdGhlIGVmZmVjdCBvZiBmb3VyIGRydWdzICRBLEIsQyBcIFwgXHRleHR7YW5kfSBcIFwgRCQgYW5kIHlvdSdyZSBnb2luZyB0byB0cmVhdCBtaWNlIHdpdGggdGhlc2UgZHJ1Z3MuIFlvdSBhc3NpZ24gdHdvIG1pY2UgdG8gZWFjaCBncm91cC4gQWZ0ZXIgc3RhcnRpbmcgdGhlICBleHBlcmltZW50IGJ5IGdpdmluZyAkQSQgYW5kICRCJCB0byBmZW1hbGUgbWljZSwgd2UgcmVhbGl6ZSB0aGVyZSBtaWdodCBiZSBzZXggZWZmZWN0LiBXZSBkZWNpZGUgdG8gZ2l2ZSAkQyQgYW5kICREJCB0byBtYWxlcyB3aXRoIGhvcGVzIG9mIGVzdGltYXRpbmcgdGhpcyBzZXggZWZmZWN0IGFzIHdlbGwuCgpOb3cgd2Ugd3JpdGUgZG93biB0aGUgZGVzaWduIG1hdHJpeCB3aXRob3V0IGFuIGludGVyY2VwdCBmb3IgZXhwbGFpbmluZyBwdXJwb3NlczoKCiQkXGJlZ2lue3BtYXRyaXh9U2V4ICYgQSAmIEIgJiBDICYgRFxcMCAmIDEgJiAwICYgMCAmIDAgXFwwICYgMSAmIDAgJiAwICYgMCBcXDAgJiAwICYgMSAmIDAgJiAwIFxcMCAmIDAgJiAxICYgMCAmIDAgXFwxICYgMCAmIDAgJiAxICYgMCBcXDEgJiAwICYgMCAmIDEgJiAwIFxcMSAmIDAgJiAwICYgMCAmIDEgXFwgMSAmIDAgJiAwICYgMCAmIDFcZW5ke3BtYXRyaXh9JCQKV2UgY2FuIHNlZSB0aGF0IHRoZSAiU2V4IiBjb2x1bW4gaXMgZXF1YWwgdG8gdGhlIEMtY29sdW1uIHBsdXMgdGhlIEQtY29sdW1uOgokJFxiZWdpbntwbWF0cml4fVNleFxcMFxcMFxcMFxcMFxcMVxcMVxcMVxcMVxlbmR7cG1hdHJpeH09XGJlZ2lue3BtYXRyaXh9Q1xcMFxcMFxcMFxcMFxcMVxcMVxcMFxcMFxlbmR7cG1hdHJpeH0rXGJlZ2lue3BtYXRyaXh9RFxcMFxcMFxcMFxcMFxcMFxcMFxcMVxcMVxlbmR7cG1hdHJpeH0kJApUaGVyZWZvcmUgaXQgaXMgY29sbGluZWFyIGFuZCB3ZSBjYW4ndCBvYnRhaW4gZXN0aW1hdGVzLiBXZSBjYW4gc2VlIHRoaXMgaW4gUiB2ZXJ5IHF1aWNrbHkuCgotICRcdW5kZXJsaW5le1x0ZXh0e0RlZmluaXRpb259fSQgVGhlICpyYW5rKiBvZiBhIG1hdHJpeCBjb2x1bW5zIGlzIHRoZSBudW1iZXIgb2YgY29sdW1ucyB0aGF0IGFyZSBpbmRlcGVuZGVudCB0byBhbGwgdGhlIG90aGVycy4KCklmIHRoZSByYW5rIGlzIHNtYWxsZXIgdGhhbiB0aGUgbnVtYmVyIG9mIGNvbHVtbnMgdGhlIExTRSBhcmUgbm90IHVuaXF1ZS4KYGBge3J9CnNldC5zZWVkKDEpClNleDwtYygwLDAsMCwwLDEsMSwxLDEpCkE8LWMoMSwxLDAsMCwwLDAsMCwwKQpCPC1jKDAsMCwxLDEsMCwwLDAsMCkKQzwtYygwLDAsMCwwLDEsMSwwLDApCkQ8LWMoMCwwLDAsMCwwLDAsMSwxKQpYPC1tb2RlbC5tYXRyaXgofjArU2V4K0ErQitDK0QpCmNhdCgibnVtYmVyIG9mIGNvbHVtbnM9IixuY29sKFgpLCJyYW5rPSIscXIoWCkkcmFuaywiXG4iKQpgYGAKCgoKCgo=