Here I will explore interpretation details about two particular contrast coding systems for categorical variables (CV’s) in generalized linear models (glm’s): Dummy coding and Deviation coding. I will focus on interpretation of parameters derived from the glm for both coding systems as well as how intereaction terms works as well. A good resource can be seen here. Coding systems are necessary because in order to include categorical variables in a glm we need to convert (or encode) them to numerical variables since a glm is formulated using numerical variables.
We will be working those systems in a simulated data set inspired in a biological problem.
As said earlier, in order to include a CV in a glm we need to encoded it first using numbers. The Dummy coding system is prety simple. Assume we have a CV \(\textbf{C}\) with k distinct values \(C_1,C_2,\dots,C_k\), then the Dummy coding system to encode this variable is as follows:
\[I_{C_i}(x)= \begin{cases} 1 & x = C_i \\ 0 & x \neq C_i \end{cases} \] for all \(C_i\) where \(C_i \neq C_1\)
Thus a to encode a CV with \(k\) distinct categories we need \(k-1\) indicator (Dummy) variables. Let us show how this encodings works for the beetles’s data set. For the case of sex we have
status | \(X_1\) |
---|---|
F | 0 |
M | 1 |
And for the case of growth status we have
status | \(X_1\) | \(X_2\) |
---|---|---|
A | 0 | 0 |
B | 1 | 0 |
C | 0 | 1 |
Now let us show how these encoded CV can be included in a glm.
In our first example we are going to build a linear model for weight on sex. This glm will specify the expected value of weight conditional to a given sex, that is \(E(W|Sex=s)\) where \(s \in \{F,M\}\). Since we encoded sex using the dummy coding we can rewrite this expected values as follows: \(E(W|X_1=x_1)\) where
\[x_1= \begin{cases} 0 & s = F \\ 1 & s = M \end{cases} \]
since now \(X1\) is a numerical variable we can specify a glm as follows:
\[\begin{equation} E(W|Sex=s) = E(W|X1=x_1)=\beta_0+\beta_1x_1\tag{2.1} \end{equation}\]
Let us explore how this model behaves for each possible sex. First for the reference category females:
\[\begin{equation} E(W|Sex=F) = E(W|X_1=0)=\beta_0+\beta_1\times 0=\beta_0\tag{2.2} \end{equation}\]
that means that the usual intercept is now the conditional expected value of weight for female beetles. Let us now explore the model (2.1) for males:
\[\begin{equation} E(W|Sex=M) = E(W|X_1=1)=\beta_0+\beta_1\times 1=\beta_0+\beta_1\tag{2.3} \end{equation}\]
equation (2.3) implies that
\[\begin{equation} \beta_1= E(W|Sex=M)-\beta_0=E(W|Sex=M)-E(W|Sex=F)\tag{2.4} \end{equation}\]
Therefore \(\beta_1\) (the usual “slope” term) is the mean difference of weight between males and females.
Let us do the same now for the status growth variable: Now we will work a model for \(E(W|SG=sg)\) where \(SG\) is just the status growth and \(sg \in \{A,B,C\}\). Following table 2.3, we now need to dummy variables, \(x_1\) and \(x_2\) where
\[x_1= \begin{cases}1 & sg = B \\ 0 & sg \neq B \end{cases} \]
and \[x_2= \begin{cases}1 & sg = C \\ 0 & sg \neq C \end{cases} \]
Therefore
\[\begin{equation} E(W|SG=sg) = E(W|X_1=x_1,X_2=x_2)=\beta_0+\beta_1x_1+\beta_2x_2\tag{2.5} \end{equation}\]
As before, let us explore the model behavior for each of the possible growth status. First for the reference category A:
\[\begin{equation} E(W|SG=A) = E(W|X_1=0,X_2=0)=\beta_0+\beta_1\times 0+\beta_2\times 0=\beta_0\tag{2.6} \end{equation}\]
As before, the intercept is just the expected value of weight for the reference category A. Now for B category:
\[\begin{equation} E(W|SG=B) = E(W|X_1=1,X_2=0)=\beta_0+\beta_1\times 1+\beta_2\times 0=\beta_0+\beta_1\tag{2.7} \end{equation}\]
Which implies \[\begin{equation} \beta_1= E(W|SG=B)-\beta_0 = E(W|SG=B)-E(W|SG=A)\tag{2.8} \end{equation}\]
That is the mean weight difference between group B and A and similarly for group C:
\[\begin{equation} E(W|SG=C) = E(W|X_1=0,X_2=1)=\beta_0+\beta_1\times 0+\beta_2\times 1=\beta_0+\beta_2\tag{2.9} \end{equation}\]
Which implies \[\begin{equation} \beta_2= E(W|SG=C)-\beta_0 = E(W|SG=C)-E(W|SG=A)\tag{2.10} \end{equation}\]
That is the mean weight difference between group C and A.
Let us now fit these models in R. Our first model will be weight on sex:
mws<-lm(weight~sex,data=z0)
summary(mws)
##
## Call:
## lm(formula = weight ~ sex, data = z0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -44.913 -18.577 -3.502 16.134 65.057
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.747 1.486 42.216 < 2e-16 ***
## sexM 6.413 2.115 3.033 0.00255 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 23.64 on 498 degrees of freedom
## Multiple R-squared: 0.01813, Adjusted R-squared: 0.01616
## F-statistic: 9.198 on 1 and 498 DF, p-value: 0.00255
You can check that (Intercept)
is just the mean weight for the reference sex F
as stated in equation (2.2) while the estimate for sexM
is just the mean difference of weight between males and females as stated in equation (2.4), you should check that by computing the means and making the apropiate sustractions from the data itself.
The same applies for status growth:
mwsg<-lm(weight~status,data=z0)
summary(mwsg)
##
## Call:
## lm(formula = weight ~ status, data = z0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -42.536 -7.455 -0.360 7.316 42.057
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.2446 0.9779 43.20 <2e-16 ***
## statusB 22.2481 1.3709 16.23 <2e-16 ***
## statusC 49.9156 1.3980 35.71 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.64 on 497 degrees of freedom
## Multiple R-squared: 0.72, Adjusted R-squared: 0.7189
## F-statistic: 639.1 on 2 and 497 DF, p-value: < 2.2e-16
Compare these results with equations (2.6), (2.8) and (2.10) respectively.
For a given dependent variable (DV) \(Y\) and a CV \(C\) with k categories \(C_1,C_2,\dots,C_k\) we will have the following linear model:
\[\begin{equation} E(Y|C=c) = \\ E(W|X_1=x_1,X_2=x_2,\dots,X_{k-1}=x_{k-1})=\beta_0+\beta_1x_1+\beta_2x_2+\dots+\beta_{k-1}x_{k-1}\tag{2.11} \end{equation}\]
where \(C_k\) is the reference category and \[x_i= \begin{cases}1 & c = C_i \\ 0 & c \neq C_i \end{cases} \] For all \(i \neq k\). for this model it can be shown that
\[\begin{equation} E(Y|C=C_k) = \beta_0 \\ E(Y|C=C_i) = \beta_0+\beta_i \; for \; i\neq k \\ \beta_i =E(Y|C=C_i)-E(Y|C=C_k) \; for \; i\neq k \\ \tag{2.12} \end{equation}\]
The general linear model formulation using linear algebra goes as follows:
\[\begin{equation} E(\textbf{Y}|\textbf{X}=X) = X\beta \tag{2.13} \end{equation}\]
where \(\textbf{Y}_{n \times 1}\) is the vector of responses in the dependent variable, \(X_{n \times p}\) is known as the design matrix with \(p-1\) covariates and \(\beta^T_{p \times 1}=(\beta_0,\beta_1,\dots,\beta_{p-1})\) is the parameter vector. We are working with \(p-1\) covariates since the first column of \(X_{n \times p}\) is a vector of constant ones (associated to the intercept, \(\beta_0\) parameter).
We are going to show how equation (2.12) is derived. Assume we have \(E(\textbf{Y}|\textbf{C}=C)=(E(Y|C=C_k),E(Y|C=C_1),\dots,E(Y|C=C_{k-1}))^T\). Therefore \(X\) design have the following form:
\[\begin{equation} X_{k \times k}=\begin{pmatrix} 1 & 0 & 0 & 0 & \cdots & 0 \\ 1 & 1 & 0 & 0 & \cdots & 0 \\ 1 & 0 & 1 & 0 & \cdots & 0 \\ \vdots &\cdots & \cdots &\ddots & \vdots & \vdots\\ \vdots &\cdots & \cdots &\cdots & \ddots & \vdots\\ 1 & 0 & 0 & \cdots & 0 & 1 \\ \end{pmatrix} \tag{2.14} \end{equation}\] It can be shown that
\[\begin{equation} X_{k \times k}^{-1}=\begin{pmatrix} 1 & 0 & 0 & 0 & \cdots & 0 \\ -1 & 1 & 0 & 0 & \cdots & 0 \\ -1 & 0 & 1 & 0 & \cdots & 0 \\ \vdots &\cdots & \cdots &\ddots & \vdots & \vdots\\ \vdots &\cdots & \cdots &\cdots & \ddots & \vdots\\ -1 & 0 & 0 & \cdots & 0 & 1 \\ \end{pmatrix} \tag{2.15} \end{equation}\]
so
\[\begin{equation} \beta = X_{k \times k}^{-1}E(\textbf{Y}|\textbf{C}=C)= \\ = (E(Y|C=C_k),E(Y|C=C_1)-E(Y|C=C_k),\dots,E(Y|C=C_{k-1})-E(Y|C=C_k))^T \tag{2.16} \end{equation}\]
In this section we will explore the behavior of a model with two CV as covariates and how parameters can be interpreted in the case of an additive model as well as in te case of interactions. First we will explore an additive model.
We will keep working with the coding system presented in tables 2.1 and 2.3. We now want to adress the following expected values: \(E(W|Sex=s,SG=sg)\). Using the dummy encodings referenced above we will have the following linear model
\[\begin{equation} E(W|Sex=s,SG=sg)= \\ E(W|I_{sex-M}=x_1,I_{sg-B}=x_2,I_{sg-C}=x_3)=\beta_0+\beta_1x_1+\beta_2x_2+\beta_3x_3 \tag{2.17} \end{equation}\]
As before, we will evaluate the model (2.17) in all the possible combinations of \(Sex\) and \(SG\).
\[\begin{equation} E(W|Sex=F,SG=A)= \\ E(W|I_{sex-M}=0,I_{sg-B}=0,I_{sg-C}=0)=\beta_0 \tag{2.18} \end{equation}\]
Therefore \(\beta_0\) is just the expected value of weight for females and growth status \(A\).
\[\begin{equation} E(W|Sex=M,SG=A)= \\ E(W|I_{sex-M}=1,I_{sg-B}=0,I_{sg-C}=0)=\beta_0+\beta_1 \tag{2.19} \end{equation}\]
therefore
\[\begin{equation} \beta_1 = E(W|Sex=M,SG=A) - E(W|Sex=F,SG=A) \\\tag{2.20} \end{equation}\]
\[\begin{equation} E(W|Sex=F,SG=B)= \\ E(W|I_{sex-M}=0,I_{sg-B}=1,I_{sg-C}=0)=\beta_0+\beta_2 \tag{2.21} \end{equation}\]
therefore
\[\begin{equation} \beta_2 = E(W|Sex=F,SG=B) - E(W|Sex=F,SG=A) \\\tag{2.22} \end{equation}\]
\[\begin{equation} E(W|Sex=M,SG=B)= \\ E(W|I_{sex-M}=1,I_{sg-B}=1,I_{sg-C}=0)=\beta_0+\beta_1+\beta_2 \tag{2.23} \end{equation}\]
\[\begin{equation} E(W|Sex=F,SG=C)= \\ E(W|I_{sex-M}=0,I_{sg-B}=0,I_{sg-C}=1)=\beta_0+\beta_3 \tag{2.24} \end{equation}\]
therefore
\[\begin{equation} \beta_3 = E(W|Sex=F,SG=C) - E(W|Sex=F,SG=A) \\\tag{2.25} \end{equation}\]
\[\begin{equation} E(W|Sex=M,SG=C)= \\ E(W|I_{sex-M}=1,I_{sg-B}=0,I_{sg-C}=1)=\beta_0+\beta_1+\beta_3 \tag{2.21} \end{equation}\]
note how equations (2.20), (2.22) and (2.25) gives meaning to the interpretation of the corresponding parameters: they are mean differences of sex or status growth with respect to the corresponding reference categories (sex female and growth status \(A\)). Let us take a closer look to for example result (2.20): \(\beta_1\) is the mean diference of males vs females in the beetles with \(A\) as status growth (the reference category for \(SG\)). something similar is hapeneing for \(\beta_2\) and \(\beta_3\), they are comparing \(SG\) \(B\) and \(C\) against \(A\) for the reference sex female.
However, what happens if we ask for the following contrast (not encoded as a parameter in this model):
\[E(W|Sex=M,SG=B) - E(W|Sex=F,SG=B)\] Using model equations (2.23) and (2.21) it can be shown that \[\begin{equation} E(W|Sex=M,SG=B) - E(W|Sex=F,SG=B) = \beta_0+\beta_1+\beta_2-(\beta_0+\beta_2)=\beta_1 \tag{2.26} \end{equation}\]
Studying similar contrasts it can be shown that :
\[\begin{equation} \beta_1=E(W|Sex=M,SG=A) - E(W|Sex=F,SG=A) \\ =E(W|Sex=M,SG=B) - E(W|Sex=F,SG=B) \\ =E(W|Sex=M,SG=C) - E(W|Sex=F,SG=C) \tag{2.27} \end{equation}\]
and
\[\begin{equation} \beta_2=E(W|Sex=F,SG=B) - E(W|Sex=F,SG=A) \\ =E(W|Sex=M,SG=B) - E(W|Sex=M,SG=B) \\ \tag{2.28} \end{equation}\]
and
\[\begin{equation} \beta_3=E(W|Sex=F,SG=C) - E(W|Sex=F,SG=A)\\ =E(W|Sex=M,SG=C) - E(W|Sex=M,SG=A)\\ \tag{2.29} \end{equation}\]
In short, equations (2.27), (2.28) and (2.29) are showing that effects of \(Sex\) and \(SG\) are independent of each other, allowing us to rewrite them as
\[\begin{equation} \beta_1=E(W|Sex=M) - E(W|Sex=F)\\ \beta_2=E(W|SG=B) - E(W|SG=A)\\ \beta_2=E(W|SG=C) - E(W|SG=A)\\ \tag{2.30} \end{equation}\]
Let us fit this model to our data set:
mwsexsg<-lm(weight~sex+status,data=z0)
summary(mwsexsg)
##
## Call:
## lm(formula = weight ~ sex + status, data = z0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -39.233 -7.908 -0.237 7.943 38.670
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 38.879 1.092 35.614 < 2e-16 ***
## sexM 6.691 1.091 6.133 1.76e-09 ***
## statusB 22.365 1.323 16.903 < 2e-16 ***
## statusC 49.977 1.349 37.043 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.2 on 496 degrees of freedom
## Multiple R-squared: 0.7398, Adjusted R-squared: 0.7382
## F-statistic: 470 on 3 and 496 DF, p-value: < 2.2e-16
let us compare this result with previous ones obtained in models mws
and mwsg
:
model_list<-list(mwsexsg=mwsexsg,mws=mws,mwsg=mwsg)
model_list %>% map(coef)
## $mwsexsg
## (Intercept) sexM statusB statusC
## 38.879016 6.691093 22.364843 49.977436
##
## $mws
## (Intercept) sexM
## 62.747171 6.413481
##
## $mwsg
## (Intercept) statusB statusC
## 42.24460 22.24812 49.91558
Some says that model mwsexsg
had effects of sex adjusted by status growth or vice versa, effect of status growth adjusted by sex. Those same people point out that differences in parameter estimates as shown comparing the 3 models built until now, are the result for this correction or adjustment. Truth is, model mwsexsg
is assuming that effect of \(Sex\) and \(SG\) are independent of each other. Next we will see how this assumption can be ignored, allowing for dependence of effects.
Let \[\begin{equation} E(\textbf{Y}|\textbf{X}=X)= \\ \left(E(Y|Sex=F,SG=A),E(Y|Sex=M,SG=A),E(Y|Sex=F,SG=B),\\ E(Y|Sex=M,SG=B),E(Y|Sex=F,SG=C),E(Y|Sex=M,SG=C)\right)^T \tag{2.31} \end{equation}\]
As before \(E(\textbf{Y}|\textbf{X}=X)=X\beta\) where
\[\begin{equation} X=\begin{pmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 1 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1 \\ 1 & 1 & 0 & 1 \\ \end{pmatrix} \tag{2.32} \end{equation}\]
To solve for \(\beta\) we get: \[\begin{equation} \beta=\left[ \left( X^TX\right)^{-1}X^T\right]E(\textbf{Y}|\textbf{X}=X) \tag{2.33} \end{equation}\]
in particular \[\begin{equation} \left( X^TX \right)^{-1}X^T=\begin{pmatrix} 2/3 & 1/3 & 1/6 & -1/6 & 1/6 & - 1/6 \\ -1/3 & 1/3 & -1/3 & 1/3 & -1/3 & 1/3 \\ -1/2 & -1/2 & 1/2 & 1/2 & 0 & 0 \\ -1/2 & -1/2 & 0 & 0 & 1/2 & 1/2 \\ \end{pmatrix} \tag{2.34} \end{equation}\]
therefore, it can be seen for example that to solve for \(\beta_1\) we get: \[\begin{equation} \beta_1= \frac{E(Y|Sex=M,SG=A)-E(Y|Sex=F,SG=A)}{3}+ \\ \frac{E(Y|Sex=M,SG=B)-E(Y|Sex=F,SG=B)}{3}+ \\ \frac{E(Y|Sex=M,SG=C)-E(Y|Sex=F,SG=C)}{3} \tag{2.35} \end{equation}\]
That is \(\beta_1\) is the average of mean differences of males vs females across all status growth groups. A similar finding can be shown for \(\beta_2\) and \(\beta_3\). This is however, assuming \(E(\textbf{Y}|\textbf{X}=X)\) have arbitrary values which is not the case following equation (2.17).
It is also worth nothing that if we exclude rows with more than 2 ones in (2.32), we get results (2.20), (2.22) and (2.25) in a similar way as in (2.16) since removing those rows will get us to an square invertible matrix similar to (2.15).
We are now going to explore a model with interaction terms between \(Sex\) ans \(SG\):
\[\begin{equation} E(W|Sex=s,SG=sg)= E(W|I_{sex-M}=x_1,I_{sg-B}=x_2,I_{sg-C}=x_3)=\\ \beta_0+\beta_1x_1+\beta_2x_2+\beta_3x_3+\beta_4(x_1x_2)+\beta_5(x_1x_3) \tag{2.36} \end{equation}\]
Parameters \(\beta_4\) and \(\beta_5\) associated with products \(x_1x_2\) and \(x_1x_3\) respectively are the new interaction terms. let us evaluate the model in each of the 6 possible combinations as before:
\[\begin{equation} E(W|Sex=F,SG=A)= \\ E(W|I_{sex-M}=0,I_{sg-B}=0,I_{sg-C}=0)=\beta_0 \tag{2.37} \end{equation}\]
Therefore \(\beta_0\) is just the expected value of weight for females and growth status \(A\) (the same result as in the additive model).
\[\begin{equation} E(W|Sex=M,SG=A)= \\ E(W|I_{sex-M}=1,I_{sg-B}=0,I_{sg-C}=0)=\beta_0+\beta_1 \tag{2.38} \end{equation}\]
therefore
\[\begin{equation} \beta_1 = E(W|Sex=M,SG=A) - E(W|Sex=F,SG=A) \\\tag{2.39} \end{equation}\]
\[\begin{equation} E(W|Sex=F,SG=B)= \\ E(W|I_{sex-M}=0,I_{sg-B}=1,I_{sg-C}=0)=\beta_0+\beta_2 \tag{2.40} \end{equation}\]
therefore
\[\begin{equation} \beta_2 = E(W|Sex=F,SG=B) - E(W|Sex=F,SG=A) \\\tag{2.41} \end{equation}\]
\[\begin{equation} E(W|Sex=M,SG=B)= \\ E(W|I_{sex-M}=1,I_{sg-B}=1,I_{sg-C}=0)=\beta_0+\beta_1+\beta_2+\beta_4 \tag{2.42} \end{equation}\]
\[\begin{equation} E(W|Sex=F,SG=C)= \\ E(W|I_{sex-M}=0,I_{sg-B}=0,I_{sg-C}=1)=\beta_0+\beta_3 \tag{2.43} \end{equation}\]
therefore
\[\begin{equation} \beta_3 = E(W|Sex=F,SG=C) - E(W|Sex=F,SG=A) \\\tag{2.44} \end{equation}\]
\[\begin{equation} E(W|Sex=M,SG=C)= \\ E(W|I_{sex-M}=1,I_{sg-B}=0,I_{sg-C}=1)=\beta_0+\beta_1+\beta_3+\beta_5 \tag{2.45} \end{equation}\]
We need to work a litle more to get a meaning for the interaction terms parameters \(\beta_4\) and \(\beta_5\):
\[\beta_4=E(W|Sex=M,SG=B)-\beta_0-\beta_1-\beta_2 \\ \Leftrightarrow \\ \beta_4=E(W|Sex=M,SG=B) -E(W|Sex=F,SG=A) \\ - (E(W|Sex=M,SG=A)-E(W|Sex=F,SG=A)) \\ - (E(W|Sex=F,SG=B)-E(W|Sex=F,SG=A)) \\ \Leftrightarrow \\ \beta_4=E(W|Sex=M,SG=B) +E(W|Sex=F,SG=A) \\ -E(W|Sex=M,SG=A)-E(W|Sex=F,SG=B)\] which can be rewritten as:
\[\begin{equation} \beta_4=E(W|Sex=M,SG=B)-E(W|Sex=M,SG=A)\\ -(E(W|Sex=F,SG=B)-E(W|Sex=F,SG=A)) \\ OR \\ \beta_4= E(W|Sex=M,SG=B)-E(W|Sex=F,SG=B) \\ -(E(W|Sex=M,SG=A)-E(W|Sex=F,SG=A)) \tag{2.46} \end{equation}\]
We can say that \(\beta_4\) is a measure of difference of effect of sex when comparing \(SG\) groups B vs A or vice versa, a measure of difference of effect of SG (B vs A) when comparing sex Males vs Females.
A similar result can be obtain for \(\beta_5\), can you obtain it?
\[\begin{equation} \beta_5=E(W|Sex=M,SG=C)-E(W|Sex=M,SG=A)\\ -(E(W|Sex=F,SG=C)-E(W|Sex=F,SG=A)) \\ OR \\ \beta_5= E(W|Sex=M,SG=C)-E(W|Sex=F,SG=C) \\ -(E(W|Sex=M,SG=A)-E(W|Sex=F,SG=A)) \tag{2.47} \end{equation}\]
Let us fit this model in R:
mwsexsgint<-lm(weight~sex*status, data=z0)
summary(mwsexsgint)
##
## Call:
## lm(formula = weight ~ sex * status, data = z0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.1624 -6.1178 0.2209 6.3394 28.3450
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 44.583 1.094 40.760 <2e-16 ***
## sexM -4.650 1.542 -3.015 0.0027 **
## statusB 20.506 1.521 13.485 <2e-16 ***
## statusC 34.203 1.556 21.976 <2e-16 ***
## sexM:statusB 3.421 2.163 1.582 0.1143
## sexM:statusC 31.736 2.205 14.393 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.965 on 494 degrees of freedom
## Multiple R-squared: 0.827, Adjusted R-squared: 0.8252
## F-statistic: 472.2 on 5 and 494 DF, p-value: < 2.2e-16