Question 1

In this exercise, students will work together to co-construct knowledge about a useful statistics technique called polynomial contrasts. For convenience, we will consider the Rehabilitation Therapy data set described in Exercise 16.9 (which you had studied in a previous assignment), but now replace the prior fitness status by a certain fitness score (whose value is a real number). In particular, ‘below average’ is replaced by a score of 10, ‘average’ is replaced by a score of 20, and ‘above average’ is replaced by 30; the observed values of the response variable are kept the same. Please work together on the following questions.

  1. Provide a description of polynomial contrasts, and describe the situations where polynomial contrasts can be considered, and the types of inferences and conclusions we will be able to make by using polynomial contrasts. [5pts]

Answer: Polynomial contrasts are orthogonal contrasts that can generalize to several means (more then 2). Polynomial contrasts can be used to construct F test to see if there is a polynomial relationship between the means. These contrasts can be considered when 1) The contrasts are orthogonal that is \[\sum{c_i\bar{Y_i}} \] \[\sum{d_i\bar{Y_i}}\] are such that \[\sum{c_i} = \sum{d_i} = 0\] But further \[\sum{\frac{c_id_i}{n_i}} =0\]

where \(n_i\) is the number of observations from each class. 2) The spacing between the levels is constant

df = read.table("http://www.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Chapter%2016%20Data%20Sets/CH16PR09.txt")
df$Recover = df$V1
df$Fitness = df$V2
df$V1 <- df$V2 <- df$V3 <- NULL 
df$Fitness[df$Fitness ==1] = 10
df$Fitness[df$Fitness ==2] = 20
df$Fitness[df$Fitness ==3] = 30
  1. Can the polynomial contrasts technique be considered for the data in this exercise? Why or Why not? [5pts]

The spacing between the contrasts is constant.

One can pick \(c_i = (1,-1,0)\) and \(d_i=(4/9,5/9,-1)\) and we get that \(\sum{c_i} = \sum{d_i} =0\) and \[\sum{\frac{c_id_i}{n_i}} = \frac{4/9}{8} + \frac{-5/9}{10}+\frac{0*-1}{6} = \frac{2}{9} - \frac{2}{9} = 0\]

So yes, a polynomial contrast can be used with this data

  1. If your answer to b) is yes, define a set of polynomial contrasts that you may consider for this data set, and show that they indeed are contrasts and they are orthogonal to each other (hint. You will need to find the definition of orthogonal contrasts). [5pts]

Answer in part b d) As in Exercise 16.9(g), describe the relationship between physical fitness scores and duration of required physical therapy. If possible, use polynomial contrasts to draw a conclusion. [5pts]

library(devtools)
## Warning: package 'devtools' was built under R version 3.4.2
library(easyGgplot2)
## Loading required package: ggplot2
library("ggplot2")
ggplot2.dotplot(data=df, xName='Fitness',yName='Recover', groupName='Fitness'
,legendPosition="top",addBoxplot=TRUE)
## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

There seems to be a linear relationship between the means

  1. Discuss if it is possible to predict the average value of the response variable when the fitness score is 15. If yes, provide a method for making this prediction. [5pts]

Yes we can make prediction by extrapolation of the “linear regression” fit by the polynomial contrast.

Question 2

df = read.table("http://www.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Chapter%2019%20Data%20Sets/CH19PR10.txt")
df$Cash = df$V1
df$Age = df$V2
df$Gender = df$V3
df$V1 <- df$V2 <- df$V3 <- df$V4 <- NULL 
df$Age[df$Age ==1] = "Young"
df$Age[df$Age ==2] = "Middle"
df$Age[df$Age ==3] = "Elderly"
df$Gender[df$Gender == 1] = "Male"
df$Gender[df$Gender == 2] = "Female"
av = aov(Cash~Age+Gender+Age*Gender,data=df)
anova(av)
## Analysis of Variance Table
## 
## Response: Cash
##            Df Sum Sq Mean Sq F value    Pr(>F)    
## Age         2 316.72 158.361 66.2907 9.789e-12 ***
## Gender      1   5.44   5.444  2.2791    0.1416    
## Age:Gender  2   5.06   2.528  1.0581    0.3597    
## Residuals  30  71.67   2.389                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. Test whether or not interaction effects are present with alpha =.05

\(H_0:all (\alpha\beta) =0\) \(H_a: \exists (\alpha_i\beta_i) \neq 0\) The p value in the ANOVA table is \(0.3597 > 0.05\) we thus fail to reject

  1. \(H_0:\forall i(\alpha_i) =0\) \(H_a: \exists i: (\alpha_i) \neq 0\) The p value in the ANOVA table is \(9.789e-12 < 0.05\) so we conclude that Age main effect is significant

\(H_0:\forall i(\beta_i) =0\) \(H_a: \exists i: (\beta_i) \neq 0\) The p value in the ANOVA table is \(0.1416> 0.05\) so we conclude that Gender effect is not significant

  1. \(\alpha \leq 1-(1-\alpha)^3= 1-(0.95)^3 = 0.143\)