Problem 1

dat<-data.frame(kg=c(1.5,2.1,1.9,2.8,1.4,1.8,
1.8,2.0,2.0,2.7,1.6,2.3,
1.9,2.5,2.5,2.6,2.1,2.4),
date=factor(rep(c("Sept 1","Sept 15","Sept30"),each=6)),
block=factor(rep(1:6,3)))

Part A.

Hypothesis: \(H_0 = \tau_1 = \tau_2 = \dots = \tau_k\) \(H_a\) = Not all \(\tau_i\) are the same.

Test: Below, we complete an F test for randomized complete block design. We obtain a p-value of 0.03.

Conclusion: With a p-value obtained from our permutation test of 0.03 < \(\alpha = 0.05\), we reject our \(H_0\). That is, there is sufficient evidence to claim that the treatement effects are the not the same and there is a difference among cutting dates.

dat<-dat[order(dat$block),]

#Usual F-test
lmOut<-lm(kg~date+block,data=dat)
Fobs <- anova(lmOut)$'F value'[1]

set.seed(1234)
nsim<-1000
fVec<-rep(NA,nsim)
for (i in 1:nsim){
datPermute<-dat
#Shuffle WITHIN blocks
datPermute$kg<-unlist(tapply(datPermute$kg,datPermute$block,
function(x){sample(x,length(x),replace=FALSE)}))
#fit the linear model
lmOut<-lm(kg~date+block,data=datPermute)
#Compute the F statistic for the treatments.
fVec[i]<-anova(lmOut)$'F value'[1]
}

#Pvalue
sum(fVec >= Fobs)/nsim

## [1] 0.03

Part B.

From the Friedman test, we obtain a p-value of 0.1146 > \(\alpha = 0.05\). We fail to reject the \(H_0\). We cannot conclude there is a differece between the dates.

friedman.test(dat$kg,dat$date,dat$block)

## 
##  Friedman rank sum test
## 
## data:  dat$kg, dat$date and dat$block
## Friedman chi-squared = 4.3333, df = 2, p-value = 0.1146

Part C.

The ANOVA provides a p-value of 0.012927 < \(\alpha = 0.05\). This does not agree with the Friedman test of the above results because it may violate the assumption of normality. It is closer to the first permutation test, but thats because the permutation test relies on the ANOVA.

We reject our \(H_0\) based on the ANOVA and investigate where the differences in treatment lie.

fit<-lm(kg~date+block,data=dat)
anova(fit)

## Analysis of Variance Table
## 
## Response: kg
##           Df  Sum Sq Mean Sq F value   Pr(>F)   
## date       2 0.53444 0.26722  6.9308 0.012927 * 
## block      5 2.00944 0.40189 10.4236 0.001022 **
## Residuals 10 0.38556 0.03856                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Part D.

From Tukey’s HSD Test below, we see that there is a significant difference between the September 1 treatment and the September 30th treatment.

a1 <- aov(kg~date+block,data=dat)
tuk <- TukeyHSD(x=a1,  conf.level=0.95)
tuk

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = kg ~ date + block, data = dat)
## 
## $date
##                     diff         lwr       upr     p adj
## Sept 15-Sept 1 0.1500000 -0.16076969 0.4607697 0.4149464
## Sept30-Sept 1  0.4166667  0.10589698 0.7274364 0.0108584
## Sept30-Sept 15 0.2666667 -0.04410302 0.5774364 0.0938611
## 
## $block
##            diff          lwr         upr     p adj
## 2-1  0.46666667 -0.090188952  1.02352229 0.1167609
## 3-1  0.40000000 -0.156855619  0.95685562 0.2126457
## 4-1  0.96666667  0.409811048  1.52352229 0.0012854
## 5-1 -0.03333333 -0.590188952  0.52352229 0.9999268
## 6-1  0.43333333 -0.123522286  0.99018895 0.1583077
## 3-2 -0.06666667 -0.623522286  0.49018895 0.9978919
## 4-2  0.50000000 -0.056855619  1.05685562 0.0855800
## 5-2 -0.50000000 -1.056855619  0.05685562 0.0855800
## 6-2 -0.03333333 -0.590188952  0.52352229 0.9999268
## 4-3  0.56666667  0.009811048  1.12352229 0.0455544
## 5-3 -0.43333333 -0.990188952  0.12352229 0.1583077
## 6-3  0.03333333 -0.523522286  0.59018895 0.9999268
## 5-4 -1.00000000 -1.556855619 -0.44314438 0.0009844
## 6-4 -0.53333333 -1.090188952  0.02352229 0.0624913
## 6-5  0.46666667 -0.090188952  1.02352229 0.1167609

Problem 2.

Reject \(H_0\) if S \(\geq s_\alpha\)

ozone <- data.frame(effect = c(-0.08, .21, .50, .14, 0.01,.17,-.11, 0.07,0.06,0.19,.34, 0.14), trt = factor(rep(c("After 0.1", "After 0.6", "After 1.0"),each=4)), block=factor(rep(1:4,3)))

Below, we find from our Friedman test that S = 3.6. Our critical value is s_= 6.5. Since S < \(s_\alpha\), and our p-value is 0.1653 > \(\alpha = 0.05\), we fail to reject \(H_0\). The treatment effects can be argued to be equal.

friedman.test(ozone$effect,ozone$trt,ozone$block)

## 
##  Friedman rank sum test
## 
## data:  ozone$effect, ozone$trt and ozone$block
## Friedman chi-squared = 3.6, df = 2, p-value = 0.1653

library(sf)

## Linking to GEOS 3.6.1, GDAL 2.1.3, PROJ 4.9.3

library(NSM3)

## Loading required package: combinat

## 
## Attaching package: 'combinat'

## The following object is masked from 'package:utils':
## 
##     combn

## Loading required package: MASS

## Loading required package: partitions

## Loading required package: survival

## fANCOVA 0.5-1 loaded

cFrd(0.05,3,4)

## Number of blocks: n=4
## Number of treatments: k=3
## For the given alpha=0.05, the upper cutoff value is Friedman, Kendall-Babington Smith S=6.5, with true alpha level=0.0417

Problem 3.

Friedman’s test could be applied to data from a one-way layout in which there are the same number, n, or observations from each of the k treatments. We would have to alter the data by adding some blocking factor to it. But, we really should not be doing that because we cannot appropriately add a factor like this without making our ultimate test results inaccurate.

Problem 4.

Suppose k=3 and n=3. Obtain the form of the exact null distribution of S for the case of no-tied observations.

p=3; 
n=3
Rsum = rep(NA,6^3)

library(gtools)
perm = permutations(3,3)

start = 1
for (i in 1:6)
{
   for (j in 1:6)
   {
         for (k in 1:6)
         {
          start = start + 1   
           matrix = matrix(data =       c(perm[i,1],perm[i,2],perm[i,3],perm[j,1],perm[j,2],perm[j,3],perm[k,1],perm[k,2],perm[k,3]), nrow = 3, ncol = 3)
         Rsum[start] = sum(apply(matrix, 1, sum)^2)
             
         }
   }
}


S = 12/(n*p*(p+1)) *(Rsum) - 3*n*(p+1)
hist(S)

Problem 5

library(gtools)
height<-c(68,70,74)
weight<-c(145,155,160)

rhoS <- cor(height,weight, method = "spearman")
cor.test(height,weight,method="spearman")

## 
##  Spearman's rank correlation rho
## 
## data:  height and weight
## S = 0, p-value = 0.3333
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho 
##   1

perms <- permutations(3,3)
spear <-rep(NA,dim(perms)[1])
for (i in 1:dim(perms)[1]){
spear[i] <- cor(height,weight[perms[i,]], method="spearman")
}
hist(spear, xlim = c(-1,1))
abline(v=rhoS, col='red', lwd=5)

rhoS <- cor(height,weight, method = "kendall")
cor.test(height,weight,method="kendall")

## 
##  Kendall's rank correlation tau
## 
## data:  height and weight
## T = 3, p-value = 0.3333
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau 
##   1

perms <- permutations(3,3)
kend <-rep(NA,dim(perms)[1])
for (i in 1:dim(perms)[1]){
kend[i] <- cor(height,weight[perms[i,]], method="kendall")
}
hist(kend, xlim = c(-1,1))
abline(v=rhoS, col='red', lwd=5)

#Problem 6

age<-c(3,7,15,24,85,180,360)
strength<-c(2500,3200,4300,5300,5900,6700,6900)

Part A.

#Plot Curve
plot(strength~age)

#Definitely nonlinear, almost looks like an ROC curve! 

#Pearson's Correlation
rho <-cor(age,strength)
rho

## [1] 0.7858418

#Spearman's Correlation
cor(age,strength,method="spearman")

## [1] 1

#Kendall's Correlation
cor(age,strength,method="kendall")

## [1] 1

Part C.

Hypothesis: \(H_0 : \rho = 0\) \(H_a : \rho \neq 0\)

The Pearson test concludes that the true correlation is not equal to 0. That is, there is linear some association between age and strength. This is not a good test to use though. Based on the plot, it is clear that the data is not linear.

With a p-value of 0.0003968 < \(\alpha=0.05\), we reject the \(H_0\). The Spearman test concludes that there is some strictly monotonic relationship between the two variables age and strength. This would be a better test to use over the pearson test.

With a p-value of 0.0003968 < \(\alpha=0.05\), we reject the \(H_0\). The Kendall test conludes that there is some association between the two variables age and strength.

cor.test(age,strength)

## 
##  Pearson's product-moment correlation
## 
## data:  age and strength
## t = 2.8414, df = 5, p-value = 0.03619
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.08030986 0.96677652
## sample estimates:
##       cor 
## 0.7858418

cor.test(age,strength, method="spearman")

## 
##  Spearman's rank correlation rho
## 
## data:  age and strength
## S = 1.2434e-14, p-value = 0.0003968
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho 
##   1

cor.test(age, strength, method ="kendall")

## 
##  Kendall's rank correlation tau
## 
## data:  age and strength
## T = 21, p-value = 0.0003968
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau 
##   1

STAT 488 NP HW 3

Kajal Chokshi

2/22/2019

Problem 1

Part A.

Part B.

Part C.

Part D.

Problem 2.

Problem 3.

Problem 4.

Problem 5

Part A.

Part C.