I am going to try to complete the homework and project by giving you some examples in coding these non-parametric tests.

To keep this assignment simple, we are going to use the built in dataset ChickWeight that is included with ggplot2.

head(data)
##   Weight Time Chick Diet
## 1     42    0     1    1
## 2     51    2     1    1
## 3     59    4     1    1
## 4     64    6     1    1
## 5     76    8     1    1
## 6     93   10     1    1

Wilcoxson Ranked Sign Test

head(data)
##   Weight Time Chick Diet
## 1     42    0     1    1
## 2     51    2     1    1
## 3     59    4     1    1
## 4     64    6     1    1
## 5     76    8     1    1
## 6     93   10     1    1

I am going to look at the difference of the summary and year cost and see if the species make a difference

df <- data[which(data$Chick %in% c("setosa","versicolor")),]
df["ChickWeight.Difference"] = df$Weight - df$Time

With that all cleaned up we run the test.

wilcox.test(data$Time, data$Chick, data = df3, paired=TRUE)
## Warning in wilcox.test.default(data$Time, data$Chick, data = df3, paired =
## TRUE): cannot compute exact p-value with ties
## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  data$Time and data$Chick
## V = 378.5, p-value = 5.994e-05
## alternative hypothesis: true location shift is not equal to 0

So we are able to reject the null hypothesis.

by(data$Chick, data$Time, median)
## data$Time: 0
## [1] 1
## ------------------------------------------------------------ 
## data$Time: 2
## [1] 1
## ------------------------------------------------------------ 
## data$Time: 4
## [1] 1
## ------------------------------------------------------------ 
## data$Time: 6
## [1] 1
## ------------------------------------------------------------ 
## data$Time: 7
## [1] 1
## ------------------------------------------------------------ 
## data$Time: 8
## [1] 1
## ------------------------------------------------------------ 
## data$Time: 10
## [1] 1

Visualize the data by boxplot

boxplot(data$Time ~ data$Weight)

Kruskal-Wallis

by(data$Weight, data$Time, median)
## data$Time: 0
## [1] 51
## ------------------------------------------------------------ 
## data$Time: 2
## [1] 59
## ------------------------------------------------------------ 
## data$Time: 4
## [1] 64
## ------------------------------------------------------------ 
## data$Time: 6
## [1] 78.5
## ------------------------------------------------------------ 
## data$Time: 7
## [1] 76
## ------------------------------------------------------------ 
## data$Time: 8
## [1] 84.5
## ------------------------------------------------------------ 
## data$Time: 10
## [1] 93
kruskal.test(Weight ~ Time, data = data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  Weight by Time
## Kruskal-Wallis chi-squared = 3.1178, df = 6, p-value = 0.7939

Here I am able to reject the null hypothesis.

ggplot

ggplot(data = data, aes(data$Weight, data$Time, color = Chick))+
geom_point()
## Warning: Use of `data$Weight` is discouraged. Use `Weight` instead.
## Warning: Use of `data$Time` is discouraged. Use `Time` instead.