aliens <- read.csv ("aliens.csv", header = TRUE, stringsAsFactors = TRUE)
source('special_functions.R')
my_sample <- make.my.sample(33002176, 100, aliens)
## Warning in RNGkind("Mersenne-Twister", "Inversion", "Rounding"): non-uniform
## 'Rounding' sampler used
library(lsr)
boxplot(my_sample$sleep~my_sample$antennae, ylim = c(0, 10))
boxplot(my_sample$sleep~my_sample$color, ylim = c(0, 10))
boxplot(my_sample$food1~my_sample$antennae, ylim = c(0, 20))
boxplot(my_sample$food1~my_sample$color, ylim = c(0, 20))
Based on my box plots, I think that neither sleep or food consumption depends on antennae shape or color. I think this because when looking at the graphs there is no significant different between either box plots.
t.test(my_sample$sleep~my_sample$antennae, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$sleep by my_sample$antennae
## t = -0.19093, df = 31.039, p-value = 0.8498
## alternative hypothesis: true difference in means between group Curly and group Straight is not equal to 0
## 95 percent confidence interval:
## -0.5942912 0.4925432
## sample estimates:
## mean in group Curly mean in group Straight
## 5.923810 5.974684
I would accept the null hypothesis because based on the alternative hypothesis which states that the true difference in means between group curly and group straight is not equal to 0 and the P-value falls above 0.5.
t.test(my_sample$sleep~my_sample$color, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$sleep by my_sample$color
## t = -0.98126, df = 64.458, p-value = 0.3301
## alternative hypothesis: true difference in means between group Blue and group Pink is not equal to 0
## 95 percent confidence interval:
## -0.6858461 0.2339779
## sample estimates:
## mean in group Blue mean in group Pink
## 5.817143 6.043077
I would not reject the null hypothesis because based on the alternative hypothesis which states that the true difference in means between group blue and group pink is not equal to 0 and the P-value falls above 0.5.
t.test(my_sample$food1~my_sample$antennae, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$food1 by my_sample$antennae
## t = -1.8098, df = 31.038, p-value = 0.08002
## alternative hypothesis: true difference in means between group Curly and group Straight is not equal to 0
## 95 percent confidence interval:
## -2.0191655 0.1204313
## sample estimates:
## mean in group Curly mean in group Straight
## 8.000000 8.949367
I would not reject the null hypothesis because based on the alternative hypothesis which states that the true difference in means between group curly and group straight is not equal to 0 and the P-value falls above 0.5.
t.test(my_sample$food1~my_sample$color, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$food1 by my_sample$color
## t = -2.6834, df = 81.033, p-value = 0.008832
## alternative hypothesis: true difference in means between group Blue and group Pink is not equal to 0
## 95 percent confidence interval:
## -1.9328428 -0.2869374
## sample estimates:
## mean in group Blue mean in group Pink
## 8.028571 9.138462
I would not reject the null hypothesis because based on the alternative hypothesis which states that the true difference in means between group blue and group pink is not equal to 0 and the P-value falls above 0.5.
t.test(my_sample$sleep~my_sample$antennae, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$sleep by my_sample$antennae
## t = -0.19093, df = 31.039, p-value = 0.8498
## alternative hypothesis: true difference in means between group Curly and group Straight is not equal to 0
## 95 percent confidence interval:
## -0.5942912 0.4925432
## sample estimates:
## mean in group Curly mean in group Straight
## 5.923810 5.974684
In this t-test, the p-value falls outside of the confidence interval and the result is not statistically significant causing us to reject the null hypothesis.
t.test(my_sample$sleep~my_sample$color, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$sleep by my_sample$color
## t = -0.98126, df = 64.458, p-value = 0.3301
## alternative hypothesis: true difference in means between group Blue and group Pink is not equal to 0
## 95 percent confidence interval:
## -0.6858461 0.2339779
## sample estimates:
## mean in group Blue mean in group Pink
## 5.817143 6.043077
In this t-test, the p-value falls outside of the confidence interval and the result is not statistically significant causing us to reject the null hypothesis.
t.test(my_sample$food1~my_sample$antennae, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$food1 by my_sample$antennae
## t = -1.8098, df = 31.038, p-value = 0.08002
## alternative hypothesis: true difference in means between group Curly and group Straight is not equal to 0
## 95 percent confidence interval:
## -2.0191655 0.1204313
## sample estimates:
## mean in group Curly mean in group Straight
## 8.000000 8.949367
In this t-test, the p-value falls inside of the confidence interval and the result is statistically significant causing us not to reject the null hypothesis.
t.test(my_sample$food1~my_sample$color, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample$food1 by my_sample$color
## t = -2.6834, df = 81.033, p-value = 0.008832
## alternative hypothesis: true difference in means between group Blue and group Pink is not equal to 0
## 95 percent confidence interval:
## -1.9328428 -0.2869374
## sample estimates:
## mean in group Blue mean in group Pink
## 8.028571 9.138462
In this t-test, the p-value falls outside of the confidence interval and the result is not statistically significant causing us to reject the null hypothesis.
cohensD(my_sample$sleep~my_sample$antennae)
## [1] 0.04738408
This would be considered a small effect based on the Cohen’s D rule of thumb.
cohensD(my_sample$sleep~my_sample$color)
## [1] 0.2114845
This would be considered a small effect based on the Cohen’s D rule of thumb.
cohensD(my_sample$food1~my_sample$antennae)
## [1] 0.4491698
This would be considered a small effect based on the Cohen’s D rule of thumb.
cohensD(my_sample$food1~my_sample$color)
## [1] 0.5331372
This would be considered a medium effect based on the Cohen’s D rule of thumb.
The expected Type I error rate for each of tests I did in Question 4 is 0.5.The probability of making at least one Type I error, if all four null hypotheses are actually true would be about 81% or 0.814. I do think there might be any Type I errors in your results due to the fact that the probability of a type 1 error based on the data is very high.
my_sample2 <- make.my.sample(33002177, 100, aliens)
## Warning in RNGkind("Mersenne-Twister", "Inversion", "Rounding"): non-uniform
## 'Rounding' sampler used
t.test(my_sample2$sleep~my_sample2$antennae, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample2$sleep by my_sample2$antennae
## t = 0.52682, df = 17.724, p-value = 0.6049
## alternative hypothesis: true difference in means between group Curly and group Straight is not equal to 0
## 95 percent confidence interval:
## -0.4740912 0.7909539
## sample estimates:
## mean in group Curly mean in group Straight
## 5.986667 5.828235
My decision would change for this piece of data and I would reject the null hypthesis because the p value now falls inside the confidence interval. I would conclude that the four hypotheses under consideration have small variations and do not change my based on adding the 1 to my student id, however this one happened to have a more significant change.
t.test(my_sample2$sleep~my_sample2$color, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample2$sleep by my_sample2$color
## t = 0.52905, df = 57.01, p-value = 0.5988
## alternative hypothesis: true difference in means between group Blue and group Pink is not equal to 0
## 95 percent confidence interval:
## -0.3128893 0.5375840
## sample estimates:
## mean in group Blue mean in group Pink
## 5.927273 5.814925
My decision would not change for this piece of data and I would not reject the null hypothesis because the p value does not fall inside the confidence interval. I would conclude that the four hypotheses under consideration have small variations and do not change my based on adding the 1 to my student id.
t.test(my_sample2$food1~my_sample2$antennae, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample2$food1 by my_sample2$antennae
## t = -1.339, df = 24.281, p-value = 0.193
## alternative hypothesis: true difference in means between group Curly and group Straight is not equal to 0
## 95 percent confidence interval:
## -1.8928286 0.4026325
## sample estimates:
## mean in group Curly mean in group Straight
## 8.066667 8.811765
My decision would change for this piece of data and I would reject the null hypothesis because the p value does fall inside the confidence interval. I would conclude that the four hypotheses under consideration have small variations and do not change my based on adding the 1 to my student id.
t.test(my_sample2$food1~my_sample2$color, alternative='two.sided')
##
## Welch Two Sample t-test
##
## data: my_sample2$food1 by my_sample2$color
## t = -3.8742, df = 68.512, p-value = 0.0002418
## alternative hypothesis: true difference in means between group Blue and group Pink is not equal to 0
## 95 percent confidence interval:
## -2.8162189 -0.9015559
## sample estimates:
## mean in group Blue mean in group Pink
## 7.454545 9.313433
My decision would not change for this piece of data and I would not reject the null hypothesis because the p value does not fall inside the confidence interval. I would conclude that the four hypotheses under consideration have small variations and do not change my based on adding the 1 to my student id.