library(tidyverse)
library(Stat2Data)
library(skimr)
library(agricolae)
Some students were interested in how an acidic environment might affect the growth of plants. They planted alfalfa seed in 15 cups and randomly chose five to get plain water, five to get a moderate anount of acid (1.5M HCl), and five to get a stronger acid solution (3.0M HCl). The plants were grown in an indoor room, so the students assumed that the distance from the main source of daylight (a window) might have an effect on growth rates. For this reason, they arranged the cups in give rows of three, with one cup from each Acid level in each row. these are labeled in the dataset as Row: a = farthest from the window through e = nearest to the window. Each cup was an experimental unit, and the response variable was the average height of the alfalfa sprouts in each cup after four days (Ht4). The data are shown in the table belowand stored in the Alfalfa file.
table <- matrix(c(1.45,2.79,1.93,2.33,4.85,1.00,0.70,1.37,2.80,1.46,1.03,1.22,0.45,1.65,1.07),ncol=5,byrow=TRUE)
colnames(table) <- c("a", " b", "c", "d", "e")
rownames(table) <- c("water","1.5 HCl","3.0 HCl")
table <- as.table(table)
table
## a b c d e
## water 1.45 2.79 1.93 2.33 4.85
## 1.5 HCl 1.00 0.70 1.37 2.80 1.46
## 3.0 HCl 1.03 1.22 0.45 1.65 1.07
table <- matrix(c(1.45,2.79,1.93,2.33,4.85,2.67,1.00,0.70,1.37,2.80,1.46,1.466,1.03,1.22,0.45,1.65,1.07,1.084,1.16,1.57,1.25,2.26,2.46,1.74),ncol=6,byrow=TRUE)
colnames(table) <- c("a", " b", "c", "d", "e","Avg")
rownames(table) <- c("water","1.5 HCl","3.0 HCl","Avg")
table <- as.table(table)
table
## a b c d e Avg
## water 1.450 2.790 1.930 2.330 4.850 2.670
## 1.5 HCl 1.000 0.700 1.370 2.800 1.460 1.466
## 3.0 HCl 1.030 1.220 0.450 1.650 1.070 1.084
## Avg 1.160 1.570 1.250 2.260 2.460 1.740
data("Alfalfa")
mydata <- Alfalfa$Ht4
mymean <- mean(mydata)
mysd <- sd(mydata)
data(Alfalfa)
a1 <- aov(Ht4~Acid + Row,data=Alfalfa)
summary(a1)
## Df Sum Sq Mean Sq F value Pr(>F)
## Acid 2 6.852 3.426 4.513 0.0487 *
## Row 4 4.183 1.046 1.378 0.3235
## Residuals 8 6.072 0.759
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
t_water = 2.67-mymean
t_hcl1 = 1.466-mymean
t_hcl3 = 1.084-mymean
b_a = 1.16-mymean
b_b = 1.57-mymean
b_c = 1.25-mymean
b_d = 2.26-mymean
b_e = 2.46-mymean
# SSA = sum(J(y_i-y)^2)
SSA = 5*((t_water^2)+(t_hcl1^2)+(t_hcl3^2))
# SSB = sum(I(y_i-y)^2)
SSB = 3*((b_a^2)+(b_b^2)+(b_c^2)+(b_d^2)+(b_e^2))
# SST = (n-1)s_y^2
SST = ((3*5)-1)*(mysd^2)
# SSE = SST - SSA - SSB
SSE = SST-SSA-SSB
# MSA = SSA/I-1
MSA = SSA/2
# MSB = SSB/J-1
MSB = SSB/4
# MSE = SSE/(I-1)(J-1)
MSE = SSE/8
# F=MSA/MSE or F=MSB/MSE
F1 = MSA/MSE
F2 = MSB/MSE
# p-value
pf(F1, 2, 8, lower.tail=F)
## [1] 0.04873771
pf(F2, 4, 8, lower.tail=F)
## [1] 0.3235159
Df | Sum Sq | Mean Sq | F value | Pr(\(>\)F) | ||
---|---|---|---|---|---|---|
Acid | 2 | 6.852 | 3.426 | 4.513 | 0.0487 | |
Row | 4 | 4.18 | 1.046 | 1.378 | 0.3235 | |
Residuals | 8 | 6.07 | 0.759 |
plot(a1,which=1)
plot(a1,which=2)
Sea slugs, common on the coast of Southern California, live on vaucherian seaweed. But the larvae from these sea slugs need to locate this type of seaweed to survive. A study was done to try to determine whether chemicals that leach out of the seaweed attract the larvae. Seawater was collected over a patch of this kind of seaweed at 5-minute intervals as the tide was coming in and, presumably, mixing with the chemicals. The idea was that as more seawater came in, the concentration of the chemicals was reduced. Each sample of water was divided into six parts. Larvae were then introduced to this seawater to see what percentage metamorphosed. Is there a difference in this percentage over the five time periods? Open the dataset SeaSlugs.
data("SeaSlugs")
a1 <- aov(Percent~factor(Time), data=SeaSlugs)
anova(a1)
## Analysis of Variance Table
##
## Response: Percent
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(Time) 5 0.63091 0.126182 5.9648 0.0006067 ***
## Residuals 30 0.63464 0.021155
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(LSD.test(a1,"factor(Time)"))
## $statistics
## MSerror Df Mean CV t.value LSD
## 0.02115452 30 0.2716667 53.53838 2.042272 0.1714963
##
## $parameters
## test p.ajusted name.t ntr alpha
## Fisher-LSD none factor(Time) 6 0.05
##
## $means
## Percent std r LCL UCL Min Max Q25 Q50
## 0 0.5356667 0.1687859 6 0.41440050 0.6569328 0.357 0.857 0.47525 0.5000
## 10 0.1776667 0.1238881 6 0.05640050 0.2989328 0.067 0.333 0.08350 0.1330
## 15 0.1833333 0.1470397 6 0.06206716 0.3045995 0.000 0.333 0.05350 0.2405
## 20 0.2191667 0.1383914 6 0.09790050 0.3404328 0.067 0.437 0.10775 0.2335
## 25 0.1686667 0.1484650 6 0.04740050 0.2899328 0.000 0.412 0.08350 0.1330
## 5 0.3455000 0.1423921 6 0.22423383 0.4667662 0.125 0.467 0.26050 0.4000
## Q75
## 0 0.52475
## 10 0.28300
## 15 0.28125
## 20 0.26700
## 25 0.23350
## 5 0.45025
##
## $comparison
## NULL
##
## $groups
## Percent groups
## 0 0.5356667 a
## 5 0.3455000 b
## 20 0.2191667 bc
## 15 0.1833333 bc
## 10 0.1776667 bc
## 25 0.1686667 c
##
## attr(,"class")
## [1] "group"
a1 <- aov(Percent~factor(Time), data=SeaSlugs)
TukeyHSD(a1)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Percent ~ factor(Time), data = SeaSlugs)
##
## $`factor(Time)`
## diff lwr upr p adj
## 5-0 -0.190166667 -0.4455792 0.06524590 0.2397208
## 10-0 -0.358000000 -0.6134126 -0.10258743 0.0023231
## 15-0 -0.352333333 -0.6077459 -0.09692077 0.0027831
## 20-0 -0.316500000 -0.5719126 -0.06108743 0.0085222
## 25-0 -0.367000000 -0.6224126 -0.11158743 0.0017407
## 10-5 -0.167833333 -0.4232459 0.08757923 0.3666256
## 15-5 -0.162166667 -0.4175792 0.09324590 0.4038772
## 20-5 -0.126333333 -0.3817459 0.12907923 0.6641386
## 25-5 -0.176833333 -0.4322459 0.07857923 0.3114499
## 15-10 0.005666667 -0.2497459 0.26107923 0.9999998
## 20-10 0.041500000 -0.2139126 0.29691257 0.9960188
## 25-10 -0.009000000 -0.2644126 0.24641257 0.9999978
## 20-15 0.035833333 -0.2195792 0.29124590 0.9980127
## 25-15 -0.014666667 -0.2700792 0.24074590 0.9999748
## 25-20 -0.050500000 -0.3059126 0.20491257 0.9901287
0.8/(1-0.8)
## [1] 4
0.25/(1-0.25)
## [1] 0.3333333
0.6/(1-0.6)
## [1] 1.5
1/((1/(1/3))+1)
## [1] 0.25
1/((1/(5/2))+1)
## [1] 0.7142857
1/((1/(1/9))+1)
## [1] 0.1
Two types of dementia are Dementia with Lewy Bodies and Alzheimer’s disease. Some people are afflicted with both of these. The file LewyBody2Groups includes the variable Type, which has two levels: “DLB/AD” for the 20 subjects with both types of dementia and “DLB” for the 19 subjects with only Lewy Body dementia. the variable MMSE measures change in functional performance on the Mini Mental State Examination. We are interested in using MMSE to predict whether or not Alzheimer’s disease is present. A fitted logistic model is
\[ log(\frac{\hat{\pi}}{1-\hat{\pi}})=-0.742-0.294MMSE \]
exp(-0.742-0.294*(-4))
## [1] 1.543419
(exp(-0.742-0.294*(-4)))/(1+exp(-0.742-0.294*(-4)))
## [1] 0.6068284
o_4 = exp(-0.742-0.294*(-4))
o_3 = exp(-0.742-0.294*(-3))
o_3-o_4
## [1] -0.3931451
p_4 = (exp(-0.742-0.294*(-4)))/(1+exp(-0.742-0.294*(-4)))
p_3 = (exp(-0.742-0.294*(-3)))/(1+exp(-0.742-0.294*(-3)))
p_3-p_4
## [1] -0.07188548