library(tidyverse)
library(Stat2Data)
library(skimr)
library(agricolae)

6.36 Alfalfa sprouts

Some students were interested in how an acidic environment might affect the growth of plants. They planted alfalfa seed in 15 cups and randomly chose five to get plain water, five to get a moderate anount of acid (1.5M HCl), and five to get a stronger acid solution (3.0M HCl). The plants were grown in an indoor room, so the students assumed that the distance from the main source of daylight (a window) might have an effect on growth rates. For this reason, they arranged the cups in give rows of three, with one cup from each Acid level in each row. these are labeled in the dataset as Row: a = farthest from the window through e = nearest to the window. Each cup was an experimental unit, and the response variable was the average height of the alfalfa sprouts in each cup after four days (Ht4). The data are shown in the table belowand stored in the Alfalfa file.

table <- matrix(c(1.45,2.79,1.93,2.33,4.85,1.00,0.70,1.37,2.80,1.46,1.03,1.22,0.45,1.65,1.07),ncol=5,byrow=TRUE)
colnames(table) <- c("a", " b", "c", "d", "e")
rownames(table) <- c("water","1.5 HCl","3.0 HCl")
table <- as.table(table)
table
##            a    b    c    d    e
## water   1.45 2.79 1.93 2.33 4.85
## 1.5 HCl 1.00 0.70 1.37 2.80 1.46
## 3.0 HCl 1.03 1.22 0.45 1.65 1.07
  1. Find the means for each row of cups (a, b, …, e) and each treatment (water, 1.5 HCl, 3.0 HCl). Also find the average and standard deviation for the growth in all 15 cups.
table <- matrix(c(1.45,2.79,1.93,2.33,4.85,2.67,1.00,0.70,1.37,2.80,1.46,1.466,1.03,1.22,0.45,1.65,1.07,1.084,1.16,1.57,1.25,2.26,2.46,1.74),ncol=6,byrow=TRUE)
colnames(table) <- c("a", " b", "c", "d", "e","Avg")
rownames(table) <- c("water","1.5 HCl","3.0 HCl","Avg")
table <- as.table(table)
table
##             a     b     c     d     e   Avg
## water   1.450 2.790 1.930 2.330 4.850 2.670
## 1.5 HCl 1.000 0.700 1.370 2.800 1.460 1.466
## 3.0 HCl 1.030 1.220 0.450 1.650 1.070 1.084
## Avg     1.160 1.570 1.250 2.260 2.460 1.740
data("Alfalfa")
mydata <- Alfalfa$Ht4
mymean <- mean(mydata)
mysd <- sd(mydata)
  1. Construct a two-way main effects ANOVA table for testing for differences in average growth due to the acid treatments using the rows as a blocking variable.
data(Alfalfa)
a1 <- aov(Ht4~Acid + Row,data=Alfalfa)
summary(a1)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## Acid         2  6.852   3.426   4.513 0.0487 *
## Row          4  4.183   1.046   1.378 0.3235  
## Residuals    8  6.072   0.759                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
t_water = 2.67-mymean
t_hcl1 = 1.466-mymean
t_hcl3 = 1.084-mymean
b_a = 1.16-mymean
b_b = 1.57-mymean
b_c = 1.25-mymean
b_d = 2.26-mymean
b_e = 2.46-mymean

# SSA = sum(J(y_i-y)^2)
SSA = 5*((t_water^2)+(t_hcl1^2)+(t_hcl3^2))

# SSB = sum(I(y_i-y)^2)
SSB = 3*((b_a^2)+(b_b^2)+(b_c^2)+(b_d^2)+(b_e^2))

# SST = (n-1)s_y^2
SST = ((3*5)-1)*(mysd^2)

# SSE = SST - SSA - SSB
SSE = SST-SSA-SSB

# MSA = SSA/I-1
MSA = SSA/2

# MSB = SSB/J-1
MSB = SSB/4

# MSE = SSE/(I-1)(J-1)
MSE = SSE/8

# F=MSA/MSE or F=MSB/MSE
F1 = MSA/MSE
F2 = MSB/MSE

# p-value
pf(F1, 2, 8, lower.tail=F)
## [1] 0.04873771
pf(F2, 4, 8, lower.tail=F)
## [1] 0.3235159
Df Sum Sq Mean Sq F value Pr(\(>\)F)
Acid 2 6.852 3.426 4.513 0.0487
Row 4 4.18 1.046 1.378 0.3235
Residuals 8 6.07 0.759
  1. Check the conditions required for the ANOVA model.
plot(a1,which=1)

plot(a1,which=2)

  1. Based on the ANOVA, would you conclude that there is a significant difference in average growth due to the treatments? Explain why or why not.
  1. Based on the ANOVA, would you conclude that there is a significant difference in average growth due to the distance from the window? Explain why or why not.

8.14 Sea slugs

Sea slugs, common on the coast of Southern California, live on vaucherian seaweed. But the larvae from these sea slugs need to locate this type of seaweed to survive. A study was done to try to determine whether chemicals that leach out of the seaweed attract the larvae. Seawater was collected over a patch of this kind of seaweed at 5-minute intervals as the tide was coming in and, presumably, mixing with the chemicals. The idea was that as more seawater came in, the concentration of the chemicals was reduced. Each sample of water was divided into six parts. Larvae were then introduced to this seawater to see what percentage metamorphosed. Is there a difference in this percentage over the five time periods? Open the dataset SeaSlugs.

  1. Use Fisher’s LSD intervals to find any differences that exist between the percent of larvae that metamorphosed in the different water conditions.
data("SeaSlugs")
a1 <- aov(Percent~factor(Time), data=SeaSlugs)
anova(a1)
## Analysis of Variance Table
## 
## Response: Percent
##              Df  Sum Sq  Mean Sq F value    Pr(>F)    
## factor(Time)  5 0.63091 0.126182  5.9648 0.0006067 ***
## Residuals    30 0.63464 0.021155                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(LSD.test(a1,"factor(Time)"))
## $statistics
##      MSerror Df      Mean       CV  t.value       LSD
##   0.02115452 30 0.2716667 53.53838 2.042272 0.1714963
## 
## $parameters
##         test p.ajusted       name.t ntr alpha
##   Fisher-LSD      none factor(Time)   6  0.05
## 
## $means
##      Percent       std r        LCL       UCL   Min   Max     Q25    Q50
## 0  0.5356667 0.1687859 6 0.41440050 0.6569328 0.357 0.857 0.47525 0.5000
## 10 0.1776667 0.1238881 6 0.05640050 0.2989328 0.067 0.333 0.08350 0.1330
## 15 0.1833333 0.1470397 6 0.06206716 0.3045995 0.000 0.333 0.05350 0.2405
## 20 0.2191667 0.1383914 6 0.09790050 0.3404328 0.067 0.437 0.10775 0.2335
## 25 0.1686667 0.1484650 6 0.04740050 0.2899328 0.000 0.412 0.08350 0.1330
## 5  0.3455000 0.1423921 6 0.22423383 0.4667662 0.125 0.467 0.26050 0.4000
##        Q75
## 0  0.52475
## 10 0.28300
## 15 0.28125
## 20 0.26700
## 25 0.23350
## 5  0.45025
## 
## $comparison
## NULL
## 
## $groups
##      Percent groups
## 0  0.5356667      a
## 5  0.3455000      b
## 20 0.2191667     bc
## 15 0.1833333     bc
## 10 0.1776667     bc
## 25 0.1686667      c
## 
## attr(,"class")
## [1] "group"
  1. Use Tukey’s HSD intervals to find any differences that exist between the percent of larvae that metamorphosed in the different water conditions.
a1 <- aov(Percent~factor(Time), data=SeaSlugs)
TukeyHSD(a1)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Percent ~ factor(Time), data = SeaSlugs)
## 
## $`factor(Time)`
##               diff        lwr         upr     p adj
## 5-0   -0.190166667 -0.4455792  0.06524590 0.2397208
## 10-0  -0.358000000 -0.6134126 -0.10258743 0.0023231
## 15-0  -0.352333333 -0.6077459 -0.09692077 0.0027831
## 20-0  -0.316500000 -0.5719126 -0.06108743 0.0085222
## 25-0  -0.367000000 -0.6224126 -0.11158743 0.0017407
## 10-5  -0.167833333 -0.4232459  0.08757923 0.3666256
## 15-5  -0.162166667 -0.4175792  0.09324590 0.4038772
## 20-5  -0.126333333 -0.3817459  0.12907923 0.6641386
## 25-5  -0.176833333 -0.4322459  0.07857923 0.3114499
## 15-10  0.005666667 -0.2497459  0.26107923 0.9999998
## 20-10  0.041500000 -0.2139126  0.29691257 0.9960188
## 25-10 -0.009000000 -0.2644126  0.24641257 0.9999978
## 20-15  0.035833333 -0.2195792  0.29124590 0.9980127
## 25-15 -0.014666667 -0.2700792  0.24074590 0.9999748
## 25-20 -0.050500000 -0.3059126  0.20491257 0.9901287
  1. Were your conclusions to (a) and (b) different? Explain. If so, which would you prefer to use in this case and why?

9.4 Probability to odds

  1. If the probability of an event occuring is 0.8, what are the odds?
0.8/(1-0.8)
## [1] 4
  1. If the probability of an event occuring is 0.25, what are the odds?
0.25/(1-0.25)
## [1] 0.3333333
  1. If the probability of an event occuring is 0.6, what are the odds?
0.6/(1-0.6)
## [1] 1.5

9.6 Odds to probabilities

  1. If the odds of an event occuring are 1:3, what is the probability?
1/((1/(1/3))+1)
## [1] 0.25
  1. If the odds of an event occuring are 5:2, what is the probability?
1/((1/(5/2))+1)
## [1] 0.7142857
  1. If the odds of an event occuring are 1:9, what is the probability?
1/((1/(1/9))+1)
## [1] 0.1

9.22 Dementia: Odds and probability

Two types of dementia are Dementia with Lewy Bodies and Alzheimer’s disease. Some people are afflicted with both of these. The file LewyBody2Groups includes the variable Type, which has two levels: “DLB/AD” for the 20 subjects with both types of dementia and “DLB” for the 19 subjects with only Lewy Body dementia. the variable MMSE measures change in functional performance on the Mini Mental State Examination. We are interested in using MMSE to predict whether or not Alzheimer’s disease is present. A fitted logistic model is

\[ log(\frac{\hat{\pi}}{1-\hat{\pi}})=-0.742-0.294MMSE \]

  1. Use this model to estimate the odds of Alzheimer’s disease. \(\pi/(1-\pi)\), if a patient’s \(MMSE\) is -4.
exp(-0.742-0.294*(-4))
## [1] 1.543419
  1. Use this model to estimate the probability of Alzheimer’s disease if a patient’s \(MMSE\) is -4.
(exp(-0.742-0.294*(-4)))/(1+exp(-0.742-0.294*(-4)))
## [1] 0.6068284
  1. How much do the estimated odds change if the \(MMSE\) changes from -4 to -3?
o_4 = exp(-0.742-0.294*(-4))
o_3 = exp(-0.742-0.294*(-3))
o_3-o_4
## [1] -0.3931451
  1. How much does the estimate of \(\pi\) change if the \(MMSE\) changes from -4 to -3?
p_4 = (exp(-0.742-0.294*(-4)))/(1+exp(-0.742-0.294*(-4)))
p_3 = (exp(-0.742-0.294*(-3)))/(1+exp(-0.742-0.294*(-3)))
p_3-p_4
## [1] -0.07188548