1. Problem and Data Set
The researcher is conducting an experiment to understand the joint effects of temperature and humidity on the growth of a specific plant species. The temperature has two levels: Low (20°C) and High (30°C), while humidity also has two levels: Low (40%) and High (80%). Each combination of temperature and humidity will be tested on 16 plant samples. For each combination of temperature and humidity, the researcher measures the height of 12 randomly selected plants after two weeks from the 3 different lots. The growth in centimeters were recorded as follows.
| Temperature | Humidity | Lot | Plant Growth |
|---|---|---|---|
| Low | Low | 1 | 12.5 |
| Low | High | 1 | 15.2 |
| High | Low | 1 | 14.8 |
| High | High | 1 | 18.3 |
| Low | Low | 2 | 12.8 |
| Low | High | 2 | 16.3 |
| High | Low | 2 | 13.4 |
| High | High | 2 | 17.9 |
| Low | Low | 3 | 13.0 |
| Low | High | 3 | 14.5 |
| High | Low | 3 | 14.0 |
| High | High | 3 | 16.5 |
2. Write your experimental question.
Is there significant effect of the temperature, humidity and joint effects of temperature and humidity on the growth of a specific plant species at \(α=0.5\)?
3. Construct your null and alternative hypotheses.
4. Fit a full factorial model with interaction to the data.
Temperature Humidity Lot Plantgrowth
1 Low Low 1 12.5
2 Low Low 2 12.8
3 Low Low 3 13.0
4 Low High 1 15.2
5 Low High 2 16.3
6 Low High 3 14.5
7 High Low 1 14.8
8 High Low 2 13.4
9 High Low 3 14.0
10 High High 1 18.3
11 High High 2 17.9
12 High High 3 16.5
Call:
lm(formula = Plantgrowth ~ Temperature + Humidity + Temperature *
Humidity, data = lab5)
Residuals:
Min 1Q Median 3Q Max
-1.06667 -0.36667 -0.01667 0.43333 0.96667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.5667 0.4353 40.360 1.56e-10 ***
TemperatureLow -2.2333 0.6155 -3.628 0.006702 **
HumidityLow -3.5000 0.6155 -5.686 0.000462 ***
TemperatureLow:HumidityLow 0.9333 0.8705 1.072 0.314917
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7539 on 8 degrees of freedom
Multiple R-squared: 0.8922, Adjusted R-squared: 0.8517
F-statistic: 22.06 on 3 and 8 DF, p-value: 0.000318
5. Is there a significant interaction effect between temperature and humidity?
Df Sum Sq Mean Sq F value Pr(>F)
Temperature 1 9.363 9.363 16.48 0.003639 **
Humidity 1 27.603 27.603 48.57 0.000116 ***
Temperature:Humidity 1 0.653 0.653 1.15 0.314917
Residuals 8 4.547 0.568
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Base on the data above, the interactive between temperature and humidity has p-value = 0.314917 which is clearly greater than 0.05. Hence, there is no significant interaction effect between temperature and humidity
6. Is there evidence of confounding between main effects and lot effects?
Call:
lm(formula = Plantgrowth ~ Temperature + Humidity + Lot, data = lab5)
Residuals:
Min 1Q Median 3Q Max
-0.9000 -0.5417 0.1000 0.5792 0.8167
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.0333 0.6290 28.670 2.37e-09 ***
TemperatureLow -1.7667 0.4193 -4.213 0.00294 **
HumidityLow -3.0333 0.4193 -7.234 8.94e-05 ***
Lot -0.3500 0.2568 -1.363 0.21000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7263 on 8 degrees of freedom
Multiple R-squared: 0.8999, Adjusted R-squared: 0.8624
F-statistic: 23.98 on 3 and 8 DF, p-value: 0.0002368
Base on the data above, the p-value of lot is 0.21 which is clearly greater than 0.05. Thus, there is no significant evidence of confounding between lot effects and the main effects.
7. How would you modify the design to remove confounding? To minimize the impact of confounding in a study’s design, various strategies can be utilized to strengthen the accuracy of causal conclusions. One of these are organizing experimental units based on shared characteristics into groups, and random assignment of treatments within each specific group or block. But in this case, there is no sufficient evidence of confounding between main effects and lot effects thus, there is no need to modify the design.
8. Compare the precision of estimating main effects under the current vs modified design. Since we’ve mentioned that there’s no need to modify the design, we will just discuss the precision of estimating main effects under the current. Note that the main effects exhibit a small standard error, signifying increased accuracy in estimating these effects. Furthermore, it is evident that the model displays a diminutive residual standard error. A decreased residual standard error implies that the model’s predictions closely align with the observed values, indicating an enhanced level of fit.
9. Provide interpretations of your findings. Based on the ANOVA table, the p-value of temperature is equal to 0.003639 less than 0.05 and F-value of 16.4751 and 48.5689 which is greater than the tabulated F=5.32. Thus, we reject the null hypothesis. This means that the main effects of Temperature and Humidity is statistically significant. Hence, we fail to reject the null hypothesis and this suggests that the interaction effect between temperature and humidity is not statistically significant.
10. Give details on the syntax used to produce your answer.
This code is used to insert the picture of the given data
knitr::include_graphics("EX1.png", error = FALSE)
The corresponding data converted to excel
library(readxl)
LAB5<-read_xlsx("D:/stat//lab5.xlsx")
library(readxl)
library(kableExtra)
A<-read_xlsx("D:/stat//LAB5.xlsx")
kable(A, format = "html") %>%
kable_styling(full_width = FALSE) %>%
row_spec(0, bold = TRUE, color = "black", background = "lightgray") %>%
row_spec(1:4, background = "white")
| Temperature | Humidity | Lot | Plant Growth |
|---|---|---|---|
| Low | Low | 1 | 12.5 |
| Low | High | 1 | 15.2 |
| High | Low | 1 | 14.8 |
| High | High | 1 | 18.3 |
| Low | Low | 2 | 12.8 |
| Low | High | 2 | 16.3 |
| High | Low | 2 | 13.4 |
| High | High | 2 | 17.9 |
| Low | Low | 3 | 13.0 |
| Low | High | 3 | 14.5 |
| High | Low | 3 | 14.0 |
| High | High | 3 | 16.5 |
For convenience, we made this data.frame.
lab5 <- data.frame(
Temperature = rep(c("Low", "Low", "High", "High"), each = 3),
Humidity = rep(c("Low", "High","Low", "High"), each = 3),
Lot = rep(1:3, times = 4),
Plantgrowth = c(12.5,12.8,13.0,15.2,16.3,14.5,14.8,13.4,14.0,18.3,17.9,16.5)
)
lab5
Temperature Humidity Lot Plantgrowth
1 Low Low 1 12.5
2 Low Low 2 12.8
3 Low Low 3 13.0
4 Low High 1 15.2
5 Low High 2 16.3
6 Low High 3 14.5
7 High Low 1 14.8
8 High Low 2 13.4
9 High Low 3 14.0
10 High High 1 18.3
11 High High 2 17.9
12 High High 3 16.5
We use lm to fit the model.
C<-lm(Plantgrowth ~ Temperature + Humidity + Temperature*Humidity, lab5)
summary(C)
Call:
lm(formula = Plantgrowth ~ Temperature + Humidity + Temperature *
Humidity, data = lab5)
Residuals:
Min 1Q Median 3Q Max
-1.06667 -0.36667 -0.01667 0.43333 0.96667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.5667 0.4353 40.360 1.56e-10 ***
TemperatureLow -2.2333 0.6155 -3.628 0.006702 **
HumidityLow -3.5000 0.6155 -5.686 0.000462 ***
TemperatureLow:HumidityLow 0.9333 0.8705 1.072 0.314917
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7539 on 8 degrees of freedom
Multiple R-squared: 0.8922, Adjusted R-squared: 0.8517
F-statistic: 22.06 on 3 and 8 DF, p-value: 0.000318
We use this code to get the ANOVA table.
anova<-aov(C)
summary(anova)
Df Sum Sq Mean Sq F value Pr(>F)
Temperature 1 9.363 9.363 16.48 0.003639 **
Humidity 1 27.603 27.603 48.57 0.000116 ***
Temperature:Humidity 1 0.653 0.653 1.15 0.314917
Residuals 8 4.547 0.568
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We use lm to fit the model.
B <- lm(Plantgrowth ~ Temperature + Humidity + Lot, data = lab5)
summary(B)
Call:
lm(formula = Plantgrowth ~ Temperature + Humidity + Lot, data = lab5)
Residuals:
Min 1Q Median 3Q Max
-0.9000 -0.5417 0.1000 0.5792 0.8167
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.0333 0.6290 28.670 2.37e-09 ***
TemperatureLow -1.7667 0.4193 -4.213 0.00294 **
HumidityLow -3.0333 0.4193 -7.234 8.94e-05 ***
Lot -0.3500 0.2568 -1.363 0.21000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7263 on 8 degrees of freedom
Multiple R-squared: 0.8999, Adjusted R-squared: 0.8624
F-statistic: 23.98 on 3 and 8 DF, p-value: 0.0002368