The data analyzed in this recipe was from a study completed by researchers at the Department of Resources and Environmental Science at Shihezi University in the People's Republic of China.
The experiment analyzed the effect that drip irrigation rate and water salinity has on the response variables of root distribution and growth of cotton.
# Read in dataset as a .csv file.
recipe4 <- read.csv("C:/Users/braunj6/Documents/Fall 2014/Design of Experiments/recipe4.csv")
x<-recipe4
# Recategorize data as a factor
x$Year <- as.factor(x$Year)
The data being examined had Was completed using a 3x2 factorial, completely randomized block design. It is 3x2, because the first factor, irrigation water salinity has three levels (fresh, brackish, and sline) while the second factor, rate of nitrogen application has two levels (0 and 360 kg N/ha).
head(x)
## N.Rate Water.Salinity Year Shoot.Biomass.Total Root.shoot.ratio
## 1 N0 FW 2011 12832 0.067
## 2 N0 BW 2011 9551 0.074
## 3 N0 SW 2011 6251 0.089
## 4 N0 FW 2012 12898 0.064
## 5 N0 BW 2012 8602 0.075
## 6 N0 SW 2012 5632 0.088
summary(x)
## N.Rate Water.Salinity Year Shoot.Biomass.Total Root.shoot.ratio
## N0 :6 BW:4 2011:6 Min. : 5632 Min. :0.0560
## N360:6 FW:4 2012:6 1st Qu.: 7435 1st Qu.:0.0663
## SW:4 Median :10469 Median :0.0735
## Mean :10366 Mean :0.0738
## 3rd Qu.:12848 3rd Qu.:0.0820
## Max. :14972 Max. :0.0890
# As seen below, there are 12 observations of 5 variables - 3 factors and 2 response variables.
str(x)
## 'data.frame': 12 obs. of 5 variables:
## $ N.Rate : Factor w/ 2 levels "N0","N360": 1 1 1 1 1 1 2 2 2 2 ...
## $ Water.Salinity : Factor w/ 3 levels "BW","FW","SW": 2 1 3 2 1 3 2 1 3 2 ...
## $ Year : Factor w/ 2 levels "2011","2012": 1 1 1 2 2 2 1 1 1 2 ...
## $ Shoot.Biomass.Total: int 12832 9551 6251 12898 8602 5632 14886 12568 7462 14972 ...
## $ Root.shoot.ratio : num 0.067 0.074 0.089 0.064 0.075 0.088 0.063 0.071 0.085 0.056 ...
The continuous variables in this dataset are the total biomass for the shoot - ranging from 5,632 to 14,972 with an average of 10,366 kg/ha - and Root/shoot ratio - ranging from 0.056 to 0.089 with a mean of 0.074
The response variable being analyzed in this dataset are also the continuous variables: the total biomass for the shoot (in kg/ha) and Root/shoot ratio, which is unit-less.
The data is organized into 5 columns, with 12 observations. N Rate's two treatment levels were 0 and 360 kg N/ha, represented by N0 and N360. Water Salinity's three treatment levels were fresh, brackish, and saline were represented by FW, BW, and SW/
This experiment was completed as a completely randomized block design for the three levels of water salinity and 2 levels of N Rate.
The experiment will use a two-factor analysis of variance. It will use various attributes as factors, such as N Rate and Salinity, and will measure the variation in both Total Shoot Biomass and Root Shoot Ratio among these groups.
The rationale for this type of design is that the goal was to analyze the difference in means between the groups. The ANOVA was set up to include interaction between the factors, with the blocking variable as year. Therefore, we wanted to see what the variation was both among, and between groups.
The experiment was completed as a completely randomized design.
The experiment had repeated measures, where all levels were completed in both 2011 and 2012.
Blocking was used for year, either 2011 or 2012
attach(x)
boxplot(Shoot.Biomass.Total ~ N.Rate, main = "Total Shoot Biomass by N Rate", xlab = "N Rate (kg N/ha)", ylab = "Total Shoot Biomass (kg/ha)")
# There appears to be a slight difference in medians between the two groups, with N rate producing a slightly greater biomass.
boxplot(Shoot.Biomass.Total ~ Water.Salinity, main = "Total Shoot Biomass by Water Salinity", xlab = "Water Salinity (ds/m)", ylab = "Total Shoot Biomass (kg/ha)")
# Water Salinity appears to affect biomass. Fresh water has the greatest biomass, where saline water has the smallest.
boxplot(Shoot.Biomass.Total ~ Year, main = "Total Shoot Biomass by Year", xlab = "Year", ylab = "Total Shoot Biomass (kg/ha)")
# Year is the blocking variable in the experiment and does not appear to have any effect on biomass.
boxplot(Root.shoot.ratio ~ N.Rate, main = "Root-Shoot Ratio by N Rate", xlab = "N Rate (kg N/ha)", ylab = "Root-Shoot Ratio")
# N rate only slightly affects the root-shoot ratio, and N0 appears to lead to a greater ratio than N360.
boxplot(Root.shoot.ratio ~ Water.Salinity, main = "Root-Shoot Ratio by Water Salinity", xlab = "Water Salinity (ds/m)", ylab = "Root-Shoot Ratio")
# Water salinity appears to affect roo-shoot ratio. Fresh water leads to the smallest root-shoot ratio, while saline water has the greatest root-shoot ratio.
boxplot(Root.shoot.ratio ~ Year, main = "Root-Shoot Ratio by Year", xlab = "Year", ylab = "Root-Shoot Ratio")
# As stated before, year is the blocked variable, and does appear to significantly affect root-shoot ratio.
detach(x)
The focus of this recipe was on a 2 factor, 2 and 3 level ANOVA test.
#ANOVA Testing
#Comparing total shoot biomass to two factors: N rate and water salinity - blocked for year.
# H0: There is no difference between the means of the samples. The variation in biomass is not caused by anything other than randomization.
# Ha: The difference in variation of biomass can caused by something other than randomization.
model1 <- aov(Shoot.Biomass.Total ~ N.Rate * Water.Salinity + Year, data = x)
summary(model1)
## Df Sum Sq Mean Sq F value Pr(>F)
## N.Rate 1 1.38e+07 13790208 92.98 0.0002 ***
## Water.Salinity 2 1.04e+08 52234725 352.19 4.2e-06 ***
## Year 1 6.09e+05 609301 4.11 0.0985 .
## N.Rate:Water.Salinity 2 1.04e+06 518889 3.50 0.1121
## Residuals 5 7.42e+05 148312
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The summary of the ANOVA gives the p-values for each individual factor, along with the p-value for the interactions between the factors. The null hypothesis states that the variation in the response variable, biomass, cannot be explained by anything other than randomization. The p-values in this ANOVA support that this is not the case. The p-values for N Rate and Salinity are both less than an alpha of 0.05. Therefore, it is likely that the variation in biomass can be explained by the factors of N rate and Water Salinity. However, the p-value for the interaction between N rate and Salinity is not significant. Therefore, the interaction between these two factors may not be cause for variation.
#Comparing shoot-root ratio to two factors: N rate and water salinity - blocked for year.
# H0: There is no difference between the means of the samples. The variation in shoot-root ratio is not caused by anything other than randomization.
# Ha: The difference in variation of biomass can caused by something other than randomization.
model2 <- aov(Root.shoot.ratio ~ N.Rate * Water.Salinity + Year, data = x)
summary(model2)
## Df Sum Sq Mean Sq F value Pr(>F)
## N.Rate 1 0.000065 0.000065 11.67 0.0189 *
## Water.Salinity 2 0.001083 0.000542 96.71 0.0001 ***
## Year 1 0.000012 0.000012 2.14 0.2031
## N.Rate:Water.Salinity 2 0.000007 0.000004 0.64 0.5657
## Residuals 5 0.000028 0.000006
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results for this Analysis of Variance resulted similarly to the first model:
The summary of the ANOVA gives the p-values for each individual factor, along with the p-value for the interactions between the factors. The null hypothesis states that the variation in the response variable, biomass, cannot be explained by anything other than randomization. The p-values in this ANOVA support that this is not the case. The p-values for N Rate and Salinity are both less than an alpha of 0.05. Therefore, it is likely that the variation in biomass can be explained by the factors of N rate and Water Salinity. However, the p-value for the interaction between N rate and Salinity is not significant. Therefore, the interaction between these two factors may not be cause for variation.
par(mfrow = c(1,1))
qqnorm(residuals(model1))
qqline(residuals(model1))
par(mfrow = c(1,1))
qqnorm(residuals(model2))
qqline(residuals(model2))
The Q-Q Normality Plots of the residuals appeared to be mostly normal. When normal, the data points would lay linearly.
interaction.plot(x$Water.Salinity, x$N.Rate, x$Shoot.Biomass.Total)
# The interaction plot between salinity and rate for biomass seems to demonstrate that there is interaction between the groups. This is because there is a difference in slopes between the two lines.
interaction.plot(x$Water.Salinity, x$N.Rate, x$Root.shoot.ratio)
# The interaction plot between salinity and rate for roo-shoot ratio also seems to demonstrate interaction between the two factors. The slope of the two lines is different, and therefore infers interaction.
plot(fitted(model1),residuals(model1))
plot(fitted(model2),residuals(model2))
The plots of the fitted values versus the residual values for both models are generally very scattered and spread evenly throughout the plot.
Tukey's range tests are used alongside ANOVAs in order to find means that are significantly different from each other, by compairing pairs of means. This test compares the mean of each treatment level to the mean of the other treatment levels.
The null hypothesis for a Tukey test states that there is no difference between the means of a pair of data, while the alternative states that there is a significant difference between the means.
tukey1 <- TukeyHSD(aov(Shoot.Biomass.Total ~ Water.Salinity, data = x))
tukey1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Shoot.Biomass.Total ~ Water.Salinity, data = x)
##
## $Water.Salinity
## diff lwr upr p adj
## FW-BW 3370 723 6017 0.0153
## SW-BW -3852 -6499 -1205 0.0072
## SW-FW -7222 -9869 -4575 0.0001
plot(tukey1)
# The results of the Tukey Test display all p-values less than an alpha, 0.05 between all pairs of water salinity groups. This shows that, for these pairs, there is a noticeable difference in means between these pairs.
tukey2 <- TukeyHSD(aov(Root.shoot.ratio ~ Water.Salinity, data = x))
tukey2
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Root.shoot.ratio ~ Water.Salinity, data = x)
##
## $Water.Salinity
## diff lwr upr p adj
## FW-BW -0.01075 -0.01773 -0.00377 0.0051
## SW-BW 0.01250 0.00552 0.01948 0.0019
## SW-FW 0.02325 0.01627 0.03023 0.0000
plot(tukey2)
# The results of the Tukey Test display all p-values less than an alpha, 0.05 between all pairs of water salinity groups. This shows that, for these pairs, there is a noticeable difference in means between these pairs.
Based on the model adequacy testing, it appears that the data is fairly normally distributed. Unfortunately, normality is an assumption for using ANOVA tests.In order to verify the results of the ANOVA test, a nonparametric analysis of multiple factors can be used.
A Kruskal-Wallis test can be used to evaluate whether groups' distributions are identical, without the assumption that the data is normally distributed.
kruskal.test(Shoot.Biomass.Total ~ N.Rate, data = x)
##
## Kruskal-Wallis rank sum test
##
## data: Shoot.Biomass.Total by N.Rate
## Kruskal-Wallis chi-squared = 0.9231, df = 1, p-value = 0.3367
# The p-value of this Kruskal-Wallis test is greater than an alpha of 0.05. Therefore, you must fail to reject the null - that variation in shoot biomass cannot be explained by anything other than randomization.
kruskal.test(Shoot.Biomass.Total ~ Water.Salinity, data = x)
##
## Kruskal-Wallis rank sum test
##
## data: Shoot.Biomass.Total by Water.Salinity
## Kruskal-Wallis chi-squared = 9.846, df = 2, p-value = 0.007277
# The p-value of this Kruskal-Wallis test is less than an alpha of 0.05. Therefore, it is possible to reject the null - that variation in shoot biomass cannot be explained by anything other than randomization- and instead it is possible that variation can be explained by water salinity.
kruskal.test(Root.shoot.ratio ~ N.Rate, data = x)
##
## Kruskal-Wallis rank sum test
##
## data: Root.shoot.ratio by N.Rate
## Kruskal-Wallis chi-squared = 0.9231, df = 1, p-value = 0.3367
# The p-value of this Kruskal-Wallis test is greater than an alpha of 0.05. Therefore, you must fail to reject the null - that variation in root-shoot ratio cannot be explained by anything other than randomization.
kruskal.test(Root.shoot.ratio ~ Water.Salinity, data = x)
##
## Kruskal-Wallis rank sum test
##
## data: Root.shoot.ratio by Water.Salinity
## Kruskal-Wallis chi-squared = 9.846, df = 2, p-value = 0.007277
# The p-value of this Kruskal-Wallis test is less than an alpha of 0.05. Therefore, it is possible to reject the null - that variation in root-shoot ratio cannot be explained by anything other than randomization- and instead it is possible that variation can be explained by water salinity.
“Root distribution and growth of cotton as affected by drip irrigation with saline water”
Wei Min, Huijuan Guo, Guangwei Zhou, Wen Zhang, Lijuan Ma, Jun Ye, Zhenan Hou Department of Resources and Environmental Science, Shihezi University, Shihezi 832003, Xinjiang, People's Republic of China
Received 15 July 2014, Revised 5 September 2014, Accepted 6 September 2014, Available online 29 September 2014
http://www.sciencedirect.com/science/article/pii/S0378429014002524