Recipe 4 - Completely Randomized Block Designs From the Literature

Cheryl Tran

RPI

10/16/2014 Version 1

1. Setting

System under test

This experiment was to study the effect of irrigation water and n rate on Nitrogen uptake of cotton on stems. There is a shotage of fresh water to some arid and semi-arid regions so there is an interest in using saline and brackish irrigation water to help increase crop yields however this can lead to a risk of soil salinization.

uptakec<-read.csv("C:/Users/tranc3/Downloads/table (1).csv")

Factors and Levels

In this experiment, the two factors being observed are the N rate and irrigation water salinity. Nrate has two levels which are 0 and 360 kg N/ha. The irrigation water salinity has three levels, freshwater(FW), brackish water(BW), and saline water(SW).

head(uptakec)

##   Nrate Water.Salinity Year  Stem Leaves  Bolls  Total
## 1    N0             FW 2011  7.02  19.52  35.81  62.35
## 2    N0             BW 2011  7.23  21.33  36.68  65.24
## 3    N0             SW 2011  5.81  20.22  35.39  61.42
## 4  N360             FW 2011 19.32  68.51 120.52 208.33
## 5  N360             BW 2011 20.05  76.82 106.90 203.70
## 6  N360             SW 2011 15.51  62.71  91.42 169.64

tail(uptakec)

##    Nrate Water.Salinity Year  Stem Leaves  Bolls  Total
## 7     N0             FW 2012 14.81  38.53  48.05 101.39
## 8     N0             BW 2012 10.41  29.56  39.96  79.93
## 9     N0             SW 2012  8.88  22.20  37.74  68.82
## 10  N360             FW 2012 33.30  91.02 139.86 264.18
## 11  N360             BW 2012 17.76  85.09 112.89 215.73
## 12  N360             SW 2012 15.54  56.24  99.90 171.68

summary(uptakec)

##   Nrate   Water.Salinity      Year           Stem           Leaves    
##  N0  :6   BW:4           Min.   :2011   Min.   : 5.81   Min.   :19.5  
##  N360:6   FW:4           1st Qu.:2011   1st Qu.: 8.47   1st Qu.:22.0  
##           SW:4           Median :2012   Median :15.16   Median :47.4  
##                          Mean   :2012   Mean   :14.64   Mean   :49.3  
##                          3rd Qu.:2012   3rd Qu.:18.15   3rd Qu.:70.6  
##                          Max.   :2012   Max.   :33.30   Max.   :91.0  
##      Bolls           Total      
##  Min.   : 35.4   Min.   : 61.4  
##  1st Qu.: 37.5   1st Qu.: 67.9  
##  Median : 69.7   Median :135.5  
##  Mean   : 75.4   Mean   :139.4  
##  3rd Qu.:108.4   3rd Qu.:204.9  
##  Max.   :139.9   Max.   :264.2

Continuous variables (if any)

The continuous variables in the data set are stems, leaves, bools, and total.

Response variables

In this experiment, the response variable is the amount of Nitrogen Uptake of cotton at the stem.

The Data: How is it organized and what does it look like?

The dataset is organized by 7 variables: Nrate, Water Salinity, Year, Stem, leaves, Bolls, and Total. There are 12 observations and they are a combination of Nrate and Water salinity during 2011 and 2012.

Randomization

There is a completely randomized block design.

2. (Experimental) Design

How will the experiment be organized and conducted to test the hypothesis?

The anova test is analyzing if the variation in Nitrogen uptake of cotton at the stem can be attributed to variation in the Nrate and Irrigation Water Salinity. The null hypothesis for this experiment is that the variation in Nitrogren uptake of cotton at the stem can not be attributed to the variation in the Nrate or Irrigation Water Salinity.The alternative is that the variation can be attributed to the variation in Nrate or Irrigation Water Salinity.

What is the rationale for this design?

The anova test is used to analyze the observed variance in a variable. This variable is broken down into factors and tested to determine if the factors can be used to explain the variation. One may think that Nrate or Irrigation water salinity would be increase the Nitrogen Uptake of Cotton. However, this may not be true therefore this experiment is used to test the hypothesis.

Randomize: What is the Randomization Scheme?

Randomization was used during data collection

Replicate: Are there replicates and/or repeated measures?

There are repeated measures because the same measurements of Nitogren uptake of Cotton occurred in 2011 and 2012.

Block: Did you use blocking in the design?

In the design, the data was blocked by year.

3. (Statistical) Analysis

(Exploratory Data Analysis) Graphics and descriptive summary

# histograms and boxplots of number of computers per student and the student teacher ratio
hist(uptakec$Stem)

plot of chunk unnamed-chunk-3

boxplot(uptakec$Stem~uptakec$Nrate, xlab="N Rate", ylab="Nitrogen uptake of cotton", main="Stem")

plot of chunk unnamed-chunk-3

boxplot(uptakec$Stem~uptakec$Water.Salinity, xlab="Water Salinity", ylab="Nitrogen uptake of cotton", main="Stem")

plot of chunk unnamed-chunk-3

The data for the Nitrogen Uptake of Cotton in Stems is skewed right. When looking at boxplots of Nrate and Nitrogen uptake of cotton, it appears that the range of values for N0 is less than the range of values for N360. Since there is a significant diffence in means, the Nitrogren uptake of cotton is dependant on Nrate. For the Water Salinity, Freshwater has a wider range of values for Nitrogen uptake of Cotton and brackish water and saline water have less variation.

Testing

model1=aov(Stem~Nrate+Year, data=uptakec)
anova(model1)

## Analysis of Variance Table
## 
## Response: Stem
##           Df Sum Sq Mean Sq F value Pr(>F)   
## Nrate      1    378     378   15.47 0.0034 **
## Year       1     55      55    2.26 0.1666   
## Residuals  9    220      24                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

model2=aov(Stem~Water.Salinity+Year, data=uptakec)
anova(model2)

## Analysis of Variance Table
## 
## Response: Stem
##                Df Sum Sq Mean Sq F value Pr(>F)
## Water.Salinity  2    107    53.3    0.87   0.46
## Year            1     55    55.3    0.90   0.37
## Residuals       8    491    61.3

model3=aov(Stem~Water.Salinity*Nrate+Year, data=uptakec)
anova(model3)

## Analysis of Variance Table
## 
## Response: Stem
##                      Df Sum Sq Mean Sq F value Pr(>F)   
## Water.Salinity        2    107      53    3.13 0.1314   
## Nrate                 1    378     378   22.18 0.0053 **
## Year                  1     55      55    3.25 0.1314   
## Water.Salinity:Nrate  2     28      14    0.82 0.4917   
## Residuals             5     85      17                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Based on the results of the first ANOVA, we would reject the null hypothesis and the variation in Nitrogen Uptake of Cotton in the Stem can be explained by something other than randomiation. The average Nitrogen Uptake of Cotton in the stem can be attributed to the Nrate.The probability of getting an F value of 15.47 under randomization is .0034. For the second ANOVA test, we would fail to reject the null hypothesis and the variation in Nitrogen uptake of cotton can’t be attributed to anything other than randomization. The probability of getting an F value of .87 under randomization is .46. For the third ANOVA test, it appears that the variation in Nitrogen uptake of cotton at the stem can be attributed to the Nrate. But when looking at Water Salinity, the variation in Nitrogen uptake of cotton can’t be attributed to anything other than randomization. With the interaction, the total variation can’t be attributed to anything other than randomization so the effect of one factor is the same for all levels of the other factor when it comes to the Nitrogen uptake of Cotton at the stem.

Diagnostics/Model Adequacy Checking

qqnorm(residuals(model3))
qqline(residuals(model3))

plot of chunk unnamed-chunk-5

shapiro.test(uptakec$Stem)

## 
##  Shapiro-Wilk normality test
## 
## data:  uptakec$Stem
## W = 0.8916, p-value = 0.1237

plot(fitted(model3), residuals(model3))

plot of chunk unnamed-chunk-5

interaction.plot(uptakec$Water.Salinity, uptakec$Nrate, uptakec$Stem)

plot of chunk unnamed-chunk-5 A Q-Q plot can be used to compare the shape of the distribution of the dataset. The Q-Q plot and Q-Q line of the residuals appear to be normal. We use the Shapiro-wilk test to check normality. With our p-values >0.1 it appears that we would fail to reject the null hypothesis that our data is normal..The plot of the fitted model and the residuals appears to be scattered randomly so the data is normal. For the interaction plot, there are different slopes for the N0 and N360 therefore there is an interaction.

T1<-TukeyHSD(aov(Stem~Water.Salinity*Nrate, data=uptakec))
T1

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Stem ~ Water.Salinity * Nrate, data = uptakec)
## 
## $Water.Salinity
##         diff     lwr   upr  p adj
## FW-BW  4.750  -5.747 15.25 0.4038
## SW-BW -2.428 -12.925  8.07 0.7671
## SW-FW -7.178 -17.675  3.32 0.1704
## 
## $Nrate
##          diff   lwr   upr p adj
## N360-N0 11.22 4.385 18.06 0.007
## 
## $`Water.Salinity:Nrate`
##                    diff      lwr    upr  p adj
## FW:N0-BW:N0       2.095 -17.1606 21.351 0.9970
## SW:N0-BW:N0      -1.475 -20.7306 17.781 0.9994
## BW:N360-BW:N0    10.085  -9.1706 29.341 0.3989
## FW:N360-BW:N0    17.490  -1.7656 36.746 0.0740
## SW:N360-BW:N0     6.705 -12.5506 25.961 0.7352
## SW:N0-FW:N0      -3.570 -22.8256 15.686 0.9691
## BW:N360-FW:N0     7.990 -11.2656 27.246 0.5998
## FW:N360-FW:N0    15.395  -3.8606 34.651 0.1195
## SW:N360-FW:N0     4.610 -14.6456 23.866 0.9177
## BW:N360-SW:N0    11.560  -7.6956 30.816 0.2890
## FW:N360-SW:N0    18.965  -0.2906 38.221 0.0533
## SW:N360-SW:N0     8.180 -11.0756 27.436 0.5800
## FW:N360-BW:N360   7.405 -11.8506 26.661 0.6617
## SW:N360-BW:N360  -3.380 -22.6356 15.876 0.9753
## SW:N360-FW:N360 -10.785 -30.0406  8.471 0.3432

plot(T1)

plot of chunk unnamed-chunk-6

When running a Tukey test, the null hypothesis is that there is no difference between the means of a pair of data, while the alternative states that there is a significant difference between the means.The tukey test creates a set of confidence intervals on the differences between the means of the levels of a factor with the specified family-wise probability of coverage. When looking at all of the plots for the Tukey test, zero is included in all of the confidence intervals so there is no difference between the means of a pair of data.

4. Contingencies

A non parametric test could be used to test the hypothesis. For example, a Kruskal Wallis or Friedmans test are some non-parametric methods.The Friedmans test and kruskal Wallis performs a rank sum test.The Kruskal Wallis test does not assume a normal distrubtion of the residuals.

kruskal.test(Stem~Nrate, data=uptakec)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  Stem by Nrate
## Kruskal-Wallis chi-squared = 8.308, df = 1, p-value = 0.003948

kruskal.test(Stem~Water.Salinity, data=uptakec)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  Stem by Water.Salinity
## Kruskal-Wallis chi-squared = 1.077, df = 2, p-value = 0.5836

The null hypothesis is that there is no difference between treatments. For Nrate, we would reject the null hypothesis then the variation is due to something other than randomization. For the Water salinity, we would fail to reject the null hypothesis and the variation cannot be explained by anything other than randomization.

5. References to the literature

http://www.sciencedirect.com/science/article/pii/S0378429014002524

Recipe 4 - Completely Randomized Block Designs From the Literature

Cheryl Tran

RPI

10/16/2014 Version 1

1. Setting

System under test

Factors and Levels

Continuous variables (if any)

Response variables

The Data: How is it organized and what does it look like?

Randomization

2. (Experimental) Design

How will the experiment be organized and conducted to test the hypothesis?

What is the rationale for this design?

Randomize: What is the Randomization Scheme?

Replicate: Are there replicates and/or repeated measures?

Block: Did you use blocking in the design?

3. (Statistical) Analysis

(Exploratory Data Analysis) Graphics and descriptive summary

Testing

Diagnostics/Model Adequacy Checking

4. Contingencies

5. References to the literature

6. Appendices

A summary of, or pointer to, the raw data

complete and documented R code