Here I look at data which was taken from a journal article ‘Postharvest Quality and Shelf Life of Some Hot Pepper Varieties’ by Samira et. al. This data analyzes the effect of storage conditions on the moisture content of the peppers.
Below are the first and last 6 lines of the CSV file which contains the data.
Peppers<-read.csv("C:/Users/Anthony/Desktop/Peppers_CSV.csv",header=TRUE)
# To define the Storage Period of the peppers as a factor
Peppers$Storage_Periods=as.factor(Peppers$Storage_Periods)
Peppers$Treatment=as.factor(Peppers$Treatment)
Peppers$Pepper_Type=as.factor(Peppers$Pepper_Type)
head(Peppers)
## Treatment Pepper_Type Storage_Periods Moisture_Content
## 1 Evaporative Cooling Melka Dima 0 91.7
## 2 Evaporative Cooling Melka Dima 4 91.5
## 3 Evaporative Cooling Melka Dima 8 91.1
## 4 Evaporative Cooling Melka Dima 12 90.4
## 5 Evaporative Cooling Melka Dima 16 89.8
## 6 Evaporative Cooling Melka Dima 20 89.0
tail(Peppers)
## Treatment Pepper_Type Storage_Periods Moisture_Content
## 60 Ambient Storage Mareko Fana 16 78.9
## 61 Ambient Storage PBC 600 0 89.4
## 62 Ambient Storage PBC 600 4 84.5
## 63 Ambient Storage PBC 600 8 83.6
## 64 Ambient Storage PBC 600 12 84.4
## 65 Ambient Storage PBC 600 16 75.0
As part of a three factor, multi-level analysis, I will be focusing on the response variable of ‘Moisture Content’.
The factor that I will be blocking for is labelled as ‘Treatment’ and has two levels: ‘Evaporative Cooling’ and ‘Ambient Storage’.
The first factor of interest for this analysis is the ‘Pepper Type’. There were five levels of this factor (five pepper types): “Melka Dima”, “Melka Eshet”, “Melka Zala”, “Mareko Fana”, and “PBC 600.”
The second factor of interest is the ‘Storage Period’ of the peppers. For peppers exposed to evaporative cooling there were eight levels to this factor: 0, 4, 8, 12, 16, 20, 14, and 28 days. For the peppers exposed to ambient conditions there were only five levels to this factor: 0, 4, 8, 12, and 16 days.
summary(Peppers)
## Treatment Pepper_Type Storage_Periods
## Ambient Storage :25 Mareko Fana:13 0 :10
## Evaporative Cooling:40 Melka Dima :13 4 :10
## Melka Eshet:13 8 :10
## Melka Zala :13 12 :10
## PBC 600 :13 16 :10
## 20 : 5
## (Other):10
## Moisture_Content
## Min. :75.0
## 1st Qu.:85.4
## Median :87.7
## Mean :87.4
## 3rd Qu.:89.7
## Max. :92.5
##
The only continuous variable that this experiment looked at was the response variable of ‘Moisture Content’. This variable is represented in percentages of moisture remaining in the peppers as a response to their respective pepper type and storage periods.
As previously mentioned, the response variable of this experimentation is ‘Moisture Content’. This variable refers to the amount of moisture retained in the peppers after exposure to their randomly assigned treatment conditions and storage periods.
This data set contains four different columns labeled ‘Treatment’, ‘Pepper Type’, ‘Storage Periods’, and ‘Moisture Content’. As previously discussed, the ‘Moisture Content’ is the response variable and the other three columns are the factors that will influence the response.
This data was collected using a completely randomized block design. Peppers were collected randomly from the nuresery beds. The peppers were then divided randomly into the ‘evaporative cooling’ treatment level or the ‘ambient storage’ treatment level. Peppers were then chosen randomyl from the treatment conditions for moisture content analysis.
How will the experiment be organized and conducted to test the hypothesis?
This experiment was organized by initially randomly selecting the peppers for experimentation as described in the previous “Randomization” section. Each combination of factor levels was performed with three replicates (n=3).
In this statistical analysis I will begin by analyzing all levels of the factor ‘Pepper Type’ while blocking for the treatment method. This will allow me to examine the effect of pepper type on the response variable ‘moisture content’.
Next, I will look at the second factor which is ‘Storage Period’ while still blocking for treatment method. This will allow me to examine the effect of the storage period on the response variable ‘moisture content’.
What is the rationale for this design?
This design was used to analyze the effects of the type of pepper, and the storage period on the response variable while blocking for the effects of the treatment method.
Randomize: What is the Randomization Scheme?
As previously described, this data was collected using a completely randomized block design. Peppers were collected randomly from the nursery beds. The peppers were then divided randomly into the ‘evaporative cooling’ treatment level or the ‘ambient storage’ treatment level. Peppers were then chosen randomly from the treatment conditions for moisture content analysis.
Replicate: Are there replicates and/or repeated measures?
Yes, each combination of treatment, pepper type, and storage period was experimented on 3 times(n=3).
Block: Did you use blocking in the design?
Yes, blocking was used to separate the pepper populations by their treatment method.
Below are boxplots of the Moisture Content of all levels of the two factors of interest.
Axis labels are abbreviated as follows: MF=Mareko Fana, MD=Melka Dima, ME=Melka Eshet, MZ=Melka Zala, P=PBC600, A=Ambient storage, E=Evaporative cooling.
In the second box plot, axis labels that are numbers are referring to the amount of days for storage periods.
par(mgp=c(3,1,0))
boxplot(Moisture_Content~Pepper_Type+Treatment,data=Peppers, ylab="Moisture Content (%)",xlab="Pepper Type and Treatment",names=c("MF.A","MD.A","ME.A","MZ.A","P.A","MF.E","MD.E","ME.E","MZ.E","P.E"),las=2)
boxplot(Moisture_Content~Storage_Periods+Treatment,data=Peppers, ylab="Moisture Content (%)",xlab="Storage Period and Treatment",names=c("0.A","4.A","8.A","12.A","16.A","20.A","24.A","28.A","0.E","4.E","8.E","12.E","16.E","20.E","24.E","28.E"),las=2)
Looking at the first box plot above it appears that the evaporative treatment cooling method resutled in a higher moisture content in the peppers overall. Also by comparing the moisture content of each pepper type in each block it appears that both sets of samples follow a very similar trend with the most moisture content in the Melka Eshet peppers, and the least moisture content in the Mareko Fena and PBC600 peppers.
While analyzing the second box plot the obvious trend that stands out is that there is a decrease in moisture content as the storage period increases. This holds true for both treatment blocks. Once again it is also obvious that evaporative cooling resulted in a higher moisture content in the peppers.
Below I perform two analyses of variance (ANOVA) to determine if there are significant statistical differences among the mean moisture contents of the peppers with regards to pepper type and storage periods while blocking to reduce the effect of the treatment.
# Assign models for the data of interest
model_Pepper<-aov(Moisture_Content~Pepper_Type+Treatment,data=Peppers)
model_StoragePeriods=aov(Moisture_Content~Storage_Periods+Treatment,data=Peppers)
# Perform ANOVA on the two models
anova(model_Pepper)
## Analysis of Variance Table
##
## Response: Moisture_Content
## Df Sum Sq Mean Sq F value Pr(>F)
## Pepper_Type 4 155 38.9 5.10 0.0014 **
## Treatment 1 64 64.2 8.42 0.0052 **
## Residuals 59 450 7.6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(model_StoragePeriods)
## Analysis of Variance Table
##
## Response: Moisture_Content
## Df Sum Sq Mean Sq F value Pr(>F)
## Storage_Periods 7 257 36.7 7.3 3.3e-06 ***
## Treatment 1 132 131.5 26.2 3.9e-06 ***
## Residuals 56 281 5.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The first ANOVA above analyzes the effect of pepper type on moisture content. By blocking for treatment, we see that there is a statistically significant effect on moisture content when changing the pepper type. I reached this conclusion because the F value of 0.0014 which is less than 0.05 indicates that the variation among these groups is likely caused by something other than randomization. Thus we reject the null hypothesis that pepper type has no effect on moisture content.
The second ANOVA which is seen above analyzes the effect of storage period on moisture content. In this ANOVA I also blocked for treatment and the very low F value (3.3e-06) indicates that the variation in the response variable with regards to this factor is likely caused by something other than randomization. Thus we also reject the null hypothesis here which states that storage period has no effect on moisture content.
To expand on the results of the ANOVA tests I perform Tukey’s Honest Significant Differences Test (HSD) to determine which levels of each factor result in changes in the moisture content that are statistically different from eachother.
TukeyHSD(model_Pepper,conf.level=0.95)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Moisture_Content ~ Pepper_Type + Treatment, data = Peppers)
##
## $Pepper_Type
## diff lwr upr p adj
## Melka Dima-Mareko Fana 2.9462 -0.1023 5.9946 0.0630
## Melka Eshet-Mareko Fana 2.9462 -0.1023 5.9946 0.0630
## Melka Zala-Mareko Fana 1.7385 -1.3100 4.7869 0.5004
## PBC 600-Mareko Fana -0.8615 -3.9100 2.1869 0.9310
## Melka Eshet-Melka Dima 0.0000 -3.0485 3.0485 1.0000
## Melka Zala-Melka Dima -1.2077 -4.2561 1.8408 0.7981
## PBC 600-Melka Dima -3.8077 -6.8561 -0.7592 0.0073
## Melka Zala-Melka Eshet -1.2077 -4.2561 1.8408 0.7981
## PBC 600-Melka Eshet -3.8077 -6.8561 -0.7592 0.0073
## PBC 600-Melka Zala -2.6000 -5.6485 0.4485 0.1297
##
## $Treatment
## diff lwr upr p adj
## Evaporative Cooling-Ambient Storage 2.043 0.6339 3.452 0.0052
TukeyHSD(model_StoragePeriods,conf.level=0.95)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Moisture_Content ~ Storage_Periods + Treatment, data = Peppers)
##
## $Storage_Periods
## diff lwr upr p adj
## 4-0 -2.39 -5.546 0.766204 0.2693
## 8-0 -3.00 -6.156 0.156204 0.0738
## 12-0 -4.25 -7.406 -1.093796 0.0020
## 16-0 -6.19 -9.346 -3.033796 0.0000
## 20-0 -3.66 -7.526 0.205545 0.0759
## 24-0 -4.32 -8.186 -0.454455 0.0184
## 28-0 -6.26 -10.126 -2.394455 0.0001
## 8-4 -0.61 -3.766 2.546204 0.9986
## 12-4 -1.86 -5.016 1.296204 0.5862
## 16-4 -3.80 -6.956 -0.643796 0.0083
## 20-4 -1.27 -5.136 2.595545 0.9670
## 24-4 -1.93 -5.796 1.935545 0.7646
## 28-4 -3.87 -7.736 -0.004455 0.0495
## 12-8 -1.25 -4.406 1.906204 0.9139
## 16-8 -3.19 -6.346 -0.033796 0.0458
## 20-8 -0.66 -4.526 3.205545 0.9994
## 24-8 -1.32 -5.186 2.545545 0.9594
## 28-8 -3.26 -7.126 0.605545 0.1585
## 16-12 -1.94 -5.096 1.216204 0.5335
## 20-12 0.59 -3.276 4.455545 0.9997
## 24-12 -0.07 -3.936 3.795545 1.0000
## 28-12 -2.01 -5.876 1.855545 0.7262
## 20-16 2.53 -1.336 6.395545 0.4523
## 24-16 1.87 -1.996 5.735545 0.7919
## 28-16 -0.07 -3.936 3.795545 1.0000
## 24-20 -0.66 -5.124 3.803547 0.9998
## 28-20 -2.60 -7.064 1.863547 0.6004
## 28-24 -1.94 -6.404 2.523547 0.8675
##
## $Treatment
## diff lwr upr p adj
## Evaporative Cooling-Ambient Storage 2.636 1.491 3.781 0
Above is a Tukey’s HSD test which compares the mean moisture content values which resulted from each change in pepper type and storage period. For individual tests that return a p-adj.value<0.05 there is a statistical difference between the mean response variables of those two levels that can be attributed to something other than randomization with a confidence level of 95%. For instance as seen in the second set of Tukey’s HSD tests the HSD test for storage period levels of 12 days and 0 days returns a p-adj. value of 0.002 which indicates that there is an honest significant difference between the mean moisture content values of these two levels that can be attributed to something other than randomization.
To check the adequacy of using the ANOVA as a means of analyzing this set of data I performed Quantile-Quantile (Q-Q) tests on the residual errors to determine if the residuals followed a normal distribution. I also created an interaction plot to see if there was an interaction effect between the two factors.
The nearly linear fit of the residuals in the two QQ plots are an indication that the ANOVA model may be adequate for this analysis.
The interaction plot following the QQ plots shows that the two factors are interacting with eachother to create an effect in the response variable. This is evident whenever there is an intersection of two or more curves, or a slope change of curves.
The third type of plot is a Residuals vs. Fits plot which is used to identify the linearity of the residual values and to determine if there are any outlying values. Because the residual values seem to be centered around zero for the models it can be concluded that the models used in these analyses are accurate for determining the effect of pepper type and storage period on the moisture content of the peppers.
# QQ Plot for residuals in analysis of cylinder effect on highway gas mileage
qqnorm(residuals(model_Pepper))
qqline(residuals(model_Pepper))
# QQ Plot for residuals in analysis of drive effect on highway gas mileage
qqnorm(residuals(model_StoragePeriods))
qqline(residuals(model_StoragePeriods))
interaction.plot(Peppers$Pepper,Peppers$Storage_Periods,Peppers$Moisture_Content)
plot(fitted(model_Pepper),residuals(model_Pepper))
plot(fitted(model_StoragePeriods),residuals(model_StoragePeriods))
The data for this experimental analysis was obtained from the entry ‘Postharvest Quality and Shelf Life of Some Hot Pepper Varieties’ by Samira et. al. published in The Journal of Food Science and Technology (2013).