1 Completely Randomized Design (Peter, Jai Sai Teja, Ravi Teja)

A completely randomized design for comparing the mean of multiple populations calls for an ANOVA test. Usually in the experimental design portion of an ANOVA test, there are a few characteristics known. The alpha, or probability of type one error, and the beta, probability of type two error, are quantities that are set in advance.
CRD is mainly applicable when they are two or more populations or treatments of a single dependent variable to be tested. say, $Pop_{1},Pop_{2},Pop_{3},....,Pop_{I}$ [ I, total # of populations ]
Equal number of samples to be collected from each population. Let say ‘n’ is the number of samples collected from a population. so, ‘n’ should be equal for all the populations in the test.
However, the order of collecting data samples will be random for all the populations or treatments. \[ Pop_{1}: \_ , \_ ,s_{3}, \_ , ...,n_{samples}.\\Pop_{2}: \_ ,s_{1}, \_ , \_ , ...,n_{samples} \\Pop_{3}: \_ , \_ , \_ ,s_{2}, ...,n_{samples} \\Pop_{I}:s_{j}, \_ , \_ , \_ , ...,n_{samples} \] where, s= samples of the populations

1.1 Assumptions

The data is assumed to be normally distributed for each treatment level.
Variance of the distribution is assumed to be equal.

These assumptions with a short example were discussed briefly under the preliminary plots section.

1.2 Process

1.2.1 Statistical Test

Fixed or Random effect: The term “Fixed effects” refers to the treatment variables that the researcher has chosen particularly for the study. Categorical variables, such as varying degrees of an independent variable, are frequently used to denote the test treatments. In order to determine if there are statistically significant changes between the fixed treatment groups, the fixed effects in CRD are commonly examined using the ANOVA approach. Random effects are sources of variability that are not chosen by the researcher. These are frequently used to take into consideration for nuisances or uncontrolled variability. CRD compares fixed treatment effects while assuming that any uncontrolled sources of variation are random.
- For Fixed effects: $H_0:\tau_i=0$ and $H_a:\tau_i \not= 0$
- For random effects: $H_0:\sigma_{\tau i}^2=0$ and $H_a:\sigma_{\tau i}^2\not=0$
Each sample observation of the data will be represented as \[y_{ij} \Rightarrow j^{th} observation \ from\ the\ Population \ 'i' \\ \bar{y_{i.}} \Rightarrow Average \ of\ population\ 'i'\]
Linear Effects Model for CRD:\[y_{ij} = \mu_{i} + \epsilon_{ij}\ \forall \ i,j\]

As we understood that each population average can be defined with grand mean and the treatment effect as shown below.\[\mu_{i} = \mu + \tau_{i} \ \forall \ i\]

Finally, we can say that,\[y_{ij} = \mu + \tau_{i} + \epsilon_{ij}\ \forall \ i,j\]where,

$\mu_{i} = Mean\ of\ each\ population.\\ \epsilon_{ij} = Random\ error\ effect[i.e., Difference\ between\ observations(y_{ij}) with\ respect\ to\ population\ average(\mu_{i})]\\ \mu = Grand\ mean [i,e,. Average\ of\ all\ the\ populations\ mean]\\ \tau_{i} = Treatment\ effect\ with\ respect\ to\ each\ population [i.e.,Difference\ between\ population\ average(\mu_{i}) with\ respect\ to\ grand\ mean(\mu)]$
Hypothesis model:

Null Hypothesis: All the treatment level means are equal, which means No significant difference between the means.\[H_0 : \mu _{1} = \mu _{2} =\ . \ .\ . \ = \mu _{k} \\ or \\H_0 : \tau_{i} = 0 \ , \forall \ i\]

Alternative Hypothesis: At least one population mean is significantly different from others.\[H_a : Atleast \ one\ \mu_{i} \ differ\ from\ other\\or\\ \tau_{i} \neq 0 \ , \exists \ i\]
Single factor or One-Way ANOVA:

The name ANOVA derived from a partitioning of total variability into its components. We can say the following using sum of squares: \[SS_{T} = SS_{treatmenets} + SS_{E}\] i.e., \[\sum_{i=1}^{I}\sum_{j=1}^{n} \left ( y_{ij}-\bar{y}_{\cdot \cdot } \right )^{2} = n\sum_{i=1}^{I}\left ( \bar{y}_{i\cdot }-\bar{y}_{\cdot \cdot } \right )^{2} +\sum_{i=1}^{I}\sum_{j=1}^{n}\left ( y_{ij}-\bar{y}_{i \cdot } \right )^{2}\]

$SS_{T}$ = Sum of squares of total.

$SS_{treatments}$ = Sum of squares of treatments.

$SS_{E}$ = Sum of squares of errors.
- Degree of freedom for each component of variability.
  
  Dof for treatments : Number of treatments - 1 [i.e., I-1]
  
  Dof for errors : Total number of observations(I*J) - Number of treatments(I) [i.e., IJ-I)]
  
  Dof for Total : IJ - 1
Following up with Mean square(MS) for each component of the total variability. \[MS_{\left ( treatments \right )} = \frac{SS_{treatments}}{Dof_{\left ( treatments \right )}} = \frac{SS_{treatments}}{I-1}\] \[MS_{\left ( error \right )} = \frac{SS_{error}}{Dof_{\left ( error \right )}} = \frac{SS_{error}}{IJ-I}\]
After performing statistical analysis of these components, we have to perform “F”test. The F-test is a statistical test that compares the variances of two or more groups or populations. The ANOVA F statistic is the mean square treatment over the mean squared error, which are both $\chi^2$ distributions.

\[F_{0} = \frac{MS_{\left ( treatment \right )}}{MS_{\left ( error \right )}}\]

The ANOVA table compiles all of the previous information into one location. It is commonly used to easily assist in finding the F statistic.

Source	Degrees of Freedom	Sum of Squares	Mean Square	F Stastic
Treatment (Between)	\(\text{I-1}\)	\(\text{Sum of Square}\) \(\text{Treatment (SSTr)}\)	\(\text{Mean Square}\) \(\text{Treatment (MSTr)}\)	\(\text{F}_0=\frac{\text{MSTr}}{\text{MSE}}\)
Error (Within)	\(\text{I(J-1)} or \text{(IJ-I)}\)	\(\text{Sum of Square}\) \(\text{Error (SSE)}\)	\(\text{Mean Square}\) \(\text{Error (MSE)}\)
Total	\(\text{IJ-1}\)	\(\text{Sum of Square}\) \(\text{Total (SST)}\)

Treatment

(Between)

$\text{I-1}$

$\text{Sum of Square}$ $\text{Treatment (SSTr)}$

$\text{Mean Square}$ $\text{Treatment (MSTr)}$

$\text{F}_0=\frac{\text{MSTr}}{\text{MSE}}$

Error

(Within)

$\text{I(J-1)} or \text{(IJ-I)}$

$\text{Sum of Square}$ $\text{Error (SSE)}$

$\text{Mean Square}$

$\text{Error (MSE)}$

Total

$\text{IJ-1}$

$\text{Sum of Square}$ $\text{Total (SST)}$

1.2.2 Sample Size Determination

When Designing an experiment, the alpha, or probability of type one error, and the beta, probability of type two error, are quantities that are set in advance. This is similar to a T test. The difference comes in when calculating the number of samples that are required to have the required power. When using a T test with a normal distribution, alpha beta and n(sample size) are all related directly and if you have two, you can find the third. This is more complicated when using an ANOVA test to compare multiple populations because the ANOVA test follows an F distribution. The F distribution changes as the parameters change. Knowing the alpha and beta of your experiment doesn’t mean that it is possible to calculate the number of samples required directly.

The ANOVA F statistic is the mean square treatment over the mean squared error, which are both $\chi^2$ distributions. Combined, they make an F distribution with two degrees of freedom.

$F_0=\left[\frac{MST_r}{MSE}\right] \sim F_{\nu_1,\nu_2}$ .

$\nu_1=$ the degrees of freedom of the mean square treatment

$\nu_2=$ the degrees of freedom of the mean square error

To find the expected value of F, you look at the expected value of the mean square treatment over the expected value of the mean square error.

$E[MST_r]=\sigma^2+\sigma_{Trt}^2$

$E[MSE]=\sigma^2$
$E[F_0]=\left[\frac{E[MST_r]}{E[MSE]}\right]=1+\left[\frac{\sigma_{Trt}^2}{\sigma^2}\right]$

The value for $\sigma_{Trt}^2$ changes between fixed effects and random effects models. For fixed effects, $\sigma_{Trt}^2=\frac{J\sum{\tau_i}}{I-1}$ For random effects, $\sigma_{Trt}^2=\sigma_{\tau_i}^2$. The hypothesis of the ANOVA test changes with fixed effects and random effects models.

For Fixed effects: $H_0:\tau_i=0$ and $H_a:\tau_i \not= 0$

For random effects: $H_0:\sigma_{\tau i}^2=0$ and $H_0:\sigma_{\tau i}^2\not=0$

If the null hypothesis is true, $\sigma_{Trt}^2$ will equal zero. This means that the expected value for F will be driven to 1. When the expected value of F is one, you have a central F distribution. However, Whenever $\sigma_{Trt}^2$ does not equal 0, the expected value of F will change. This causes the null hypothesis to be rejected in favor of the alternative hypothesis. The term $\left[\frac{\sigma_{Trt}^2}{\sigma^2}\right]$ is called the “non centrality parameter”

When you have a non centrality parameter, your non central F now has 3 parameters. This changes the shape of the F distribution.

$F_0 \sim F_{\nu_1,\nu_2,[non\ centrality\ parameter] }$

It is worth noting that there are multiple ways for this to be written. It is popular for the same concept to be written as:

$E[F_0]=1+\left[\frac{\sigma_{Trt}^2}{\sigma^2}\right]=1+\left[\frac{\sigma_{between}^2}{\sigma_{within}^2}\right]$

The power calculation process is very taxing and is not done by hand anymore. There used to be tables that would be referenced when looking for an F statistic with certain parameters and certain power. Now, We have programs like R to do all of the work for us. This guide will show how to use R to calculate the power and number of samples in an ANOVA test.

To find the number of samples required for an experimental design in ANOVA, a power analysis of the experimental design can be done. This analysis requires that there is already information known about the data sets. This is information on the variability of the data. Both the $\sigma_{between}^2$ and the $\sigma_{within}^2$ must be known or estimated.

$\sigma_{between}^2$ can be estimated by $S^2$, $S^2=\frac{\sum (\mu_i-\mu.)^2}{n-1}$

1.2.2.1 power.anova.test

To do the power analysis in R, the power.anova.test function will be used. Following is the usage guide provided by R for the power.anova.test

power.anova.test(groups = NULL, n = NULL,between.var = NULL, within.var = NULL, sig.level = 0.05, power = NULL)

Arguments

`groups`	Number of groups
`n`	Number of observations (per group)
`between.var`	Between group variance
`within.var`	Within group variance
`sig.level`	Significance level (Type I error probability)
`power`	Power of test (1 minus Type II error probability)

Details

Exactly one of the parameters groups, n, between.var, power, within.var, and sig.level must be passed as NULL, and that parameter is determined from the others. Notice that sig.level has non-NULL default so NULL must be explicitly passed if you want it computed.

1.2.2.2 pwr.anova.test

What if the variances aren’t known values? This proves to be an issue because the power.anova.test is no longer usable. Instead the function pwr.anova.test is used. First, the library “pwr” must be added to your script using “library(pwr)” command. The variability is now estimated using Cohen’s method.

The famous statistician Jacob Cohen created a method for estimating the variability when the exact values of the means are unknown. This is done by choosing one of three options: minimum variability, intermediate variability, and maximum variability. Each option is selected based on the predicted spread of the means. He classifies each group as having an effect represented by f(different from the F statistic). Each option has a different equation to find the effect.

These equations use the following variables:

$\sigma =$ Standard deviation

$d=\frac{\mu_{max} - \mu_{min}}{\sigma}$

$k=$ the number of populations(Groups)

Minimum variability is chosen if the bulk of the means are clustered around the central part of the data. An example of this would be if the means of 6 populations are thought to be (2.0, 2.4, 2.5, 2.5, 2.6, 3.0). The means are clustered around the midpoint, and there is approximately equal space between the cluster and the high and low means.

$\text{f}_{min}=d\sqrt{\frac{1}{2k}}$

Intermediate variability is chosen of the means are equally distributed across the group of the data. For example, if there were five populations with the following estimated means, they would be a good candidate for using the intermediate variability model. (7.0, 8.5, 10.1, 11.6, 12.9). All of the means are almost equally spaced out.

$\text{f}_{int}=\frac{d}{2}\sqrt{\frac{k+1}{3(k-1)}}$

Maximum variability is used for situations where there is bi-modal data with two clusters of means at the high and low end of the spread. An example of this would be if six populations had means predicted to be (100, 101, 98, 176, 174, 175). All of the data is very clustered around the two end points.

For odd observations:

$\text{f}_{int}=\frac{d}{2}\sqrt{\frac{k+1}{3(k-1)}}$

For even observations:

$\text{f}_{max}=\frac{d}{2}$

This r value will now be used with the pwr.anova.test function to calculate either the power or number of samples required per group. Following is the usage guide provided by R for the pwr.anova.test.

pwr.anova.test(k = NULL, n = NULL, f = NULL, sig.level = 0.05, power = NULL)

Arguments

`k`	Number of groups
`n`	Number of observations (per group)
`f`	Effect size
`sig.level`	Significance level (Type I error probability)
`power`	Power of test (1 minus Type II error probability)

Details

Exactly one of the parameters ‘k’,‘n’,‘f’,‘power’ and ‘sig.level’ must be passed as NULL, and that parameter is determined from the others. Notice that the last one has non-NULL default so NULL must be explicitly passed if you want to compute it.

1.2.3 Design Layout

With the power test complete, all of the information needed to plan the data collection is now in hand. The size has been determined. The type one and two errors, and thus power, have been predetermined. The next step before data collection is the experimental design. To be a completely randomized design, the data collection order needs to be randomized.

The package “agricolae” in R will be used to generate the experimental design. This is a package created for analysis of many experiments commonly used in agricultural fields. It has a good analysis of CRD that is useful to us.

To use this package, first download it using the install.packages(“agricolae”) command. Next load the package using the library(agricolae) command. The package is now ready for use in your R script.

The design.crd is a package inside of the agricolae package. To set up the use of the design.crd package, format your information in the following way:

For a CRD with I=4 levels (lvl1, lvl2, lvl3, lvl4), specify the I levels of a factor/treatment in a vector.

Ex.

 trt1<-c("lvl1","lvl2","lvl3","lvl4")

Following is the usage guide provided by R for the design.crd

design.crd(trt, r, serie = 2, seed = 0, kinds = "Super-Duper",randomization=TRUE)

Arguments

`trt`	Treatments
`r`	Replications
`serie`	number plot, 1: 11,12; 2: 101,102; 3: 1001,1002
`seed`	seed
`kinds`	method for to randomize
`randomization`	TRUE or FALSE - randomize

For our purposes, the only options needed are trt, r, and seed. The other options have to do with the type of randomization algorithm. for the trt, input the trt1 vector that was just created. For r put in the number of replications needed (which is the n calculated by the power test) and for the seed number, put any number you would like. The seed number is the start to the random number generator that randomizes the results. The default settings will work for our experimental design for the rest of the variables. Continuing with our example for a CRD with four levels and six replications, the R code would look like the following:

library(agricolae) 
trt1<-c("lvl1","lvl2","lvl3","lvl4") 
design<-design.crd(trt=trt1,serie=0,r=6,seed=8240) 
design$book

##    plots r trt1
## 1      1 1 lvl4
## 2      2 1 lvl2
## 3      3 2 lvl4
## 4      4 1 lvl3
## 5      5 1 lvl1
## 6      6 2 lvl2
## 7      7 2 lvl1
## 8      8 3 lvl1
## 9      9 3 lvl2
## 10    10 4 lvl1
## 11    11 2 lvl3
## 12    12 3 lvl4
## 13    13 3 lvl3
## 14    14 4 lvl2
## 15    15 4 lvl4
## 16    16 5 lvl2
## 17    17 5 lvl1
## 18    18 4 lvl3
## 19    19 5 lvl3
## 20    20 5 lvl4
## 21    21 6 lvl2
## 22    22 6 lvl3
## 23    23 6 lvl4
## 24    24 6 lvl1

write.csv(design$book, file = "design.csv")

Calling the book column of the design shows the order that the samples should be collected. In this example, the first sample collected is from level 4. The second sample collected is from level 2. The third sample collected is from level 4 again. This continues until six samples of each level have been collected. This random order is critical for the validity of the results. The last line writes the experiment order to a .csv file named “Design”. This has a column called “plots” which keeps track of the sample number. It has another column called “r” which keeps track of a summation of how many samples from each population have been collected. It has another column called trt1(which is the name of your data set) and it tells you the order of sample collection. All of this is in a .csv file and can be edited in excel.

1.2.4 Collect Data

Using the file created in the previous section, another column should be made called “data”. This should record the observations of the data. Each sample should be collected in the order prescribed by the fourth column, in our case”trt1”. With all of the data collected, it can be read into R using the read.csv command followed by the location of the file.

CRD_Data<- read.csv("https://raw.githubusercontent.com/Jaisai0611/CRD-example/main/design.csv")
head(CRD_Data)

##   X plots r trt1 Data
## 1 1     1 1 lvl4    5
## 2 2     2 1 lvl2    6
## 3 3     3 2 lvl4    7
## 4 4     4 1 lvl3    5
## 5 5     5 1 lvl1    6
## 6 6     6 2 lvl2    5

1.2.5 Preliminary Plots

Let’s recap the assumptions considered earlier with a short example.

Normality of Data:

In R, we can use the “qqnorm” and “qqline” commands. These programs generate a quantile-quantile (Q-Q) plot, which allows you to visually verify the normality of your data. Your data is most likely regularly distributed if the points nearly follow a straight diagonal line.

The better the fit to a normal distribution, the closer the points are to the line. If the Q-Q plot exhibits a “S” curve, it indicates that your data has heavy tails and is not normally distributed. In that case we might consider doing data transformations or non-parametric testing.
```
pop1 <- c(21,35,42,47,28,36,40)
pop2 <- c(43,27,33,39,26,31,29)
pop3 <- c(30,38,41,25,35,37,45)

qqnorm(c(pop1,pop2,pop3), col = 'blue')
qqline(c(pop1,pop2,pop3), col = 'red')
```

Comment on Q-Q Plot: The observations are aligned closely to the Normality line without much deviation. So, we assume that the all the populations are normally distributed.

Constant Variance:

Boxplots can be used to visually test variance equality in a Completely Randomized Design. It gives an easy approach to examine the dispersion and distribution of data among multiple treatment populations. If the boxplots have similar box widths and box lengths and there are no obvious outliers indicating high variance discrepancies, this indicates that the variances are generally equal.
```
boxplot(pop1,pop2,pop3,col="lightblue", names = c("pop1","pop2","pop3"), main = "Box-Plot of 3 Random Populations", xlab= "Populations", ylab = "Observations")
```

Comment on Boxplot: The assumed populations seem to have an equal variance as the width of the each box is relatively similar to one another.

1.2.6 Residuals (Model Adequacy)

The ANOVA test is very sensitive to changes in variance. If the assumption of constant variance is not true, the ANOVA will not be an accurate test. To check the model adequacy, a boxcox plot can be referenced. This can lead to a variance stabilizing transformation if needed. Because these are not covered in this guide, only the instructions to read a boxcox plot are described. A boxcox plot will have a large arc on it and a confidence interval marked out. When comparing the confidence interval with the x axis, if 1 is included, the data is adequate for analysis by an ANOVA test. If the confidence interval does not include 1, the data collected is not suitable for an ANOVA test and a variance stabilizing transformation must be performed.

Another way to check the model adequacy is using the residuals. A residual is the difference between the the individual value in a population and the average value from this same population:

\[\text{e}_i = y_i.-\bar y_i.\]

To create the residual plot the aov function is used. When the function is run, it will output four different graphs. The ones that are important for our purposes are the first two. These plots are the residual spread plot and the normal probability plot of the residuals.

Following is a brief example that illustrates the aov function and shows the residual graphs

Lvl1<-c(1000,1500,1200,1800,1600,1100)
Lvl2<-c(1500,1800,2000,1200,2000,1700)
Lvl3<-c(900,1000,1200,1500,1200,1550)
Lvl4<-c(950,1050,1250,1550,1250,1600)

dat<-data.frame(Lvl1,Lvl2,Lvl3,Lvl4)
library(tidyr)
dat<-pivot_longer(dat,c(Lvl1,Lvl2,Lvl3,Lvl4),names_to = "Treatments")
aov.model<-aov(value~Treatments,data=dat)
plot(aov.model,1)

plot(aov.model,2)

Looking to the plots that are generated from the aov function, the first is the fitted values versus the residuals. What is important in this graph is that all of the values on the graph lie within two parallel lines. The approximate height and location of each column of values should be the same. The residuals should have about the same max and min values for each fitted value. Constant variance isn’t shown if there are some fitted values that are much higher and lower then the others, or if the values on the graph seem to make a large “less then”(<) or “greater then” sign(>) with wide variation on some fitted values and small variation on others. The second graph is the normal probability plot of the residuals. All of the values should show up in a straight line, similar to the previously explained normal probability plot with the actual data.

Alternatively, the data can be checked for constant variance using Levene’s test or Fisher’s test but those are not covered in this tutorial. If the data turns out to not validate the assumption of constant variance, use a type of transform to get the data to meet your assumptions. Variance stabilizing transforms are not covered in this tutorial. If this does not work. you can do a generalized linear model(GLM). GLM’s are not covered in this tutorial.

1.2.7 Inference

Now To determine statistical significance, compare the F statistic ($F_{0}$) to the critical value from the F-distribution table. The critical value is determined by your selected significance level $(\alpha )$ and the degrees of freedom for the F-distribution’s numerator $(v_{1})$ and denominator $(v_{2})$.

$v_{1}\rightarrow Dof_{\left ( treatments \right )} \\ v_{2}\rightarrow Dof_{\left ( error \right )}$

If the estimated F statistic is greater than the crucial value, then “REJECT THE NULL HYPOTHESIS”. This indicates that at least one group has significantly different variance than the rest. \[F_{0}> F\left ( \alpha ,v_{1} ,v_{2}\right )\\ i.e., \ P_{value}< \alpha \]If the estimated F statistic is less than or equal to the crucial value, then “FAILED TO REJECT THE NULL HYPOTHESIS”. This implies that there is no statistically significant variation in variances across groups.\[ F_{0}>F\left ( \alpha ,v_{1} ,v_{2}\right )\\ i.e.,\ P_{value}> \alpha \]

1.2.8 Multiple Comparisons

If the ANOVA results in a rejection of $H_0$ in favor of $H_a$, the conclusion that at least one mean significantly differs from the grand mean can be made. The next question is which mean or means differ significantly from the grand mean. To analyse this a multiple comparison test is run.

There is a multiple comparison test called Least Significant Difference(LSD) It is a test that resembles a 2 sample T test but with different degrees of freedom on $\sigma$. This is taught in some text books but it is not the best test to use because it does not control for family wise error(type 1 error).

The test that we will use is Tukey’s test. It is also known as the Honest Significant Difference test(HSD). This test does account for family wise error(type 1 error).

Tukey’s test uses the studentized range statistic, q. This is the difference between the largest and smallest data in a sample that is normalized by the sample standard deviation.

$q=\frac{\bar{y}_{max}-\bar{y}_{min}}{\sqrt{\frac{MSE}{n}}}\sim\ q_ {Dof\ of\ MSE}$

Using q, we set up a confidence interval between the difference of the means. If zero is not in the interval, you reject $H_O$. If zero is in the interval, you fail to reject $H_O$.

This is the interval that Tukey’s test checks:

$\bar{y}_i.-\bar{y}_j.- q_{\alpha}(\text{I,f})\sqrt{MSE(\frac{1}{n_i}+\frac{1}{n_j})} \eqslantless \mu_i-\mu_j \eqslantless \bar{y}_i.-\bar{y}_j.+ q_{\alpha}(\text{I,f})\sqrt{MSE(\frac{1}{n_i}+\frac{1}{n_j})}$

To do Tukey’s test in R, an aov(Analysis of variance) test is first used on the data. To do this test, your data must be saved in a data frame with the different populations in one column and the data in another column. The following example shows how to save data as vectors, then put them into a data frame. Then it shows how to use tidyr to organize the data into the previously described format. With the data set, an aov test is run and the data is saved to aov.model. Last Tukeys test is performed and plotted.

plot(TukeyHSD(aov.model),las=1,cex.axis = 0.75)

Looking at the results of the Tukey’s HSD test, it is clear that there is only one comparison where the confidence interval does not include zero. This is the difference in Level 3 and Level 2. This shows that the means between these to populations significantly differ. The rest of the comparisons include zero and so they are not significantly different from one another.

1.3 Example

This one is sourced from the Design and Analysis of Experiments textbook by D.C. Montgomery (8th ed.) Exercise 3.21 (pg.133).

EXPERIMENT PERFORMED and DATA:

An Article in Environmental International (Vol. 18, No.4, 1992) describes an experiment in which the amount of radon released in showers was investigated. Radon-enriched water was used in the experiment, and six different orifice diameters were tested in shower heads. The data from the experiment are shown in the following table:

*Data from the experiment (This data is stored as a CSV file in GitHub:* https://raw.githubusercontent.com/Jaisai0611/CRD-example/main/EX%203.21.csv)
Orifice Diameter	Radon Released (%) Sample 1	Radon Released (%) Sample 2	Radon Released (%) Sample 3	Radon Released (%) Sample 4
0.37	80	83	83	85
0.51	75	75	79	79
0.71	74	73	76	77
1.02	67	72	74	74
1.40	62	62	67	69
1.99	60	61	64	66

1.3.1 Experimental Data fits the CRD model or not? if CRD, Factor levels (Orifice Diameter) Fixed or Random Effects?

Observing the given data table, we can clearly understand that the experiment conducted or observations chosen would be in a random manner so that the environment in which the treatments are applied is as uniform as possible. Thus, It can be considered as a CRD Model.

The treatment effects, according to the study conducted author considered particularly 6 variants in orifice diameter (0.37, 0.51, 0.71, 1.02, 1.40, 1.99). Thus we can consider it as fixed effects (Single factor ANOVA). The same experiment would have also been best suitable for Random Effects if orifice diameters were randomly chosen out of a big range of diameters, especially to perform Statistical analysis.

1.3.2 Read CSV file into R and perform some data wrangling to best fit for ANOVA

The CSV file was read into R from the GitHub link. Using the pivot_longer command, we made sure to fit all the observations in a single column. Then, converted the data of Orifice Diameter as a factor. This was shown in the following Code chunk.

#Read the CSV file into R
ex3.21<- read.csv("https://raw.githubusercontent.com/Jaisai0611/CRD-example/main/EX%203.21.csv")

#Data wrangling to make it fit for ANOVA
library(tidyr)
ex3.21<- pivot_longer(ex3.21,2:5,names_to = "samples", values_to = "Radon Released%")
ex3.21$Orifice.Diameter<- as.factor(ex3.21$Orifice.Diameter)

1.3.3 Check for Assumptions

As discussed earlier, we must ensure that all the given data is normally distributed and that constant variance among all the factor levels of treatment.

Check for Normality of data
```
#Assumption1: Check for Normality of data
qqnorm(ex3.21$`Radon Released%`, ylab = "Radon Released %", col ='BLUE')
qqline(ex3.21$`Radon Released%`, col = 'RED')
```
From the above NORMAL Q-Q Plot, most of the data points lie upon the normality line or closer to it, and no major deviations from the normal line were observed. Thus it can be interpreted that the data fits into a Normal Distribution.
Check for Constant Variance
```
#Assumption2: Check for Variance equality among different levels of Treatments
boxplot(ex3.21$`Radon Released%`~ ex3.21$Orifice.Diameter, main = "Boxplot for Orifice Diameters", xlab = "Orifice Diameters", ylab = "% of Radon Released", col = 'lightblue')
```
From the plotted Boxplot, based on the size of each box which represents the variance of Radon released percentage for respective orifice diameter, it can be only interpreted as approximately constant variance.
However, by performing Boxcox, we can check the Model adequacy to perform ANOVA.

1.3.4 Model Adequacy check using Boxcox

library(MASS)
boxcox(ex3.21$`Radon Released%`~ ex3.21$Orifice.Diameter,lambda = seq(-5, 8))

we can observe that ( $\lambda$ =1) lies within the confidence interval, thus confirming that our data is valid to proceed with ANOVA.

1.3.5 ANOVA (Analysis of Variance)

HYPOTHESIS: (Recap Discussions)
- Null Hypothesis: All the treatment level means are equal, which means No significant difference between the means of Radon released (%) with respect to Orifice Diameter.
  
  \[ H_{0}: \mu_{1} =\mu_{2} =\mu_{3} =\mu_{4} =\mu_{5} =\mu_{6} \\ (Or)\\ H_{0}: \tau_{i} = 0\ \ \ \forall i \]
- Alternate Hypothesis: At least one mean is significantly different from others, which means at least one of the means out of 6 types of orifice diameters significantly differs from the rest.
  
  \[ H_a: atleast\ one\ mean\ significantly\ differs\\ (Or)\\ H_a: \tau_i\ \neq\ 0\ \ \ \exists\ i \]

Applying Hypothesis to our example:

#Performing single factor ANOVA
Anova_3.21<- aov(`Radon Released%`~ Orifice.Diameter, data = ex3.21)
summary(Anova_3.21)

##                  Df Sum Sq Mean Sq F value   Pr(>F)    
## Orifice.Diameter  5 1133.4  226.68   30.85 3.16e-08 ***
## Residuals        18  132.2    7.35                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can observe that p-value (3.16e-08) is far less than our significant value (say 0.001 or 99% Conf. interval), which leads to reject the null hypothesis. Hence, there is a significant difference between the various types of Orifice diameters w.r.t Radon Released (%).

Although we rejected null hypothesis, we are still unaware about which pair of means exactly differs. lets check the residual plots and later perform TukeysHSD test.

1.3.6 Residual plots from ANOVA

plot(Anova_3.21,1, col = 'BLUE', col.lab = 'red', col.axis = 'blue')

Comment: Analyzing the Residuals vs Fitted values plot, widespread residual points can be observed which tells us that it has higher variability. however across all the treatments variance spread seems to be the same which was indicated by the red line over the center of the plot, and even spread can be observed on either side of the line.

plot(Anova_3.21,2, col = 'BLUE', col.lab = 'red', col.axis = 'blue')

Comment: Observing the normal probability plot of the residuals, hard to interpret it as a Normality because we can see a few outliers from the normality line which is not even closely bounded.

However, we can’t say that this data doesn’t fit exactly for this type of statistical model, but let’s proceed with the HSD test just for sake of identification of pairs of mean that are significantly different. This step accomplishes the ultimate goal of performing this statistical test (ANOVA with CRD)

1.3.7 Tukeys-HSD Test

#Multiple pairwise means comparison using tukeys-HSD test
library(car)
results<-TukeyHSD(Anova_3.21)
results_matrix <- as.matrix (results) 
df_res <- as.data.frame(results_matrix[1]) #this conversion of results to matrix and data fram is for sake of ease in interpretting the color coded plot to identify the pairs of mean which significantly differ.
plot(results, col= ifelse(df_res[,4]<0.05, 'red', 'darkgreen'), las = 1, xlim =c(-26,4), cex.axis = 0.75) # Aguments like las (axis lable directin), xlim (xasis scale limits), cex.axis (font size of axis labels) were used here in order to ehance the looks of graphics and also due to space contraints.

Observing the Tukeys HSD plot, a few pairs of orifice diameters whose differences in mean levels of oriffice diameter intersect with “0”, meaning those pairs are not significantly different. But, all those pairs on the plot, highlighted with red color are significantly different.

1.3.8 Conclusion:

On an overall scale, since more no. of pairs of mean (Orifice diameters) are significantly different. Hence, This statistical analysis strongly believes that Orifice diameter is affecting the percentage of Radon Released.

1.3.9 Complete Code

#Read the CSV file into R
ex3.21<- read.csv("https://raw.githubusercontent.com/Jaisai0611/CRD-example/main/EX%203.21.csv")

#Data wrangling to make it fit for ANOVA
library(tidyr)
ex3.21<- pivot_longer(ex3.21,2:5,names_to = "samples", values_to = "Radon Released%")
ex3.21$Orifice.Diameter<- as.factor(ex3.21$Orifice.Diameter)

#Assumption1: Check for Normality of data
qqnorm(ex3.21$`Radon Released%`, ylab = "Radon Released %", col ='BLUE')
qqline(ex3.21$`Radon Released%`, col = 'RED')

#Assumption2: Check for Variance equality among different levels of Treatments
boxplot(ex3.21$`Radon Released%`~ ex3.21$Orifice.Diameter, main = "Boxplot for Orifice Diameters", xlab = "Orifice Diameters", ylab = "% of Radon Released", col = 'lightblue')

#Check if transformation of data required or not (MODEL ADEQUECY)
library(MASS)
boxcox(ex3.21$`Radon Released%`~ ex3.21$Orifice.Diameter,lambda = seq(-5, 8, 1/2))

#Performing single factor ANOVA
Anova_3.21<- aov(`Radon Released%`~ Orifice.Diameter, data = ex3.21)
summary(Anova_3.21)
plot(Anova_3.21)

#Multiple pairwise means comparison using tukeys-HSD test
library(car)
comparision<-TukeyHSD(Anova_3.21)
results_matrix <- as.matrix (results) 
df_res <- as.data.frame(results_matrix[1]) #this conversion of results to matrix and data fram is for sake of ease in interpretting the color coded plot to identify the pairs of mean which significantly differ.
plot(results_matrix, col= ifelse(df_res[,4]<0.05, 'red', 'black'), las = 1, xlim =c(-26,4), cex.axis = 0.75) # Aguments like las (axis lable directin), xlim (xasis scale limits), cex.axis (font size of axis labels) were used here in order to ehance the looks of graphics and also due to space contraints.

1.4 Acknowledgement:

All the technical content and equations used in this cookbook are sourced from the lectures of Dr.Timothy Matis.

CRD EXAMPLE (TEXT_Ex3.21)

Author 1 “Keller Peter”

Author 2 “Jai Sai Teja Reddy Devalampeta”

Author 3 “Ravi Teja Vijjana”

Last compiled on October 22, 2023 at 8:46 PM - CDT

1 Completely Randomized Design (Peter, Jai Sai Teja, Ravi Teja)

1.1 Assumptions

1.2 Process

1.2.1 Statistical Test

1.2.2 Sample Size Determination

1.2.2.1 power.anova.test

1.2.2.2 pwr.anova.test

1.2.3 Design Layout

1.2.4 Collect Data

1.2.5 Preliminary Plots

1.2.6 Residuals (Model Adequacy)

1.2.7 Inference

1.2.8 Multiple Comparisons

1.3 Example

1.3.1 Experimental Data fits the CRD model or not? if CRD, Factor levels (Orifice Diameter) Fixed or Random Effects?

1.3.2 Read CSV file into R and perform some data wrangling to best fit for ANOVA

1.3.3 Check for Assumptions

1.3.4 Model Adequacy check using Boxcox

1.3.5 ANOVA (Analysis of Variance)

1.3.6 Residual plots from ANOVA

1.3.7 Tukeys-HSD Test

1.3.8 Conclusion:

1.3.9 Complete Code

1.4 Acknowledgement: