N. Uttam Singh¹

Abhishek Thakur¹*

Eric Rani¹

*Corresponding Author Email:

¹ICAR Research Complex for NEH Region

Umiam, Meghalaya

Introduction

Analysis of Variance (ANOVA) is one of the most widely used statistical techniques in agricultural research. It is used to compare treatment means and determine whether significant differences exist among treatments under experimental conditions.

ANOVA helps researchers identify treatment effects and distinguish them from random experimental variation.

R Studio provides an efficient and user-friendly environment for performing ANOVA, graphical visualization, and statistical interpretation.

Objectives

The objectives of this practical tutorial are:

  • To understand the concept of ANOVA
  • To import agricultural datasets into RStudio
  • To perform ANOVA using R
  • To calculate CD, CV, SEm, and SEd
  • To perform DMRT for multiple comparison
  • To generate graphical representation of treatment means

Software Requirements

Software Purpose
R Software Statistical Computing
RStudio Integrated Development Environment

Introduction to R and R Studio

R is an open-source programming language widely used for statistical analysis, data visualization, and predictive modelling.

R Studio is an Integrated Development Environment (IDE) for R.

Main Components of RStudio

  1. Source Editor
  2. Console
  3. Environment/History
  4. Files/Plots/Packages/Help

Agricultural Dataset Description

In this tutorial, agricultural experimental data consisting of treatments and replications are analysed using ANOVA techniques.

Required Package Installation

Run these commands once in R Console:

install.packages("readxl")
install.packages("dplyr")
install.packages("ggplot2")
install.packages("agricolae")
install.packages("emmeans")
install.packages("knitr")

Library Required Packages

library(readxl)
library(ggplot2)
library(dplyr)
library(knitr)
library(agricolae)
library(emmeans)

# Import Data (Not for Users)

data <- read_excel("rbddata.xlsx")

Command to be Used (For Users)

data <- read_excel("C:/Users/Abhisek/Desktop/RBD in R studio/rbddata.xlsx")

Note 1: Use the Second command to Import the data from the Excel File by copying the file path, the First Command is provided as reference with R Markdown file.

Note 2: While specifying the file path in RStudio, use forward slashes (/) instead of backward slashes (\). For example:

C:/Users/Abhisek/Desktop/RBD in R studio/rbddata.xlsx
head(data)
## # A tibble: 6 × 3
##   Treatment Replication Yield
##   <chr>     <chr>       <dbl>
## 1 T1        R1             45
## 2 T1        R2             47
## 3 T1        R3             46
## 4 T2        R1             52
## 5 T2        R2             54
## 6 T2        R3             53

Convert Variables into Factor

data$Treatment <- as.factor(data$Treatment)
data$Replication <- as.factor(data$Replication)

Structure of Dataset

str(data)
## tibble [24 × 3] (S3: tbl_df/tbl/data.frame)
##  $ Treatment  : Factor w/ 8 levels "T1","T2","T3",..: 1 1 1 2 2 2 3 3 3 4 ...
##  $ Replication: Factor w/ 3 levels "R1","R2","R3": 1 2 3 1 2 3 1 2 3 1 ...
##  $ Yield      : num [1:24] 45 47 46 52 54 53 58 57 59 63 ...

Summary Statistics

summary(data)
##    Treatment Replication     Yield      
##  T1     :3   R1:8        Min.   :45.00  
##  T2     :3   R2:8        1st Qu.:51.75  
##  T3     :3   R3:8        Median :59.50  
##  T4     :3               Mean   :59.75  
##  T5     :3               3rd Qu.:66.00  
##  T6     :3               Max.   :77.00  
##  (Other):6

Treatment Means

aggregate(Yield ~ Treatment,
          data=data,
          mean)
##   Treatment Yield
## 1        T1    46
## 2        T2    53
## 3        T3    58
## 4        T4    64
## 5        T5    70
## 6        T6    61
## 7        T7    50
## 8        T8    76

Boxplot of Treatments

ggplot(data,
       aes(x=Treatment,
           y=Yield,
           fill=Treatment)) +
  geom_boxplot() +
  labs(title="Treatment-wise Yield",
       x="Treatment",
       y="Yield") +
  theme_minimal()

Analysis of Variance (ANOVA)

Fitting ANOVA Model

model <- aov(Yield ~ Treatment,
             data=data)

model
## Call:
##    aov(formula = Yield ~ Treatment, data = data)
## 
## Terms:
##                 Treatment Residuals
## Sum of Squares     2164.5      16.0
## Deg. of Freedom         7        16
## 
## Residual standard error: 1
## Estimated effects may be unbalanced

ANOVA Table

summary(model)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Treatment    7   2164   309.2   309.2 7.34e-16 ***
## Residuals   16     16     1.0                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation of ANOVA Table

The ANOVA table provides:

  • Degrees of freedom
  • Sum of squares
  • Mean squares
  • F-value
  • p-value

Decision Rule

p-value Interpretation
p < 0.05 Significant Treatment Effect
p > 0.05 Non-significant Treatment Effect

Coefficient of Variation (CV)

cv <- (sqrt(mean(model$residuals^2)) /
         mean(data$Yield)) * 100

cv
## [1] 1.366521

Interpretation of CV

  • Lower CV indicates better experimental precision and reliability
  • Higher CV indicates greater experimental variability
  • In Randomized Block Design (RBD), CV values generally ranging from 5–15% are considered acceptable for agricultural experiments

Standard Error of Mean (SEm)

mse <- deviance(model)/df.residual(model)

r <- 3

SEm <- sqrt(mse/r)

SEm
## [1] 0.5773503

Standard Error of Difference (SEd)

SEd <- sqrt(2*mse/r)

SEd
## [1] 0.8164966

Critical Difference (CD)

tvalue <- qt(0.975,
             df.residual(model))

CD <- tvalue * SEd

CD
## [1] 1.730895

Duncan Multiple Range Test (DMRT)

dmrt <- agricolae::duncan.test(model,
                               "Treatment",
                               group=TRUE)

dmrt
## $statistics
##   MSerror Df  Mean      CV
##         1 16 59.75 1.67364
## 
## $parameters
##     test    name.t ntr alpha
##   Duncan Treatment   8  0.05
## 
## $duncan
##      Table CriticalRange
## 2 2.997999      1.730895
## 3 3.143802      1.815075
## 4 3.234945      1.867696
## 5 3.297445      1.903780
## 6 3.342599      1.929850
## 7 3.376283      1.949298
## 8 3.401918      1.964098
## 
## $means
##    Yield std r        se Min Max  Q25 Q50  Q75
## T1    46   1 3 0.5773503  45  47 45.5  46 46.5
## T2    53   1 3 0.5773503  52  54 52.5  53 53.5
## T3    58   1 3 0.5773503  57  59 57.5  58 58.5
## T4    64   1 3 0.5773503  63  65 63.5  64 64.5
## T5    70   1 3 0.5773503  69  71 69.5  70 70.5
## T6    61   1 3 0.5773503  60  62 60.5  61 61.5
## T7    50   1 3 0.5773503  49  51 49.5  50 50.5
## T8    76   1 3 0.5773503  75  77 75.5  76 76.5
## 
## $comparison
## NULL
## 
## $groups
##    Yield groups
## T8    76      a
## T5    70      b
## T4    64      c
## T6    61      d
## T3    58      e
## T2    53      f
## T7    50      g
## T1    46      h
## 
## attr(,"class")
## [1] "group"

Graphical Representation of Treatment Means

means <- aggregate(Yield ~ Treatment,
                   data=data,
                   mean)

ggplot(means,
       aes(x=Treatment,
           y=Yield,
           fill=Treatment)) +
  geom_bar(stat="identity") +
  labs(title="Treatment Means",
       x="Treatment",
       y="Mean Yield") +
  theme_minimal()

Applications in Agriculture

ANOVA has wide applications in agricultural sciences.

Major Applications

  • Crop variety evaluation
  • Fertilizer response studies
  • Irrigation management
  • Soil nutrient experiments
  • Agronomic research
  • Plant breeding studies
  • Pest and disease management trials

Conclusion

ANOVA is an important statistical tool for analysing agricultural experimental data and comparing treatment effects. RStudio provides a powerful environment for performing ANOVA, calculating statistical parameters, conducting multiple comparison tests, and generating graphical outputs. The procedures demonstrated in this tutorial can be extended to advanced experimental designs and agricultural data analysis.