This 433/533 class exercise will focus on the analysis of fungicide application timing and the efficacy of different products on crop yield using field plot data. I extracted the first set of data (mock_yield_data) from the picture (multi-location trial) we used in one of our classes. Another set of data is created from multi-location, multi-product trials (test, test2, test4, test4, etc.). The dataset is stored in an Excel file that is in the working directory and includes different treatments applied at various locations. Make sure use a correct folder path where you have all the files for subsequent analysis.
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. It’s a combination of: - Markdown (text with formatting) - R code chunks (sections of code you can run) - Output (plots, summaries, tables)
# Webling: https://posit.co/download/rstudio-desktop/
# Note first install 'R' then 'Rstudio':
Packages are tools that add functions to R. Run on console to get required packages. >install.packages(“librarian”) >install.packages(“bookdown”)
packages <- c(“ggplot2”, “dplyr”, “readxl”, “knitr”, “tidyr”, “agricolae”, “here”, “bookdown”)
new_packages <- packages[!(packages %in% installed.packages()[,“Package”])] if(length(new_packages)) install.packages(new_packages)
librarian::shelf(readr, knitr,readxl, dplyr, tidyr, ggplot2, agricolae, here, bookdown)
# The 'librarian' package checks if packages are installed, verifies their version,
# installs only if needed, and then loads all desired packages into the R session.
# however it required CRAN mirriring, otheresie fails.
librarian::shelf(readr, knitr,readxl, dplyr, tidyr, ggplot2, agricolae, here)
This is the folder where R will look for your files.
setwd("C:/Users/huma0/Dropbox/SDSU/Teaching/Diseases_of_Field_Crops_Major_years/2025_Spring/2025_Data_class_35")
#If you get error: Classic Windows path issue in R:backslashes (\) in the file path
# Use forward slashes /: if you are copying path from Windows
# use your working directory. Can I make red colored?
💡 Note: Make sure the folder path above is correct for your computer.
Example path: C:/Users/huma0/Desktop/Data
We will load a CSV file named mock_yield_data.csv.
data <- read_csv("mock_yield_data.csv")
## Rows: 96 Columns: 5
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (2): Location, Treatment
## dbl (3): Rep, Yield, Severity
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
#use read_csv or read_excel
# View the first few rows
#Explore the Data
head(data)
## # A tibble: 6 x 5
## Location Treatment Rep Yield Severity
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 ClayCenter Check 1 77.5 8
## 2 ClayCenter Check 2 78.8 7
## 3 ClayCenter Check 3 85.9 7
## 4 ClayCenter Check 4 80.0 9
## 5 ClayCenter R3 1 77.2 3.1
## 6 ClayCenter R3 2 83.3 3.2
str(data)
## spc_tbl_ [96 x 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ Location : chr [1:96] "ClayCenter" "ClayCenter" "ClayCenter" "ClayCenter" ...
## $ Treatment: chr [1:96] "Check" "Check" "Check" "Check" ...
## $ Rep : num [1:96] 1 2 3 4 1 2 3 4 1 2 ...
## $ Yield : num [1:96] 77.5 78.8 85.9 80 77.2 ...
## $ Severity : num [1:96] 8 7 7 9 3.1 3.2 3.4 3.4 6 5.4 ...
## - attr(*, "spec")=
## .. cols(
## .. Location = col_character(),
## .. Treatment = col_character(),
## .. Rep = col_double(),
## .. Yield = col_double(),
## .. Severity = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
summary(data)
## Location Treatment Rep Yield
## Length:96 Length:96 Min. :1.00 Min. :58.95
## Class :character Class :character 1st Qu.:1.75 1st Qu.:72.21
## Mode :character Mode :character Median :2.50 Median :77.79
## Mean :2.50 Mean :77.92
## 3rd Qu.:3.25 3rd Qu.:84.82
## Max. :4.00 Max. :94.99
## Severity
## Min. : 2.700
## 1st Qu.: 3.200
## Median : 4.550
## Mean : 5.146
## 3rd Qu.: 6.425
## Max. :10.000
Summarize by Group. We want the mean, standard deviation, and standard error by location and treatment.
# Summary Statistics
summary_df <- data %>%
group_by(Location, Treatment) %>%
summarize(
MeanYield = mean(Yield, na.rm = TRUE),
SD = sd(Yield, na.rm = TRUE),
n = n(),
SE = SD / sqrt(n),
.groups = "drop"
)
head(summary_df)
## # A tibble: 6 x 6
## Location Treatment MeanYield SD n SE
## <chr> <chr> <dbl> <dbl> <int> <dbl>
## 1 ClayCenter Check 80.5 3.73 4 1.86
## 2 ClayCenter R3 77.7 4.70 4 2.35
## 3 ClayCenter V4 84.0 3.61 4 1.81
## 4 ClayCenter V4R3 85.3 4.12 4 2.06
## 5 Clinton Check 84.4 5.22 4 2.61
## 6 Clinton R3 84.1 1.71 4 0.856
# Set plotting order for Treatment
summary_df$Treatment <- factor(summary_df$Treatment, levels = c("Check", "V4", "R3", "V4R3"))
head(summary_df)
## # A tibble: 6 x 6
## Location Treatment MeanYield SD n SE
## <chr> <fct> <dbl> <dbl> <int> <dbl>
## 1 ClayCenter Check 80.5 3.73 4 1.86
## 2 ClayCenter R3 77.7 4.70 4 2.35
## 3 ClayCenter V4 84.0 3.61 4 1.81
## 4 ClayCenter V4R3 85.3 4.12 4 2.06
## 5 Clinton Check 84.4 5.22 4 2.61
## 6 Clinton R3 84.1 1.71 4 0.856
This compares treatment effects statistically.
lsd_results <- data %>%
group_by(Location) %>%
do({
model <- aov(Yield ~ Treatment, data = .)
pval <- summary(model)[[1]]$`Pr(>F)`[1]
lsd_out <- LSD.test(model, "Treatment", alpha = 0.10, group = FALSE)
data.frame(LSD = lsd_out$statistics$LSD, p_value = pval)
}) %>%
ungroup()
lsd_results
## # A tibble: 6 x 3
## Location LSD p_value
## <chr> <dbl> <dbl>
## 1 ClayCenter 5.12 0.0806
## 2 Clinton 5.36 0.735
## 3 Janesville 5.05 0.769
## 4 Malta 3.13 0.00257
## 5 Slater 5.38 0.525
## 6 Waterloo 3.08 0.286
annotation_df <- summary_df %>%
group_by(Location) %>%
summarize(MaxMean = max(MeanYield)) %>%
left_join(lsd_results, by = "Location") %>%
mutate(
LSD_label = ifelse(
p_value < 0.10,
paste0("LSD(0.10) = ", round(LSD, 1)),
"LSD(0.10) = NS"
)
)
annotation_df
## # A tibble: 6 x 5
## Location MaxMean LSD p_value LSD_label
## <chr> <dbl> <dbl> <dbl> <chr>
## 1 ClayCenter 85.3 5.12 0.0806 LSD(0.10) = 5.1
## 2 Clinton 87.0 5.36 0.735 LSD(0.10) = NS
## 3 Janesville 87.7 5.05 0.769 LSD(0.10) = NS
## 4 Malta 71.8 3.13 0.00257 LSD(0.10) = 3.1
## 5 Slater 76.5 5.38 0.525 LSD(0.10) = NS
## 6 Waterloo 75.7 3.08 0.286 LSD(0.10) = NS
# this code chunk has a figure plotted, so 'fig.cap' needed.
my_colors <- c(
"Check" = "#FDB813",
"R3" = "#FFE135",
"V4" = "#0073CF",
"V4R3" = "#333333"
)
#"Check" = "#E69F00",
# "R3" = "#F0E442",
# "V4" = "#0072B2",
# "V4R3" = "#444444"
#)
yield_plot <- ggplot(summary_df, aes(x = Location, y = MeanYield, fill = Treatment)) +
geom_col(position = position_dodge(0.8), width = 0.7, color = "black") +
geom_errorbar(
aes(ymin = MeanYield - SE, ymax = MeanYield + SE),
position = position_dodge(0.8), width = 0.3
) +
geom_text( # this will write mean value inside the bar.
aes(label = round(MeanYield, 1),group = Treatment),
position = position_dodge(0.8),
vjust = .7,
hjust = 4,
angle = 90,
fontface = "bold",
color = "white",
size = 3
) +
geom_text( # this will write LSD value,font, and its placing.
data = annotation_df,
aes(x = Location, y = MaxMean + 5, label = LSD_label),
inherit.aes = FALSE, size = 2.5, fontface = "bold"
) +
scale_fill_manual(values = my_colors) + # this will modify axis Y.
scale_y_continuous(limits = c(0, 100), expand = expansion(mult = c(0, 0.12))) +
labs(
title = "Effect of Fungicide Timing on Yield",
x = "Location", y = "Yield (bu/A)", fill = NULL
) +
theme_minimal(base_size = 14) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, face = "bold", size = 12, color = "black"),
axis.text.y = element_text(face = "bold", size = 12, color = "black"),
axis.title = element_text(face = "bold", size = 14, color = "black"),
axis.line = element_line(color = "black"),
axis.ticks = element_line(color = "black"),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
plot.title = element_text(hjust = 0.5, face = "bold", size = 16)
)
print(yield_plot)
Figure 4.1: Effect of Fungicide Timing on Yield
ggsave("Yield_Fungicide_Effect.png", plot = yield_plot, dpi = 900, width = 12, height = 10, units = "in")
You have completed a full R analysis workflow on given a dataset: - Loaded data - Summarized results - Ran statistical tests - Created and saved a custom plot
Let me know if you have questions, or experiment by changing some numbers in the dataset and re-running the analysis!