Intro to R and Packages for Field Data Analysis

0.1 1. Welcome to R!

R is a free, open-source programming language designed for data analysis, statistics, and visualization.

R is especially popular in agriculture, biology, economics, and health research.

We use RStudio, which is a friendly interface for writing and running R code.

0.2 2. Why Are We Using R in This Course?

To learn data handling and statistical thinking
To analyze real-world field trial datasets
To create custom plots and summaries that communicate results clearly
To build skills needed in research, graduate studies, and industry work

Field Plot Example

Example of a field experiment setup

0.3 3. What is R Markdown?

R Markdown combines code, results, and explanations into one file.

It is used to create reports, assignments, or publications with both analysis and interpretation.

R Markdown files have the extension .Rmd and include: - Text written in Markdown (like this!) - Code chunks inside triple backticks

Example:


``` r
# This is an example of a code chunk
summary(cars)  # This summarizes a built-in dataset
```

```
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00
```

0.4 4. Packages: R’s Toolboxes

Packages are collections of functions that add specific abilities to R.

You can install them once using install.packages("packagename"), and then load them each time using library(packagename).

0.5 5. Packages We Will Use

Here are the key packages we’ll use in this course:

0.5.1 1. `readr` and `readxl`

For reading data from CSV and Excel files

0.5.2 2. `dplyr` and `tidyr`

For data cleaning, summarizing, and reshaping

0.5.3 3. `ggplot2`

For making beautiful and customizable graphs

0.5.4 4. `agricolae`

For agriculture-specific statistics like LSD tests

0.5.5 5. `here`

Helps manage file paths in a reliable and reproducible way

0.5.6 6. `librarian`

A helper package that can install and load multiple packages at once

0.6 6. Understanding Basic Statistical Concepts

0.6.1 Mean (Average)

The mean is the central value of a dataset. It’s used to summarize data.

Formula: \(\text{Mean} = \frac{\sum x}{n}\)

mean(c(5, 10, 15, 20))

## [1] 12.5

0.6.2 Normal Distribution

Most biological traits (like crop yield) follow a bell-shaped curve.

x <- seq(-4, 4, length=100)
y <- dnorm(x)
plot(x, y, type="l", lwd=2, main="Normal Distribution", ylab="Density")

Normal Distribution Curve

0.6.3 T-test vs ANOVA

T-test compares 2 groups (e.g., treated vs control)
ANOVA compares 3 or more groups (e.g., Check, V4, R3, V4R3)

0.6.4 LSD (Least Significant Difference)

Used after ANOVA to tell which groups are different
Used in agricolae::LSD.test()

0.6.5 Significance

If p < 0.05, the result is statistically significant
Means the treatment likely had an effect

0.7 7. Agriculture Field Trial Design

In ag research, we use replicated plots in a field to test treatments.

Each treatment is applied to multiple replicates
This controls variability in weather, soil, etc.
Common designs: Randomized Complete Block (RCBD), Split Plot

Design

Source: ResearchGate – Example of RCBD layout

0.8 8. Try It: Interactive Figure with `plotly`

library(ggplot2)
library(plotly)

p <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3) +
  labs(title = "MPG vs Weight", x = "Weight (1000 lbs)", y = "Miles per Gallon", color = "Cylinders") +
  theme_minimal()

ggplotly(p)

0.9 9. Summary

You’ve just reviewed the basics of: - What R and R Markdown are - What packages we’ll use - What common statistical concepts mean - How field trials are designed - How to view interactive figures

👍 Ready to try hands-on examples in the next session!

Intro to R and Packages for Field Data Analysis

Shyam Solanki, Ph.D.

2025-04-03

0.1 1. Welcome to R!

0.2 2. Why Are We Using R in This Course?

0.3 3. What is R Markdown?

0.4 4. Packages: R’s Toolboxes

0.5 5. Packages We Will Use

0.5.1 1. `readr` and `readxl`

0.5.2 2. `dplyr` and `tidyr`

0.5.3 3. `ggplot2`

0.5.4 4. `agricolae`

0.5.5 5. `here`

0.5.6 6. `librarian`

0.6 6. Understanding Basic Statistical Concepts

0.6.1 Mean (Average)

0.6.2 Normal Distribution

0.6.3 T-test vs ANOVA

0.6.4 LSD (Least Significant Difference)

0.6.5 Significance

0.7 7. Agriculture Field Trial Design

0.8 8. Try It: Interactive Figure with `plotly`

0.9 9. Summary

Intro to R and Packages for Field Data Analysis

Shyam Solanki, Ph.D.

2025-04-03

0.1 1. Welcome to R!

0.2 2. Why Are We Using R in This Course?

0.3 3. What is R Markdown?

0.4 4. Packages: R’s Toolboxes

0.5 5. Packages We Will Use

0.5.1 1. readr and readxl

0.5.2 2. dplyr and tidyr

0.5.3 3. ggplot2

0.5.4 4. agricolae

0.5.5 5. here

0.5.6 6. librarian

0.6 6. Understanding Basic Statistical Concepts

0.6.1 Mean (Average)

0.6.2 Normal Distribution

0.6.3 T-test vs ANOVA

0.6.4 LSD (Least Significant Difference)

0.6.5 Significance

0.7 7. Agriculture Field Trial Design

0.8 8. Try It: Interactive Figure with plotly

0.9 9. Summary

0.5.1 1. `readr` and `readxl`

0.5.2 2. `dplyr` and `tidyr`

0.5.3 3. `ggplot2`

0.5.4 4. `agricolae`

0.5.5 5. `here`

0.5.6 6. `librarian`

0.8 8. Try It: Interactive Figure with `plotly`