Part 1: Important Operators and Keyboard Shortcuts

Keyboard Shortcuts

Essential shortcuts to speed up your workflow:

Action	Windows/Linux	Mac
Running Code
Run current line/selection	`Ctrl + Enter`	`Cmd + Enter`
Run entire script	`Ctrl + Shift + Enter`	`Cmd + Shift + Enter`
Run from beginning to current line	`Ctrl + Alt + B`	`Cmd + Option + B`

Typing Helpers
Assignment `<-`	`Alt + -`	`Option + -`
Pipe `%>%`	`Ctrl + Shift + M`	`Cmd + Shift + M`
Comment/uncomment line	`Ctrl + Shift + C`	`Cmd + Shift + C`

File Operations
Save script	`Ctrl + S`	`Cmd + S`
New script	`Ctrl + Shift + N`	`Cmd + Shift + N`
Open file	`Ctrl + O`	`Cmd + O`

Editing
Find and replace	`Ctrl + F`	`Cmd + F`
Find in files	`Ctrl + Shift + F`	`Cmd + Shift + F`
Undo	`Ctrl + Z`	`Cmd + Z`
Redo	`Ctrl + Shift + Z`	`Cmd + Shift + Z`
Indent	`Tab`	`Tab`
Outdent	`Shift + Tab`	`Shift + Tab`

RStudio
Clear console	`Ctrl + L`	`Cmd + L`
Restart R session	`Ctrl + Shift + F10`	`Cmd + Shift + F10`
Knit R Markdown	`Ctrl + Shift + K`	`Cmd + Shift + K`
Insert code chunk (Rmd)	`Ctrl + Alt + I`	`Cmd + Option + I`
Show help for function	`F1` (cursor on function)	`F1`
Autocomplete	`Tab`	`Tab`

Operators

Assignment Operators

Used to store values in objects:

Operator	Meaning	Example	Result
`<-`	Assign (preferred)	`x <- 5`	x contains 5
`=`	Assign (also works)	`x = 5`	x contains 5
`->`	Assign right	`5 -> x`	x contains 5

Tip: Use <- for assignment. The shortcut is Alt + - (Windows) or Option + - (Mac).

Comparison Operators

Used to compare values. Always return TRUE or FALSE:

Operator	Meaning	Example	Result
`==`	Equal to	`5 == 5`	`TRUE`
`!=`	Not equal to	`5 != 3`	`TRUE`
`<`	Less than	`3 < 5`	`TRUE`
`>`	Greater than	`5 > 3`	`TRUE`
`<=`	Less than or equal	`5 <= 5`	`TRUE`
`>=`	Greater than or equal	`5 >= 3`	`TRUE`
`%in%`	Is in set	`"a" %in% c("a","b","c")`	`TRUE`

⚠️ Common Mistake: Don’t confuse = (assignment) with == (comparison)!

x = 5 → Stores 5 in x

x == 5 → Asks “Is x equal to 5?”

Examples in filtering:

# Filter rows where habitat equals "forest"
filter(data, habitat == "forest")

# Filter rows where abundance is greater than 50
filter(data, abundance > 50)

# Filter rows where habitat is forest OR grassland
filter(data, habitat %in% c("forest", "grassland"))

Logical Operators

Used to combine multiple conditions:

Operator	Meaning	Example	Result
`&`	AND (both must be TRUE)	`TRUE & FALSE`	`FALSE`
`\|`	OR (at least one TRUE)	`TRUE \| FALSE`	`TRUE`
`!`	NOT (reverses TRUE/FALSE)	`!TRUE`	`FALSE`

Examples:

# AND: Both conditions must be true
filter(data, habitat == "forest" & abundance > 50)

# OR: At least one condition must be true
filter(data, habitat == "forest" | habitat == "grassland")

# NOT: Exclude missing values
filter(data, !is.na(abundance))

Arithmetic Operators

Used for mathematical calculations:

Operator	Meaning	Example	Result
`+`	Addition	`5 + 3`	`8`
`-`	Subtraction	`5 - 3`	`2`
`*`	Multiplication	`5 * 3`	`15`
`/`	Division	`6 / 2`	`3`
`^`	Exponent (power)	`2^3`	`8`
`%%`	Modulo (remainder)	`7 %% 3`	`1`
`%/%`	Integer division	`7 %/% 3`	`2`

Special Operators

Operator	Meaning	Example	Description
`%>%`	Pipe	`data %>% filter()`	Passes left side to right side. Read as “THEN”
`$`	Extract column	`data$column`	Gets a column from a data frame
`[ ]`	Subset/index	`data[1, 2]`	Extracts elements by position
`[[ ]]`	Extract single element	`list[[1]]`	Extracts single element from list
`:`	Sequence	`1:10`	Creates sequence from 1 to 10
`::`	Package function	`dplyr::filter()`	Uses function from specific package
`~`	Formula	`y ~ x`	Used in models and statistics

The Pipe Operator `%>%`

The pipe takes the output from the left side and passes it as the first argument to the function on the right side. We will explore more in Part 2.

Read %>% as “THEN”

Shortcut: Ctrl + Shift + M (Windows) or Cmd + Shift + M (Mac)

Common Mistakes to Avoid

Wrong ❌	Correct ✅	Explanation
`habitat = "forest"`	`habitat == "forest"`	Use `==` to compare, `=` assigns
`habitat == forest`	`habitat == "forest"`	Text needs quotes
`setwd("C:/Users/...")`	Use R Projects	Absolute paths break on other computers
`NA == NA`	`is.na(x)`	Use `is.na()` to check for missing values
`x = TRUE` in filter	`x == TRUE`	Use `==` for comparison in functions

Part 2: R Basics

Assignment & Objects

x <- 5                    # Assign value to object
my_data <- c(1, 2, 3)     # Create vector

Data Types

Type	Example	Check with
Numeric	`42`, `3.14`	`class(x)`
Character	`"forest"`	`is.character(x)`
Logical	`TRUE`, `FALSE`	`is.logical(x)`
Factor	`factor("low", "high")`	`is.factor(x)`

Functions

function_name(argument1, argument2)
mean(c(1, 2, 3))           # Calculate mean
round(3.14159, digits = 2) # Named argument

Getting Help

?mean                     # Help for function
??diversity               # Search help

Part 3: Data Wrangling (tidyverse)

Core Verbs

Verb	Purpose	Example
`filter()`	Keep rows	`filter(data, habitat == "forest")`
`select()`	Choose columns	`select(data, site, abundance)`
`mutate()`	Create columns	`mutate(data, log_ab = log(abundance))`
`arrange()`	Sort rows	`arrange(data, desc(richness))`
`summarise()`	Summarize	`summarise(data, mean = mean(x))`
`group_by()`	Group data	`group_by(data, habitat)`

Filtering Patterns

# Exact match
filter(data, habitat == "forest")

# Not equal
filter(data, habitat != "agriculture")

# Multiple options
filter(data, habitat %in% c("forest", "grassland"))

# Greater/less than
filter(data, abundance > 50)
filter(data, abundance <= 100)

# Range (between two values)
filter(data, abundance >= 10 & abundance <= 100)

# Combine conditions (AND)
filter(data, habitat == "forest" & abundance > 50)

# Combine conditions (OR)
filter(data, habitat == "forest" | habitat == "grassland")

# Remove missing values
filter(data, !is.na(abundance))

Creating and Modifying

# Create new column
mutate(data, log_abundance = log(abundance + 1))

# Create column with conditions
mutate(data, 
  size_class = case_when(
    abundance < 10 ~ "low",
    abundance < 50 ~ "medium",
    TRUE ~ "high"
  )
)

Summarizing

# Basic summary
summarise(data, 
  mean_ab = mean(abundance),
  sd_ab = sd(abundance),
  n = n()
)

# Summary by group
data %>%
  group_by(habitat) %>%
  summarise(
    mean_ab = mean(abundance),
    n_species = n_distinct(species)
  )

The Pipe Operator: `%>%`

What is The Pipe

The pipe %>% is a specieal operator from the magrittr package (loaded automatically with tidyverse). It takes the output from the left side and passees it as the first argument to the function on the right side.

Aspect	Description
Symbol	`%>%` or `\|>`
Read as	“THEN” or “AND THEN”
What it does	Passes left side as first argument to right side
Why use it	Makes multi-step operations readable
Shortcut	Ctrl + Shift + M (Windows/Linux) / Cmd + Shift + M (Mac)
Package	Loaded with `tidyverse` (from `magrittr`)

The Basic Concept

# These two lines do EXACTLY the same thing:

# Without pipe - function wraps around data
mean(c(1, 2, 3, 4, 5))

# With pipe - data flows into function
c(1, 2, 3, 4, 5) %>%  mean ()

# Read as: "Take 1, 2, 3, 4, 5 THEN calculate mean"

Why use the PIPE?“ Problem: Nested functions are hard to read.

When you need to do multiple operations, code becomes confusing:

# HARD TO READ - You must read from inside out!
round(mean(sqrt(abs(c(-4, 9, -16, 25)))), 2)

# What's happening?
# 1. c(-4, 9, -16, 25)  - create vector
# 2. abs(...)           - absolute value
# 3. sqrt(...)          - square root
# 4. mean (...)         - average
# 5. round(..., 2)      - round to 2 decimals

# This is like reading a sentence backwards!

Solution: The pipe makes code flow naturally

# EASY TO READ - Read from top to bottom, left to right!
c(-4, 9, -16, 25) %>%       # start with these numbers, THEN
  abs() %>%                 # take absolute value, THEN
  sqrt() %>%                # take square root, THEN
  mean() %>%                # calculate mean, THEN
  round(2)                  # round to 2 decimals
  
# This reads like a recipe - step by step!

How the Pipe works technically

The pipe takes whatever is on the LEFT and inserts it as the FIRST ARGUMENT of the function on the RIGHT.

# These are equivalent:
x %>% f()
f(x)

# These are also equivalent:
x %>% f(y)
f(x, y)

# And also these:
x %>% f(y, z)
f(x, y, z)

Real examples with Data Frames

# WITHOUT pipe - nested and confusing
summary(filter(select(insect_data, site, habitat, abundance), habitat == "forest"))

# WITH pipe - clear step-by-step workflow
insect_data %>% 
  select(site, habitat, abundance) %>%    # choose columns, THEN
  filter(habitat == "forest") %>%         # keep forest rows, THEN
  summary()                               # show summary
  
# Each step:
# 1. Start with insect_data.
# 2. select() receives insect_data as first argument.
# 3. filter() receives the result of select() as first argument.
# 4. summary() receives the result of filter() as first argument.

The Cooking Recipe Analogy

Think of the pipe like following a recipe:


# Recipe WITHOUT pipe (confusing):
serve(
  plate(
    garnish(
      cook(
        season(
          chop(vegetables)
        )
      )
    )
  )
)

# Recipe WITH pipe:
vegetables %>% 
  chop() %>% 
  season() %>% 
  cook() %>% 
  garnish() %>% 
  plate() %>% 
  serve()

# Just like a real recipe:
# 1. Take vegetables
# 2. Chop them
# 3. Season them
# 4. Cook them
# 5. Garnish
# 6. Plate
# 7. Serve

Common Patterns in Ecological Data Analysis


# Pattern 1: Data Exploration
insect_data %>% 
  filter(order == "Coleoptera") %>% 
  group_by(habitat) %>% 
  summarise(
    total = sum(abundance),
    richness = n_distinct(morphospecies)
)

# Read as:
# "Take insect_data, THEN
# filter to Coleoptera, THEN
# group by habitat, THEN
# summarise total and richness"

# Pattern 2: Creating community matrix
insect_data %>% 
  filter(order == "Coleoptera") %>% 
  group_by(site, morphospecies) %>% 
  summarise(abundance = sum(abundance), .groups = "drop") %>% 
  pivot_wider(names_from = morphospecies, values_from = abundance, values_fill = 0)

# Pattern 3: Saving results along the way
beetle_summary <- insect_data %>%     # save the final result
  filter(order == "Coleoptera") %>% 
  group_by(habitat) %>% 
  summarise(mean_abundance = mean(abundance))

The New Native Pipe: |> R version 4.1+ introduced a built-in-pipe |> that works similarly:


# magrittr pipe (from tidyverse)
x %>% mean()

# Native R pipe (R 4.1+)
x |> mean ()

# Both work! The native pipe is slightly faster but has fewer features.
# For this workshop, we use %>% because it's more common in existing code.

When NOT to Use the Pipe The pipe isn’t always the best choice:


# Don't pipe into functions that don't take data as first argument
c(1, 2, 3) %>% plot(main = "My Plot")   # Works, but...

# Don't use for simple one-step operations (unnecessary)
x %>% mean()                            # Just use: mean(x)

# Don't make chains too long (it will be hard to debug)
# If your chain is >10 steps, consider breaking it up and saving intermediate resutls.

# DO use pipes when you have > 2 steps that transform data
data %>% 
  step1() %>% 
  step2() %>% 
  step3() %>%

Reshaping

pivot_longer(data, cols, names_to, values_to)  # Wide → Long
pivot_wider(data, names_from, values_from)     # Long → Wide

Part 4:Community Ecology (vegan)

Alpha Diversity

specnumber(comm_matrix)              # Richness
diversity(comm_matrix, "shannon")    # Shannon H'
diversity(comm_matrix, "simpson")    # Simpson 1-D
rarefy(comm_matrix, sample = n)      # Rarefied richness

Distance & Ordination

vegdist(comm_matrix, "bray")         # Bray-Curtis distance
metaMDS(comm_matrix, k = 2)          # NMDS
pcoa(dist_matrix)                    # PCoA (ape package)

Statistical Tests

adonis2(comm ~ habitat, data = env)  # PERMANOVA
betadisper(dist, groups)             # Dispersion test
simper(comm, groups)                 # Species contributions

Part 5: Interpretation Guidelines

NMDS Stress

Value	Quality
< 0.05	Excellent
0.05-0.10	Good
0.10-0.20	Acceptable
> 0.20	Poor

PERMANOVA R²

Value	Effect Size
< 0.05	Tiny
0.05-0.10	Small
0.10-0.25	Medium
> 0.25	Large

Part 6: Visualization (ggplot2)

Basic Structure

ggplot(data, aes(x = var1, y = var2)) +
  geom_point()            # Add geometry

Common Geoms

Geom	Plot Type
`geom_point()`	Scatter plot
`geom_boxplot()`	Box plot
`geom_col()`	Bar plot
`geom_histogram()`	Histogram
`geom_smooth()`	Trend line

Customization

+ labs(x = "Label", y = "Label", title = "Title")
+ theme_bw()
+ scale_color_manual(values = c("red", "blue"))
+ facet_wrap(~ variable)

Part 7: Common Errors & Fixes

Error	Likely Cause	Fix
`object not found`	Typo or not created	Check spelling, run creation code
`could not find function`	Package not loaded	`library(package)`
`unexpected symbol`	Missing comma/parenthesis	Check syntax
`subscript out of bounds`	Wrong index	Check dimensions with `dim()`

Keep this card handy during the workshop!

R Quick Reference Card

Workshop: Introduction to R Statistics for Insect Ecology

Amanda Mawan

11-12 February 2026

Part 1: Important Operators and Keyboard Shortcuts

Keyboard Shortcuts

Operators

Assignment Operators

Comparison Operators

Logical Operators

Arithmetic Operators

Special Operators

The Pipe Operator `%>%`

Common Mistakes to Avoid

Part 2: R Basics

Assignment & Objects

Data Types

Functions

Getting Help

Part 3: Data Wrangling (tidyverse)

Core Verbs

Filtering Patterns

Creating and Modifying

Summarizing

The Pipe Operator: `%>%`

Reshaping

Part 4:Community Ecology (vegan)

Alpha Diversity

Distance & Ordination

Statistical Tests

Part 5: Interpretation Guidelines

NMDS Stress

PERMANOVA R²

Part 6: Visualization (ggplot2)

Basic Structure

Common Geoms

Customization

Part 7: Common Errors & Fixes

R Quick Reference Card

Workshop: Introduction to R Statistics for Insect Ecology

Amanda Mawan

11-12 February 2026

Part 1: Important Operators and Keyboard Shortcuts

Keyboard Shortcuts

Operators

Assignment Operators

Comparison Operators

Logical Operators

Arithmetic Operators

Special Operators

The Pipe Operator %>%

Common Mistakes to Avoid

Part 2: R Basics

Assignment & Objects

Data Types

Functions

Getting Help

Part 3: Data Wrangling (tidyverse)

Core Verbs

Filtering Patterns

Creating and Modifying

Summarizing

The Pipe Operator: %>%

Reshaping

Part 4:Community Ecology (vegan)

Alpha Diversity

Distance & Ordination

Statistical Tests

Part 5: Interpretation Guidelines

NMDS Stress

PERMANOVA R²

Part 6: Visualization (ggplot2)

Basic Structure

Common Geoms

Customization

Part 7: Common Errors & Fixes

The Pipe Operator `%>%`

The Pipe Operator: `%>%`