Essential shortcuts to speed up your workflow:
| Action | Windows/Linux | Mac |
|---|---|---|
| Running Code | ||
| Run current line/selection | Ctrl + Enter |
Cmd + Enter |
| Run entire script | Ctrl + Shift + Enter |
Cmd + Shift + Enter |
| Run from beginning to current line | Ctrl + Alt + B |
Cmd + Option + B |
| Typing Helpers | ||
Assignment <- |
Alt + - |
Option + - |
Pipe %>% |
Ctrl + Shift + M |
Cmd + Shift + M |
| Comment/uncomment line | Ctrl + Shift + C |
Cmd + Shift + C |
| File Operations | ||
| Save script | Ctrl + S |
Cmd + S |
| New script | Ctrl + Shift + N |
Cmd + Shift + N |
| Open file | Ctrl + O |
Cmd + O |
| Editing | ||
| Find and replace | Ctrl + F |
Cmd + F |
| Find in files | Ctrl + Shift + F |
Cmd + Shift + F |
| Undo | Ctrl + Z |
Cmd + Z |
| Redo | Ctrl + Shift + Z |
Cmd + Shift + Z |
| Indent | Tab |
Tab |
| Outdent | Shift + Tab |
Shift + Tab |
| RStudio | ||
| Clear console | Ctrl + L |
Cmd + L |
| Restart R session | Ctrl + Shift + F10 |
Cmd + Shift + F10 |
| Knit R Markdown | Ctrl + Shift + K |
Cmd + Shift + K |
| Insert code chunk (Rmd) | Ctrl + Alt + I |
Cmd + Option + I |
| Show help for function | F1 (cursor on function) |
F1 |
| Autocomplete | Tab |
Tab |
Used to store values in objects:
| Operator | Meaning | Example | Result |
|---|---|---|---|
<- |
Assign (preferred) | x <- 5 |
x contains 5 |
= |
Assign (also works) | x = 5 |
x contains 5 |
-> |
Assign right | 5 -> x |
x contains 5 |
Tip: Use
<-for assignment. The shortcut isAlt + -(Windows) orOption + -(Mac).
Used to compare values. Always return TRUE or
FALSE:
| Operator | Meaning | Example | Result |
|---|---|---|---|
== |
Equal to | 5 == 5 |
TRUE |
!= |
Not equal to | 5 != 3 |
TRUE |
< |
Less than | 3 < 5 |
TRUE |
> |
Greater than | 5 > 3 |
TRUE |
<= |
Less than or equal | 5 <= 5 |
TRUE |
>= |
Greater than or equal | 5 >= 3 |
TRUE |
%in% |
Is in set | "a" %in% c("a","b","c") |
TRUE |
⚠️ Common Mistake: Don’t confuse
=(assignment) with==(comparison)!
x = 5→ Stores 5 in xx == 5→ Asks “Is x equal to 5?”
Examples in filtering:
# Filter rows where habitat equals "forest"
filter(data, habitat == "forest")
# Filter rows where abundance is greater than 50
filter(data, abundance > 50)
# Filter rows where habitat is forest OR grassland
filter(data, habitat %in% c("forest", "grassland"))
Used to combine multiple conditions:
| Operator | Meaning | Example | Result |
|---|---|---|---|
& |
AND (both must be TRUE) | TRUE & FALSE |
FALSE |
| |
OR (at least one TRUE) | TRUE | FALSE |
TRUE |
! |
NOT (reverses TRUE/FALSE) | !TRUE |
FALSE |
Examples:
# AND: Both conditions must be true
filter(data, habitat == "forest" & abundance > 50)
# OR: At least one condition must be true
filter(data, habitat == "forest" | habitat == "grassland")
# NOT: Exclude missing values
filter(data, !is.na(abundance))
Used for mathematical calculations:
| Operator | Meaning | Example | Result |
|---|---|---|---|
+ |
Addition | 5 + 3 |
8 |
- |
Subtraction | 5 - 3 |
2 |
* |
Multiplication | 5 * 3 |
15 |
/ |
Division | 6 / 2 |
3 |
^ |
Exponent (power) | 2^3 |
8 |
%% |
Modulo (remainder) | 7 %% 3 |
1 |
%/% |
Integer division | 7 %/% 3 |
2 |
| Operator | Meaning | Example | Description |
|---|---|---|---|
%>% |
Pipe | data %>% filter() |
Passes left side to right side. Read as “THEN” |
$ |
Extract column | data$column |
Gets a column from a data frame |
[ ] |
Subset/index | data[1, 2] |
Extracts elements by position |
[[ ]] |
Extract single element | list[[1]] |
Extracts single element from list |
: |
Sequence | 1:10 |
Creates sequence from 1 to 10 |
:: |
Package function | dplyr::filter() |
Uses function from specific package |
~ |
Formula | y ~ x |
Used in models and statistics |
%>%The pipe takes the output from the left side and passes it as the first argument to the function on the right side. We will explore more in Part 2.
Read
%>%as “THEN”
Shortcut: Ctrl + Shift + M (Windows) or
Cmd + Shift + M (Mac)
| Wrong ❌ | Correct ✅ | Explanation |
|---|---|---|
habitat = "forest" |
habitat == "forest" |
Use == to compare, = assigns |
habitat == forest |
habitat == "forest" |
Text needs quotes |
setwd("C:/Users/...") |
Use R Projects | Absolute paths break on other computers |
NA == NA |
is.na(x) |
Use is.na() to check for missing values |
x = TRUE in filter |
x == TRUE |
Use == for comparison in functions |
x <- 5 # Assign value to object
my_data <- c(1, 2, 3) # Create vector
| Type | Example | Check with |
|---|---|---|
| Numeric | 42, 3.14 |
class(x) |
| Character | "forest" |
is.character(x) |
| Logical | TRUE, FALSE |
is.logical(x) |
| Factor | factor("low", "high") |
is.factor(x) |
function_name(argument1, argument2)
mean(c(1, 2, 3)) # Calculate mean
round(3.14159, digits = 2) # Named argument
?mean # Help for function
??diversity # Search help
| Verb | Purpose | Example |
|---|---|---|
filter() |
Keep rows | filter(data, habitat == "forest") |
select() |
Choose columns | select(data, site, abundance) |
mutate() |
Create columns | mutate(data, log_ab = log(abundance)) |
arrange() |
Sort rows | arrange(data, desc(richness)) |
summarise() |
Summarize | summarise(data, mean = mean(x)) |
group_by() |
Group data | group_by(data, habitat) |
# Exact match
filter(data, habitat == "forest")
# Not equal
filter(data, habitat != "agriculture")
# Multiple options
filter(data, habitat %in% c("forest", "grassland"))
# Greater/less than
filter(data, abundance > 50)
filter(data, abundance <= 100)
# Range (between two values)
filter(data, abundance >= 10 & abundance <= 100)
# Combine conditions (AND)
filter(data, habitat == "forest" & abundance > 50)
# Combine conditions (OR)
filter(data, habitat == "forest" | habitat == "grassland")
# Remove missing values
filter(data, !is.na(abundance))
# Create new column
mutate(data, log_abundance = log(abundance + 1))
# Create column with conditions
mutate(data,
size_class = case_when(
abundance < 10 ~ "low",
abundance < 50 ~ "medium",
TRUE ~ "high"
)
)
# Basic summary
summarise(data,
mean_ab = mean(abundance),
sd_ab = sd(abundance),
n = n()
)
# Summary by group
data %>%
group_by(habitat) %>%
summarise(
mean_ab = mean(abundance),
n_species = n_distinct(species)
)
%>%What is The Pipe
The pipe %>% is a specieal operator from the
magrittr package (loaded automatically with
tidyverse). It takes the output from the left side and
passees it as the first argument to the function on the
right side.
| Aspect | Description |
|---|---|
| Symbol | %>% or |> |
| Read as | “THEN” or “AND THEN” |
| What it does | Passes left side as first argument to right side |
| Why use it | Makes multi-step operations readable |
| Shortcut | Ctrl + Shift + M (Windows/Linux) / Cmd + Shift + M (Mac) |
| Package | Loaded with tidyverse (from magrittr) |
The Basic Concept
# These two lines do EXACTLY the same thing:
# Without pipe - function wraps around data
mean(c(1, 2, 3, 4, 5))
# With pipe - data flows into function
c(1, 2, 3, 4, 5) %>% mean ()
# Read as: "Take 1, 2, 3, 4, 5 THEN calculate mean"
Why use the PIPE?“ Problem: Nested functions are hard to read.
When you need to do multiple operations, code becomes confusing:
# HARD TO READ - You must read from inside out!
round(mean(sqrt(abs(c(-4, 9, -16, 25)))), 2)
# What's happening?
# 1. c(-4, 9, -16, 25) - create vector
# 2. abs(...) - absolute value
# 3. sqrt(...) - square root
# 4. mean (...) - average
# 5. round(..., 2) - round to 2 decimals
# This is like reading a sentence backwards!
Solution: The pipe makes code flow naturally
# EASY TO READ - Read from top to bottom, left to right!
c(-4, 9, -16, 25) %>% # start with these numbers, THEN
abs() %>% # take absolute value, THEN
sqrt() %>% # take square root, THEN
mean() %>% # calculate mean, THEN
round(2) # round to 2 decimals
# This reads like a recipe - step by step!
How the Pipe works technically
The pipe takes whatever is on the LEFT and inserts it as the FIRST ARGUMENT of the function on the RIGHT.
# These are equivalent:
x %>% f()
f(x)
# These are also equivalent:
x %>% f(y)
f(x, y)
# And also these:
x %>% f(y, z)
f(x, y, z)
Real examples with Data Frames
# WITHOUT pipe - nested and confusing
summary(filter(select(insect_data, site, habitat, abundance), habitat == "forest"))
# WITH pipe - clear step-by-step workflow
insect_data %>%
select(site, habitat, abundance) %>% # choose columns, THEN
filter(habitat == "forest") %>% # keep forest rows, THEN
summary() # show summary
# Each step:
# 1. Start with insect_data.
# 2. select() receives insect_data as first argument.
# 3. filter() receives the result of select() as first argument.
# 4. summary() receives the result of filter() as first argument.
The Cooking Recipe Analogy
Think of the pipe like following a recipe:
# Recipe WITHOUT pipe (confusing):
serve(
plate(
garnish(
cook(
season(
chop(vegetables)
)
)
)
)
)
# Recipe WITH pipe:
vegetables %>%
chop() %>%
season() %>%
cook() %>%
garnish() %>%
plate() %>%
serve()
# Just like a real recipe:
# 1. Take vegetables
# 2. Chop them
# 3. Season them
# 4. Cook them
# 5. Garnish
# 6. Plate
# 7. Serve
Common Patterns in Ecological Data Analysis
# Pattern 1: Data Exploration
insect_data %>%
filter(order == "Coleoptera") %>%
group_by(habitat) %>%
summarise(
total = sum(abundance),
richness = n_distinct(morphospecies)
)
# Read as:
# "Take insect_data, THEN
# filter to Coleoptera, THEN
# group by habitat, THEN
# summarise total and richness"
# Pattern 2: Creating community matrix
insect_data %>%
filter(order == "Coleoptera") %>%
group_by(site, morphospecies) %>%
summarise(abundance = sum(abundance), .groups = "drop") %>%
pivot_wider(names_from = morphospecies, values_from = abundance, values_fill = 0)
# Pattern 3: Saving results along the way
beetle_summary <- insect_data %>% # save the final result
filter(order == "Coleoptera") %>%
group_by(habitat) %>%
summarise(mean_abundance = mean(abundance))
The New Native Pipe: |> R version
4.1+ introduced a built-in-pipe |> that works
similarly:
# magrittr pipe (from tidyverse)
x %>% mean()
# Native R pipe (R 4.1+)
x |> mean ()
# Both work! The native pipe is slightly faster but has fewer features.
# For this workshop, we use %>% because it's more common in existing code.
When NOT to Use the Pipe The pipe isn’t always the best choice:
# Don't pipe into functions that don't take data as first argument
c(1, 2, 3) %>% plot(main = "My Plot") # Works, but...
# Don't use for simple one-step operations (unnecessary)
x %>% mean() # Just use: mean(x)
# Don't make chains too long (it will be hard to debug)
# If your chain is >10 steps, consider breaking it up and saving intermediate resutls.
# DO use pipes when you have > 2 steps that transform data
data %>%
step1() %>%
step2() %>%
step3() %>%
pivot_longer(data, cols, names_to, values_to) # Wide → Long
pivot_wider(data, names_from, values_from) # Long → Wide
specnumber(comm_matrix) # Richness
diversity(comm_matrix, "shannon") # Shannon H'
diversity(comm_matrix, "simpson") # Simpson 1-D
rarefy(comm_matrix, sample = n) # Rarefied richness
vegdist(comm_matrix, "bray") # Bray-Curtis distance
metaMDS(comm_matrix, k = 2) # NMDS
pcoa(dist_matrix) # PCoA (ape package)
adonis2(comm ~ habitat, data = env) # PERMANOVA
betadisper(dist, groups) # Dispersion test
simper(comm, groups) # Species contributions
| Value | Quality |
|---|---|
| < 0.05 | Excellent |
| 0.05-0.10 | Good |
| 0.10-0.20 | Acceptable |
| > 0.20 | Poor |
| Value | Effect Size |
|---|---|
| < 0.05 | Tiny |
| 0.05-0.10 | Small |
| 0.10-0.25 | Medium |
| > 0.25 | Large |
ggplot(data, aes(x = var1, y = var2)) +
geom_point() # Add geometry
| Geom | Plot Type |
|---|---|
geom_point() |
Scatter plot |
geom_boxplot() |
Box plot |
geom_col() |
Bar plot |
geom_histogram() |
Histogram |
geom_smooth() |
Trend line |
+ labs(x = "Label", y = "Label", title = "Title")
+ theme_bw()
+ scale_color_manual(values = c("red", "blue"))
+ facet_wrap(~ variable)
| Error | Likely Cause | Fix |
|---|---|---|
object not found |
Typo or not created | Check spelling, run creation code |
could not find function |
Package not loaded | library(package) |
unexpected symbol |
Missing comma/parenthesis | Check syntax |
subscript out of bounds |
Wrong index | Check dimensions with dim() |
Keep this card handy during the workshop!