📦 Load Required Packages

We’ll use the following packages for this analysis:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(readxl)

📁 Load the Data

We’re working with the dataset individualData_noStages.xlsx.

plants <- read_excel("individualData_noStages.xlsx")

summary(plants)
##        id           size_t1           fruit          survived    
##  Min.   :  1.0   Min.   : 1.185   Min.   : 0.00   Min.   :0.000  
##  1st Qu.:125.8   1st Qu.: 3.362   1st Qu.: 4.00   1st Qu.:0.000  
##  Median :250.5   Median : 4.528   Median : 8.00   Median :0.000  
##  Mean   :250.5   Mean   : 5.139   Mean   :10.15   Mean   :0.256  
##  3rd Qu.:375.2   3rd Qu.: 6.313   3rd Qu.:13.00   3rd Qu.:1.000  
##  Max.   :500.0   Max.   :22.658   Max.   :88.00   Max.   :1.000  
##                                                                  
##     n_seeds         size_t2      
##  Min.   : 0.00   Min.   : 3.380  
##  1st Qu.: 4.00   1st Qu.: 7.336  
##  Median : 8.00   Median : 9.862  
##  Mean   :10.15   Mean   :10.139  
##  3rd Qu.:13.00   3rd Qu.:12.295  
##  Max.   :88.00   Max.   :27.967  
##                  NA's   :372

You could make some graphs like you did in the practice set.


🧱 Define Size Classes

To assign individuals into size classes, we first define custom breakpoints.

# Breaks for small, medium, large individuals
size_breaks <- c(0, 5, 10, Inf)

# Create new columns for size class at time 1 and time 2
plants <- plants %>%
  mutate(sizeClass_t1 = cut(
    size_t1,
    breaks = size_breaks,
    include.lowest = TRUE,
    labels = FALSE
  )) %>%
  mutate(sizeClass_t2 = cut(
    size_t2,
    breaks = size_breaks,
    include.lowest = TRUE,
    labels = FALSE
  ))

🔄 Transition Probabilities from Stage 1

We’ll now calculate transition probabilities from stage 1 to other stages based on size class changes between time 1 and time 2.

Total individuals in Stage 1 at time 1:

stage1Total <- plants %>%
  filter(sizeClass_t1 == 1) %>%
  nrow()

stage1Total
## [1] 296

Survived and remained in Stage 1:

stage1Remain <- plants %>%
  filter(sizeClass_t1 == 1, sizeClass_t2 == 1) %>%
  nrow()

t1_1 <- stage1Remain / stage1Total
t1_1
## [1] 0.01351351

Survived and moved to Stage 2:

stage1ToStage2 <- plants %>%
  filter(sizeClass_t1 == 1, sizeClass_t2 == 2) %>%
  nrow()

t1_2 <- stage1ToStage2 / stage1Total
t1_2
## [1] 0.06081081

Survived and moved to Stage 3:

stage1ToStage3 <- plants %>%
  filter(sizeClass_t1 == 1, sizeClass_t2 == 3) %>%
  nrow()

t1_3 <- stage1ToStage3 / stage1Total
t1_3
## [1] 0

🔁 Repeat for Stages 2 and 3

Students should now repeat this same process to calculate the transition probabilities from Stage 2 and Stage 3.

💡 Tip: Use the same structure — filter by sizeClass_t1 == 2 or 3, and then count how many individuals moved to each sizeClass_t2.


🌱 Reproduction: Estimating Seedling Input

We’ll now calculate how many seedlings were contributed per individual in each stage class.

Let’s assume 409 seedlings were observed at time 2.

Seedlings_t2 <- 409

Step 1: Total fruit/seed production by stage

reprod <- plants %>%
  group_by(sizeClass_t1) %>%
  summarise(
    nIndivs = n(),
    totalSeeds = sum(n_seeds)
  ) %>%
  mutate(prop_seeds = totalSeeds / sum(totalSeeds))

reprod
## # A tibble: 3 × 4
##   sizeClass_t1 nIndivs totalSeeds prop_seeds
##          <int>   <int>      <dbl>      <dbl>
## 1            1     296       1535      0.302
## 2            2     172       2443      0.481
## 3            3      32       1098      0.216

This gives the proportion of seeds produced by each stage class, which we’ll use to distribute seedlings.


Step 2: Distribute seedlings proportionally by stage

reprod <- reprod %>%
  mutate(seedlingsPerStage = Seedlings_t2 * prop_seeds)

reprod
## # A tibble: 3 × 5
##   sizeClass_t1 nIndivs totalSeeds prop_seeds seedlingsPerStage
##          <int>   <int>      <dbl>      <dbl>             <dbl>
## 1            1     296       1535      0.302             124. 
## 2            2     172       2443      0.481             197. 
## 3            3      32       1098      0.216              88.5

Step 3: Estimate per-individual reproduction

reprod <- reprod %>%
  mutate(seedlingsPerIndiv = seedlingsPerStage / nIndivs)

reprod
## # A tibble: 3 × 6
##   sizeClass_t1 nIndivs totalSeeds prop_seeds seedlingsPerStage seedlingsPerIndiv
##          <int>   <int>      <dbl>      <dbl>             <dbl>             <dbl>
## 1            1     296       1535      0.302             124.              0.418
## 2            2     172       2443      0.481             197.              1.14 
## 3            3      32       1098      0.216              88.5             2.76

Now we have an estimate of how many seedlings each individual in each stage class contributed, on average.


✅ Summary

In this tutorial, you:

Step What You Did
Defined size classes Used cut() with custom breaks
Calculated transitions Counted survival and growth transitions
Estimated reproduction Linked seed output to seedling contribution