Here I document my initial exploration of our REU2025 butternut health assessment data from the first 4 weeks of field days.

Uploading

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
health_assess <- read_csv("butternut_health_assessment_form_jun_2025.csv")
## Rows: 47 Columns: 54
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (45): Timestamp, Email Address, Date, Site Number or Initial: JC-W-_____...
## dbl  (4): Slope (degree), Shape of terminal bud, East, South
## lgl  (5): If VOUCHERS were collected, how many?, If LEAVES were collected, h...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#View(health_assess)

Pre-processing

health_assess %>% head()
## # A tibble: 6 × 54
##   Timestamp  `Email Address` Date  Site Number or Initi…¹ Plant Number (e.g. 4…²
##   <chr>      <chr>           <chr> <chr>                  <chr>                 
## 1 6/5/2025 … eleavens@morto… 6/5/… SH                     70                    
## 2 6/5/2025 … eleavens@morto… 6/5/… SH                     67                    
## 3 6/5/2025 … eleavens@morto… 6/5/… SH                     71                    
## 4 6/5/2025 … eleavens@morto… 6/5/… SH                     72                    
## 5 6/5/2025 … eleavens@morto… 6/5/… SH                     18                    
## 6 6/5/2025 … eleavens@morto… 6/5/… SH                     17                    
## # ℹ abbreviated names: ¹​`Site Number or Initial: JC-W-_______`,
## #   ²​`Plant Number (e.g. 4th tree assessed will be 4)`
## # ℹ 49 more variables: `Number of the 1st photo taken` <chr>,
## #   `GPS location NORTH` <chr>, `GPS Location WEST` <chr>,
## #   `Slope (degree)` <dbl>, `Aspect (N, NE, E, etc)` <chr>,
## #   `Plant Height (in FEET)` <chr>, `DBH (in CENTIMETERS)` <chr>,
## #   `Producing seed?` <chr>, `Roughly how many seeds are on the tree?` <chr>, …

Ignoring the first 17 rows

Okay, so the first few rows are going to be ignored for this initial exploration because they are from when we were still developing the form and they don’t currently reflect their values accurately in the table.

So we can remove the first 17 individuals using the slice command, where n() makes sure that all other individuals are kept from its saying “keep entries 17 to end.”

health_assess <- health_assess %>% slice(17:n())
view(health_assess)

This leaves data from 06-12-2025 and 06-19-2025, only.

Ignoring columns that aren’t directly necessary

Since some of the columns aren’t directly necessary I am gonna ingnore some for ease of use.

health_assess <- read_csv("butternut_health_assessment_form_jun_2025.csv")
health_assess <- health_assess %>% slice(17:n())

health_assess <- health_assess %>% select(
  -`Producing seed?`,
  -`Roughly how many seeds are on the tree?`,
  -`How many seed are in each bunch (average estimate)?`,
  
  # Collections
  -`What did you collect?`,
  -`If VOUCHERS were collected, how many?`,
  -`If LEAVES were collected, how many?`,
  -`If CUTTINGS were collected, how many?`,
  -`If SEEDS were collected, how many?`,
  -`If other collections were made, please describe them here including the number collected.`,
  
  -`How deep are the furrows in the bark?`,
  -`What shade (from light/white to dark) is the tree bark?`,
  
  #Editing
  -`Edited after field collection? (Y/N)`,
  -`If edited, what date:`,
  -`If edited, what:`
  )

Renaming the important columns/variables

Because we are using a google form, all of the column headings are named after the question text. This is one of the disadvantages of the google form; however, we can rename the columns to better utilize the data.

# Note to rename columns you need to directly copy the string saved in the system. You can see these column strings using the 'names' command.
names(health_assess)
##  [1] "Timestamp"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [2] "Email Address"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
##  [3] "Date"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
##  [4] "Site Number or Initial: JC-W-_______"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
##  [5] "Plant Number (e.g. 4th tree assessed will be 4)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##  [6] "Number of the 1st photo taken"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
##  [7] "GPS location NORTH"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [8] "GPS Location WEST"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
##  [9] "Slope (degree)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
## [10] "Aspect (N, NE, E, etc)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
## [11] "Plant Height (in FEET)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
## [12] "DBH (in CENTIMETERS)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [13] "Percent live canopy (estimate to the nearest 10% increment, being sure to only include live branches in assessment)\r\n\r\nNote: This is a measure of crown density. In order to estimate this, first envision the amount of canopy there would be if the tree were fully healthy. Butternuts do not typically have a tightly formed canopy even when healthy so be sure to evaluate based on branch presence and location. Then estimate what percent of the envisioned canopy is actually present. This will be your estimate of percent live canopy."
## [14] "What is the crown class of this individual?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [15] "Number of epicormic branches / sprouts from the base"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [16] "Number of epicormic branches / sprouts from the trunk"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [17] "Visible cankers?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
## [18] "If large cankers are present, are they being calloused over?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
## [19] "How much area of the trunk below first main branch is infected by canker, measured as a percentage of total trunk with cankers visible (including cankering visible underneath uplifted bark)?"                                                                                                                                                                                                                                                                                                                                                         
## [20] "At the part of the trunk that appears most girdled by canker, what portion of the circumference of the trunk is girdled?"                                                                                                                                                                                                                                                                                                                                                                                                                               
## [21] "How much area of the base/ root flare is infected by canker, e.g. as a percentage of root flare (up to 10 cm above soil) with cankers visible (including underneath bark)?"                                                                                                                                                                                                                                                                                                                                                                             
## [22] "Assess severity of infection. Focus on the bottom 10 feet of the tree when assessing the number and size of cankers, noting that cankers can be hard to see on old trees with thick bark. CANKERS:"                                                                                                                                                                                                                                                                                                                                                     
## [23] "Assess severity of infection. CANOPY:"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [24] "Shape of terminal bud"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [25] "Shape of bud scar"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
## [26] "Shape / length of lenticels"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [27] "Hairs on the end of the twigs"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [28] "North"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [29] "East"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [30] "South"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [31] "West"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [32] "Riparian or upland?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## [33] "Associated tree species within 20 meters."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## [34] "Any additional notes?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [35] "Number of the last photo taken"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
## [36] "Is this individual a seedling?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
## [37] "What competition is potentially threatening this tree?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
## [38] "If you answered \"Other\" above, please explain."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
## [39] "Does this seedling show signs of damage from any of the following?"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
## [40] "Does this tree show any signs of any of the following?"
# Percent_live_canopy
health_assess <- health_assess %>% rename(percent_live_canopy = `Percent live canopy (estimate to the nearest 10% increment, being sure to only include live branches in assessment)\r\n\r\nNote: This is a measure of crown density. In order to estimate this, first envision the amount of canopy there would be if the tree were fully healthy. Butternuts do not typically have a tightly formed canopy even when healthy so be sure to evaluate based on branch presence and location. Then estimate what percent of the envisioned canopy is actually present. This will be your estimate of percent live canopy.`)

# crown class
health_assess <- health_assess %>% rename(crown_class = `What is the crown class of this individual?`)

#base_epicormics
health_assess <- health_assess %>% rename(base_epicormics = `Number of epicormic branches / sprouts from the base`)

#trunk_epicormics
health_assess <- health_assess %>% rename(trunk_epicormics = `Number of epicormic branches / sprouts from the trunk`)

#has_canker
health_assess <- health_assess %>% rename(has_canker = `Visible cankers?`)

#has_callous
health_assess <- health_assess %>% rename(has_callous = `If large cankers are present, are they being calloused over?`)

#trunk_canker_area
health_assess <- health_assess %>% rename(trunk_canker_area = `How much area of the trunk below first main branch is infected by canker, measured as a percentage of total trunk with cankers visible (including cankering visible underneath uplifted bark)?`)


#girdled_circum
health_assess <- health_assess %>% rename(girdled_circum = `At the part of the trunk that appears most girdled by canker, what portion of the circumference of the trunk is girdled?`)

#base_canker_area
health_assess <- health_assess %>% rename(base_canker_area = `How much area of the base/ root flare is infected by canker, e.g. as a percentage of root flare (up to 10 cm above soil) with cankers visible (including underneath bark)?`)

#purdue_severity_based_on_canker
health_assess <- health_assess %>% rename(purdue_severity_based_on_canker = `Assess severity of infection. Focus on the bottom 10 feet of the tree when assessing the number and size of cankers, noting that cankers can be hard to see on old trees with thick bark. CANKERS:`)

#purdue_severity_based_on_canopy
health_assess <- health_assess %>% rename(purdue_severity_based_on_canopy = `Assess severity of infection. CANOPY:`)

Now the variables read:

names(health_assess)
##  [1] "Timestamp"                                                         
##  [2] "Email Address"                                                     
##  [3] "Date"                                                              
##  [4] "Site Number or Initial: JC-W-_______"                              
##  [5] "Plant Number (e.g. 4th tree assessed will be 4)"                   
##  [6] "Number of the 1st photo taken"                                     
##  [7] "GPS location NORTH"                                                
##  [8] "GPS Location WEST"                                                 
##  [9] "Slope (degree)"                                                    
## [10] "Aspect (N, NE, E, etc)"                                            
## [11] "Plant Height (in FEET)"                                            
## [12] "DBH (in CENTIMETERS)"                                              
## [13] "percent_live_canopy"                                               
## [14] "crown_class"                                                       
## [15] "base_epicormics"                                                   
## [16] "trunk_epicormics"                                                  
## [17] "has_canker"                                                        
## [18] "has_callous"                                                       
## [19] "trunk_canker_area"                                                 
## [20] "girdled_circum"                                                    
## [21] "base_canker_area"                                                  
## [22] "purdue_severity_based_on_canker"                                   
## [23] "purdue_severity_based_on_canopy"                                   
## [24] "Shape of terminal bud"                                             
## [25] "Shape of bud scar"                                                 
## [26] "Shape / length of lenticels"                                       
## [27] "Hairs on the end of the twigs"                                     
## [28] "North"                                                             
## [29] "East"                                                              
## [30] "South"                                                             
## [31] "West"                                                              
## [32] "Riparian or upland?"                                               
## [33] "Associated tree species within 20 meters."                         
## [34] "Any additional notes?"                                             
## [35] "Number of the last photo taken"                                    
## [36] "Is this individual a seedling?"                                    
## [37] "What competition is potentially threatening this tree?"            
## [38] "If you answered \"Other\" above, please explain."                  
## [39] "Does this seedling show signs of damage from any of the following?"
## [40] "Does this tree show any signs of any of the following?"

Summary Graphs

has_canker

health_assess %>% ggplot(aes(x=has_canker)) + geom_bar()

has callous

A majority of individuals so far don’t have callousing.

health_assess %>% ggplot(aes(x=has_callous)) + geom_bar()

seedlings

health_assess %>% ggplot(aes(x=`Is this individual a seedling?`)) + geom_bar()

riparian or upland

health_assess %>% ggplot(aes(x=`Riparian or upland?`)) + geom_bar()

densiometer

select(health_assess, North)
## # A tibble: 31 × 1
##    North
##    <chr>
##  1 <NA> 
##  2 <NA> 
##  3 <NA> 
##  4 64   
##  5 <NA> 
##  6 5    
##  7 14   
##  8 <NA> 
##  9 <NA> 
## 10 <NA> 
## # ℹ 21 more rows

Displays of how the variables need to updated

Changes impacting survey

DBH needs to be standardized with numbers

health_assess %>% ggplot(aes(x=`DBH (in CENTIMETERS)`)) + geom_bar()

Plant height also needs to be standarized– perhaps into inches?

Right now I can’t do a histogram because there is not a continuous variable to base it off of.

health_assess %>% ggplot(aes(x=`Plant Height (in FEET)`)) + geom_bar()

Our continuous areas are string values not numbered values

So, my thought is that this column needs to be standardized as a number; in the future form this could look like making the form a number only input.

trunk_canker_area

health_assess %>% ggplot(aes(x=trunk_canker_area)) + geom_bar()

Right now it is ordering the categories by the character not the number.

health_assess <- health_assess %>%
  mutate(trunk_canker_area = fct_reorder(trunk_canker_area, trunk_canker_area, .desc = TRUE))
## Warning: There were 2 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `trunk_canker_area = fct_reorder(trunk_canker_area,
##   trunk_canker_area, .desc = TRUE)`.
## Caused by warning:
## ! `fct_reorder()` removing 7 missing values.
## ℹ Use `.na_rm = TRUE` to silence this message.
## ℹ Use `.na_rm = FALSE` to preserve NAs.
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
health_assess %>% ggplot(aes(x=trunk_canker_area)) + geom_bar()

health_assess %>% select(trunk_canker_area)
## # A tibble: 31 × 1
##    trunk_canker_area
##    <fct>            
##  1 <NA>             
##  2 <NA>             
##  3 5%               
##  4 5%               
##  5 25%              
##  6 15%              
##  7 10               
##  8 <NA>             
##  9 30%              
## 10 5%               
## # ℹ 21 more rows

girdled_circum

health_assess %>% ggplot(aes(x=girdled_circum)) + geom_bar()

health_assess %>% select(girdled_circum)
## # A tibble: 31 × 1
##    girdled_circum
##    <chr>         
##  1 <NA>          
##  2 <NA>          
##  3 20%           
##  4 50%           
##  5 65%           
##  6 40%           
##  7 40            
##  8 <NA>          
##  9 25%           
## 10 20%           
## # ℹ 21 more rows

base_canker_area

health_assess %>% ggplot(aes(x=base_canker_area)) + geom_bar()

health_assess %>% select(base_canker_area)
## # A tibble: 31 × 1
##    base_canker_area
##    <chr>           
##  1 <NA>            
##  2 <NA>            
##  3 30%             
##  4 40%             
##  5 80%             
##  6 0%              
##  7 15              
##  8 <NA>            
##  9 30%             
## 10 20%             
## # ℹ 21 more rows

Formatting changes

Canker severity numbers need to be mapped to categories with shorter names(just include the numbers)

health_assess %>% ggplot(aes(x=purdue_severity_based_on_canker)) + geom_bar()