Excel File Path:
C:/Users/tlittmann/OneDrive - USDA/Juniper mastication/BLM/data/CedarCreekMastication.xlsx
Sheets Used:
Hardcopies:
All field datasheets have been scanned and archived in the Juniper
mastication project folder (see filepath).
Year | Sampling Dates | Locations |
---|---|---|
2024 | June 24, 25 / July 1 | 26 |
2025 | June 9, 10, 11, 26 | 22 |
Data Type | Protocol/Descriptor | Expected Target |
---|---|---|
Brown’s lines | By class size (25 m transect) | 26 transects(’24), 22 transects(’25) |
Belt Transect | # hits by species (1x25m transect) | 26 transects(’24), 22 transects(’25) |
Productivity | # of samples per transect | 5/transect |
Composition | # of quadrats examined per transect | 5/transect |
All data from both 2024 and 2025 sampling events have been compiled into a single Excel file, with individual sheets assigned for each sampling category.
Summary of Recorded Observations (data entry):
Below are descriptors for all observations recorded onto Excel files.
year | transect | Browns_Lines | BeltTransect | Composition | Clips |
---|---|---|---|---|---|
2024 | 37 | S: 32, M: 17, L: 4, XL: 1, Mass: 0 | Species: 3, Hits: 48 | 5 | 5 |
2024 | 38 | S: 51, M: 14, L: 4, XL: 0, Mass: 0 | Species: 5, Hits: 160 | 5 | 5 |
2024 | 39 | S: 52, M: 10, L: 0, XL: 0, Mass: 0 | Species: 5, Hits: 14 | 5 | 5 |
2024 | 40 | S: 17, M: 4, L: 0, XL: 0, Mass: 0 | Species: 3, Hits: 7 | 5 | 5 |
2024 | 41 | S: 90, M: 21, L: 15, XL: 2, Mass: 0 | Species: 4, Hits: 14 | 5 | 5 |
2024 | 42 | S: 14, M: 36, L: 6, XL: 1, Mass: 0 | Species: 4, Hits: 16 | 5 | 5 |
2024 | 43 | S: 21, M: 12, L: 3, XL: 0, Mass: 0 | Species: 4, Hits: 13 | 5 | 5 |
2024 | 44 | S: 70, M: 17, L: 1, XL: 0, Mass: 0 | Species: 2, Hits: 8 | 5 | 5 |
2024 | 45 | S: 38, M: 17, L: 7, XL: 3, Mass: 0 | Species: 6, Hits: 22 | 5 | 5 |
2024 | 46 | S: 45, M: 13, L: 9, XL: 2, Mass: 0 | Species: 4, Hits: 25 | 5 | 5 |
2024 | 47 | S: 30, M: 8, L: 3, XL: 0, Mass: 0 | Species: 6, Hits: 19 | 5 | 5 |
2024 | 48 | S: 46, M: 11, L: 0, XL: 0, Mass: 0 | Species: 3, Hits: 20 | 5 | 5 |
2024 | 49 | S: 14, M: 8, L: 0, XL: 0, Mass: 0 | Species: 6, Hits: 10 | 5 | 5 |
2024 | 50 | S: 16, M: 0, L: 1, XL: 0, Mass: 0 | Species: 4, Hits: 8 | 5 | 5 |
2024 | 51 | S: 23, M: 9, L: 0, XL: 0, Mass: 0 | Species: 5, Hits: 9 | 5 | 5 |
2024 | 52 | S: 18, M: 5, L: 1, XL: 0, Mass: 0 | Species: 4, Hits: 7 | 5 | 5 |
2024 | 53 | S: 7, M: 7, L: 1, XL: 0, Mass: 0 | Species: 1, Hits: 3 | 5 | 5 |
2024 | 54 | S: 22, M: 1, L: 0, XL: 0, Mass: 0 | Species: 4, Hits: 9 | 5 | 5 |
2024 | 55 | S: 27, M: 4, L: 1, XL: 0, Mass: 0 | Species: 2, Hits: 4 | 5 | 5 |
2024 | 56 | S: 37, M: 11, L: 2, XL: 2, Mass: 0 | Species: 4, Hits: 21 | 5 | 5 |
2024 | 57 | S: 16, M: 9, L: 5, XL: 0, Mass: 0 | Species: 6, Hits: 12 | 5 | 5 |
2024 | 58 | S: 10, M: 5, L: 2, XL: 0, Mass: 0 | Species: 4, Hits: 15 | 5 | 5 |
2025 | 101A | S: 20, M: 15, L: 5, XL: 0, Mass: 0 | Species: 8, Hits: 11 | 5 | 5 |
2025 | 101B | S: 11, M: 11, L: 0, XL: 0, Mass: 0 | Species: 4, Hits: 5 | 5 | 5 |
2025 | 102A | S: 25, M: 17, L: 0, XL: 0, Mass: 0 | Species: 3, Hits: 3 | 5 | 5 |
2025 | 102B | S: 3, M: 4, L: 0, XL: 0, Mass: 0 | Species: 2, Hits: 4 | 5 | 5 |
2025 | 103A | S: 13, M: 14, L: 1, XL: 0, Mass: 0 | Species: 7, Hits: 9 | 5 | 5 |
2025 | 103B | S: 0, M: 0, L: 0, XL: 0, Mass: 0 | Species: 3, Hits: 5 | 5 | 5 |
2025 | 104A | S: 13, M: 8, L: 2, XL: 0, Mass: 0 | Species: 5, Hits: 5 | 5 | 5 |
2025 | 104B | S: 0, M: 0, L: 0, XL: 0, Mass: 0 | Species: 3, Hits: 4 | 5 | 5 |
2025 | 105A | S: 13, M: 5, L: 0, XL: 0, Mass: 0 | Species: 5, Hits: 6 | 5 | 5 |
2025 | 105B | S: 0, M: 0, L: 0, XL: 0, Mass: 0 | Species: 2, Hits: 7 | 5 | 5 |
2025 | 106A | S: 7, M: 8, L: 0, XL: 0, Mass: 0 | Species: 4, Hits: 5 | 5 | 5 |
2025 | 106B | S: 4, M: 3, L: 0, XL: 0, Mass: 0 | Species: 3, Hits: 7 | 5 | 5 |
2025 | 107A | S: 21, M: 13, L: 1, XL: 0, Mass: 0 | Species: 6, Hits: 11 | 5 | 5 |
2025 | 107B | S: 1, M: 2, L: 0, XL: 0, Mass: 0 | Species: 3, Hits: 3 | 5 | 5 |
2025 | 41 | S: 20, M: 23, L: 0, XL: 0, Mass: 4 | Species: 3, Hits: 3 | 5 | 5 |
2025 | 42 | S: 0, M: 0, L: 0, XL: 0, Mass: 1 | Species: 1, Hits: 1 | 5 | 5 |
2025 | 43 | S: 19, M: 27, L: 12, XL: 0, Mass: 3 | Species: 2, Hits: 2 | 5 | 5 |
2025 | 44 | S: 40, M: 30, L: 7, XL: 1, Mass: 4 | Species: 2, Hits: 2 | 5 | 5 |
2025 | 47 | S: 77, M: 75, L: 3, XL: 0, Mass: 2 | Species: 5, Hits: 5 | 5 | 5 |
2025 | 48 | S: 39, M: 8, L: 4, XL: 1, Mass: 2 | Species: 3, Hits: 3 | 5 | 5 |
2025 | 49 | S: 47, M: 59, L: 7, XL: 0, Mass: 2 | Species: 4, Hits: 4 | 5 | 5 |
2025 | 50 | S: 112, M: 76, L: 24, XL: 0, Mass: 2 | Species: 2, Hits: 5 | 5 | 5 |
2025 | 51 | S: 59, M: 64, L: 16, XL: 0, Mass: 2 | Species: 4, Hits: 4 | 5 | 5 |
2025 | 52 | S: 69, M: 36, L: 4, XL: 0, Mass: 2 | Species: 4, Hits: 4 | 5 | 5 |
2025 | 53 | S: 70, M: 111, L: 4, XL: 1, Mass: 2 | Species: 3, Hits: 3 | 5 | 5 |
2025 | 54 | S: 33, M: 50, L: 4, XL: 0, Mass: 2 | Species: 3, Hits: 3 | 5 | 5 |
Total_small | Total_medium | Total_large | Total_Above8 | Total_Mass_Ranges |
---|---|---|---|---|
2136 | 1590 | 270 | 14 | 28 |
year | date | transect | small | medium | large | Above8 | mass |
---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Duplicates |
---|
0 |
Added in variable “Net weight” to inspect any outlying data values after tare applications. Conducted summarization by insuring no data entries were missing. Duplicate counts detected via n()>1 of the spreadsheet.
-Duplicates had been detected on initial run. Entries had been examined (reweighed/GrossMass and cross referenced). Raw data was changed or remove as needed. Rerunning of the script shown no further entries
Year | Transect | Quadrat | Gross | NetMass | Duplicates |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 |
transect | quadrat | year | Year | Transect | Quadrat | Gross_mass | Net_Weight |
---|
Year | Transect | Quadrat | Code | Cover | Duplicates |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 |
year | date | transect | quadrat | code | cover | sheetID | binomial | common |
---|
Class Codes
After removing all zero entries, a count of all class values was conducted to insure only values specified to the Daubenmire method were used.
Daubenmire class entries detected (checked) and changed based on hard copy examination
Year | Transect/code | Daubenmire value change |
---|---|---|
2024 | 45 JUSC | 4 changed to 3 |
2024 | 58 CAREX | 68 changed to 86 |
cover | n |
---|---|
1 | 77 |
3 | 771 |
16 | 489 |
38 | 221 |
63 | 102 |
86 | 43 |
98 | 48 |
100 | 15 |
Plant Codes
After removing all non-plant codes: A table display of the dominant plant species.
code | Frequency | Mean_Cover |
---|---|---|
CAREX | 123 | 21.658537 |
PASM | 116 | 15.491379 |
BRJA | 70 | 5.742857 |
POTA | 67 | 18.298507 |
ANPA | 48 | 10.541667 |
PSTE | 41 | 5.073171 |
ACMI | 40 | 6.375000 |
ASCR | 33 | 4.181818 |
DEPI | 31 | 7.387097 |
JUSC | 30 | 35.000000 |
Below shows the distribution of plant coverage (note Daubemire hold specific midpoint values) for 2024 and 2025.
Range vs HeightHits
Belt transect data was divided into two categorical variables (Range and Height hit ) to apply accountibility on any missing or mistaken entries.
year | transect | n |
---|---|---|
2024 | 37 | 48 |
2024 | 38 | 160 |
2024 | 39 | 19 |
2024 | 40 | 11 |
2024 | 41 | 21 |
2024 | 42 | 31 |
2024 | 43 | 18 |
2024 | 44 | 10 |
2024 | 45 | 78 |
2024 | 46 | 114 |
2024 | 47 | 24 |
2024 | 48 | 20 |
2024 | 49 | 25 |
2024 | 50 | 32 |
2024 | 51 | 10 |
2024 | 52 | 10 |
2024 | 53 | 3 |
2024 | 54 | 9 |
2024 | 55 | 5 |
2024 | 56 | 21 |
2024 | 57 | 19 |
2024 | 58 | 129 |
2025 | 101A | 49 |
2025 | 101B | 22 |
2025 | 102A | 12 |
2025 | 102B | 22 |
2025 | 103A | 38 |
2025 | 103B | 21 |
2025 | 104A | 22 |
2025 | 104B | 22 |
2025 | 105A | 33 |
2025 | 105B | 43 |
2025 | 106A | 14 |
2025 | 106B | 44 |
2025 | 107A | 64 |
2025 | 107B | 14 |
2025 | 41 | 6 |
2025 | 42 | 8 |
2025 | 43 | 8 |
2025 | 44 | 8 |
2025 | 47 | 17 |
2025 | 48 | 5 |
2025 | 49 | 12 |
2025 | 50 | 44 |
2025 | 51 | 8 |
2025 | 52 | 15 |
2025 | 53 | 12 |
2025 | 54 | 7 |
year | date | transect | code | HeightHit |
---|---|---|---|---|
0 | 0 | 0 | 0 | 0 |
Range
Range entries were examined for improper formatting both in validity and logical patterns so that any improper range measurements are avoided.
## No invalid range hits found in the dataset.
HeightHits
Below is a visualization of woody plant height distribution for both 2024 and 2025
Upon examining heights of plants <1m no outliers above 100 cm or below 0 cm were detected.
## ✅ No HeightHit outliers found above 100 or below 0.
Coverage (canopy)
After separating out the range values and into measurements, outliers were checked for by setting triggers for below 0 and above 20 cm
year | transect | species | length_cm |
---|
Proper Species Mapping for Belt Transects
year | date | transect | code | HeightHit | sheetID | binomial | common |
---|
The data summarized above have been corrected as needed and prepared for future use. Actions to verify, correct or modify data were conducted by R script manipulations, physical examination of samples, review of excel spreadsheets and inspection of hard copy materials.
• Improved categorization of data – information put on to a spreadsheet should have uniformity in the sense that all entries are following the conditions of the data desired. (Case: Belt Transect entries should have had two columns for each data type; i.e. “Height measured/≤1m” and “start-end points/≥1m” should be recorded as different types of data being collected.
• Data wrangling
• Update/modify/clarify species list to ensure accuracy
• Protocol to methods drafting
knitr::opts_chunk$set(echo = FALSE,
warning = FALSE,
message = FALSE)
library(ggplot2)
library(tidyverse)
library(dplyr)
library(tidyr)
library(knitr)
library(readxl)
library(stringr)
if (!require("pacman")) install.packages("pacman")
WoodyDebris <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx",
sheet = "CoarseWoodyDebris")
BeltTransect <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx",
sheet = "BeltTransects")
Clipping <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx",
sheet = "Clipping")
PlantComp <- read_excel(
"C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx",
sheet = "PlantComp",
col_types = c("numeric", "date", "text", "text", "text", "numeric")
)
Revision_Belttransect <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx",
sheet = "belt revision")
SpeciesKey <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx",
sheet = "SpeciesKeys")
WoodyDebris <- WoodyDebris %>% mutate(transect = as.character(transect))
BeltTransect <- BeltTransect %>% mutate(transect = as.character(transect))
PlantComp <- PlantComp %>% mutate(transect = as.character(transect), quadrat = as.character(quadrat))
Clipping <- Clipping %>% mutate(transect = as.character(transect))
Above8_count <- WoodyDebris %>%
separate_longer_delim(Above8, ",") %>%
group_by(transect, year) %>%
summarize(Above8 = sum(Above8 != 0), .groups = "drop")
mass_count <- WoodyDebris %>%
separate_longer_delim(mass, ",") %>%
group_by(transect, year) %>%
summarize(mass = sum(!is.na(mass)), .groups = "drop")
WoodyD_trimmed <- WoodyDebris %>%
select(-Above8, -mass)
WoodyD_combined <- WoodyD_trimmed %>%
full_join(Above8_count, by = c("transect", "year")) %>%
full_join(mass_count, by = c("transect", "year"))
WD_sum <- WoodyD_combined %>%
mutate(WoodyDebris = paste0("S: ", small, ", M: ", medium,
", L: ", large, ", XL: ", Above8,
", Mass: ", mass)) %>%
select(transect, year, WoodyDebris)
BeltTransect_summary <- BeltTransect %>%
group_by(year, transect) %>%
summarise(
BeltTransect_HeightHits = sum(!is.na(HeightHit)),
BeltTransect_Species = n_distinct(code),
.groups = "drop"
)
Composition_summary <- PlantComp %>%
mutate(quadrat = as.character(quadrat)) %>% # Ensure 'quadrat' is treated as a character
group_by(year, transect) %>%
summarise(n_quadrats = n_distinct(quadrat), .groups = "drop")
Clipping_summary <- Clipping %>%
group_by(year, transect) %>%
summarise(Clipping_Quadrats = n(), .groups = "drop")
summary_table <- WD_sum %>%
full_join(BeltTransect_summary, by = c("year", "transect")) %>%
full_join(Composition_summary, by = c("year", "transect")) %>%
full_join(Clipping_summary, by = c("year", "transect")) %>%
arrange(year, transect)
summary_table_clean <- summary_table %>%
rename(HeightHits = BeltTransect_HeightHits,
Species = BeltTransect_Species,
Composition = n_quadrats,
Clips = Clipping_Quadrats
)
summary_table_clean %>%
mutate(
Browns_Lines = WoodyDebris,
BeltTransect = paste0("Species: ", Species, ", Hits: ", HeightHits)
) %>%
select(year, transect, Browns_Lines, BeltTransect, Composition, Clips) %>%
arrange(year, transect) %>%
kable(
caption = "Transect Summary Table",
align = "c")
WoodyDebris %>%
select(year, transect, small, medium, large, Above8, mass) %>%
group_by(year, transect) %>%
summarise(Row_Count = n(), .groups = "drop")%>%
kable(caption = "Row Count")
WoodyDebris_long <- WoodyDebris %>%
select(year, transect, small, medium, large, Above8, mass) %>%
separate_rows(mass, sep = ",") %>%
mutate(
Above8 = as.character(trimws(Above8)),
mass = trimws(mass)
)
summary_data <- WoodyDebris_long %>%
summarise(
Total_small = sum(small, na.rm = TRUE),
Total_medium = sum(medium, na.rm = TRUE),
Total_large = sum(large, na.rm = TRUE),
Total_Above8 = sum(Above8 != 0, na.rm = TRUE),
Total_Mass_Ranges = sum(mass != "" & !is.na(mass)),
.groups = "drop"
)
kable(summary_data, caption = "Debris/Mass Totals")
# Step 3: Create the plot using pivot_longer
#summary_data %>%
# pivot_longer(everything(), names_to = "size", values_to = "count") %>%
#ggplot(aes(x = size, y = count, fill = size)) +
#geom_col(show.legend = FALSE) +
#labs(
# title = "Total Woody Debris Counts by Size Class",
#x = "Size Class",
#y = "Number of Hits"
#) +
#theme_minimal()
missing_summary <- WoodyD_combined %>%
summarise(across(everything(), ~sum(is.na(.)), .names = "{col}"))
missing_summary %>%
kable(caption = "Missing Values in Woody Debris Dataset")
duplicates <- WoodyDebris %>%
group_by(transect, year, small, medium, large) %>%
filter(n() > 1) %>%
ungroup() %>%
summarise(Duplicates = n())
kable(duplicates, caption = "Total Number of Duplicate Entries")
Clipping_summary <- Clipping %>%
group_by(year, transect) %>%
summarise(Clipping_Quadrats = n(), .groups = "drop")
# Define tare weights
tare_weights <- c(
B = 8.884,
S = 4.418,
L = 21.294,
NH = 0
)
Clip_net<- Clipping %>%
mutate(
net_weight_g = GrossMass - tare_weights[BagType],
net_weight_g = ifelse(net_weight_g < 0, 0, net_weight_g)
)
missing_summary <- Clip_net %>%
summarise(
Year = sum(is.na(year)),
Transect = sum(is.na(transect)),
Quadrat = sum(is.na(quadrat)),
Gross = sum(is.na(GrossMass)),
NetMass = sum(is.na(net_weight_g)),
Duplicates = n() - n_distinct(year, transect, quadrat)
)
kable(missing_summary, caption = "Missing Values and Duplicates", align = "c")
ggplot(Clip_net, aes(x = net_weight_g)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
labs(
title = "Distribution of Net Clipping Weights",
x = "Net Weight (g)",
y = "Frequency"
) +
theme_minimal()
Clip_net <- Clipping %>%
group_by(year, transect, quadrat) %>%
mutate(GrossMass = mean(GrossMass, na.rm = FALSE)) %>%
distinct(year, transect, quadrat, .keep_all = TRUE) %>%
ungroup() %>%
mutate(net_weight_g = GrossMass - tare_weights[BagType])
negative_v <- Clip_net %>%
mutate(negative_weight = if_else(!is.na(net_weight_g) & net_weight_g < 0, TRUE, FALSE)) %>%
filter(negative_weight) %>%
group_by(transect, quadrat, year) %>%
summarise(
Year = sum(is.na(year)),
Transect = sum(is.na(transect)),
Quadrat = sum(is.na(quadrat)),
Gross_mass = sum(is.na(GrossMass)),
Net_Weight = sum(is.na(net_weight_g)),
.groups = "drop"
)
kable(negative_v, caption = "Negative Values", align = "c")
PlantComp %>%
mutate(
year = as.character(year),
transect = as.character(transect),
quadrat = as.character(quadrat),
code = as.character(code),
cover = as.character(cover)
) %>%
select(year, transect, quadrat, code, cover) %>%
summarise(
Year = sum(is.na(year)),
Transect = sum(is.na(transect)),
Quadrat = sum(is.na(quadrat)),
Code = sum(is.na(code)),
Cover = sum(is.na(cover)),
Duplicates = n() - n_distinct(year, transect, quadrat, code, cover)
) %>%
kable(caption = "Missing Values and Duplicate Count")
Species_Map <- PlantComp %>%
left_join(SpeciesKey %>% select(code, binomial), by = "code") %>%
filter(!complete.cases(.)) %>%
group_by(transect, quadrat, code, year) %>%
summarize(instances = n(), .groups = "drop") %>%
pivot_wider(names_from = year, values_from = instances)
unmatched <- PlantComp %>%
left_join(SpeciesKey, by = "code") %>%
filter(is.na(binomial))
kable(unmatched, caption = "Unmatched Species Instances")
PlantComp %>%
filter(!is.na(cover) & cover != 0) %>%
count(cover) %>%
arrange(cover) %>%
knitr::kable(
caption = "Counts of Daubenmire Cover Classes",
align = "c"
)
PlantComp %>%
filter(
!is.na(cover) & cover != 0,
!code %in% c("HERBLIT", "SWODE", "BAREG", "SDHERB", "CRUST")
) %>%
distinct(year, transect, quadrat, code, .keep_all = TRUE) %>%
group_by(code) %>%
summarise(
Frequency = n(),
Mean_Cover = mean(cover, na.rm = TRUE),
.groups = "drop"
) %>%
arrange(desc(Frequency)) %>%
head(10) %>%
knitr::kable(
caption = "Top 10 Plant Species (Frequency and Mean)",
align = "c"
)
break_values <- c(1, 3, 16, 38, 63, 86, 98, 100)
PlantComp %>%
filter(!is.na(cover) & cover != 0) %>%
filter(cover %in% break_values) %>% # Keep only the specified cover values
mutate(cover = as.factor(cover)) %>% # Treat cover as categorical
group_by(year, cover) %>%
summarize(count = n(), .groups = "drop") %>%
ggplot(aes(x = cover, y = count)) +
geom_bar(stat = "identity", fill = "lightgreen", color = "white") +
facet_wrap(~year) +
labs(
title = "Plant Cover Distribution (Daubenmire)",
x = "Cover Class (%)",
y = "# of Instances"
) +
theme_minimal(base_size = 14)
BeltTransect %>%
separate_longer_delim(HeightHit, ",") %>%
mutate(hit = as.character(HeightHit)) %>%
group_by(year, transect) %>%
summarise(n = n(), .groups = "drop") %>%
arrange(year, transect) %>%
knitr::kable(
caption = "Counts of Hits",
align = "c"
)
missing_BT <- BeltTransect %>%
summarise(across(everything(), ~sum(is.na(.)), .names = "{col}"))
missing_BT %>%
kable(caption = "Woody Transect Missing Values")
DF_clean <- Revision_Belttransect %>%
separate_rows(Range, sep = ",") %>%
mutate(Range = str_trim(Range))
valid_range_pattern <- "^\\d+-\\d+$"
DF_validated <- DF_clean %>%
mutate(
Range_valid_format = str_detect(Range, valid_range_pattern),
X = as.numeric(str_extract(Range, "^\\d+")),
Y = as.numeric(str_extract(Range, "(?<=-)\\d+")),
Range_logical = X < Y,
Range_valid = Range_valid_format & Range_logical
)
DF_valid_hits <- DF_validated %>%
filter(Range_valid) %>%
select(year, transect, species, HeightHit, Range, X, Y)
# Conditional output
if (nrow(DF_valid_hits) > 0) {
kable(DF_valid_hits, caption = "invalid Range Hits")
} else {
cat("No invalid range hits found in the dataset.\n")
}
Revision_Belttransect %>%
mutate(HeightHit = as.numeric(HeightHit)) %>%
filter(!is.na(HeightHit), !is.na(year)) %>%
ggplot(aes(x = HeightHit)) +
geom_histogram(binwidth = 5, fill = "steelblue", color = "white") +
facet_wrap(~ year) +
labs(title = "Distribution of HeightHit by Year", x = "HeightHit", y = "Count") +
theme_minimal()
library(tidyverse)
library(knitr)
if (
Revision_Belttransect %>%
mutate(HeightHit = as.numeric(HeightHit)) %>%
filter(!is.na(HeightHit)) %>%
filter(HeightHit > 100 | HeightHit < 0) %>%
nrow() > 0
) {
Revision_Belttransect %>%
mutate(HeightHit = as.numeric(HeightHit)) %>%
filter(!is.na(HeightHit)) %>%
filter(HeightHit > 100 | HeightHit < 0) %>%
select(year, transect, species, HeightHit) %>%
arrange(desc(HeightHit)) %>%
kable(caption = "HeightHit Outliers: Values Above 100 or Below 0")
} else {
cat("✅ No HeightHit outliers found above 100 or below 0.\n")
}
Range_lengths <- Revision_Belttransect %>%
mutate(
Range_parts = str_split(Range, ","),
Range_clean = map(Range_parts, str_trim),
Range_hits = map(Range_clean, ~ .x[str_detect(.x, "-")])
) %>%
select(year, transect, species, Range_hits) %>%
unnest(Range_hits) %>%
separate(Range_hits, into = c("start", "end"), sep = "-", remove = FALSE) %>%
mutate(
start = as.numeric(str_trim(start)),
end = as.numeric(str_trim(end)),
length_cm = end - start
) %>%
filter(!is.na(length_cm), length_cm > 0)
# Plot histogram
ggplot(Range_lengths, aes(x = length_cm)) +
geom_histogram(binwidth = 3, fill = "red", color = "white") +
labs(
title = "Canopy Coverage (range length)",
x = "Length (cm)",
y = "Frequency"
) +
facet_wrap(~ year)+
theme_minimal()
Range_lengths %>%
filter(!is.na(length_cm)) %>%
filter(length_cm > 20| length_cm < 0) %>%
select(year, transect, species, length_cm) %>%
arrange(desc(length_cm)) %>%
kable(caption = "Coverage (Range Length) Outliers")
#Both of these do the same thing but one is more specific fyi
Species_Map <- BeltTransect %>%
left_join(SpeciesKey %>% select(code, binomial), by = "code") %>%
filter(!complete.cases(.)) %>%
group_by(transect, code, year) %>%
summarize(instances = n(), .groups = "drop") %>%
pivot_wider(names_from = year, values_from = instances)
unmatched <- BeltTransect %>%
left_join(SpeciesKey, by = "code") %>%
filter(is.na(binomial))
kable(unmatched, caption = "Unmatched Species Instances")