BLM Juniper Mastication QA/QC

BLM Mastication QA/QC

File Sources and Organization

Excel File Path:
C:/Users/tlittmann/OneDrive - USDA/Juniper mastication/BLM/data/CedarCreekMastication.xlsx
Sheets Used:
- CoarseWoodyDebris
- BeltTransects
- Clipping
- PlantComp
Hardcopies:
All field datasheets have been scanned and archived in the Juniper mastication project folder (see filepath).

Sampling Dates

Year	Sampling Dates	Locations
2024	June 24, 25 / July 1	26
2025	June 9, 10, 11, 26	22

Sampling Effort

Data Type	Protocol/Descriptor	Expected Target
Brown’s lines	By class size (25 m transect)	26 transects(’24), 22 transects(’25)
Belt Transect	# hits by species (1x25m transect)	26 transects(’24), 22 transects(’25)
Productivity	# of samples per transect	5/transect
Composition	# of quadrats examined per transect	5/transect

Data Summary Overview

All data from both 2024 and 2025 sampling events have been compiled into a single Excel file, with individual sheets assigned for each sampling category.

Summary of Recorded Observations (data entry):

Below are descriptors for all observations recorded onto Excel files.

Coarse Woody Debris:
- 47 total entries
- 7 variables
- Size classes include small, medium, and large debris
- Classed > 8 cm measured to exact dimensions
Tree/Shrub Coverage:
- 589 total entries
- 5 variables
- Measurements taken using the “HeightHits” protocol
- Heights < 1 m measured in cm (Height)
- Heights > 1 m recorded as start-end points (Range)
Plant Composition:
- 2540 total entries
- 6 variables
- Quadrats assessed with Daubenmire method
- Percentage range midpoints used as recorded values
- Species identified using standardized plant codes
Productivity:
- 241 total entries
- 5 variables
- Herbaceous biomass clipped to 3 cm from ground
- Woody and non-target debris excluded
- Weights recorded post-tare using averaged bag weight
Documentation:
- All hardcopies of datasheets have been scanned
- Digital backups stored alongside electronic data

Transect Summary Table
year	transect	Browns_Lines	BeltTransect	Composition	Clips
2024	37	S: 32, M: 17, L: 4, XL: 1, Mass: 0	Species: 3, Hits: 48	5	5
2024	38	S: 51, M: 14, L: 4, XL: 0, Mass: 0	Species: 5, Hits: 160	5	5
2024	39	S: 52, M: 10, L: 0, XL: 0, Mass: 0	Species: 5, Hits: 14	5	5
2024	40	S: 17, M: 4, L: 0, XL: 0, Mass: 0	Species: 3, Hits: 7	5	5
2024	41	S: 90, M: 21, L: 15, XL: 2, Mass: 0	Species: 4, Hits: 14	5	5
2024	42	S: 14, M: 36, L: 6, XL: 1, Mass: 0	Species: 4, Hits: 16	5	5
2024	43	S: 21, M: 12, L: 3, XL: 0, Mass: 0	Species: 4, Hits: 13	5	5
2024	44	S: 70, M: 17, L: 1, XL: 0, Mass: 0	Species: 2, Hits: 8	5	5
2024	45	S: 38, M: 17, L: 7, XL: 3, Mass: 0	Species: 6, Hits: 22	5	5
2024	46	S: 45, M: 13, L: 9, XL: 2, Mass: 0	Species: 4, Hits: 25	5	5
2024	47	S: 30, M: 8, L: 3, XL: 0, Mass: 0	Species: 6, Hits: 19	5	5
2024	48	S: 46, M: 11, L: 0, XL: 0, Mass: 0	Species: 3, Hits: 20	5	5
2024	49	S: 14, M: 8, L: 0, XL: 0, Mass: 0	Species: 6, Hits: 10	5	5
2024	50	S: 16, M: 0, L: 1, XL: 0, Mass: 0	Species: 4, Hits: 8	5	5
2024	51	S: 23, M: 9, L: 0, XL: 0, Mass: 0	Species: 5, Hits: 9	5	5
2024	52	S: 18, M: 5, L: 1, XL: 0, Mass: 0	Species: 4, Hits: 7	5	5
2024	53	S: 7, M: 7, L: 1, XL: 0, Mass: 0	Species: 1, Hits: 3	5	5
2024	54	S: 22, M: 1, L: 0, XL: 0, Mass: 0	Species: 4, Hits: 9	5	5
2024	55	S: 27, M: 4, L: 1, XL: 0, Mass: 0	Species: 2, Hits: 4	5	5
2024	56	S: 37, M: 11, L: 2, XL: 2, Mass: 0	Species: 4, Hits: 21	5	5
2024	57	S: 16, M: 9, L: 5, XL: 0, Mass: 0	Species: 6, Hits: 12	5	5
2024	58	S: 10, M: 5, L: 2, XL: 0, Mass: 0	Species: 4, Hits: 15	5	5
2025	101A	S: 20, M: 15, L: 5, XL: 0, Mass: 0	Species: 8, Hits: 11	5	5
2025	101B	S: 11, M: 11, L: 0, XL: 0, Mass: 0	Species: 4, Hits: 5	5	5
2025	102A	S: 25, M: 17, L: 0, XL: 0, Mass: 0	Species: 3, Hits: 3	5	5
2025	102B	S: 3, M: 4, L: 0, XL: 0, Mass: 0	Species: 2, Hits: 4	5	5
2025	103A	S: 13, M: 14, L: 1, XL: 0, Mass: 0	Species: 7, Hits: 9	5	5
2025	103B	S: 0, M: 0, L: 0, XL: 0, Mass: 0	Species: 3, Hits: 5	5	5
2025	104A	S: 13, M: 8, L: 2, XL: 0, Mass: 0	Species: 5, Hits: 5	5	5
2025	104B	S: 0, M: 0, L: 0, XL: 0, Mass: 0	Species: 3, Hits: 4	5	5
2025	105A	S: 13, M: 5, L: 0, XL: 0, Mass: 0	Species: 5, Hits: 6	5	5
2025	105B	S: 0, M: 0, L: 0, XL: 0, Mass: 0	Species: 2, Hits: 7	5	5
2025	106A	S: 7, M: 8, L: 0, XL: 0, Mass: 0	Species: 4, Hits: 5	5	5
2025	106B	S: 4, M: 3, L: 0, XL: 0, Mass: 0	Species: 3, Hits: 7	5	5
2025	107A	S: 21, M: 13, L: 1, XL: 0, Mass: 0	Species: 6, Hits: 11	5	5
2025	107B	S: 1, M: 2, L: 0, XL: 0, Mass: 0	Species: 3, Hits: 3	5	5
2025	41	S: 20, M: 23, L: 0, XL: 0, Mass: 4	Species: 3, Hits: 3	5	5
2025	42	S: 0, M: 0, L: 0, XL: 0, Mass: 1	Species: 1, Hits: 1	5	5
2025	43	S: 19, M: 27, L: 12, XL: 0, Mass: 3	Species: 2, Hits: 2	5	5
2025	44	S: 40, M: 30, L: 7, XL: 1, Mass: 4	Species: 2, Hits: 2	5	5
2025	47	S: 77, M: 75, L: 3, XL: 0, Mass: 2	Species: 5, Hits: 5	5	5
2025	48	S: 39, M: 8, L: 4, XL: 1, Mass: 2	Species: 3, Hits: 3	5	5
2025	49	S: 47, M: 59, L: 7, XL: 0, Mass: 2	Species: 4, Hits: 4	5	5
2025	50	S: 112, M: 76, L: 24, XL: 0, Mass: 2	Species: 2, Hits: 5	5	5
2025	51	S: 59, M: 64, L: 16, XL: 0, Mass: 2	Species: 4, Hits: 4	5	5
2025	52	S: 69, M: 36, L: 4, XL: 0, Mass: 2	Species: 4, Hits: 4	5	5
2025	53	S: 70, M: 111, L: 4, XL: 1, Mass: 2	Species: 3, Hits: 3	5	5
2025	54	S: 33, M: 50, L: 4, XL: 0, Mass: 2	Species: 3, Hits: 3	5	5

Coarse Woody Debris

Debris/Mass Totals
Total_small	Total_medium	Total_large	Total_Above8	Total_Mass_Ranges
2136	1590	270	14	28

Missing Values in Woody Debris Dataset
year	date	transect	small	medium	large	Above8	mass
0	0	0	0	0	0	0	0

Total Number of Duplicate Entries
Duplicates
0

Productivity

Added in variable “Net weight” to inspect any outlying data values after tare applications. Conducted summarization by insuring no data entries were missing. Duplicate counts detected via n()>1 of the spreadsheet.

-Duplicates had been detected on initial run. Entries had been examined (reweighed/GrossMass and cross referenced). Raw data was changed or remove as needed. Rerunning of the script shown no further entries

Missing Values and Duplicates
Year	Transect	Quadrat	Gross	NetMass	Duplicates
0	0	0	0	0	0

Negative Values
transect	quadrat	year	Year	Transect	Quadrat	Gross_mass	Net_Weight

Plant Composition

Missing Values and Duplicate Count
Year	Transect	Quadrat	Code	Cover	Duplicates
0	0	0	0	0	0

Unmatched Species Instances
year	date	transect	quadrat	code	cover	sheetID	binomial	common

Class Codes

After removing all zero entries, a count of all class values was conducted to insure only values specified to the Daubenmire method were used.

Daubenmire class entries detected (checked) and changed based on hard copy examination

Counts of Daubenmire Cover Classes
Year	Transect/code	Daubenmire value change
2024	45 JUSC	4 changed to 3
2024	58 CAREX	68 changed to 86

cover	n
1	77
3	771
16	489
38	221
63	102
86	43
98	48
100	15

Plant Codes

After removing all non-plant codes: A table display of the dominant plant species.

Top 10 Plant Species (Frequency and Mean)
code	Frequency	Mean_Cover
CAREX	123	21.658537
PASM	116	15.491379
BRJA	70	5.742857
POTA	67	18.298507
ANPA	48	10.541667
PSTE	41	5.073171
ACMI	40	6.375000
ASCR	33	4.181818
DEPI	31	7.387097
JUSC	30	35.000000

Below shows the distribution of plant coverage (note Daubemire hold specific midpoint values) for 2024 and 2025.

Belt Transects

Range vs HeightHits

Belt transect data was divided into two categorical variables (Range and Height hit ) to apply accountibility on any missing or mistaken entries.

Counts of Hits
year	transect	n
2024	37	48
2024	38	160
2024	39	19
2024	40	11
2024	41	21
2024	42	31
2024	43	18
2024	44	10
2024	45	78
2024	46	114
2024	47	24
2024	48	20
2024	49	25
2024	50	32
2024	51	10
2024	52	10
2024	53	3
2024	54	9
2024	55	5
2024	56	21
2024	57	19
2024	58	129
2025	101A	49
2025	101B	22
2025	102A	12
2025	102B	22
2025	103A	38
2025	103B	21
2025	104A	22
2025	104B	22
2025	105A	33
2025	105B	43
2025	106A	14
2025	106B	44
2025	107A	64
2025	107B	14
2025	41	6
2025	42	8
2025	43	8
2025	44	8
2025	47	17
2025	48	5
2025	49	12
2025	50	44
2025	51	8
2025	52	15
2025	53	12
2025	54	7

Woody Transect Missing Values
year	date	transect	code	HeightHit
0	0	0	0	0

Range

Range entries were examined for improper formatting both in validity and logical patterns so that any improper range measurements are avoided.

## No invalid range hits found in the dataset.

HeightHits

Below is a visualization of woody plant height distribution for both 2024 and 2025

Upon examining heights of plants <1m no outliers above 100 cm or below 0 cm were detected.

## ✅ No HeightHit outliers found above 100 or below 0.

Coverage (canopy)

After separating out the range values and into measurements, outliers were checked for by setting triggers for below 0 and above 20 cm

Coverage (Range Length) Outliers
year	transect	species	length_cm

Proper Species Mapping for Belt Transects

Unmatched Species Instances
year	date	transect	code	HeightHit	sheetID	binomial	common

Assessment and Recommended Mitigations:

Finalized statment:

The data summarized above have been corrected as needed and prepared for future use. Actions to verify, correct or modify data were conducted by R script manipulations, physical examination of samples, review of excel spreadsheets and inspection of hard copy materials.

Recommended mitigations:

• Improved categorization of data – information put on to a spreadsheet should have uniformity in the sense that all entries are following the conditions of the data desired. (Case: Belt Transect entries should have had two columns for each data type; i.e. “Height measured/≤1m” and “start-end points/≥1m” should be recorded as different types of data being collected.

Ongoing work:

• Data wrangling

• Update/modify/clarify species list to ensure accuracy

• Protocol to methods drafting

knitr::opts_chunk$set(echo = FALSE, 
                      warning = FALSE, 
                      message = FALSE)

library(ggplot2)
library(tidyverse)
library(dplyr)
library(tidyr)
library(knitr)
library(readxl)
library(stringr)
if (!require("pacman")) install.packages("pacman")



WoodyDebris <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx", 
    sheet = "CoarseWoodyDebris")

BeltTransect <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx", 
    sheet = "BeltTransects")

Clipping <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx", 
    sheet = "Clipping")

PlantComp <- read_excel(
  "C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx",
  sheet = "PlantComp",
  col_types = c("numeric", "date", "text", "text", "text", "numeric")
)

Revision_Belttransect <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx", 
    sheet = "belt revision")

SpeciesKey <- read_excel("C:/Users/tlittmann/USDA/Rangeland responses to fire - Woody plants/Juniper mastication/BLM/data/CedarCreekMastication.xlsx", 
                                    sheet = "SpeciesKeys")


WoodyDebris <- WoodyDebris %>% mutate(transect = as.character(transect))
BeltTransect <- BeltTransect %>% mutate(transect = as.character(transect))
PlantComp <- PlantComp %>% mutate(transect = as.character(transect), quadrat = as.character(quadrat))
Clipping <- Clipping %>% mutate(transect = as.character(transect))


Above8_count <- WoodyDebris %>%
  separate_longer_delim(Above8, ",") %>%
  group_by(transect, year) %>%
  summarize(Above8 = sum(Above8 != 0), .groups = "drop")



mass_count <- WoodyDebris %>%
  separate_longer_delim(mass, ",") %>%
  group_by(transect, year) %>%
  summarize(mass = sum(!is.na(mass)), .groups = "drop")


WoodyD_trimmed <- WoodyDebris %>%
  select(-Above8, -mass)

WoodyD_combined <- WoodyD_trimmed %>%
  full_join(Above8_count, by = c("transect", "year")) %>%
  full_join(mass_count, by = c("transect", "year"))

WD_sum <- WoodyD_combined %>%
  mutate(WoodyDebris = paste0("S: ", small, ", M: ", medium,
                         ", L: ", large, ", XL: ", Above8,
                         ", Mass: ", mass)) %>%
  select(transect, year, WoodyDebris)

BeltTransect_summary <- BeltTransect %>%
  group_by(year, transect) %>%
  summarise(
    BeltTransect_HeightHits = sum(!is.na(HeightHit)),  
    BeltTransect_Species = n_distinct(code),     
    .groups = "drop"
  )

Composition_summary <- PlantComp %>%
  mutate(quadrat = as.character(quadrat)) %>%  # Ensure 'quadrat' is treated as a character
  group_by(year, transect) %>%
  summarise(n_quadrats = n_distinct(quadrat), .groups = "drop") 


Clipping_summary <- Clipping %>%
  group_by(year, transect) %>%
  summarise(Clipping_Quadrats = n(), .groups = "drop")

summary_table <- WD_sum %>%
  full_join(BeltTransect_summary, by = c("year", "transect")) %>%
  full_join(Composition_summary, by = c("year", "transect")) %>%
  full_join(Clipping_summary, by = c("year", "transect")) %>%
  arrange(year, transect)


summary_table_clean <- summary_table %>%
  rename(HeightHits = BeltTransect_HeightHits,
    Species = BeltTransect_Species,
    Composition = n_quadrats,
    Clips = Clipping_Quadrats
  )

summary_table_clean %>%
  mutate(
    Browns_Lines = WoodyDebris,
    BeltTransect = paste0("Species: ", Species, ", Hits: ", HeightHits)
  ) %>%
  select(year, transect, Browns_Lines, BeltTransect, Composition, Clips) %>%
  arrange(year, transect) %>%
  kable(
    caption = "Transect Summary Table",
    align = "c")

WoodyDebris %>%
  select(year, transect, small, medium, large, Above8, mass) %>%
  group_by(year, transect) %>%
  summarise(Row_Count = n(), .groups = "drop")%>%
  kable(caption = "Row Count")

WoodyDebris_long <- WoodyDebris %>%
  select(year, transect, small, medium, large, Above8, mass) %>%
  separate_rows(mass, sep = ",") %>%
  mutate(
    Above8 = as.character(trimws(Above8)),
    mass = trimws(mass)
  )

summary_data <- WoodyDebris_long %>%
  summarise(
    Total_small = sum(small, na.rm = TRUE),
    Total_medium = sum(medium, na.rm = TRUE),
    Total_large = sum(large, na.rm = TRUE),
    Total_Above8 = sum(Above8 != 0, na.rm = TRUE),
    Total_Mass_Ranges = sum(mass != "" & !is.na(mass)),
    .groups = "drop"
  )

kable(summary_data, caption = "Debris/Mass Totals")

# Step 3: Create the plot using pivot_longer
#summary_data %>%
 # pivot_longer(everything(), names_to = "size", values_to = "count") %>%
  #ggplot(aes(x = size, y = count, fill = size)) +
  #geom_col(show.legend = FALSE) +
  #labs(
   # title = "Total Woody Debris Counts by Size Class",
    #x = "Size Class",
    #y = "Number of Hits"
  #) +
  #theme_minimal()
missing_summary <- WoodyD_combined %>%
  summarise(across(everything(), ~sum(is.na(.)), .names = "{col}"))

missing_summary %>%
  kable(caption = "Missing Values in Woody Debris Dataset")

duplicates <- WoodyDebris %>%
  group_by(transect, year, small, medium, large) %>%
  filter(n() > 1) %>%
  ungroup() %>%
  summarise(Duplicates = n())

kable(duplicates, caption = "Total Number of Duplicate Entries")




Clipping_summary <- Clipping %>%
  group_by(year, transect) %>%
  summarise(Clipping_Quadrats = n(), .groups = "drop")



# Define tare weights
tare_weights <- c(
  B = 8.884,
  S = 4.418,
  L = 21.294,
  NH = 0
)

Clip_net<- Clipping %>%
  mutate(
    net_weight_g = GrossMass - tare_weights[BagType],
    net_weight_g = ifelse(net_weight_g < 0, 0, net_weight_g)
  )



missing_summary <- Clip_net %>%
  summarise(
    Year = sum(is.na(year)),
    Transect = sum(is.na(transect)),
    Quadrat = sum(is.na(quadrat)),
    Gross = sum(is.na(GrossMass)),
    NetMass = sum(is.na(net_weight_g)),
    Duplicates = n() - n_distinct(year, transect, quadrat)
  )

kable(missing_summary, caption = "Missing Values and Duplicates", align = "c")


ggplot(Clip_net, aes(x = net_weight_g)) +
  geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
  labs(
    title = "Distribution of Net Clipping Weights",
    x = "Net Weight (g)",
    y = "Frequency"
  ) +
  theme_minimal()


Clip_net <- Clipping %>%
  group_by(year, transect, quadrat) %>%
  mutate(GrossMass = mean(GrossMass, na.rm = FALSE)) %>%
  distinct(year, transect, quadrat, .keep_all = TRUE) %>%
  ungroup() %>%
  mutate(net_weight_g = GrossMass - tare_weights[BagType])

negative_v <- Clip_net %>%
  mutate(negative_weight = if_else(!is.na(net_weight_g) & net_weight_g < 0, TRUE, FALSE)) %>%
  filter(negative_weight) %>%
  group_by(transect, quadrat, year) %>%
  summarise(
    Year = sum(is.na(year)),
    Transect = sum(is.na(transect)),
    Quadrat = sum(is.na(quadrat)),
    Gross_mass = sum(is.na(GrossMass)),
    Net_Weight = sum(is.na(net_weight_g)),
    .groups = "drop"
  )

kable(negative_v, caption = "Negative Values", align = "c")



PlantComp %>%
  mutate(
    year = as.character(year),
    transect = as.character(transect),
    quadrat = as.character(quadrat),
    code = as.character(code),
    cover = as.character(cover)
  ) %>%
  select(year, transect, quadrat, code, cover) %>%
  summarise(
    Year = sum(is.na(year)),
    Transect = sum(is.na(transect)),
    Quadrat = sum(is.na(quadrat)),
    Code = sum(is.na(code)),
    Cover = sum(is.na(cover)),
    Duplicates = n() - n_distinct(year, transect, quadrat, code, cover)
  ) %>%
  kable(caption = "Missing Values and Duplicate Count")





Species_Map <- PlantComp %>%
  left_join(SpeciesKey %>% select(code, binomial), by = "code") %>%
  filter(!complete.cases(.)) %>%
  group_by(transect, quadrat, code, year) %>%
  summarize(instances = n(), .groups = "drop") %>%
  pivot_wider(names_from = year, values_from = instances)


unmatched <- PlantComp %>%
  left_join(SpeciesKey, by = "code") %>%
  filter(is.na(binomial))

kable(unmatched, caption = "Unmatched Species Instances")







PlantComp %>%
  filter(!is.na(cover) & cover != 0) %>%
  count(cover) %>%
  arrange(cover) %>%
  knitr::kable(
    caption = "Counts of Daubenmire Cover Classes",
    align = "c"
  )
PlantComp %>%
  filter(
    !is.na(cover) & cover != 0,
    !code %in% c("HERBLIT", "SWODE", "BAREG", "SDHERB", "CRUST")
  ) %>%
  distinct(year, transect, quadrat, code, .keep_all = TRUE) %>%
  group_by(code) %>%
  summarise(
    Frequency = n(),
    Mean_Cover = mean(cover, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(Frequency)) %>%
  head(10) %>%
  knitr::kable(
    caption = "Top 10 Plant Species (Frequency and Mean)",
    align = "c"
  )


break_values <- c(1, 3, 16, 38, 63, 86, 98, 100)

PlantComp %>%
  filter(!is.na(cover) & cover != 0) %>%
  filter(cover %in% break_values) %>%  # Keep only the specified cover values
  mutate(cover = as.factor(cover)) %>% # Treat cover as categorical
  group_by(year, cover) %>%
  summarize(count = n(), .groups = "drop") %>%
  ggplot(aes(x = cover, y = count)) +
  geom_bar(stat = "identity", fill = "lightgreen", color = "white") +
  facet_wrap(~year) +
  labs(
    title = "Plant Cover Distribution (Daubenmire)",
    x = "Cover Class (%)",
    y = "# of Instances"
  ) +
  theme_minimal(base_size = 14)



BeltTransect %>%
  separate_longer_delim(HeightHit, ",") %>%
  mutate(hit = as.character(HeightHit)) %>%  
  group_by(year, transect) %>%
  summarise(n = n(), .groups = "drop") %>%
  arrange(year, transect) %>%
  knitr::kable(
    caption = "Counts of Hits",
    align = "c"
  )

missing_BT <- BeltTransect %>%
  summarise(across(everything(), ~sum(is.na(.)), .names = "{col}"))

missing_BT %>%
  kable(caption = "Woody Transect Missing Values")









DF_clean <- Revision_Belttransect %>%
  separate_rows(Range, sep = ",") %>%
  mutate(Range = str_trim(Range))

valid_range_pattern <- "^\\d+-\\d+$"

DF_validated <- DF_clean %>%
  mutate(
    Range_valid_format = str_detect(Range, valid_range_pattern),
    X = as.numeric(str_extract(Range, "^\\d+")),
    Y = as.numeric(str_extract(Range, "(?<=-)\\d+")),
    Range_logical = X < Y,
    Range_valid = Range_valid_format & Range_logical
  )

DF_valid_hits <- DF_validated %>%
  filter(Range_valid) %>%
  select(year, transect, species, HeightHit, Range, X, Y)

# Conditional output
if (nrow(DF_valid_hits) > 0) {
  kable(DF_valid_hits, caption = "invalid Range Hits")
} else {
  cat("No invalid range hits found in the dataset.\n")
}


Revision_Belttransect %>%
  mutate(HeightHit = as.numeric(HeightHit)) %>%
  filter(!is.na(HeightHit), !is.na(year)) %>%
  ggplot(aes(x = HeightHit)) +
  geom_histogram(binwidth = 5, fill = "steelblue", color = "white") +
  facet_wrap(~ year) +
  labs(title = "Distribution of HeightHit by Year", x = "HeightHit", y = "Count") +
  theme_minimal()


library(tidyverse)
library(knitr)

if (
  Revision_Belttransect %>%
    mutate(HeightHit = as.numeric(HeightHit)) %>%
    filter(!is.na(HeightHit)) %>%
    filter(HeightHit > 100 | HeightHit < 0) %>%
    nrow() > 0
) {
  Revision_Belttransect %>%
    mutate(HeightHit = as.numeric(HeightHit)) %>%
    filter(!is.na(HeightHit)) %>%
    filter(HeightHit > 100 | HeightHit < 0) %>%
    select(year, transect, species, HeightHit) %>%
    arrange(desc(HeightHit)) %>%
    kable(caption = "HeightHit Outliers: Values Above 100 or Below 0")
} else {
  cat("✅ No HeightHit outliers found above 100 or below 0.\n")
}

Range_lengths <- Revision_Belttransect %>%
  mutate(
    Range_parts = str_split(Range, ","),
    Range_clean = map(Range_parts, str_trim),
    Range_hits = map(Range_clean, ~ .x[str_detect(.x, "-")])
  ) %>%
  select(year, transect, species, Range_hits) %>%
  unnest(Range_hits) %>%
  separate(Range_hits, into = c("start", "end"), sep = "-", remove = FALSE) %>%
  mutate(
    start = as.numeric(str_trim(start)),
    end = as.numeric(str_trim(end)),
    length_cm = end - start
  ) %>%
  filter(!is.na(length_cm), length_cm > 0)

# Plot histogram
ggplot(Range_lengths, aes(x = length_cm)) +
  geom_histogram(binwidth = 3, fill = "red", color = "white") +
  labs(
    title = "Canopy Coverage (range length)",
    x = "Length (cm)",
    y = "Frequency"
  ) +
  facet_wrap(~ year)+
  theme_minimal()

Range_lengths %>%
  filter(!is.na(length_cm)) %>%
  filter(length_cm > 20| length_cm < 0) %>%
  select(year, transect, species, length_cm) %>%
  arrange(desc(length_cm)) %>%
  kable(caption = "Coverage (Range Length) Outliers")



#Both of these do the same thing but one is more specific fyi

Species_Map <- BeltTransect %>%
  left_join(SpeciesKey %>% select(code, binomial), by = "code") %>%
  filter(!complete.cases(.)) %>%
  group_by(transect, code, year) %>%
  summarize(instances = n(), .groups = "drop") %>%
  pivot_wider(names_from = year, values_from = instances)


unmatched <- BeltTransect %>%
  left_join(SpeciesKey, by = "code") %>%
  filter(is.na(binomial))


kable(unmatched, caption = "Unmatched Species Instances")