A. Tridentata QA/QC 2024

Brief Overview

Late season 2023, 25 plots were established at Forest Service pasture (Fort Keogh). Initial pretreatment sampling during that year consisted of plot level sagebrush (height and count), focal sagebrush-20 per plot (height, wide/narrow diameters, stem count, gal index), plant composition (transect), and forage clipping (gross mass).

It had been realized that 4 of the plots had not had focal plant measurements taken or data entered. During the 2024 season, these 4 plots were sampled and data logged. Along with this, prefire treatments had been applied to 10 of the plots (5 sample clippings per) as dictated by the predesignated treatments in the field protocol.

The data related to Focal Plant measurements consist of 500 observations out of 6 variables confirming that all intended 500 measurements (20 focal plants x 25 plots) were successfully taken and entered into a datasheet.

The data related to Clippings consist of 50 observation of 3 variables confirming that all intended samples (5 samples x 10 plots) were taken weighed and entered into a datasheet.

Summary and file path

Below is the pretreatment data taken from the TridentataMasterData excel file. Updated for 2024.

File path access can be found: Rangeland responses to fire > Sagebrush > A_tridentata > Data > TridentataMasterData

Below is a quick summary on all variables of samples recorded. Noted variables to examine are height, diameter, diameter2 and stem count.

##       plot          plantID          height          diameter     
##  Min.   :801.0   Min.   : 17.0   Min.   : 15.00   Min.   : 16.00  
##  1st Qu.:809.0   1st Qu.:144.5   1st Qu.: 60.50   1st Qu.: 73.08  
##  Median :815.0   Median :562.0   Median : 74.00   Median : 98.00  
##  Mean   :814.6   Mean   :426.5   Mean   : 73.45   Mean   :102.78  
##  3rd Qu.:821.0   3rd Qu.:690.5   3rd Qu.: 86.00   3rd Qu.:132.00  
##  Max.   :827.0   Max.   :900.0   Max.   :124.00   Max.   :250.00  
##                  NA's   :5       NA's   :5        NA's   :5       
##     diamter2         stem#         gallindex     
##  Min.   :  9.0   Min.   : 1.00   Min.   :0.0000  
##  1st Qu.: 50.0   1st Qu.: 2.00   1st Qu.:0.0000  
##  Median : 72.0   Median : 3.00   Median :0.0000  
##  Mean   : 75.9   Mean   : 3.36   Mean   :0.0404  
##  3rd Qu.: 95.5   3rd Qu.: 4.00   3rd Qu.:0.0000  
##  Max.   :229.0   Max.   :13.00   Max.   :1.0000  
##  NA's   :5       NA's   :5       NA's   :5

Below shows a quick summary of the raw data for the clips sampled. ‘GrossMass’ (gm) is the main variable to be examined for this portion of the study design (see file path: Rangeland responses to fire > Sagebrush > A_tridentata > Sampling > Fine fuel sampling protocol.docx)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    9.66   14.36   17.09   18.07   19.27   52.59

Data Organization (cleaning):

Rename and filter variables:

Data from TridentataMasterData sheet has been reorganized for QA/QC purposes.

Mispelling corrections, renaming and/or filtering out unnecessary variables are utilized here for this purpose.

Noted changes:

-Variables renamed from ‘diameter’ and ‘diameter2’ to ’diameterW (for wide) and diameterN (for narrow)

-Variable “stem#” changed to “stem_num” due to the use of ‘#’ in R script can be problematic.

-Added in a new variable “Diameter_avg” via ‘rowMeans’ mutation so diameter averages could be examined together and further spot any anomalies.

-All numeric variables had any NA’s either removed or replaced by 0’s.

-Created an ‘identify_outliers’ function to filter out quantile boundaries.

Outlier Checks

Upon examination of focal plant height measurements (cm) where the mean height is 73.45, there was one noted outlier that needed to be investigated. Focal Plant ID: 560, 15 cm.

Actions taken: examined spreadsheet to ensure data was properly placed/entered. Double checked protocols to ensure proper method of measurement. Checked hard copies and scans of original sampling.

Resulting investigation reflects no error. No corrections need to be made.

## # A tibble: 1 × 6
##      ID Height_cm DiameterW_cm DiameterN_cm Stem_Count Diameter_avg
##   <dbl>     <dbl>        <dbl>        <dbl>      <dbl>        <dbl>
## 1   560        15           16            9          1         12.5

Examining the measurments for plant diameters taken at the widest point as dictated in the protocol shows 2 evident outliers.

ID 611: 250 cm

ID 711: 242 cm

Actions taken: examined spreadsheet to ensure data was properly placed/entered. Double checked protocols to ensure proper method of measurement. Checked hard copies and scans of original sampling.

Resulting investigation reflected no error. No corrections needed to be made.

## # A tibble: 2 × 6
##      ID Height_cm DiameterW_cm DiameterN_cm Stem_Count Diameter_avg
##   <dbl>     <dbl>        <dbl>        <dbl>      <dbl>        <dbl>
## 1   611       110          250           70          8          160
## 2   711       108          242           76          5          159

However when the same application was used to examine the measurements taken at the narrowest point perpendicular to the wide measurement, there were unexpected discrepancies.

Notice below that some of the values of DiameterW are smaller than DiameterN. Examples: Plant ID’s - 172, 236, 253. There are others as well. The safe assumption appears to be the entries had been written in backwards. So an R correction had been made here to switch these values to their proper place.

Actions taken: entered R script to filter out ‘conditions_violated’. Mutate function to revise data so as to reflect swapping of values.

## # A tibble: 8 × 6
##      ID Height_cm DiameterW_cm DiameterN_cm Stem_Count Diameter_avg
##   <dbl>     <dbl>        <dbl>        <dbl>      <dbl>        <dbl>
## 1   162        94          172          170          6         171 
## 2   164        91          170          164          6         167 
## 3   172        92          161          170          7         166.
## 4    61        93          193          172          8         182.
## 5   236        85          125          168          6         146.
## 6   253        85           85          229          2         157 
## 7   587        63          134          171          5         152.
## 8   773       122          209          168          2         188.

Correction is shown below.

## [1] "Rows where DiameterW_cm is NOT greater than DiameterN_cm:"
## # A tibble: 43 × 7
##       ID Height_cm DiameterW_cm DiameterN_cm Stem_Count Diameter_avg
##    <dbl>     <dbl>        <dbl>        <dbl>      <dbl>        <dbl>
##  1   141        63           28           32          1         30  
##  2   155        55           85           87          3         86  
##  3   172        92          161          170          7        166. 
##  4   631        40           44           45          3         44.5
##  5   624        84           28           37          4         32.5
##  6   621        63           59           62          3         60.5
##  7   619        39           30           37          3         33.5
##  8   602        41           23           39          1         31  
##  9   600        87          110          125          4        118. 
## 10    70        41           41           41          1         41  
## # ℹ 33 more rows
## # ℹ 1 more variable: Condition_Violated <lgl>

Below are the new resulting outliers after diameter value revision.

These were cross checked with original spreadsheet and hard copies.

No further action required.

## # A tibble: 4 × 6
##      ID Height_cm DiameterW_cm DiameterN_cm Stem_Count Diameter_avg
##   <dbl>     <dbl>        <dbl>        <dbl>      <dbl>        <dbl>
## 1   611       110          250           70          8         160 
## 2   253        85          229          229          2         157 
## 3   711       108          242           76          5         159 
## 4   763       124          220          153          6         186.

## # A tibble: 8 × 6
##      ID Height_cm DiameterW_cm DiameterN_cm Stem_Count Diameter_avg
##   <dbl>     <dbl>        <dbl>        <dbl>      <dbl>        <dbl>
## 1   162        94          172          170          6         171 
## 2   164        91          170          164          6         167 
## 3   172        92          170          170          7         166.
## 4    61        93          193          172          8         182.
## 5   236        85          168          168          6         146.
## 6   253        85          229          229          2         157 
## 7   587        63          171          171          5         152.
## 8   773       122          209          168          2         188.

Overall diameter average

Below is a quick analysis on the “PlotLevelSagebrush” sheet. The averages shown below were double checked by reviewing the spreadsheet as at first I had been skeptical on the Average number of plant per plot. However, this has checked out given there were a total of 6460 observations recorded. There were also some outliers to be investigated.

Action taken: cross examined spreadsheet. Hard copies are missing/unavailabe/not scanned. It would be safe to assume however, that the measurements are correct as I have witness sage at this height in the field.

Resulting investigation reflects no error. No corrections need to be made.

## [1] "Average number of plants per plot: 258.4"

## [1] "Average plant height: 64.5969510813416"

## # A tibble: 6 × 4
##   date                 plot species       height
##   <dttm>              <dbl> <chr>          <dbl>
## 1 2023-06-30 00:00:00     3 A. tridentata    127
## 2 2023-06-30 00:00:00    10 A. tridentata    125
## 3 2023-06-30 00:00:00    23 A. tridentata    129
## 4 2023-06-30 00:00:00    27 A. tridentata    151
## 5 2023-06-30 00:00:00    27 A. tridentata    125
## 6 2023-06-30 00:00:00    27 A. tridentata    127

Looking at the stem count, the histogram (see ‘Stem Count Distribution’ below) created from said data shows a negative skew. There are 19 noted outliers (listed below). Given how counts have been typically lower on average, higher counts are not uncommon. For QA/QC purposes, however, these values were each cross checked with the data entered and the scanned hardcopies to make sure no values were put in error.

Upon examination of clipping mass (grams), there were two noted outliers that needed to be investigated:

Plot 806 Corner A - 42.14g

Plot 813 Corner A - 52.59g

Actions taken - examined spreadsheet to ensure data was properly placed. Double checked protocol for ensure method had been followed to clipping standard. Checked hard copies and scans of original sampling.

Resulting investigation reflects no error. No corrections need to be made.

## # A tibble: 2 × 3
##    plot corner mass_grams
##   <dbl> <chr>       <dbl>
## 1   806 A            42.1
## 2   813 A            52.6

Summary statistics on clean data:

Summary stats has reflected no changes. The average diameter reflects the new variable that had been created.

## # A tibble: 1 × 3
##   Avg_Height Avg_Diameter Avg_Stem_Count
##        <dbl>        <dbl>          <dbl>
## 1       73.5         89.3           3.33

Final Assessment:

Provided the summaries both visual and written, data had been cleaned and organized to root out and change any potential errors. Outliers have been checked and cleared. Diameter data now reflects proper placement of values.

Sampling mitigation recommended:

Ensure measurements are properly recorded by having more descriptive headings in the columns in order to avoid confusion or unnecessary errors.
Run R script on filtering outliers within a week of sampling/data entry. Therefore if something is questionable, the sample can be remeasured in the field (ex. plant height)

A. Tridentata QA/QC 2024

Timothy Littmann

2025-05-08

Brief Overview

Summary and file path

Data Organization (cleaning):

Outlier Checks

Overall diameter average

Summary statistics on clean data:

Final Assessment: