PADEP conducted a physical habitat assessment training/calibration during the 2022 Biologists Meeting in Williamsport on October 20, 2022. The results were compiled and examined to identify parameters with weak precision that may deserve attention during future training/calibration events. Another objective was to begin investigating method precision estimates.
Site results are organized in tabbed format. Click tabs across the top to examine results from each site - and an overall summary.
If you would like to recreate this analysis, or complete additional analysis, the dataset and R code are located on the last tab
Before interpreting results, keep in mind that Bottle Run is impaired for:
1) FLOW REGIME MODIFICATION – RURAL (RESIDENTIAL AREAS),
2) ORGANIC ENRICHMENT – RURAL (RESIDENTIAL AREAS), and
3) PATHOGENS – SOURCE UNKNOWN.
| SEC_STATION_ID | parameter | mean | sd | cv | n | min | max | rank |
|---|---|---|---|---|---|---|---|---|
| Bottle | total_score | 88.2 | 9.6 | 26.4 | 24 | 69 | 106 | 12 |
| Bottle | instr_coup_GP | 17.1 | 4.1 | 9.6 | 24 | 8 | 26 | 11 |
| Bottle | SED_DEP | 6.0 | 2.7 | 9.0 | 24 | 2 | 15 | 10 |
| Bottle | bank_coup_GP | 26.4 | 4.4 | 6.8 | 24 | 17 | 33 | 9 |
| Bottle | RIP_VEG_ZONE | 7.5 | 2.5 | 6.6 | 24 | 4 | 16 | 8 |
| Bottle | POOL_VAR | 6.0 | 1.8 | 6.0 | 24 | 3 | 10 | 7 |
| Bottle | CHAN_ALT | 8.8 | 2.3 | 5.2 | 24 | 5 | 14 | 6 |
| Bottle | SUB_AVAIL_COVER | 8.0 | 1.9 | 4.8 | 24 | 5 | 11 | 5 |
| Bottle | POOL_SUB_CHAR | 11.1 | 2.5 | 4.6 | 24 | 6 | 16 | 4 |
| Bottle | CHAN_FLOW | 14.5 | 3.0 | 4.2 | 24 | 5 | 19 | 3 |
| Bottle | BANK_VEG | 13.2 | 2.6 | 4.0 | 24 | 9 | 17 | 2 |
| Bottle | COND_BANKS | 13.2 | 2.5 | 3.8 | 24 | 8 | 17 | 1 |
Before interpreting results, keep in mind that Hagermans Run Upstream is attaining Aquatic Life, Recreational, and Potable Water Supply uses.
| SEC_STATION_ID | parameter | mean | sd | cv | n | min | max | rank |
|---|---|---|---|---|---|---|---|---|
| Hagermans UP | total_score | 182.9 | 10.1 | 14.4 | 25 | 165 | 204 | 15 |
| Hagermans UP | bank_coup_RR | 26.6 | 4.1 | 6.0 | 25 | 19 | 34 | 14 |
| Hagermans UP | COND_BANKS | 12.2 | 2.5 | 4.0 | 25 | 7 | 16 | 13 |
| Hagermans UP | instr_coup_RR | 26.6 | 2.6 | 4.0 | 25 | 19 | 31 | 12 |
| Hagermans UP | BANK_VEG | 14.4 | 2.2 | 3.0 | 25 | 11 | 19 | 11 |
| Hagermans UP | SED_DEP | 12.7 | 1.9 | 3.0 | 25 | 7 | 15 | 10 |
| Hagermans UP | VELO_DEPTH | 14.3 | 2.2 | 3.0 | 25 | 10 | 19 | 9 |
| Hagermans UP | CHAN_FLOW | 15.9 | 2.0 | 2.6 | 25 | 13 | 19 | 8 |
| Hagermans UP | CHAN_ALT | 14.8 | 1.8 | 2.4 | 25 | 11 | 18 | 7 |
| Hagermans UP | EMBEDDED | 14.0 | 1.4 | 2.0 | 25 | 11 | 16 | 6 |
| Hagermans UP | EPI_SUB | 17.1 | 1.3 | 1.6 | 25 | 15 | 20 | 5 |
| Hagermans UP | GRAZING | 17.2 | 1.4 | 1.6 | 25 | 15 | 20 | 4 |
| Hagermans UP | RIP_VEG_ZONE | 15.7 | 1.2 | 1.6 | 25 | 13 | 18 | 3 |
| Hagermans UP | INST_COVER | 16.6 | 1.1 | 1.4 | 25 | 15 | 19 | 2 |
| Hagermans UP | FREQ_RIFF | 18.2 | 1.1 | 1.2 | 25 | 16 | 20 | 1 |
Before interpreting results, keep in mind that Hagermans Run Downstream is impaired for:
1) HABITAT ALTERATIONS – CHANNELIZATION,
2) CAUSE UNKNOWN – URBAN RUNOFF/STORM SEWERS, and
3) PATHOGENS – SOURCE UNKNOWN.
| SEC_STATION_ID | parameter | mean | sd | cv | n | min | max | rank |
|---|---|---|---|---|---|---|---|---|
| Hagermans DOWN | total_score | 124.1 | 11.8 | 24.0 | 27 | 94 | 146 | 15 |
| Hagermans DOWN | bank_coup_RR | 14.2 | 4.4 | 12.4 | 27 | 9 | 22 | 14 |
| Hagermans DOWN | COND_BANKS | 6.3 | 2.5 | 8.0 | 27 | 2 | 10 | 13 |
| Hagermans DOWN | GRAZING | 6.4 | 2.2 | 6.8 | 27 | 3 | 12 | 12 |
| Hagermans DOWN | RIP_VEG_ZONE | 4.0 | 1.2 | 6.0 | 27 | 2 | 7 | 11 |
| Hagermans DOWN | CHAN_ALT | 8.7 | 2.4 | 5.6 | 27 | 4 | 12 | 10 |
| Hagermans DOWN | instr_coup_RR | 20.6 | 2.8 | 5.6 | 27 | 15 | 26 | 9 |
| Hagermans DOWN | BANK_VEG | 7.9 | 2.1 | 5.4 | 27 | 5 | 12 | 8 |
| Hagermans DOWN | SED_DEP | 8.7 | 2.2 | 5.0 | 27 | 5 | 12 | 7 |
| Hagermans DOWN | FREQ_RIFF | 15.7 | 3.0 | 3.8 | 27 | 5 | 18 | 6 |
| Hagermans DOWN | CHAN_FLOW | 13.1 | 2.1 | 3.2 | 27 | 9 | 18 | 5 |
| Hagermans DOWN | EMBEDDED | 11.9 | 1.5 | 2.6 | 27 | 10 | 15 | 4 |
| Hagermans DOWN | VELO_DEPTH | 12.4 | 1.6 | 2.6 | 27 | 10 | 16 | 3 |
| Hagermans DOWN | INST_COVER | 13.8 | 1.6 | 2.4 | 27 | 11 | 17 | 2 |
| Hagermans DOWN | EPI_SUB | 15.0 | 1.3 | 1.8 | 27 | 12 | 17 | 1 |
Unlike other indices, the physical habitat assessment does not have an associated precision estimate that would allow for spatial or temporal comparisons of to determine if differences in scores are significant. This training represents the beginnings of a dataset to examine intrasite precision (\(PE_I\)). \(PE_I\) is defined as multiple samples taken at the same site at the same time.
Most staff used the low gradient form to assess Bottle Run. There are very few staff that have current audits for the low gradient protocol, so this site was removed from PE analysis. The riffle/run sites (Hagermans DOWN and UP) were included. CVs were calculated for each parameter at each site. CVs were calculated for all staff, staff that have current habitat assessment method audits, and staff that have not been audited.
A number of meaningful observations were made using this dataset:
Variability (CVs) in scores at sites was dependent on site condition. The Hagermans UP site with more intact habitat had lower variability for all parameters, couplets, and total score compared to Hagermans DOWN, which had more impacted habitat.
For most parameters, the CV for audited staff (n = 11) was less than the CV for staff that have not been audited (n = 17) and all staff (n = 28).
The format of the workshop allowed interesting observations of variability in physical habitat assessment scores to be understood. Additionally, this format allowed for the beginning of a dataset that can be built upon to generate PEs for the physical habitat assessment method. Some recommendations for continuing to improve the statistical validity of the physical habitat assessment method include:
Recommendation 1 Increase the number of staff that are trained and audited in the physical habitat assessment method.
Recommendation 2 The size of this dataset needs to be increased in order to better gage variability and quantify PEs. Increased frequency of physical habitat data collections at the same site, same time are needed generate additional data to calculate intrasite precision (\(PE_I\)). When audited staff are in the field with other staff (audited or non-audited), independent habitat assessments should be conducted and data provided to increase n.
Recommendation 3 Explore temporal precision (\(PE_T\)), defined as precision at the same site, through time. A number of confounding factors such as seasonality (e.g. vegetation cover, flow regime) and antecedent scouring flows can drastically affect habitat scores at the same site. Other indices have index periods when data can be collected and considered comparable; the physical habitat assessment has no such index period. We need to understand this variability and its contribution to physical habitat scores.
Recommendation 4 Investigate differential variability between least-disturbed and impacted sites. The data from Hagermans UP and DOWN showed clearly that variability in scoring was highest at the impacted (DOWN) site. Do we need to bin sites based on their condition prior to examining CVs?
Click here to download a spreadsheet of the data included herein
Below is the code used to load necessary packages, import and clean data, create graphics, and generate statistics.
getwd()
options(scipen = 999)
# rm(list = ls())
# load packages -----------------------------------------------------------
library(readxl)
library(RColorBrewer)
library(plotly)
library(tidyverse)
library(tidylog)
# import data -------------------------------------------------------------
# vector of staff that have been audited for riff/run
audit_vec <-
read_xlsx('20221020_RBP_Workshop_Data.xlsx', sheet = 'audits') %>%
drop_na(username, FreestoneHab) %>%
pull(username)
audit_vec
# RBP data from 2022 Bio Meeting
hab_df_wide <-
read_xlsx('20221020_RBP_Workshop_Data.xlsx', sheet = 'Sheet1') %>%
mutate(total_score = rowSums(across(SUB_AVAIL_COVER:RIP_VEG_ZONE), na.rm = T),
instr_coup_RR = ifelse(BLN_GLIDE == 'N', EMBEDDED + SED_DEP, NA),
bank_coup_RR = ifelse(BLN_GLIDE == 'N', COND_BANKS + BANK_VEG, NA),
instr_coup_GP = ifelse(BLN_GLIDE == 'Y', POOL_SUB_CHAR + SED_DEP, NA),
bank_coup_GP = ifelse(BLN_GLIDE == 'Y', COND_BANKS + BANK_VEG, NA),
username = sub("^([^-]+-){2}", "", STR_STATION_ID), # remove username from STR_STATION_ID - text after second '-'
riff_run_audit = case_when(
username %in% audit_vec & BLN_GLIDE == 'N' ~ 'Y',
BLN_GLIDE == 'N' ~ 'N',
BLN_GLIDE == 'Y' ~ NA_character_,
TRUE ~ 'panic')) %>%
relocate(c(username, riff_run_audit), .after = STR_STATION_ID)
hab_df_wide
sort(unique(hab_df_wide$SEC_STATION_ID))
sort(unique(hab_df_wide$REGION))
hab_df_wide %>%
filter(SEC_STATION_ID != 'Bottle') %>%
distinct(username, riff_run_audit) %>%
group_by(riff_run_audit) %>%
summarize(n = n())
# change to long format for plotting
hab_df_long <-
hab_df_wide %>%
pivot_longer(cols = 8:27, names_to = 'parameter', values_to = 'score', values_drop_na = T)
hab_df_long
sort(unique(hab_df_long$parameter))
# visualize ---------------------------------------------------------------
# Bottle - raw parameters
bottle_raw_plot <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Bottle') %>% #distinct(parameter)
filter(!(parameter %in% c('total_score', 'instr_coup_GP', 'bank_coup_GP'))) %>%
# ggplot(aes(x = parameter, y = score)) +
ggplot(aes(x = reorder(parameter, score, FUN = sd), y = score)) +
# geom_hline(yintercept = c(1, 5, 10, 15, 20), lty = 'dashed') +
geom_rect(xmin = 0, xmax = Inf, ymin = 1, ymax = 5.5, fill = '#969696') +
annotate('text', x =0.5, y = 3.5, label = 'Poor', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 5.5, ymax = 10.5, fill = '#bdbdbd') +
annotate('text', x =0.5, y = 8, label = 'Marginal', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 10.5, ymax = 15.5, fill = '#d9d9d9') +
annotate('text', x =0.5, y = 13, label = 'Suboptimal', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 15.5, ymax = 20, fill = '#f0f0f0') +
annotate('text', x =0.5, y = 18, label = 'Optimal', angle = 90) +
geom_boxplot(outlier.shape = NA) +
geom_point(aes(fill = REGION, label = username), pch = 21, alpha = 0.8, size = 3,
position = position_jitterdodge(jitter.width =.6, dodge.width = .2)) +
scale_y_continuous(limits = c(1,20)) +
scale_fill_brewer(palette = 'Set1') +
labs(x = 'Parameter - Ordered by Ascending Standard Deviation') +
theme_minimal()
bottle_raw_plot
# Bottle - couplets and total score
bottle_tot_plot <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Bottle') %>% #distinct(parameter)
filter(parameter %in% c('total_score', 'instr_coup_GP', 'bank_coup_GP')) %>%
ggplot(aes(x = reorder(parameter, score, FUN = sd), y = score)) +
geom_boxplot(outlier.shape = NA) +
geom_point(aes(fill = REGION, label = username), pch = 21, alpha = 0.8, size = 3,
position = position_jitterdodge(jitter.width =.6, dodge.width = .2)) +
geom_hline(yintercept = c(20, 105), lty = 'dashed') +
scale_fill_brewer(palette = 'Set1') +
labs(x = 'Parameter - Ordered by Ascending Standard Deviation') +
theme_minimal()
bottle_tot_plot
# stats
bottle_summ <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Bottle') %>%
group_by(SEC_STATION_ID, parameter) %>%
summarize(mean = round(mean(score, na.rm = T), 1),
sd = round(sd(score, na.rm = T), 1),
n = n(),
min = min(score, na.rm = T),
max = max(score, na.rm = T)) %>%
mutate(cv = round(sd/mean, 2),
total = case_when(
str_detect(parameter, 'coup') ~ 40,
str_detect(parameter, 'total') ~ 240,
TRUE ~ 20),
cv_pts = cv * total) %>%
select(-c(total, cv)) %>%
rename(cv = cv_pts) %>%
arrange(-cv) %>%
relocate(cv, .after = 'sd') %>%
mutate(rank = nrow(.):1)
bottle_summ
# Hagermans UP - raw parameters
hag_up_raw_plot <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Hagermans UP') %>% #distinct(parameter)
filter(!(parameter %in% c('total_score', 'instr_coup_RR', 'bank_coup_RR'))) %>%
# ggplot(aes(x = parameter, y = score)) +
ggplot(aes(x = reorder(parameter, score, FUN = sd), y = score)) +
geom_rect(xmin = 0, xmax = Inf, ymin = 1, ymax = 5.5, fill = '#969696') +
annotate('text', x =0.5, y = 3.5, label = 'Poor', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 5.5, ymax = 10.5, fill = '#bdbdbd') +
annotate('text', x =0.5, y = 8, label = 'Marginal', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 10.5, ymax = 15.5, fill = '#d9d9d9') +
annotate('text', x =0.5, y = 13, label = 'Suboptimal', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 15.5, ymax = 20, fill = '#f0f0f0') +
annotate('text', x =0.5, y = 18, label = 'Optimal', angle = 90) +
geom_boxplot(outlier.shape = NA) +
geom_point(aes(fill = REGION, label = username), pch = 21, alpha = 0.8, size = 3,
position = position_jitterdodge(jitter.width =.6, dodge.width = .2)) +
scale_y_continuous(limits = c(1,20)) +
scale_fill_brewer(palette = 'Set1') +
labs(x = 'Parameter - Ordered by Ascending Standard Deviation') +
theme_minimal()
hag_up_raw_plot
# Hagermans UP - couplets and total score
hag_up_tot_plot <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Hagermans UP') %>% #distinct(parameter)
filter(parameter %in% c('total_score', 'instr_coup_RR', 'bank_coup_RR')) %>%
ggplot(aes(x = reorder(parameter, score, FUN = sd), y = score)) +
geom_boxplot(outlier.shape = NA) +
geom_point(aes(fill = REGION, label = username), pch = 21, alpha = 0.8, size = 3,
position = position_jitterdodge(jitter.width =.6, dodge.width = .2)) +
geom_hline(yintercept = c(24, 140), lty = 'dashed') +
scale_fill_brewer(palette = 'Set1') +
labs(x = 'Parameter - Ordered by Ascending Standard Deviation') +
theme_minimal()
hag_up_tot_plot
# stats
hag_up_summ <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Hagermans UP') %>%
group_by(SEC_STATION_ID, parameter) %>%
summarize(mean = round(mean(score, na.rm = T), 1),
sd = round(sd(score, na.rm = T), 1),
n = n(),
min = min(score, na.rm = T),
max = max(score, na.rm = T)) %>%
mutate(cv = round(sd/mean, 2),
total = case_when(
str_detect(parameter, 'coup') ~ 40,
str_detect(parameter, 'total') ~ 240,
TRUE ~ 20),
cv_pts = cv * total) %>%
select(-c(total, cv)) %>%
rename(cv = cv_pts) %>%
arrange(-cv) %>%
relocate(cv, .after = 'sd') %>%
mutate(rank = nrow(.):1)
hag_up_summ
# Hagermans DOWN - raw parameters
hag_dn_raw_plot <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Hagermans DOWN') %>% #distinct(parameter)
filter(!(parameter %in% c('total_score', 'instr_coup_RR', 'bank_coup_RR'))) %>%
# ggplot(aes(x = parameter, y = score)) +
ggplot(aes(x = reorder(parameter, score, FUN = sd), y = score)) +
geom_rect(xmin = 0, xmax = Inf, ymin = 1, ymax = 5.5, fill = '#969696') +
annotate('text', x =0.5, y = 3.5, label = 'Poor', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 5.5, ymax = 10.5, fill = '#bdbdbd') +
annotate('text', x =0.5, y = 8, label = 'Marginal', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 10.5, ymax = 15.5, fill = '#d9d9d9') +
annotate('text', x =0.5, y = 13, label = 'Suboptimal', angle = 90) +
geom_rect(xmin = 0, xmax = Inf, ymin = 15.5, ymax = 20, fill = '#f0f0f0') +
annotate('text', x =0.5, y = 18, label = 'Optimal', angle = 90) +
geom_boxplot(outlier.shape = NA) +
geom_point(aes(fill = REGION, label = username), pch = 21, alpha = 0.8, size = 3,
position = position_jitterdodge(jitter.width =.6, dodge.width = .2)) +
scale_y_continuous(limits = c(1,20)) +
scale_fill_brewer(palette = 'Set1') +
labs(x = 'Parameter - Ordered by Ascending Standard Deviation') +
theme_minimal()
hag_dn_raw_plot
# Hagermans DOWN - couplets and total score
hag_dn_tot_plot <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Hagermans DOWN') %>% #distinct(parameter)
filter(parameter %in% c('total_score', 'instr_coup_RR', 'bank_coup_RR')) %>%
ggplot(aes(x = reorder(parameter, score, FUN = sd), y = score)) +
geom_boxplot(outlier.shape = NA) +
geom_point(aes(fill = REGION, label = username), pch = 21, alpha = 0.8, size = 3,
position = position_jitterdodge(jitter.width =.6, dodge.width = .2)) +
geom_hline(yintercept = c(24, 140), lty = 'dashed') +
scale_fill_brewer(palette = 'Set1') +
labs(x = 'Parameter - Ordered by Ascending Standard Deviation') +
theme_minimal()
hag_dn_tot_plot
# stats
hag_dn_summ <-
hab_df_long %>%
filter(SEC_STATION_ID == 'Hagermans DOWN') %>%
group_by(SEC_STATION_ID, parameter) %>%
summarize(mean = round(mean(score, na.rm = T), 1),
sd = round(sd(score, na.rm = T), 1),
n = n(),
min = min(score, na.rm = T),
max = max(score, na.rm = T)) %>%
mutate(cv = round(sd/mean, 2),
total = case_when(
str_detect(parameter, 'coup') ~ 40,
str_detect(parameter, 'total') ~ 240,
TRUE ~ 20),
cv_pts = cv * total) %>%
select(-c(total, cv)) %>%
rename(cv = cv_pts) %>%
arrange(-cv) %>%
relocate(cv, .after = 'sd') %>%
mutate(rank = nrow(.):1)
hag_dn_summ
# overall summary
all_summ <-
bind_rows(bottle_summ, hag_up_summ, hag_dn_summ) %>%
group_by(parameter) %>%
summarize(mean_sd = round(mean(sd, na.rm = T), 1),
n = n()) %>%
arrange(-mean_sd) %>%
mutate(rank = nrow(.):1)
all_summ
# PEs ---------------------------------------------------------------------
PE_riff_run_all <-
hab_df_long %>%
filter(BLN_GLIDE == 'N') %>%
group_by(SEC_STATION_ID, parameter) %>%
summarize(mean = mean(score),
sd = sd(score),
cv = sd / mean) %>%
mutate(total = case_when(
str_detect(parameter, 'coup') ~ 40,
str_detect(parameter, 'total') ~ 240,
TRUE ~ 20),
cv_pts = cv * total) %>%
arrange(SEC_STATION_ID, -cv_pts) %>%
mutate(category = 'all')
PE_riff_run_all
PE_riff_run_audit <-
hab_df_long %>%
filter(BLN_GLIDE == 'N') %>%
filter(riff_run_audit == 'Y') %>%
group_by(SEC_STATION_ID, parameter) %>%
summarize(mean = mean(score),
sd = sd(score),
cv = sd / mean) %>%
mutate(total = case_when(
str_detect(parameter, 'coup') ~ 40,
str_detect(parameter, 'total') ~ 240,
TRUE ~ 20),
cv_pts = cv * total) %>%
arrange(SEC_STATION_ID, -cv_pts) %>%
mutate(category = 'audited staff')
PE_riff_run_audit
PE_riff_run_non_audit <-
hab_df_long %>%
filter(BLN_GLIDE == 'N') %>%
filter(riff_run_audit == 'N' | is.na(riff_run_audit)) %>%
group_by(SEC_STATION_ID, parameter) %>%
summarize(mean = mean(score),
sd = sd(score),
cv = sd / mean) %>%
mutate(total = case_when(
str_detect(parameter, 'coup') ~ 40,
str_detect(parameter, 'total') ~ 240,
TRUE ~ 20),
cv_pts = cv * total) %>%
arrange(SEC_STATION_ID, -cv_pts) %>%
mutate(category = 'non-audited staff')
PE_riff_run_non_audit
PE_df <-
bind_rows(PE_riff_run_all, PE_riff_run_audit, PE_riff_run_non_audit) %>%
mutate(fcategory = factor(category, levels = c('all', 'non-audited staff', 'audited staff')))
PE_df
levels(PE_df$fcategory)
rm(PE_riff_run_all, PE_riff_run_audit, PE_riff_run_non_audit)
cv_Hagermans_plot <-
ggplot(PE_df, aes(x = cv_pts, y = reorder(parameter, cv_pts, FUN = 'mean'),
fill = fcategory)) +
facet_wrap(~SEC_STATION_ID) +
geom_bar(stat = 'identity', position = 'dodge') +
scale_fill_manual(values = c('#1b9e77', '#d95f02', '#7570b3'),
breaks=c('audited staff', 'non-audited staff', 'all'),
name = '') +
labs(x = 'CV (points)', y = 'Parameter - Ordered by Descending CV') +
theme_minimal() +
theme(legend.position = 'bottom')
cv_Hagermans_plot
# show plot with actual scores overtop CVs
cv_Hagermans_points_plot <-
ggplot() +
facet_wrap(~SEC_STATION_ID) +
geom_linerange(data = filter(PE_df, !parameter %in% c('total_score', 'instr_coup_RR', 'bank_coup_RR')),
aes(xmin = mean - (0.5*cv_pts), xmax = mean + (0.5*cv_pts),
y = reorder(parameter, cv_pts, FUN = 'mean'),
color = category, size = category)) +
# add black triangle for mean
stat_summary(data = filter(PE_df, !parameter %in% c('total_score', 'instr_coup_RR', 'bank_coup_RR')),
aes(x= mean, y = reorder(parameter, cv_pts, FUN = 'mean')),
fun = mean, geom = "point", pch = 17, size = 6, color = "black") +
scale_color_manual(values = c('#1b9e77', '#d95f02', '#7570b3'),
# breaks=c('audited staff', 'non-audited staff', 'all'),
name = '') +
scale_size_manual(values = c(12,8,4),
# breaks=c('audited staff', 'non-audited staff', 'all'),
name = '') +
geom_point(data = filter(hab_df_long, SEC_STATION_ID != 'Bottle' &
!parameter %in% c('total_score', 'instr_coup_RR', 'bank_coup_RR')),
aes(x = score, y = parameter, fill = REGION),
pch = 21, alpha = 0.8, size = 3,
position = position_jitterdodge(jitter.width =.4, dodge.width = .2),
inherit.aes = F) +
# scale_x_continuous(limits = c(1,20)) +
scale_fill_brewer(palette = 'Set1') +
labs(y = 'Parameter - Ordered by Descending CV') +
theme_bw()
cv_Hagermans_points_plot