Standardize all fire responses as proportional to either (a) a pre-burn condition or (b) an unburned control sampled simultaneously.
Use reported means and error terms to (a) calculate an effect size for the response and (b) estimate a response and 95% CI based on simulations parameterized by mean and error.
Test temporal trends with multiple linear mixed-effect models weighted by effect sizes and ranked by model selection.
Subset of responses from 25 papers with sufficient datapoints to illustrate desired graphical outcomes of the review. All trendlines are weighted by the effect size for each observation fit with a second-order polynomial. The grey trendline represents the overall response across all fire severity levels, shaded with the 95% CI.
Generating the meta-analysis data proceeds automatically using a single script in R. Key to this functionality is an exact protocol for entering values reported from primary literature. Steps in this protocol include:
In this way, all data from a single study (one column from the spreadsheet) are imported as a single character vector (string) in a single spreadsheet cell. Such a vector can exceed 1000 characters depending on how many responses/treatments/intervals reported in the study.
The general pattern of RTV format is:
(error type, value) Response1 > Treatment1, mean-error; Treatment2, mean-error | Response2 > Treatment1, mean-error; Treatment2, mean-error
A specific example for RTV format follows:
(se, 3)C_t> UB,18.4-0.47; Lo,15.64-0.59; Hi,13.73-0.53 | N_t> UB,1.27-0.02; Lo,1.15-0.06; Hi,1.08-0.05 | pH> UB,6.8-0.1; Lo, 7.6-0.2; Hi, 8.1-0.1 | P_a> UB,107-7 ;Lo,338-47; Hi,920-49 | Ca_e>UB,1381-40; Lo,1354-21; Hi,2151-280 | K_e>UB,305-15; Lo,372-27; Hi,408-27 | Na_e>UB,29-11; Lo,30-11; Hi,16-5
Specific ASCII characters are used to denote each item of the string. Each of these are parsed specifically by the subroutines:
The R package tidyverse, which includes dplyr and stringr, provide all the workhorse functions for parsing the character strings and arranging them in a manner they can be arranged into a data.frame. This is accomplished within subroutines depending on the comparison type.
The primary data are imported directly from the Google Sheet
SH_key = "1xwVtU-Y9wVUMRS6jkT-8E_jRhuOXhqrXu9n8vlHxXV4"
pools.d <-
gs_key(SH_key) %>%
gs_read(ws="RawPools1") %>%
t() %>%
as.data.frame() %>%
rownames_to_column() %>%
mutate_each(list(as.character)) %>%
as.data.frame() %>%
setNames(.[1,]) %>%
slice(-1) %>%
select(author_year, Severity, CompType,
CompConfig, SampInt, Pools) %>%
filter(str_detect(Pools, "(na)") == "FALSE") %>%
filter(str_detect(Pools, "(NA)") == "FALSE") %>%
mutate(CompType = trimws(CompType),
CompConfig = trimws(CompConfig),
Severity = trimws(Severity))
The first for loop identifies a single study by the author_year citation key column and determines which error term it has. For simplicty here the data are subset by Ando et al. (2004):
for(a in 1:length(unique(pools.d$author_year))) {
pd = filter(pools.d, author_year == author_year[[a]])
tb = c(pd$author_year, pd$CompConfig)
if (str_detect(pd$Pools[[1]], "(sd)") == "FALSE") {
err = "se"
n = as.numeric(str_extract(pd$Pools, "([0-9]+)"))
}
if (str_detect(pd$Pools[[1]], "(sd)") == "TRUE") { err = "sd" }
d1 <- pd %>%
mutate(Pools = str_replace(Pools, "\\([^()]*\\)", "")) %>%
unnest(Pools = str_split(Pools, pattern = "\\#")) %>%
mutate(study = paste(author_year,"-",1:length(author_year), sep=""))
Breaking up the long character string is a series of calls to unnest and separate functions.
First an if statement identifies the comparison configuration.
if(d1$CompConfig[[1]] == "RTV") {
Then the different responses are split into unique rows via the unnest function.
ds <- d1 %>% unnest(Pools = str_split(Pools, pattern = "\\|"))
## author_year Pools
## 1 ando_2014 C_t> UB,18.4-0.47; Lo,15.64-0.59; Hi,13.73-0.53
## 2 ando_2014 N_t> UB,1.27-0.02; Lo,1.15-0.06; Hi,1.08-0.05
## 3 ando_2014 pH> UB,6.8-0.1; Lo, 7.6-0.2; Hi, 8.1-0.1
## 4 ando_2014 P_a> UB,107-7 ;Lo,338-47; Hi,920-49
## 5 ando_2014 Ca_e>UB,1381-40; Lo,1354-21; Hi,2151-280
## 6 ando_2014 K_e>UB,305-15; Lo,372-27; Hi,408-27
## 7 ando_2014 Na_e>UB,29-11; Lo,30-11; Hi,16-5
The separate function takes out the Response and adds it to its own column by splitting on sep=‘>’
ds <- ds %>% separate(Pools, into=c("response", "pools"), sep='>')
## author_year response pools
## 1 ando_2014 C_t UB,18.4-0.47; Lo,15.64-0.59; Hi,13.73-0.53
## 2 ando_2014 N_t UB,1.27-0.02; Lo,1.15-0.06; Hi,1.08-0.05
## 3 ando_2014 pH UB,6.8-0.1; Lo, 7.6-0.2; Hi, 8.1-0.1
## 4 ando_2014 P_a UB,107-7 ;Lo,338-47; Hi,920-49
## 5 ando_2014 Ca_e UB,1381-40; Lo,1354-21; Hi,2151-280
## 6 ando_2014 K_e UB,305-15; Lo,372-27; Hi,408-27
## 7 ando_2014 Na_e UB,29-11; Lo,30-11; Hi,16-5
Another unnest call breaks each Treatment into a unique row by breaking on the ; . First three responses shown.
ds <- ds %>% unnest(pools = str_split(pools, pattern = '\\;'))
## author_year response pools
## 1 ando_2014 C_t UB,18.4-0.47
## 2 ando_2014 C_t Lo,15.64-0.59
## 3 ando_2014 C_t Hi,13.73-0.53
## 4 ando_2014 N_t UB,1.27-0.02
## 5 ando_2014 N_t Lo,1.15-0.06
## 6 ando_2014 N_t Hi,1.08-0.05
## 7 ando_2014 pH UB,6.8-0.1
## 8 ando_2014 pH Lo, 7.6-0.2
## 9 ando_2014 pH Hi, 8.1-0.1
A final separate call creates a column for the Treatments by splitting on the , .
ds <- ds %>% separate(pools, into=c("treatment", "values"), sep=",")
## author_year response treatment values
## 1 ando_2014 C_t UB 18.4-0.47
## 2 ando_2014 C_t Lo 15.64-0.59
## 3 ando_2014 C_t Hi 13.73-0.53
## 4 ando_2014 N_t UB 1.27-0.02
## 5 ando_2014 N_t Lo 1.15-0.06
## 6 ando_2014 N_t Hi 1.08-0.05
## 7 ando_2014 pH UB 6.8-0.1
## 8 ando_2014 pH Lo 7.6-0.2
## 9 ando_2014 pH Hi 8.1-0.1
The above steps get the original character string into a long format suitable for another subroutine to chug out meta-analytical stats.
The first step is a for loop that goes through each response and sorts the data into mean and error columns. The guts of that loop follow. Note how case_when determines whether the error term is sd or se and either saves it or converts it accordingly:
select(treatment, values) %>%
separate(values, into=c("mean", "error"), sep="-") %>%
mutate(mean=as.numeric(mean),
sd = case_when(
err=="sd" ~ as.numeric(error),
err=="se" ~ as.numeric(error) * (sqrt(n)) )) %>%
select(-error)
Before proceeding to the next for loop, the mean and variance parameters of the unburned control treatment.
ub.mu = ub$mean
ub.sig = ub$sd
The next for loop chugs through each treatment and calculates meta-analytical statistics compared to the unburned control.
The first meta-analytical statistic is the effect size. Hedge’s g is used here; to control for potential changes in error as means increase, all comparisons use the SD of the unburned control ub.sig:
EffectSize = (t.mu - ub.mu) / (ub.sig)
With the graphical output in mind, we want to show not only each observation but also the probability of the estimated percentage response with respect to zero (no response). Many papers in the primary literature apply inappropriate statistics to determine fire effect, but here we use reported means and errors to reconstruct the theoretical distribution of treatment and control values by drawing 1000 random observations from the simulated Guassian distributions…
t.mu = td$mean
t.sig = td$sd
sim.dat <- data.frame(ts = rnorm(2000, mean=t.mu, sd=t.sig),
us = rnorm(2000, mean=ub.mu, sd=ub.sig))
sim.dat <- mutate(sim.dat, PrctDif = 100*((ts - us)/us ))
… then we estimate the observed response as the median of the simulated data (Pd_est), and portray its difference from zero (if any) via 95% confidence intervals (Pd_ciL, Pd_ciU):
Pd_ciL = quantile(sim.dat$PrctDif, 0.025, na.rm = TRUE)[[1]]
Pd_est = quantile(sim.dat$PrctDif, 0.500, na.rm = TRUE)[[1]]
Pd_ciU = quantile(sim.dat$PrctDif, 0.975, na.rm = TRUE)[[1]]
The data for a single study comes out as:
## study response level.treatment level.mean level.sd EffectSize
## 1 ando_2014-1 C_t Lo 15.64 1.02190998 -3.3900
## 2 ando_2014-1 C_t Hi 13.73 0.91798693 -5.7400
## 3 ando_2014-1 N_t Lo 1.15 0.10392305 -3.4600
## 4 ando_2014-1 N_t Hi 1.08 0.08660254 -5.4800
## 5 ando_2014-1 pH Lo 7.60 0.34641016 4.6200
## 6 ando_2014-1 pH Hi 8.10 0.17320508 7.5100
## 7 ando_2014-1 P_a Lo 338.00 81.40638796 19.1000
## 8 ando_2014-1 P_a Hi 920.00 84.87048957 67.1000
## 9 ando_2014-1 Ca_e Lo 1354.00 36.37306696 -0.3900
## 10 ando_2014-1 Ca_e Hi 2151.00 484.97422612 11.1000
## 11 ando_2014-1 K_e Lo 372.00 46.76537180 2.5800
## 12 ando_2014-1 K_e Hi 408.00 46.76537180 3.9600
## 13 ando_2014-1 Na_e Lo 30.00 19.05255888 0.0525
## 14 ando_2014-1 Na_e Hi 16.00 8.66025404 -0.6820
## Pd_ciL Pd_est Pd_ciU SampInt
## 1 -27.500 -15.10 -1.260 6_mo
## 2 -36.600 -25.50 -12.400 6_mo
## 3 -26.100 -9.21 7.550 6_mo
## 4 -28.600 -15.40 -0.895 6_mo
## 5 0.405 11.80 23.300 6_mo
## 6 12.100 19.00 27.200 6_mo
## 7 69.800 214.00 405.000 6_mo
## 8 548.000 757.00 1050.000 6_mo
## 9 -12.300 -2.10 10.100 6_mo
## 10 -15.000 57.20 131.000 6_mo
## 11 -9.570 22.20 62.400 6_mo
## 12 -0.211 33.40 76.700 6_mo
## 13 -790.000 -10.10 925.000 6_mo
## 14 -419.000 -49.50 417.000 6_mo
These data are plotted as single observations within each response variable at 6 months since fire for low and high severity. Trendlines and regression models are weighted by the absolute value of the EffectSize.
Two similar subroutines parse the other response configurations into identical data.frames so they can all be combined for graphing and analysis.
The code chunks above leave out a lot of R-specific formatting, etc. The entire script follows:
{ # <-- Sends results through columns-standardizing subroutine + list-flattener
VARscalar = 0.5 # Dial down effect sizes
SDscalar = 0.5 # Dial down confidence intervals
PooledError <- data.frame()
PoolList <- list( RTV.d = data.frame(),
RIV.d = data.frame(),
TRIV.d = data.frame() )
for(a in 1:length(unique(pools.d$author_year))) {
#a = 6
pd = filter(pools.d, author_year == author_year[[a]])
tb = c(pd$author_year, pd$CompConfig)
if (str_detect(pd$Pools[[1]], "(sd)") == "FALSE") {
err = "se"
n = as.numeric(str_extract(pd$Pools, "([0-9]+)"))
}
if (str_detect(pd$Pools[[1]], "(sd)") == "TRUE") { err = "sd" }
d1 <- pd %>%
mutate(Pools = str_replace(Pools, "\\([^()]*\\)", "")) %>%
unnest(Pools = str_split(Pools, pattern = "\\#")) %>%
mutate(study = paste(author_year,"-",1:length(author_year), sep=""))
if(d1$CompConfig[[1]] == "RTV") {
for(s in 1:length(unique(d1$study)) ) {
# j = 1
ddd <- data.frame()
ds <- d1 %>% filter(study == study[[s]])
ds <- ds %>%
unnest(Pools = str_split(Pools, pattern = "\\|")) %>%
separate(Pools, into=c("response", "pools"), sep=">") %>%
unnest(pools = str_split(pools, pattern = "\\;")) %>%
separate(pools, into=c("treatment", "values"), sep=",") %>%
mutate(response = trimws(response),
treatment = trimws(treatment))
Dd <- data.frame()
for(k in 1:length(unique(ds$response)) ) {
dd <- data.frame()
resp = unique(ds$response)[[k]]
dk = filter(ds, response == resp)
study = unique(ds$study)
SampInt = unique(ds$SampInt)
dk2 <- dk %>%
select(treatment, values) %>%
separate(values, into=c("mean", "error"), sep="-") %>%
mutate(mean=as.numeric(mean),
sd = case_when(
err=="sd" ~ as.numeric(error),
err=="se" ~ as.numeric(error) * (sqrt(n)) )) %>%
select(-error)
pe <- data.frame(response = resp,
SD = round( dk2$sd / dk2$mean, 3) )
PooledError <- rbind(PooledError, pe)
ub = dk2 %>% filter(treatment == "UB")
ub.mu <- ub$mean
ub.sig <- ub$sd
trt = filter(dk2, treatment != "UB")
for(t in 1:length(trt$treatment)) {
td = filter(trt, treatment == treatment[[t]])
t.mu = td$mean
t.sig = td$sd
sim.dat <- data.frame(ts = rnorm(2000, mean=t.mu, sd=t.sig*SDscalar),
us = rnorm(2000, mean=ub.mu, sd=ub.sig*SDscalar))
sim.dat <- mutate(sim.dat, PrctDif = 100*((ts - us)/us ))
d <- data.frame(study=study,
response = resp,
level = case_when(
td$treatment=="B" ~ unique(dk$Severity),
TRUE ~ td$treatment),
EffectSize = signif(
(t.mu - ub.mu) / (ub.sig*VARscalar), 3),
Pd_ciL = signif(
quantile(sim.dat$PrctDif, 0.025,
na.rm = TRUE)[[1]], 3),
Pd_est = signif(
quantile(sim.dat$PrctDif, 0.500,
na.rm = TRUE)[[1]], 3),
Pd_ciU = signif(
quantile(sim.dat$PrctDif, 0.975,
na.rm = TRUE)[[1]], 3),
SampInt = SampInt)
dd <- rbind(dd, d)
rm(d,td,t.mu,t.sig,sim.dat)
}
ddd <- rbind(ddd, dd)
rm(dd, resp, dk, study, SampInt, dk2, pe, ub, ub.mu, ub.sig, trt)
}
Dd <- rbind(Dd, ddd)
rm(ddd, ds)
}
PoolList$RTV.d<- rbind(PoolList$RTV.d, Dd)
rm(Dd)
} # Close RTV if
if(d1$CompConfig[[1]] == "RIV") {
for(s in 1:length(unique(d1$study)) ) {
# j = 1
ddd <- data.frame()
ds <- d1 %>% filter(study == study[[s]])
ds <- ds %>%
unnest(Pools = str_split(Pools, pattern = "\\|")) %>%
separate(Pools, into=c("response", "pools"), sep=">") %>%
unnest(pools = str_split(pools, pattern = "\\;")) %>%
separate(pools, into=c("interval", "values"), sep=",") %>%
mutate(response = trimws(response),
interval = trimws(interval)) %>%
select(-author_year, -CompType)
Dd <- data.frame()
for(k in 1:length(unique(ds$response)) ) {
# k = 1
dd <- data.frame()
resp = unique(ds$response)[[k]]
dk = filter(ds, response == resp)
study = unique(ds$study)
SampInt = unique(ds$SampInt)
dk2 <- dk %>%
select(interval, values) %>%
separate(values, into=c("mean", "error"), sep="-") %>%
mutate(mean=as.numeric(mean),
sd = case_when(
err=="sd" ~ as.numeric(error),
err=="se" ~ as.numeric(error) * (sqrt(n)) )) %>%
select(-error)
pe <- data.frame(response = resp,
SD = round( dk2$sd / dk2$mean, 3) )
PooledError <- rbind(PooledError, pe)
c = dk2 %>% filter(interval %in% c("UB", "P", "F"))
c.mu <- c$mean
c.sig <- c$sd
int = dk2 %>% filter(! interval %in% c("UB", "P", "F"))
for(i in 1:length(int$interval)) {
id = filter(int, interval == interval[[i]])
f.mu = id$mean
f.sig = id$sd
sim.dat <- data.frame(fs = rnorm(2000, mean=f.mu, sd=f.sig*SDscalar),
cs = rnorm(2000, mean=c.mu, sd=c.sig*SDscalar))
sim.dat <- mutate(sim.dat, PrctDif = 100*((fs - cs)/cs ))
d <- data.frame(study=study,
response = resp,
level = unique(dk$Severity),
interval = int$interval[[i]],
EffectSize = signif(
(f.mu - c.mu) / (c.sig*VARscalar), 3),
Pd_ciL = signif(
quantile(sim.dat$PrctDif, 0.025,
na.rm = TRUE)[[1]], 3),
Pd_est = signif(
quantile(sim.dat$PrctDif, 0.500,
na.rm = TRUE)[[1]], 3),
Pd_ciU = signif(
quantile(sim.dat$PrctDif, 0.975,
na.rm = TRUE)[[1]], 3),
SampInt = SampInt)
dd <- rbind(dd, d)
rm(d, id, f.mu, f.sig, sim.dat)
}
ddd <- rbind(ddd, dd)
rm(dd, resp, dk, study, SampInt, dk2, pe)
}
Dd <- rbind(Dd, ddd)
rm(ddd, ds)
}
PoolList$RIV.d <- rbind(PoolList$RIV.d, Dd)
rm(Dd)
} # Close RIV if
if(d1$CompConfig[[1]] == "TRIV") {
for(s in 1:length(unique(d1$study)) ) {
# j = 1
ddd <- data.frame()
ds <- d1 %>% filter(study == study[[s]])
ds <- ds %>%
unnest(Pools = str_split(Pools, pattern = "\\ @")) %>%
separate(Pools, into=c("treatment", "pools"), sep="\\^") %>%
unnest(pools = str_split(pools, pattern = "\\|")) %>%
separate(pools, into=c("response", "pools"), sep=">") %>%
unnest(pools = str_split(pools, pattern = "\\;")) %>%
separate(pools, into=c("interval", "values"), sep=",") %>%
mutate( treatment = trimws(treatment),
response = trimws(response),
interval = trimws(interval)) %>%
select(-author_year, -CompConfig)
Dd <- data.frame()
for(k in 1:length(unique(ds$response)) ) {
# k = 1
dd <- data.frame()
resp = unique(ds$response)[[k]]
dk = filter(ds, response == resp)
study = unique(ds$study)
for (i in 1:length(unique(dk$interval))) {
# i = 1
int = unique(dk$interval)[[i]]
di = filter(dk, interval == int)
di2 <- di %>%
select(CompType, treatment, values) %>%
separate(values, into=c("mean", "error"), sep="-") %>%
mutate(mean=as.numeric(mean),
sd = case_when(
err=="sd" ~ as.numeric(error),
err=="se" ~ as.numeric(error) * (sqrt(n)) )) %>%
select(-error)
pe <- data.frame(response = resp,
SD = round( di2$sd / di2$mean, 3) )
PooledError <- rbind(PooledError, pe)
c = di2 %>% filter(treatment %in% c("UB", "P", "F"))
c.mu <- c$mean
c.sig <- c$sd
f = di2 %>% filter(! treatment %in% c("UB", "P", "F"))
f.mu = f$mean
f.sig = f$sd
sim.dat <- data.frame(fs = rnorm(2000, mean=f.mu, sd=f.sig*SDscalar),
cs = rnorm(2000, mean=c.mu, sd=c.sig*SDscalar))
sim.dat <- mutate(sim.dat, PrctDif = 100*((fs - cs)/cs ))
d <- data.frame(study=study,
response = resp,
level = case_when(
di2$CompType[[1]] == "TCI" ~ unique(di$Severity),
di2$CompType[[1]] =="SCI" ~ f$treatment ),
interval = int,
EffectSize = signif(
(f.mu - c.mu) / (c.sig*VARscalar), 3),
Pd_ciL = signif(
quantile(sim.dat$PrctDif, 0.025,
na.rm = TRUE)[[1]], 3),
Pd_est = signif(
quantile(sim.dat$PrctDif, 0.500,
na.rm = TRUE)[[1]], 3),
Pd_ciU = signif(
quantile(sim.dat$PrctDif, 0.975,
na.rm = TRUE)[[1]], 3),
SampInt = unique(di$SampInt) )
dd <- rbind(dd, d)
rm(d, int, di, di2, pe, c.mu, c.sig, f, f.mu, f.sig)
} # close interval loop
ddd <- rbind(ddd, dd)
rm(dd, resp, dk, study)
} # close response loop
Dd <- rbind(Dd, ddd)
rm(ddd, ds)
} # close the study loop
PoolList$TRIV.d <- rbind(PoolList$TRIV.d, Dd)
rm(Dd)
} # Close TRIV if
} # close author for loop
# Standardize columns subroutine
{ # <-- open for list flattener
for(i in 1:length(names(PoolList)) ) {
d = names(PoolList)[i]
if (d == "RTV.d") {
PoolList$RTV.d <-
PoolList$RTV.d %>%
separate(SampInt, c("tsf", "interval"), sep="_") %>%
mutate(tsf = as.numeric(tsf)) %>%
mutate(DaysSinceFire = case_when(
interval %in% c("mo", "mos") ~ tsf*30,
interval %in% c("years", "yr", "yrs") ~ tsf*365,
interval == "weeks" ~ tsf*7,
interval == "days" ~ tsf )) %>%
select(-tsf, -interval)
}
if (d == "RIV.d") {
PoolList$RIV.d <-
PoolList$RIV.d %>%
mutate(interval = as.numeric(as.character(interval)),
SampInt = gsub("_", "", SampInt)) %>%
mutate(DaysSinceFire = case_when(
SampInt %in% c("mo", "mos") ~ interval*30,
SampInt %in% c("years", "yr", "yrs") ~ interval*365,
SampInt == "weeks" ~ interval*7,
SampInt == "days" ~ interval )) %>%
select(-SampInt, -interval)
}
if ( d == "TRIV.d") {
PoolList$TRIV.d <-
PoolList$TRIV.d %>%
mutate(interval = as.numeric(as.character(interval)),
SampInt = gsub("_", "", SampInt)) %>%
mutate(DaysSinceFire = case_when(
SampInt %in% c("mo", "mos") ~ interval*30,
SampInt %in% c("years", "yr", "yrs") ~ interval*365,
SampInt == "weeks" ~ interval*7,
SampInt == "days" ~ interval )) %>%
select(-SampInt, -interval)
}
}
# Flatten list
PoolList %>%
data_frame( ) %>%
unnest() -> MetaPool
}
} # Close whole thing