Uncertiainty Space Sampeled
I followed the protocol we set up for the AGU aerosol experiment.
# uncertainty_sample_range: samples a range and returns a vector that meeets the rolustion return a vector the
# resolution / max samples argument requirements. The sampeling in unifrom centered around the median.
# Args
# input: a data frame contianing a column of scalars that
# resolution: the number of the desired resolution between sampled space
# max_smaples: a number that determines the limit of the maxiumn number of samples that can be taken from a space.
# TODO This function is okay, and could probably be enhanced.
uncertainty_sample_range <- function(input, resolution, max_samples){
# Find the center of the paramter range.
median_value <- median(input$scalar)
median_extreme_diff <- abs(input$scalar[1] - median_value)
if(median_extreme_diff <= resolution){
# Return the input scalars puls the meidan value.
c(median_value, input$scalar)
} else {
# Determine how many sample would have to be taken in order to meet the resolution
# requirements.
sample_length <- ceiling(median_extreme_diff / resolution)
if( 2 * sample_length > max_samples){
# If the calculated sample length is longer than the max sample size
# default to the max sample size and return a warning.
sample_length <- floor((max_samples / 2)) - 1
warning('Calulated sample size based on the resolution exces the max sample size,
using the max sample size instead.')
}
first_half <- seq(from = min(input$scalar), to = median_value, length.out = sample_length)
second_half <- seq(from = median_value, to = max(input$scalar), length.out = sample_length)
if(length(first_half) < 1){
first_half <- min(input$scalar)
}
if(length(second_half) < 1){
second_half <- max(input$scalar)
}
# Return the sampled sequence
sort(c(first_half, median_value, second_half))
}
}
bind_rows(tibble( species = "SO2i", min = -1.2, max = 0),
tibble( species = "SO2d", min = -.60, max = - 0.2),
tibble( species = "OC", min = -.16,max = -.03),
tibble( species = "BC", min = .05, max = .8)) ->
IPCC_range
# Subset the RF pathways to include the RF values in 2011 for the rcp45 runs for the aerosol species we are
# perturbing in the uncertianty anlaysis.
aerosol_RF <- bind_rows(Hector_RF[ grepl(paste(IPCC_range$species, collapse = '|') , names(Hector_RF))]) %>%
filter( grepl('rcp45', name) ) %>%
filter(year == 2011)
# Now add the IPCC 2011 uncertianty ranges to the aerosol RF tibble.
aerosol_RF %>%
full_join(gather(IPCC_range, extreme, IPCC_range, min, max), by = c('agent' = 'species')) %>%
select(year, IPCC_range, value_2011 = value, agent, extreme) ->
IPCC_2011_values
# Divide the min and max IPCC ranges by the 2011 RF values to calculate the min and matx uncertainty scalars.
IPCC_2011_values %>%
mutate(scalar = IPCC_range / value_2011 ) %>%
select(agent, scalar, extreme) %>%
mutate(species = paste0('rcp45_F', agent)) ->
min_max_scalars
# 3. Sample Uncetianty scalars and Run HIRM -----------------------------------------------------------------------
# Now sample the uncetainty scalars using the function we defined in section 0.5. How often we sample the
# scalar space will depend on how large the range is.
min_max_scalars %>%
filter(agent == 'BC') %>%
uncertainty_sample_range(resolution = 0.02, max_samples = 10) ->
BC_scalars
Calulated sample size based on the resolution exces the max sample size,
using the max sample size instead.
min_max_scalars %>%
filter(agent == 'OC') %>%
uncertainty_sample_range(resolution = 0.02, max_samples = 8) ->
OC_scalars
Calulated sample size based on the resolution exces the max sample size,
using the max sample size instead.
min_max_scalars %>%
filter(agent == 'SO2d') %>%
uncertainty_sample_range(resolution = 0.1, max_samples = 15) ->
SO2d_scalars
min_max_scalars %>%
filter(agent == 'SO2i') %>%
uncertainty_sample_range(resolution = 0.1, max_samples = 20) ->
SO2i_scalars
# Save the scalars a a list.
scalars_1 <- list(rcp45_FBC = BC_scalars,
rcp45_FOC = OC_scalars,
rcp45_FSO2d = SO2d_scalars,
rcp45_FSO2i = SO2i_scalars)
Let’s take a look at what values ended up being samppled.
BC RF
length(scalars_1$rcp45_FBC)
[1] 9
range(scalars_1$rcp45_FOC)
[1] 0.06893547 0.36765582
ggplot(data = tibble(value = scalars_1$rcp45_FBC,
scalar = 'BC')) +
geom_point(aes(scalar, value)) +
labs(y = 'BC RF scalar value')

OC RF
length(scalars_1$rcp45_FOC)
[1] 7
range(scalars_1$rcp45_FOC)
[1] 0.06893547 0.36765582
ggplot(data = tibble(value = scalars_1$rcp45_FOC,
scalar = 'OC')) +
geom_point(aes(scalar, value)) +
labs(y = 'OC RF scalar value')

RF SOd
length(scalars_1$rcp45_FSO2d)
[1] 13
range(scalars_1$rcp45_FSO2d)
[1] 0.567414 1.702242
ggplot(data = tibble(value = scalars_1$rcp45_FSO2d,
scalar = 'SO2d')) +
geom_point(aes(scalar, value)) +
labs(y = 'SO2d RF scalar value')

RF SOi
length(scalars_1$rcp45_FSO2i)
[1] 21
range(scalars_1$rcp45_FSO2i)
[1] 0.00000 1.99043
ggplot(data = tibble(value = scalars_1$rcp45_FSO2i,
scalar = 'SO2i')) +
geom_point(aes(scalar, value)) +
labs(y = 'SO2i RF scalar value')

How many perumataions is this??
total_run_count <- prod(lengths(scalars_1))
total_run_count
[1] 17199
Constraints
Global Mean Temperautre Constraint
# The temperature linear trend for the 1880-2010 period (0.65 to 1.1 oC) was then used to screen out
# cases with trends falling outside this range (Hartmann et al. 2013).
# Hartmann et al. 2013 is IPCC AR5 climate chapter WG I chapter 2.
# Pg 194
# Observations: Atmosphere and Surface
# Coordinating Lead Authors:
# Dennis L. Hartmann (USA), Albert M.G. Klein Tank (Netherlands), Matilde Rusticucci (Argentina)
temp_constraints <- tibble(period_name = '1880_2010',
period_start = 1880,
period_end = 2010,
min = 0.65,
max = 1.1)
For each of the 1.719910^{4} runs I calcualted the slope of the temperature change from 1880 to 2020 multiplied by 131 years for the number of years in the period. And then compared the HIRM slopes with the min and max values of the IPCC slope.
How many runs passed through the IPCC temp change?
passing_rslts <- read.csv(list.files(OUTPUT2_DIR, '.csv', full.names = TRUE))
passing_rslts %>%
filter(constraint == 'temp') %>%
pull(pass) %>%
sum ->
passing_temp_runs
passing_temp_runs
[1] 9304
What percentage of the total run count passed through the temperature observations??
100*(passing_temp_runs / total_run_count)
[1] 54.09617
What does the temp change constraint look like?
ggplot(data = passing_rslts %>%
filter(constraint == 'temp')) +
geom_point(aes(year, value), position = 'jitter') +
geom_hline(aes(yintercept = temp_constraints$min), color = 'red') +
geom_hline(aes(yintercept = temp_constraints$max), color = 'red') +
labs(y = expression(degree~'C'),
title = 'Change in HIRM temperature and IPCC historical temperature constraints')

Total Aerosol RF Constraint
I use the same constraints we did from the AGU analysis however I found several effor in my code.
- I did not constrain to runs with aersol RF I looked at total RF.
- I selected runs that fell outside the constraint region (face palm).
# The IPCC AR5 uncertainty in the aerosol uncertainty in 2011 spans a range of 0.14 to -1.66.
rf <- tibble(year = 2011,
min = 0.14,
max = -1.66)
Now when I limit the sum of the BC, OC, SO2d, and SO2i RF in 2011 to fall within this range there are no runs that pass through the RF constriant. I think that this is an issue that we are going to need to revisit.
passing_rslts %>%
filter(constraint == 'forcing') %>%
pull(pass) %>%
sum ->
passing_rf_runs
passing_rf_runs
[1] 0
Let’s compare the HIRM total aerosol RF values with the constraint values.

---
title: "HIRM Documentation Paper - Uncertainty"
output: html_notebook
---
```{r, message = FALSE, warning=FALSE}
# Load the required packages. 
library(dplyr)
library(tidyr)
library(tibble)
library(SLCFimpulse)
library(ggplot2)

BASE_DIR <- "/Users/dorh012/Documents/2019/HIRM-documentation-paper"
OUTPUT2_DIR <- file.path(BASE_DIR, 'output', '2C.uncertainty_rslts')

```


## Objective 

This document highlights the method and resutls from the aerosol uncertinaty case study that we would like to include in the HIRM docuemtnation paper. 

## Uncertiainty Space Sampeled 

I followed the protocol we set up for the AGU aerosol experiment. 

```{r, message = FALSE}

# uncertainty_sample_range: samples a range and returns a vector that meeets the rolustion return a vector the
# resolution / max samples argument requirements. The sampeling in unifrom centered around the median.
# Args
#   input: a data frame contianing a column of scalars that
#   resolution: the number of the desired resolution between sampled space
#   max_smaples: a number that determines the limit of the maxiumn number of samples that can be taken from a space.
# TODO This function is okay, and could probably be enhanced.

uncertainty_sample_range <- function(input, resolution, max_samples){
  
  # Find the center of the paramter range.
  median_value <- median(input$scalar)
  median_extreme_diff <- abs(input$scalar[1] - median_value)
  
  if(median_extreme_diff <= resolution){
    
    # Return the input scalars puls the meidan value.
    c(median_value, input$scalar)
    
  } else {
    
    # Determine how many sample would have to be taken in order to meet the resolution
    # requirements.
    sample_length <- ceiling(median_extreme_diff / resolution)
    
    if( 2 * sample_length > max_samples){
      
      # If the calculated sample length is longer than the max sample size
      # default to the max sample size and return a warning.
      sample_length <- floor((max_samples / 2)) - 1
      
      warning('Calulated sample size based on the resolution exces the max sample size,
              using the max sample size instead.')
      
    }
    
    first_half  <- seq(from = min(input$scalar), to = median_value, length.out = sample_length)
    second_half <- seq(from = median_value, to = max(input$scalar), length.out = sample_length)
    
    if(length(first_half) < 1){
      
      first_half <- min(input$scalar)
      
    }
    
    if(length(second_half) < 1){
      second_half <- max(input$scalar)
    }
    
    # Return the sampled sequence
    sort(c(first_half, median_value, second_half))
    
  }
  
}

bind_rows(tibble( species = "SO2i", min =  -1.2, max = 0),
          tibble( species = "SO2d", min = -.60, max = - 0.2),
          tibble( species = "OC", min = -.16,max =  -.03),
          tibble( species = "BC", min = .05, max = .8)) ->
  IPCC_range

# Subset the RF pathways to include the RF values in 2011 for the rcp45 runs for the aerosol species we are
# perturbing in the uncertianty anlaysis.
aerosol_RF <- bind_rows(Hector_RF[ grepl(paste(IPCC_range$species, collapse = '|') , names(Hector_RF))]) %>%
  filter( grepl('rcp45', name) ) %>%
  filter(year == 2011)

# Now add the IPCC 2011 uncertianty ranges to the aerosol RF tibble.
aerosol_RF %>%
  full_join(gather(IPCC_range, extreme, IPCC_range, min, max), by = c('agent' = 'species')) %>%
  select(year, IPCC_range, value_2011 = value, agent, extreme) ->
  IPCC_2011_values

# Divide the min and max IPCC ranges by the 2011 RF values to calculate the min and matx uncertainty scalars.
IPCC_2011_values %>%
  mutate(scalar = IPCC_range / value_2011 ) %>%
  select(agent, scalar, extreme) %>%
  mutate(species =  paste0('rcp45_F', agent))  ->
  min_max_scalars

# 3. Sample Uncetianty scalars and Run HIRM -----------------------------------------------------------------------
# Now sample the uncetainty scalars using the function we defined in section 0.5. How often we sample the
# scalar space will depend on how large the range is.
min_max_scalars %>%
  filter(agent == 'BC') %>%
  uncertainty_sample_range(resolution = 0.02, max_samples = 10) ->
  BC_scalars

min_max_scalars %>%
  filter(agent == 'OC') %>%
  uncertainty_sample_range(resolution = 0.02, max_samples = 8) ->
  OC_scalars

min_max_scalars %>%
  filter(agent == 'SO2d') %>%
  uncertainty_sample_range(resolution = 0.1, max_samples = 15) ->
  SO2d_scalars

min_max_scalars %>%
  filter(agent == 'SO2i') %>%
  uncertainty_sample_range(resolution = 0.1, max_samples = 20) ->
  SO2i_scalars

# Save the scalars a a list.
scalars_1 <- list(rcp45_FBC = BC_scalars,
                  rcp45_FOC = OC_scalars,
                  rcp45_FSO2d = SO2d_scalars,
                  rcp45_FSO2i = SO2i_scalars)

```

Let's take a look at what values ended up being samppled. 

### BC RF  

```{r}
length(scalars_1$rcp45_FBC)
```

```{r}
range(scalars_1$rcp45_FOC)
```

```{r}
ggplot(data = tibble(value = scalars_1$rcp45_FBC,  
                     scalar = 'BC')) + 
  geom_point(aes(scalar, value)) + 
  labs(y = 'BC RF scalar value')
```

### OC RF  

```{r}
length(scalars_1$rcp45_FOC)
```

```{r}
range(scalars_1$rcp45_FOC)
```

```{r}
ggplot(data = tibble(value = scalars_1$rcp45_FOC,  
                     scalar = 'OC')) + 
  geom_point(aes(scalar, value)) + 
  labs(y = 'OC RF scalar value')
```

### RF SOd 

```{r}
length(scalars_1$rcp45_FSO2d)
```

```{r}
range(scalars_1$rcp45_FSO2d)
```

```{r}
ggplot(data = tibble(value = scalars_1$rcp45_FSO2d,  
                     scalar = 'SO2d')) + 
  geom_point(aes(scalar, value)) + 
  labs(y = 'SO2d RF scalar value')
```

### RF SOi 

```{r}
length(scalars_1$rcp45_FSO2i)
```

```{r}
range(scalars_1$rcp45_FSO2i)
```

```{r}
ggplot(data = tibble(value = scalars_1$rcp45_FSO2i,  
                     scalar = 'SO2i')) + 
  geom_point(aes(scalar, value)) + 
  labs(y = 'SO2i RF scalar value')
```


### How many perumataions is this?? 

```{r}
total_run_count <- prod(lengths(scalars_1))
total_run_count
```

## Constraints 

### Global Mean Temperautre Constraint 

```{r}
# The temperature linear trend for the 1880-2010 period (0.65 to 1.1 oC) was then used to screen out
# cases with trends falling outside this range  (Hartmann et al. 2013).
# Hartmann et al. 2013 is IPCC AR5 climate chapter WG I chapter 2.
# Pg 194
# Observations: Atmosphere and Surface
# Coordinating Lead Authors:
# Dennis L. Hartmann (USA), Albert M.G. Klein Tank (Netherlands), Matilde Rusticucci (Argentina)
temp_constraints <- tibble(period_name = '1880_2010',
                           period_start = 1880,
                           period_end = 2010,
                           min = 0.65, 
                           max = 1.1) 
```


For each of the `r total_run_count` runs I calcualted the slope of the temperature change from 1880 to 2020 multiplied by 131 years for the number of years in the period. And then compared the HIRM slopes with the min and max values of the IPCC slope. 

How many runs passed through the IPCC temp change?  

```{r}
passing_rslts <- read.csv(list.files(OUTPUT2_DIR, '.csv', full.names = TRUE))
```

```{r}
passing_rslts %>%  
  filter(constraint == 'temp') %>% 
  pull(pass) %>% 
  sum -> 
  passing_temp_runs
passing_temp_runs
```

What percentage of the total run count passed through the temperature observations?? 

```{r}
100*(passing_temp_runs / total_run_count)
```

What does the temp change constraint look like? 

```{r}
ggplot(data = passing_rslts %>%  
         filter(constraint == 'temp')) +
  geom_point(aes(year, value), position = 'jitter') + 
  geom_hline(aes(yintercept = temp_constraints$min), color = 'red') + 
  geom_hline(aes(yintercept = temp_constraints$max), color = 'red') + 
  labs(y = expression(degree~'C'), 
       title = 'Change in HIRM temperature and IPCC historical temperature constraints')
```



### Total Aerosol RF Constraint 

I use the same constraints we did from the AGU analysis however I found several effor in my code. 

1. I did not constrain to runs with aersol RF I looked at total RF. 
2. I selected runs that fell outside the constraint region (face palm). 

```{r}
# The IPCC AR5 uncertainty in the aerosol uncertainty in 2011 spans a range of 0.14 to -1.66. 
rf <- tibble(year = 2011, 
             min = 0.14, 
             max = -1.66)
```


Now when I limit the sum of the BC, OC, SO2d, and SO2i RF in 2011 to fall within this range there are no runs that pass through the RF constriant. I think that this is an issue that we are going to need to revisit.


```{r}
passing_rslts %>%  
  filter(constraint == 'forcing') %>% 
  pull(pass) %>% 
  sum -> 
  passing_rf_runs
passing_rf_runs
```

Let's compare the HIRM total aerosol RF values with the constraint values.

```{r}
ggplot(data = passing_rslts %>%  
         filter(constraint == 'forcing')) +
  geom_point(aes(year, value), position = 'jitter') + 
  geom_hline(aes(yintercept = rf$min), color = 'red') + 
  geom_hline(aes(yintercept = rf$max), color = 'red') + 
  labs(y = 'RF W/m2', 
       title = 'Total Aerosol RF vs IPCC constraint')
```



## Conclusions 

1. I think that the global mean temperature change constraint looks good. 
2. I am concerned about what is going on with the aerosol RF contraint. 
3. Do we sample enough BC and OC scalars?

