Resources: i. PACTA for Banks Methodology
ii. PACTA for Banks Packages
iii. Step-by-step Instruction
iv. PACTA Scenario Support Document
v. PACTA for Banks Knowledge Hub

1. Overview

Package Short Description Function
r2dii.data Dataset Templates Provides mock datasets and templates for loan books and asset-level data.
r2dii.match Data Linking Uses fuzzy-matching algorithms to link bank clients to physical assets.
r2dii.analysis Core Calculation Calculates Technology Mix, Production Trajectories, and Emission Intensity.
r2dii.plot Visualization Generates standardized charts to visualize alignment with climate scenarios.

2. Methodology

Step 1: Gather inputs: loanbook, asset-based company data, climate scenarios.
Step 2: Map loans from the bank’s loanbook to physical assets in the real economy and their corresponding production values.
Step 3: Measure & visualize portfolio alignment & company alignment in 3 metrics and 2 approaches, depending on the sector.

3. Demo with mock dataset from r2dii.data

Important note for the demo:
- Assume the data is already clean and requires no further validation. In a real-world scenario, data cleaning steps would be mandatory.
- Only consider portfolio-weighted approach as it would be most relevant to our scope of work.
- To view full analysis procedure, please refer to the Step-by-step Instruction pinned at the top of this document.

3.1. Prepare the tool

Need to install: R Studio, R language & the following packages

library("rmarkdown")
library("pacta.loanbook")
library("r2dii.data")
library("r2dii.analysis")
library("r2dii.match")
library("r2dii.plot")
library("dplyr")
library("ggplot2")
library("scales")
library("tidyr")

3.2. Prepare the input

PACTA is data-agnostic. None of the inputs below are available by PACTA itself and should be prepared by users. All the data points have data dictionary which we can refer to in input preparation.

3.2.1. Loan book

  • Source: Prepared by Bank

  • Required input:

    • id_loan: <character>
    • id_direct_loantaker: <character>
    • name_direct_loantaker: <character>
    • id_ultimate_parent: <character>
    • name_ultimate_parent: <character>
    • loan_size_outstanding: <numeric>
    • loan_size_outstanding_currency: <character>
    • loan_size_credit_limit: <numeric>
    • loan_size_credit_limit_currency: <character>
    • sector_classification_system: <character>
    • sector_classification_direct_loantaker: <character>
    • lei_direct_loantaker: <character>
    • isin_direct_loantaker: <character>
  • For sector classfication: We have previously done VSIC & NAICS mapping, so if we proceed this analysis for Vietnamese banks, we can ultilize that.

  • Below is the demo:

ap_loanbook <- r2dii.data::loanbook_demo %>%
  mutate(sector_classification_direct_loantaker = as.character(sector_classification_direct_loantaker)) %>%
  left_join(sector_classifications, by = c(
    "sector_classification_system" = "code_system",
    "sector_classification_direct_loantaker" = "code"
  )) %>%
  rename(
    sector_matched = sector,
    borderline_matched = borderline
  )

paged_table(ap_loanbook)

3.2.2. Asset-based company data (ABCD)

  • Description: Production profiles and emission intensities for companies in climate critical economic sectors.
  • Source: This typically must be purchase from third-party providers (eg. Asset Impact), it can also be self-prepared or supplemented with additional entries. This is the most challenging input for Vietnamese banks.
  • Required inputs:
    • company_id: <character>
    • name_company: <character>
    • lei: <character>
    • sector: <character>
    • technology: <character>
    • production_unit: <character>
    • year: <integer>
    • production: <numeric>
    • emission_factor: <numeric>
    • plant_location: <character>
    • is_ultimate_owner: <logical>
    • emission_factor_unit: <character>
  • Below is the demo:
ap_abcd <- r2dii.data::abcd_demo
paged_table(ap_abcd)

3.2.3. Scenarios (ABCD)

  • Source: World Energy Outlook by IEA, JRC, ISF…
  • Required inputs
    • Sectors using Market Share Approach:
      • scenario: <character>
      • sector: <character>
      • technology: <character>
      • region: <character>
      • year: <integer>
      • tmsr: <numeric>
      • smsp: <numeric>
      • scenario_source: <character>
    • Sectors using Sectoral Decarbonization Approach:
      • scenario_source: <character>
      • scenario: <character>
      • sector: <character>
      • region: <character>
      • year: <numeric>
      • emission_factor_unit: <character>
      • emission_factor: <numeric>
  • Below is the demo
scenario <- r2dii.data::scenario_demo_2020
co2 <- r2dii.data::co2_intensity_scenario_demo
region <- r2dii.data::region_isos_demo

3.3. Match loans (from loanbook) to physical assets & production values (from ABCD)

3.3.1. Match

  • We can match loanbook & ABCD data by the match_name function based on fuzzy matching algorithms. Advanced arguments including:
    • min_score: a threshold between 0 and 1 that sets the minimum similarity score required for a match to be suggested, where 1 represents a perfect match.
    • method: specifies the string distance calculation algorithm used to score name similarity, with options including “jw” (Jaro-Winkler), “levenshtein”, “cosine”, and “jaccard”.
    • p: prefix factor used with Jaro-Wrinkler (jw) method ranging from 0 to 0.25, giving the highest scores to name pairs with a common prefix.
    • overwrite: an optional data frame used to manually replace or correct the sector or name columns for specific borrowers, allowing the user to bypass the algorithm for known entities or ownership changes
  • After we run the match function, we need to export the file and check manually for each match with score lower than 1 and manually change the score to 1 or 0 depending on the correctness of the match.
  • We then import the file into R again and run funtion prioritize to remove duplicates in case one loan ID is matched with multiple levels of ownership.
  • Below is the demo (the min_score is set to 1 because I cannot check if it’s a correct match or not so only exact match should be kept, no manual check needed)
matches <- ap_loanbook %>%
  match_name(ap_abcd, by_sector = TRUE, min_score = 1, method = "jw", p = 0.1) %>%
  #between the two steps must be manual check
  prioritize()
paged_table(matches)
mismatch_analysis <- matches %>%
  filter(sector_matched != sector) %>%
  select(id_loan, name_direct_loantaker, name_ultimate_parent, sector_matched, sector)

if (nrow(mismatch_analysis) > 0) {
  paged_table(mismatch_analysis)
} else {
  print("No sector mismatch found. Scope matched by Bank is consistent with ABCD.")
}
## [1] "No sector mismatch found. Scope matched by Bank is consistent with ABCD."

3.3.2. Calculate match success rate

#Calculation
loanbook_sector_summary <- ap_loanbook %>%
 group_by(sector_matched) %>%
  summarise(
   total_outstanding = sum(loan_size_outstanding, na.rm = TRUE)
  )
matches_sector_summary <- matches %>%
  group_by(sector) %>%
   summarise(
     matches_outstanding = sum(loan_size_outstanding, na.rm = TRUE)
   )
sector_summary <- loanbook_sector_summary %>%
  left_join(matches_sector_summary, by = c("sector_matched" = "sector")) %>%
  mutate(
    matches_outstanding = ifelse(is.na(matches_outstanding), 0, matches_outstanding),
    match_percentage = (matches_outstanding / total_outstanding) * 100
  )

#Pie Chart
outstanding_total <- sum(sector_summary$total_outstanding)
outstanding_matched <- sum(sector_summary$matches_outstanding)
outstanding_notinscope <- loanbook_sector_summary %>%
  filter(sector_matched == "not in scope") %>%
  pull(total_outstanding)

df_sector_pie <- data.frame(
  status = c("(In Scope) Matched", "(In Scope) Not Matched", "Not in Scope"),
  amount = c(
    outstanding_matched, 
    (outstanding_total - outstanding_notinscope - outstanding_matched), 
    outstanding_notinscope
  )
) %>%
  mutate(
    percent = amount / sum(amount),
    label = paste0(status, "\n", percent(percent, accuracy = 0.01))
  )

ggplot(df_sector_pie, aes(x = "", y = amount, fill = status)) +
  geom_bar(stat = "identity", width = 1, color = "white") +
  coord_polar("y", start = 0) +
  geom_text(aes(label = label), 
            position = position_stack(vjust = 0.5), 
            color = "white", 
            fontface = "bold",
            size = 3) +
  scale_fill_manual(values = c(
    "(In Scope) Matched" = "#14645c",
    "(In Scope) Not Matched" = "#e8594b",
    "Not in Scope" = "#9E9E9E"
  )) +
  labs(
    title = "Portfolio Distribution Breakdown",
    subtitle = paste0("Loanbook Value: ", comma(outstanding_total), " EUR"),
    fill = NULL
  ) +
  theme_void() +
  theme(
    legend.position = "bottom",
    plot.title = element_text(hjust = 0.5, face = "bold", size = 16),
    plot.subtitle = element_text(hjust = 0.5, size = 12)
  )

#Bar Chart
df_sector_bar <- sector_summary %>%
  mutate(not_matched_outstanding = total_outstanding - matches_outstanding) %>%
  pivot_longer(
    cols = c(matches_outstanding, not_matched_outstanding),
    names_to = "type",
    values_to = "amount"
  ) %>%
  mutate(
    status = case_when(
      sector_matched == "not in scope" ~ "Not in Scope",
      type == "matches_outstanding" ~ "(In Scope) Matched",
      TRUE ~ "(In Scope) Not Matched"
    ),
  status = factor(status, levels = c("(In Scope) Matched", "(In Scope) Not Matched", "Not in Scope"))
  )

ggplot(df_sector_bar, aes(x = reorder(sector_matched, amount, sum), y = amount, fill = status)) +
  geom_col(width = 0.7, color = "white", linewidth = 0.1,position=position_stack(reverse=TRUE)) +
  geom_text(data = sector_summary %>% filter(sector_matched != "not in scope"),
            aes(x = sector_matched, y = total_outstanding, 
                label = percent(match_percentage/100, accuracy = 0.01)),
            inherit.aes = FALSE,
            hjust = -0.2,
            size = 3.5) +
  coord_flip() +
  scale_y_continuous(labels = label_comma(), expand = expansion(mult = c(0, 0.2))) +
  scale_fill_manual(values = c(
    "(In Scope) Matched" = "#14645c",
    "(In Scope) Not Matched" = "#e8594b",
    "Not in Scope" = "#9E9E9E"
  )) +
  labs(
    title = "Breakdown of Matching Coverage by Sector",
    x = "Sector", 
    y = "Outstanding Amount (EUR)", 
    fill = NULL
  ) +
  theme_minimal() +
  theme(
    legend.position = "bottom",
    axis.text.y = element_text(face = "bold"),
    plot.title = element_text(hjust = 0.5, face = "bold", size = 16)
  )

3.4. Calculate PACTA alignment metrics

3.4.1. Calculate alignment for Power & Automotive Sectors by Market Share Approach

3.4.1.1. Sector-level: Technology Mix
  • Function: target_market_share
market_share_targets_portfolio <-
  target_market_share(
    data = matches %>%
      select(-c("sector_matched", "borderline_matched")),
    abcd = ap_abcd,
    scenario = scenario,
    region_isos = region
  )
unique(market_share_targets_portfolio$sector)
## [1] "automotive" "power"
head(market_share_targets_portfolio)
## # A tibble: 6 × 10
##   sector     technology  year region scenario_source metric     production
##   <chr>      <chr>      <int> <chr>  <chr>           <chr>           <dbl>
## 1 automotive electric    2020 global demo_2020       projected     145649.
## 2 automotive electric    2020 global demo_2020       target_cps    145649.
## 3 automotive electric    2020 global demo_2020       target_sds    145649.
## 4 automotive electric    2020 global demo_2020       target_sps    145649.
## 5 automotive electric    2021 global demo_2020       projected     147480.
## 6 automotive electric    2021 global demo_2020       target_cps    148314.
## # ℹ 3 more variables: technology_share <dbl>, scope <chr>,
## #   percentage_of_initial_production_by_scope <dbl>
  • Power Sector
    Note: If you take a look at the graph, you can see portfolio in 2025 is empty. This is because the data projected for 2025 for capacity did not add up to 100%.
data_portfolio <- filter(
  market_share_targets_portfolio,
  scenario_source == "demo_2020",
  sector == "power",
  region == "global",
  metric %in% c("projected", "corporate_economy", "target_sds")
)

qplot_techmix(data_portfolio) +
  ggrepel::geom_label_repel(
    aes(label = paste0(round(technology_share, 3) * 100, "%")),
    min.segment.length = 0,
    position = position_stack(vjust = 0.5),
    show.legend = FALSE
  ) +
  theme(text=element_text(family="sans"))
## The `technology_share` values are plotted for extreme years.
## Do you want to plot different years? E.g. filter . with:`subset(., year %in% c(2020, 2030))`.
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_bar()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_label_repel()`).

  • Automotive Sector
data_portfolio <- filter(
  market_share_targets_portfolio,
  scenario_source == "demo_2020",
  sector == "automotive",
  region == "global",
  metric %in% c("projected", "corporate_economy", "target_sds")
)

qplot_techmix(data_portfolio) +
  ggrepel::geom_label_repel(
    aes(label = paste0(round(technology_share, 3) * 100, "%")),
    min.segment.length = 0,
    position = position_stack(vjust = 0.5),
    show.legend = FALSE
  ) +
  theme(text=element_text(family="sans"))
## The `technology_share` values are plotted for extreme years.
## Do you want to plot different years? E.g. filter . with:`subset(., year %in% c(2020, 2030))`.

3.4.1.2. Technology-level Production Volume Trajectory
  • Power Sector
#Renewables
data_portfolio <- filter(
  market_share_targets_portfolio,
  sector == "power",
  technology == "renewablescap",
  region == "global",
  scenario_source == "demo_2020"
)

data_renewablescap <- data_portfolio %>% 
  dplyr::filter(year == min(.data$year) + 5) %>% 
  dplyr::rename(value = "percentage_of_initial_production_by_scope")

qplot_trajectory(data_portfolio) +
  ggrepel::geom_text_repel(
    aes(label = paste0(round(value, 3) * 100, "%")),
    data = data_renewablescap
  ) +
  theme(text=element_text(family="sans"))

#Coal
data_portfolio <- filter(
  market_share_targets_portfolio,
  sector == "power",
  technology == "coalcap",
  region == "global",
  scenario_source == "demo_2020"
)

data_coalcap <- data_portfolio %>% 
  dplyr::filter(year == min(.data$year) + 5) %>% 
  dplyr::rename(value = "percentage_of_initial_production_by_scope")

qplot_trajectory(data_portfolio) +
  ggrepel::geom_text_repel(
    aes(label = paste0(round(value, 3) * 100, "%")),
    data = data_coalcap
  )+
  theme(text=element_text(family="sans"))

  • Automotive Sector
#Electric
data_portfolio <- filter(
  market_share_targets_portfolio,
  sector == "automotive",
  technology == "electric",
  region == "global",
  scenario_source == "demo_2020"
)

data_electric_auto <- data_portfolio %>% 
  dplyr::filter(year == min(.data$year) + 5) %>% 
  dplyr::rename(value = "percentage_of_initial_production_by_scope")

qplot_trajectory(data_portfolio) +
  ggrepel::geom_text_repel(
    aes(label = paste0(round(value, 3) * 100, "%")),
    data = data_electric_auto
  )+
  theme(text=element_text(family="sans"))

#Combustion engine
data_portfolio <- filter(
  market_share_targets_portfolio,
  sector == "automotive",
  technology == "ice",
  region == "global",
  scenario_source == "demo_2020"
)

data_ice_auto <- data_portfolio %>% 
  dplyr::filter(year == min(.data$year) + 5) %>% 
  dplyr::rename(value = "percentage_of_initial_production_by_scope")

qplot_trajectory(data_portfolio) +
  ggrepel::geom_text_repel(
    aes(label = paste0(round(value, 3) * 100, "%")),
    data = data_ice_auto
  )+
  theme(text=element_text(family="sans"))

3.4.2 Calculate alignment for Cement and Steel Sectors by Sectoral Decarbonization Approach

  • Funtion: target_sda
sda_targets_portfolio <-
  target_sda(
    data = matches,
    abcd = ap_abcd,
    co2_intensity_scenario = co2,
    region_isos = region)
## Warning: Removing rows in abcd where `emission_factor` is NA
unique(sda_targets_portfolio$sector)
## [1] "cement" "steel"
head(sda_targets_portfolio)
## # A tibble: 6 × 6
##   sector  year region             scenario_source emission_factor_metric
##   <chr>  <dbl> <chr>              <chr>           <chr>                 
## 1 cement  2020 advanced economies demo_2020       projected             
## 2 cement  2020 developing asia    demo_2020       projected             
## 3 cement  2020 global             demo_2020       projected             
## 4 cement  2021 advanced economies demo_2020       projected             
## 5 cement  2021 developing asia    demo_2020       projected             
## 6 cement  2021 global             demo_2020       projected             
## # ℹ 1 more variable: emission_factor_value <dbl>
  • Cement
data_portfolio <- filter(
  sda_targets_portfolio,
  sector == "cement",
  region == "global"
)

data_cement <- data_portfolio %>% 
  dplyr::filter(year == min(.data$year) + 5) %>% 
  dplyr::mutate(
    year = as.Date(strptime(as.character(year), "%Y")),
    label = pacta.loanbook::to_title(emission_factor_metric)
  )

qplot_emission_intensity(data_portfolio) +
  ggrepel::geom_text_repel(
    aes(label = round(emission_factor_value, 3)),
    data = data_cement,
    show.legend = FALSE
  )+
  theme(text=element_text(family="sans"))

  • Steel
data_portfolio <- filter(
  sda_targets_portfolio,
  sector == "steel",
  region == "global"
)

data_steel <- data_portfolio %>% 
  dplyr::filter(year == min(.data$year) + 5) %>% 
  dplyr::mutate(
    year = as.Date(strptime(as.character(year), "%Y")),
    label = pacta.loanbook::to_title(emission_factor_metric)
  )

qplot_emission_intensity(data_portfolio) +
  ggrepel::geom_text_repel(
    aes(label = round(emission_factor_value, 3)),
    data = data_steel,
    show.legend = FALSE
  )+
  theme(text=element_text(family="sans"))