HW6

4.5.0.1 Homework problem 1:

Your boss knows that they will be asking her about how the implications of the electric vehicle (EV) market and policy choices on the demand for copper and cobalt.

Use the workflow developed in this chapter to import and tidy worksheet 2.3 EV from the dataset.
Use the tidied dataset to come up with 5 compelling data visualizations that illustrate key actionable insights about how policy scenarios, and technological scenarios will impact demand for copper and cobalt.

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(readxl) 
library(here)

here() starts at /Users/zhenglinyi/Desktop/24 spring/sustainable finance/week6

path_to_sheet <- here("data_raw", "CM_Data_Explorer.xlsx")
path_to_sheet

[1] "/Users/zhenglinyi/Desktop/24 spring/sustainable finance/week6/data_raw/CM_Data_Explorer.xlsx"

read_EV_sheet <- partial(
  .f = read_excel,
  path = path_to_sheet,
  sheet = "2.3 EV",
  col_names = FALSE
)

sheet_header <- read_EV_sheet(range = "A4:W5")

New names:
• `` -> `...1`
• `` -> `...2`
• `` -> `...3`
• `` -> `...4`
• `` -> `...5`
• `` -> `...6`
• `` -> `...7`
• `` -> `...8`
• `` -> `...9`
• `` -> `...10`
• `` -> `...11`
• `` -> `...12`
• `` -> `...13`
• `` -> `...14`
• `` -> `...15`
• `` -> `...16`
• `` -> `...17`
• `` -> `...18`
• `` -> `...19`
• `` -> `...20`
• `` -> `...21`
• `` -> `...22`
• `` -> `...23`

sheet_header

# A tibble: 2 × 23
  ...1   ...2 ...3  ...4    ...5  ...6  ...7  ...8  ...9 ...10 ...11 ...12 ...13
  <lgl> <dbl> <lgl> <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <chr> <dbl> <dbl>
1 NA       NA NA    State…    NA    NA    NA    NA    NA NA    Anno…    NA    NA
2 NA     2022 NA    2025    2030  2035  2040  2045  2050 NA    2025   2030  2035
# ℹ 10 more variables: ...14 <dbl>, ...15 <dbl>, ...16 <dbl>, ...17 <lgl>,
#   ...18 <chr>, ...19 <dbl>, ...20 <dbl>, ...21 <dbl>, ...22 <dbl>,
#   ...23 <dbl>

sheet_header_processed <- sheet_header |> 
  # transpose the data
  t() |>
  # turn it back into a tibble
  as_tibble() |>
  # make them meaningful
  rename(scenario = V1, year = V2) |>
  # fill scenario down
  fill(scenario) |>
  #insert "Current" at top
  replace_na(list(scenario = "Current Year"))

Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if
`.name_repair` is omitted as of tibble 2.0.0.
ℹ Using compatibility `.name_repair`.

sheet_header_processed

# A tibble: 23 × 2
   scenario                 year 
   <chr>                    <chr>
 1 Current Year             <NA> 
 2 Current Year             2022 
 3 Current Year             <NA> 
 4 Stated policies scenario 2025 
 5 Stated policies scenario 2030 
 6 Stated policies scenario 2035 
 7 Stated policies scenario 2040 
 8 Stated policies scenario 2045 
 9 Stated policies scenario 2050 
10 Stated policies scenario <NA> 
# ℹ 13 more rows

EV_name <- read_EV_sheet(range = "A7") |> 
  pull()

New names:
• `` -> `...1`

EV_name

[1] "Constrained nickel supply"

EV_info <- read_EV_sheet(range = "A8:W19")

New names:
• `` -> `...1`
• `` -> `...2`
• `` -> `...3`
• `` -> `...4`
• `` -> `...5`
• `` -> `...6`
• `` -> `...7`
• `` -> `...8`
• `` -> `...9`
• `` -> `...10`
• `` -> `...11`
• `` -> `...12`
• `` -> `...13`
• `` -> `...14`
• `` -> `...15`
• `` -> `...16`
• `` -> `...17`
• `` -> `...18`
• `` -> `...19`
• `` -> `...20`
• `` -> `...21`
• `` -> `...22`
• `` -> `...23`

EV_info_col_names <- names(EV_info)

sheet_headers_and_col_names <- sheet_header_processed |> 
  add_column(EV_info_col_names = EV_info_col_names)

EV_info_long <- EV_info |> 
  rename(indicator = `...1`) |> 
  pivot_longer(cols = -indicator,
               names_to = "EV_info_col_names") |> 
  add_column(EV_name)

combined_data <- EV_info_long |> 
  left_join(sheet_headers_and_col_names, by = join_by(EV_info_col_names)) |> 
  # filter out what were empty columns (where years are NA) 
  filter(!is.na(year)) |> 
    # convert the year column from character to numeric
    mutate(year = as.integer(year)) |> 
  select(EV_name, indicator, scenario, year, value)

combined_data

# A tibble: 228 × 5
   EV_name                   indicator scenario                    year value
   <chr>                     <chr>     <chr>                      <int> <dbl>
 1 Constrained nickel supply Copper    Current Year                2022  381.
 2 Constrained nickel supply Copper    Stated policies scenario    2025  617.
 3 Constrained nickel supply Copper    Stated policies scenario    2030 1389.
 4 Constrained nickel supply Copper    Stated policies scenario    2035 1736.
 5 Constrained nickel supply Copper    Stated policies scenario    2040 2233.
 6 Constrained nickel supply Copper    Stated policies scenario    2045 2418.
 7 Constrained nickel supply Copper    Stated policies scenario    2050 2313.
 8 Constrained nickel supply Copper    Announced pledges scenario  2025  732.
 9 Constrained nickel supply Copper    Announced pledges scenario  2030 2113.
10 Constrained nickel supply Copper    Announced pledges scenario  2035 3587.
# ℹ 218 more rows

EV_name_range <- "A7"
EV_info_range <- "A8:W19"


EV_name <- read_EV_sheet(range = EV_name_range) |> 
  pull()

New names:
• `` -> `...1`

EV_info <- read_EV_sheet(range = EV_info_range)

New names:
• `` -> `...1`
• `` -> `...2`
• `` -> `...3`
• `` -> `...4`
• `` -> `...5`
• `` -> `...6`
• `` -> `...7`
• `` -> `...8`
• `` -> `...9`
• `` -> `...10`
• `` -> `...11`
• `` -> `...12`
• `` -> `...13`
• `` -> `...14`
• `` -> `...15`
• `` -> `...16`
• `` -> `...17`
• `` -> `...18`
• `` -> `...19`
• `` -> `...20`
• `` -> `...21`
• `` -> `...22`
• `` -> `...23`

EV_info_col_names <- names(EV_info)

EV_info_long <- EV_info |> 
  rename(indicator = `...1`) |> 
  pivot_longer(cols = -indicator,
               names_to = "EV_info_col_names") |> 
  add_column(EV_name)

combined_data <- EV_info_long |> 
  left_join(sheet_headers_and_col_names, by = join_by(EV_info_col_names)) |> 
  # filter out what were empty columns (where years are NA) 
  filter(!is.na(year)) |> 
  # case_when is supercharged if else
  mutate(
    # convert the year column from character to numeric
    year = as.integer(year)
  ) |> 
  select(EV_name, indicator, scenario, year, value)

combined_data

# A tibble: 228 × 5
   EV_name                   indicator scenario                    year value
   <chr>                     <chr>     <chr>                      <int> <dbl>
 1 Constrained nickel supply Copper    Current Year                2022  381.
 2 Constrained nickel supply Copper    Stated policies scenario    2025  617.
 3 Constrained nickel supply Copper    Stated policies scenario    2030 1389.
 4 Constrained nickel supply Copper    Stated policies scenario    2035 1736.
 5 Constrained nickel supply Copper    Stated policies scenario    2040 2233.
 6 Constrained nickel supply Copper    Stated policies scenario    2045 2418.
 7 Constrained nickel supply Copper    Stated policies scenario    2050 2313.
 8 Constrained nickel supply Copper    Announced pledges scenario  2025  732.
 9 Constrained nickel supply Copper    Announced pledges scenario  2030 2113.
10 Constrained nickel supply Copper    Announced pledges scenario  2035 3587.
# ℹ 218 more rows

read_EV_table <-
  function(EV_name_range, EV_info_range) {
    EV_name <-
      read_EV_sheet(range = EV_name_range) |>
      pull()
    
    EV_info <- read_EV_sheet(range = EV_info_range)
    
    EV_info_col_names <- names(EV_info)
    
    EV_info_long <- EV_info |>
      rename(indicator = `...1`) |>
      pivot_longer(cols = -indicator,
                   names_to = "EV_info_col_names") |>
      add_column(EV_name)
    
    combined_data <- EV_info_long |>
      left_join(sheet_headers_and_col_names, by = join_by(EV_info_col_names)) |>
      # filter out what were empty columns (where years are NA)
      filter(!is.na(year)) |>
      # case_when is supercharged if else
      mutate(
        # convert the year column from character to numeric
        year = as.integer(year)
      ) |>
      select(EV_name, indicator, scenario, year, value)
    
    combined_data
  }

nickle_supply <- read_EV_table(
  EV_name_range = "A7",
  EV_info_range = "A8:W19"
)

New names:
New names:
• `` -> `...1`

silicon_rich <- read_EV_table(
  EV_name_range = "A22",
  EV_info_range = "A23:W34"
)

New names:
New names:
• `` -> `...1`

solid_state_batteries <- read_EV_table(
  EV_name_range = "A37",
  EV_info_range = "A38:W49"
)

New names:
New names:
• `` -> `...1`

low_battery_sizes <- read_EV_table(
  EV_name_range = "A52",
  EV_info_range = "A53:W64"
)

New names:
New names:
• `` -> `...1`

limited_battery_size_reduction <- read_EV_table(
  EV_name_range = "A67",
  EV_info_range = "A68:W79"
)

New names:
New names:
• `` -> `...1`

STEPS <- read_EV_table(
  EV_name_range = "A82",
  EV_info_range = "A83:W94"
)

New names:
New names:
• `` -> `...1`

final_EV_table <- nickle_supply |> 
  bind_rows(silicon_rich) |> 
  bind_rows(solid_state_batteries) |>
  bind_rows(low_battery_sizes) |>
  bind_rows(limited_battery_size_reduction)|>
  bind_rows(STEPS)

final_EV_table

# A tibble: 1,368 × 5
   EV_name                   indicator scenario                    year value
   <chr>                     <chr>     <chr>                      <int> <dbl>
 1 Constrained nickel supply Copper    Current Year                2022  381.
 2 Constrained nickel supply Copper    Stated policies scenario    2025  617.
 3 Constrained nickel supply Copper    Stated policies scenario    2030 1389.
 4 Constrained nickel supply Copper    Stated policies scenario    2035 1736.
 5 Constrained nickel supply Copper    Stated policies scenario    2040 2233.
 6 Constrained nickel supply Copper    Stated policies scenario    2045 2418.
 7 Constrained nickel supply Copper    Stated policies scenario    2050 2313.
 8 Constrained nickel supply Copper    Announced pledges scenario  2025  732.
 9 Constrained nickel supply Copper    Announced pledges scenario  2030 2113.
10 Constrained nickel supply Copper    Announced pledges scenario  2035 3587.
# ℹ 1,358 more rows

write_csv(final_EV_table,here("data", "iea_mineral_demand_for_EV.csv"))

cleaned_data <- here("data", "iea_mineral_demand_for_EV.csv") |> 
  read_csv()

Rows: 1368 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): EV_name, indicator, scenario
dbl (2): year, value

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

cleaned_data

# A tibble: 1,368 × 5
   EV_name                   indicator scenario                    year value
   <chr>                     <chr>     <chr>                      <dbl> <dbl>
 1 Constrained nickel supply Copper    Current Year                2022  381.
 2 Constrained nickel supply Copper    Stated policies scenario    2025  617.
 3 Constrained nickel supply Copper    Stated policies scenario    2030 1389.
 4 Constrained nickel supply Copper    Stated policies scenario    2035 1736.
 5 Constrained nickel supply Copper    Stated policies scenario    2040 2233.
 6 Constrained nickel supply Copper    Stated policies scenario    2045 2418.
 7 Constrained nickel supply Copper    Stated policies scenario    2050 2313.
 8 Constrained nickel supply Copper    Announced pledges scenario  2025  732.
 9 Constrained nickel supply Copper    Announced pledges scenario  2030 2113.
10 Constrained nickel supply Copper    Announced pledges scenario  2035 3587.
# ℹ 1,358 more rows

cobalt_data <- cleaned_data[cleaned_data$indicator == "Cobalt", ]
copper_data <- cleaned_data[cleaned_data$indicator == "Copper", ]
library(ggplot2)

# 1. Line Plot of Copper and Cobalt Demand Over Time by Scenario
metal_demand_line <- ggplot() +
  geom_line(data = cobalt_data, aes(x = year, y = value, color = scenario, linetype = "Cobalt")) +
  geom_line(data = copper_data, aes(x = year, y = value, color = scenario, linetype = "Copper")) +
  labs(title = "Demand Over Time by Scenario", x = "Year", y = "Demand", color = "Scenario", linetype = "Metal") +
  theme_minimal()

metal_demand_line

From the figure we can see that The demand for Copper is higher than Cobalt over all. The most influential scenario is announced pledges. The general trend of demand for cobalt is increasing.

# 2. Bar Chart of Cobalt Demand by Scenario and Year
cobalt_demand_bar <- ggplot() +
  geom_bar(data = cobalt_data, aes(x = year, y = value, fill = scenario), stat = "identity", position = "dodge") +
  labs(title = "Demand for Cobalt by Scenario and Year", x = "Year", y = "Demand", fill = "Scenario") +
  theme_minimal()

cobalt_demand_bar

Warning: Removed 12 rows containing missing values (`geom_bar()`).

From the bar chart we can see demand for cobalt is increasing and the net zero emissions by 2050 scenario has the greatest impact.

# 3. Bar Chart of Copper Demand by Scenario and Year
copper_demand_bar <- ggplot() +
  geom_bar(data = copper_data, aes(x = year, y = value, fill = scenario), stat = "identity", position = "dodge") +
  labs(title = "Demand for Copper by Scenario and Year", x = "Year", y = "Demand", fill = "Scenario") +
  theme_minimal()

copper_demand_bar

Warning: Removed 12 rows containing missing values (`geom_bar()`).

It’s worth noting that the stated policies may cause demand for copper decrease from 2045 to 2050.

library(ggridges)
metal_demand_ridgeline <- ggplot() +
  geom_density_ridges(data = rbind(cobalt_data, copper_data), aes(x = value, y = scenario, fill = indicator), alpha = 0.7) +
  labs(title = "Metal Demand Distribution by Scenario", x = "Density", y = "Scenario", fill = "Metal") +
  theme_minimal()

print(metal_demand_ridgeline)

Picking joint bandwidth of 148

Warning: Removed 24 rows containing non-finite values
(`stat_density_ridges()`).

Desnity of demand for copper is more widely distributed than cobalt.

# 5. Stacked Bar Chart of Policy Impact on Metal Demand
policy_impact_stacked <- ggplot() +
  geom_bar(data = cobalt_data, aes(x = factor(year), y = value, fill = scenario), stat = "identity") +
  geom_bar(data = copper_data, aes(x = factor(year), y = value, fill = scenario), stat = "identity") +
  labs(title = "Policy Impact on Metal Demand", x = "Year", y = "Demand", fill = "Scenario") +
  theme_minimal()

policy_impact_stacked

Warning: Removed 12 rows containing missing values (`position_stack()`).
Removed 12 rows containing missing values (`position_stack()`).

The total demand for copper and cobalt my decrease in 2050 because of stated policies and net zero emissions by 2050 scenario.