Data Analysis Homework: Week 3

Author

Jingning Feng

BlackRock EST ETF Data

library(tidyverse) 
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0     ✔ purrr   1.0.1
✔ tibble  3.1.8     ✔ dplyr   1.1.0
✔ tidyr   1.3.0     ✔ stringr 1.5.0
✔ readr   2.1.3     ✔ forcats 1.0.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
github_raw_csv_url <- "https://raw.githubusercontent.com/t-emery/sais-susfin_data/main/datasets/blackrock_etf_screener_2022-08-30.csv"

blackrock_etf_data <- read_csv(github_raw_csv_url)
Rows: 393 Columns: 22
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (14): ticker, name, incept_date, net_assets_as_of, asset_class, sub_asse...
dbl  (8): gross_expense_ratio_percent, net_expense_ratio_percent, net_assets...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
blackrock_etf_data |> 
  ggplot(aes(x = net_assets_usd_bn, fill = is_esg)) + 
  geom_histogram(binwidth = 5, alpha = .5, position = "identity") +
  facet_wrap(~region, ncol = 2)

Compared with regular funds, the quantity of ESG funds is much smaller. Most of the ESG funds focus on the North America region.

blackrock_etf_data |> 
  ggplot(aes(y = esg_score)) +
  geom_boxplot()
Warning: Removed 38 rows containing non-finite values (`stat_boxplot()`).

blackrock_etf_data |> 
  ggplot(aes(y = esg_score)) +
  geom_boxplot() +
  facet_wrap(~is_esg)
Warning: Removed 38 rows containing non-finite values (`stat_boxplot()`).

The median MSCI ESG quality score for all BlackRock ETF funds is about 8. The median score for ESG funds is higher than regular bonds. The scores for ESG funds are more concentrated on the upper scale, while the scores for regular funds are more extended.

funds_by_co2 <- blackrock_etf_data |> 
  select(fund_name = name_wo_ishares_etf, is_esg, region, co2_intensity)

funds_by_co2 |> 
  group_by(is_esg) |> 
  calculate_summary_stats(value_var = co2_intensity)
# A tibble: 2 × 8
  is_esg         avg std_dev   min   max pctile_25 median pctile_75
  <chr>        <dbl>   <dbl> <dbl> <dbl>     <dbl>  <dbl>     <dbl>
1 ESG Fund      130.    120. 29.6   738.      73.3   104.      133.
2 Regular Fund  254.    258.  2.55 2119.     110.    187.      320.
funds_by_co2 |> 
  group_by(is_esg, region) |> 
  calculate_summary_stats(value_var = co2_intensity)
`summarise()` has grouped output by 'is_esg'. You can override using the
`.groups` argument.
# A tibble: 9 × 9
  is_esg       region            avg std_dev    min   max pctil…¹ median pctil…²
  <chr>        <chr>           <dbl>   <dbl>  <dbl> <dbl>   <dbl>  <dbl>   <dbl>
1 ESG Fund     Global           145.    65.7  68.7   266.   104.    124.    195.
2 ESG Fund     North America    124.   138.   29.6   738.    71.1   104.    117.
3 Regular Fund Asia Pacific     276.   203.   79.9   837.   116.    214.    333.
4 Regular Fund Europe           185.   191.    3.1   925.   115.    124.    149.
5 Regular Fund Global           289.   332.    9.71 1756.    96.1   180.    343.
6 Regular Fund Kuwait           561.    NA   561.    561.   561.    561.    561.
7 Regular Fund Latin America    394.   142.  247.    658.   331.    352.    411.
8 Regular Fund Middle East an…  506.   374.   76.6   985.   341.    482.    647.
9 Regular Fund North America    230.   228.    2.55 2119.   108.    197.    291.
# … with abbreviated variable names ¹​pctile_25, ²​pctile_75

The average CO2 intensity for ESG funds is 130. The number almost doubles for regular funds, among which Kuwait has the highest average CO2 intensity, and then Middle East and Africa.

Fund Holdings Data

etf_comparison_data_github_url <- "https://raw.githubusercontent.com/t-emery/sais-susfin_data/main/datasets/etf_comparison-2022-10-03.csv"

etf_comparison <- read_csv(etf_comparison_data_github_url)
Rows: 537 Columns: 14
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): ticker, company_name, sector, esg_uw_ow
dbl (7): esg_etf, standard_etf, esg_tilt, esg_tilt_z_score, esg_tilt_rank, e...
lgl (3): in_esg_only, in_standard_only, in_on_index_only

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(etf_comparison)
Rows: 537
Columns: 14
$ ticker              <chr> "PRU", "GIS", "K", "ADP", "ECL", "JCI", "ES", "PEG…
$ company_name        <chr> "PRUDENTIAL FINANCIAL INC", "GENERAL MILLS INC", "…
$ sector              <chr> "Financials", "Consumer Staples", "Consumer Staple…
$ esg_etf             <dbl> 0.5366803, 0.5522180, 0.4534279, 0.6486836, 0.4407…
$ standard_etf        <dbl> 0.10574313, 0.15134370, 0.05920732, 0.31168123, 0.…
$ in_esg_only         <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
$ in_standard_only    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
$ in_on_index_only    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
$ esg_tilt            <dbl> 0.4309371, 0.4008743, 0.3942206, 0.3370024, 0.3222…
$ esg_uw_ow           <chr> "Overweight", "Overweight", "Overweight", "Overwei…
$ esg_tilt_z_score    <dbl> -3.396231, -3.159305, -3.106867, -2.655928, -2.539…
$ esg_tilt_rank       <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,…
$ esg_tilt_percentile <dbl> 0.000000000, 0.001865672, 0.003731343, 0.005597015…
$ esg_tilt_quantile_5 <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
etf_comparison |> 
  select(sector) |> 
  distinct()
# A tibble: 11 × 1
   sector                
   <chr>                 
 1 Financials            
 2 Consumer Staples      
 3 Information Technology
 4 Materials             
 5 Industrials           
 6 Utilities             
 7 Energy                
 8 Health Care           
 9 Real Estate           
10 Consumer Discretionary
11 Communication         
etf_comparison |> 
  group_by(in_esg_only, in_standard_only) |> 
  summarise(avg_esg_tilt = mean(esg_tilt, na.rm = TRUE)) 
`summarise()` has grouped output by 'in_esg_only'. You can override using the
`.groups` argument.
# A tibble: 3 × 3
# Groups:   in_esg_only [2]
  in_esg_only in_standard_only avg_esg_tilt
  <lgl>       <lgl>                   <dbl>
1 FALSE       FALSE                  0.0459
2 FALSE       TRUE                  -0.0744
3 TRUE        FALSE                  0.118 

Companies in ESG fund have higher average ESG tilt than companies in standard fund.

etf_comparison |> 
  group_by(sector) |> 
  select(ticker, company_name, sector, esg_tilt) |> 
  add_rank_features(value = esg_tilt, rank_by_lowest = FALSE) |> 
  arrange(sector, esg_tilt_rank)
# A tibble: 537 × 9
# Groups:   sector [11]
   ticker company_name    sector esg_t…¹ esg_t…² esg_t…³ esg_t…⁴ esg_t…⁵ esg_t…⁶
   <chr>  <chr>           <chr>    <dbl>   <dbl>   <dbl>   <dbl>   <int>   <dbl>
 1 ZI     ZOOMINFO TECHN… Commu…  0.114   -1.68        1  0            1 -0.136 
 2 VZ     VERIZON COMMUN… Commu…  0.0867  -1.38        2  0.0417       1 -0.104 
 3 CABO   CABLE ONE INC   Commu…  0.0830  -1.33        3  0.0833       1 -0.0994
 4 EA     ELECTRONIC ART… Commu…  0.0733  -1.22        4  0.125        1 -0.0878
 5 IPG    INTERPUBLIC GR… Commu…  0.0666  -1.15        5  0.167        1 -0.0798
 6 PARA   PARAMOUNT GLOB… Commu…  0.0549  -1.01        6  0.208        2 -0.0658
 7 WBD    WARNER BROS. D… Commu…  0.0303  -0.730       7  0.25         2 -0.0363
 8 DIS    WALT DISNEY     Commu…  0.0124  -0.524       8  0.292        2 -0.0148
 9 DISH   DISH NETWORK C… Commu… -0.0113  -0.254       9  0.333        2  0.0135
10 LUMN   LUMEN TECHNOLO… Commu… -0.0228  -0.121      10  0.375        2  0.0273
# … with 527 more rows, and abbreviated variable names ¹​esg_tilt,
#   ²​esg_tilt_z_score, ³​esg_tilt_rank, ⁴​esg_tilt_percentile,
#   ⁵​esg_tilt_quantile_5, ⁶​esg_tilt_pct_total
etf_comparison |> 
  group_by(sector) |> 
  summarise(avg_esg_tilt = mean(esg_tilt, na.rm = TRUE)) 
# A tibble: 11 × 2
   sector                 avg_esg_tilt
   <chr>                         <dbl>
 1 Communication              -0.0334 
 2 Consumer Discretionary     -0.00748
 3 Consumer Staples            0.00247
 4 Energy                      0.00925
 5 Financials                 -0.00843
 6 Health Care                -0.00606
 7 Industrials                 0.00466
 8 Information Technology      0.0172 
 9 Materials                   0.00526
10 Real Estate                 0.00839
11 Utilities                  -0.0113 

Communication sector has the lowest average ESG tilt score, while information technology sector has the highest.

etf_comparison |> 
  ggplot(aes(x = esg_tilt, fill = in_esg_only)) + 
  geom_histogram(alpha = .7, position = "identity") 
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Most companies have negative ESG tilt. Only a small portion of companies are picked into ESG funds, and they have positive ESG tilt.