First, I will calculate the quantitative relationship among ESG composite index, E index, S index and G index ratings to observe the ESG scores of China’s A-share listed companies. From the results shown in the figure below, we can see that the ESG index and S (Social) index are mostly rated as B,BB and CCC, while the E (Environmental) index is mainly rated as C. It is worth noting that the Goverance index is mostly ahead of the BBB index. This means that many companies have poor performance in environmental protection and need to strengthen environmental awareness and practices; The deficiencies in social responsibility and social impact require more social input and improvement. While most companies are doing relatively well in terms of governance structure and management, there is still room for improvement.
# histlibrary(ggplot2)library(gridExtra)
Attaching package: 'gridExtra'
The following object is masked from 'package:dplyr':
combine
plot_rating_histogram <-function(data, rating_col, title) {ggplot(data, aes(x =!!as.symbol(rating_col), fill =!!as.symbol(rating_col))) +geom_bar() +labs(title = title, x = rating_col, y ="Frequency") +theme_minimal() +theme(legend.position ="none")}esg_histogram <-plot_rating_histogram(merged_data, "grade", "ESG Composite Rating")e_histogram <-plot_rating_histogram(merged_data, "E_grade", "Environmental (E) Rating")s_histogram <-plot_rating_histogram(merged_data, "S_grade", "Social (S) Rating")g_histogram <-plot_rating_histogram(merged_data, "G_grade", "Governance (G) Rating")combined_plot <-grid.arrange(esg_histogram, e_histogram, s_histogram, g_histogram, nrow =2, ncol =2, top ="Distribution of ESG Ratings")
In the second step, I drew the correlation coefficient heat map of each numerical indicator to detect whether there is correlation between various variables, so as to avoid the occurrence of regression analysis multicollinearity in the following steps. In the following steps, I will use G_score, S_score,E_score and score as independent variables, and year_return,volume and circulation_value as dependent variables. Looking at the impact of ESG rating index on stock return rate, trading volume and current market value, it can be noticed from the heat map that among the four independent variables mentioned above, score is highly correlated with the other three independent variables. The reason is that score is weighted by the scores of the other three indexes, while the correlation among the other three indexes is low. So it can be used for regression analysis.
Then, I aggregate the data according to industry classification, calculate the average rate of return of each industry and the average of each index in each year, and then draw the annual change curve. The results can be seen in the following set of graphs. It can be seen that the difference between the five charts is obvious, not only the trend, but also the performance of different industries in different indexes have different rankings.
Before analyzing the results, I would like to explain the classification of industries. In the table, I used numerical serial numbers to represent each industry for easy classification. 1-6 represent: including Finance, Utilities, Properties, Conglomerates, Industrials, Commerce six major industries.
It can be seen that the yields of the six industries have remained almost consistent in the intermediate process, and by the end of 2021, each industry has different results. In the composite index, Finance and Properties have higher scores, while Conglomerates and Utilities have the lowest overall scores, which has a lot to do with the nature of their work and the environment. From the perspective of each grading index, the E index increases year by year, which means that listed companies pay more and more attention to environmental Governance, while the governance score decreases year by year until 2019, which may be because the score is very high at the beginning, and the stability and improvement of this part is ignored later.
Through the above EDA analysis, we can see some relationships between ESG index and industry, time, and market. In the next part we will do regression analysis using indices and quotations to try to find more meaningful results.
<ggproto object: Class ScaleDiscrete, Scale, gg>
aesthetics: colour
axis_order: function
break_info: function
break_positions: function
breaks: waiver
call: call
clone: function
dimension: function
drop: TRUE
expand: waiver
get_breaks: function
get_breaks_minor: function
get_labels: function
get_limits: function
guide: legend
is_discrete: function
is_empty: function
labels: Finance Utilities Properties Conglomerates Industrials C ...
limits: function
make_sec_title: function
make_title: function
map: function
map_df: function
n.breaks.cache: NULL
na.translate: TRUE
na.value: grey50
name: waiver
palette: function
palette.cache: NULL
position: left
range: environment
rescale: function
reset: function
scale_name: manual
train: function
train_df: function
transform: function
transform_df: function
super: <ggproto object: Class ScaleDiscrete, Scale, gg>