Write text and code here.
What is (are) your main question(s)? What is your story? What does the final graphic show?
For stock investors in South Korea, checking how the U.S. market closed the previous day is almost a daily routine upon waking up. This habit is not limited to those who invest in the U.S. market; even for those focused on the Korean market, the U.S. market serves as a barometer of the global economic flow and an indicator for the next day’s performance in the Korean stock market.
Through the analysis and graphical representation of the correlation between the KOSPI index and the S&P 500 index, we aim to provide investors with valuable insights.
Additionally, in the context of persistent undervaluation of the Korean stock market—often referred to as the “Korea Discount”—we hope to offer a resource for reflecting on the future direction of Korea’s stock market through the lens of data-driven graphics”
Explain where the data came from, what agency or company made it, how it is structured, what it shows, etc.
The data used in this analysis comes from Yahoo Finance, a trusted source for historical financial data that includes stock market prices, indices, and various economic indicators. The data is provided through the tidyquant R package, which enables easy access to financial market data from Yahoo Finance via an API.
KOSPI: The KOSPI (Korea Composite Stock Price Index) data is sourced from Yahoo Finance under the ticker symbol ^KS11. This index represents the performance of the Korean stock market, specifically the stock prices of companies listed on the Korea Stock Exchange (KSE). S&P 500: The S&P 500 data is obtained under the ticker symbol ^GSPC. The S&P 500 Index is a benchmark representing the performance of 500 large publicly traded companies in the United States and is a widely recognized indicator of U.S. stock market performance.
Among the three major US stock indices, the Dow Jones Indices and the Nasdaq Indices are also famous. The reason for choosing the s&p 500 among them is that it is a market cap type stock index(시가총액식 주가지수) similar to the KOSPI. So I thought that comparing the KOSPI and the s&p 500 would be effective.
In this study, we focus on the daily adjusted closing prices of these indices. These adjusted values are key for performing return calculations and analyzing market trends.
Describe and show how you cleaned and reshaped the data
kospi_data <- tq_get("^KS11", from = "1980-01-01", to = "2024-12-06")
sp500_data <- tq_get("^GSPC", from = "1980-01-01", to = "2024-12-06")
Describe and show how you created the first figure. Why did you choose this figure type?
The first figure, which illustrates the adjusted closing prices of KOSPI and S&P 500 indices over time and insert lines about critical economic events, was created using a line graph.
A line graph was chosen because it effectively visualizes time series data, allowing for easy comparison of the movements of KOSPI and S&P 500 indices over time, highlights the relation between global economic events and market performance, and makes it easier to identify trends, fluctuations, and the overall direction of the indices.
#s&p500과 kospi 데이터 다운로드( 실제 코스피 기준일은 1980년 4월 1일이나, 야후데이터 내 가져올 수 있는 시기가 1996년 12월 부터로 파악되어 데이터 기준일을 1995년 1월로 임의 지정했다)
kospi_data <- tq_get("^KS11", from = "1995-01-01", to = "2024-12-06")
sp500_data <- tq_get("^GSPC", from = "1995-01-01", to = "2024-12-06")
ggplot() +
geom_line(data = kospi_data, aes(x = date, y = adjusted, color = "KOSPI"), size = 0.5) + # KOSPI 선
geom_line(data = sp500_data, aes(x = date, y = adjusted, color = "S&P 500"), size = 0.5) + # S&P 500 선
labs(
title = "Adjusted Closing Prices of KOSPI and S&P 500 with Key Events",
x = "Date",
y = "Adjusted Closing Price",
color = "Index"
) +
theme_minimal() +
scale_color_manual(values = c("KOSPI" = "blue", "S&P 500" = "red")) +
#주요 경제 이슈 삽입
geom_vline(xintercept = as.Date("2000-03-10"), color = "black", linetype = "dashed") +
geom_vline(xintercept = as.Date("2008-09-15"), color = "black", linetype = "dashed") +
geom_vline(xintercept = as.Date("2020-03-11"), color = "black", linetype = "dashed") +
geom_vline(xintercept = as.Date("2024-12-03"), color = "black", linetype = "dashed") +
geom_vline(xintercept = as.Date("1997-11-01"), color = "black", linetype = "dashed") +
annotate("text", x = as.Date("2000-03-10"), y = 2000,
label = "Dotcom Bubble", angle = 45, vjust = 0.5, hjust = 1, size = 3) +
annotate("text", x = as.Date("2008-09-15"), y = 2000,
label = "2008 Financial Crisis", angle = 0, vjust = 0.5, hjust = 1, size = 3) +
annotate("text", x = as.Date("2020-03-11"), y = 2000,
label = "COVID-19 Pandemic", angle = 45, vjust = 0.5, hjust = 1, size = 3) +
annotate("text", x = as.Date("2024-12-03"), y = 2000,
label = "2024 Korean Martial Law", angle = 45, vjust = 0.5, hjust = 1, size = 3) +
annotate("text", x = as.Date("1997-11-01"), y = 2000,
label = "1997 IMF Crisis", angle = 45, vjust = 0.5, hjust = 1, size = 3) +
theme(
legend.position = "top",
legend.title = element_blank(),
legend.text = element_text(size = 10)
)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
ggsave("figure_1.png", width = 10, height = 6, dpi = 300)
library(dplyr)
library(ggplot2)
library(lubridate)
library(ggplot2)
daily_data <- kospi_data %>%
select(date, adjusted) %>%
rename(kospi_adjusted = adjusted) %>%
inner_join(
sp500_data %>%
select(date, adjusted) %>%
rename(sp500_adjusted = adjusted),
by = "date"
)
#투자적 인사이트를 제공하기 위한 상관관계 분석
correlation_value <- cor(daily_data$kospi_adjusted, daily_data$sp500_adjusted, use = "complete.obs")
ggplot(daily_data, aes(x = sp500_adjusted, y = kospi_adjusted)) +
geom_point(alpha = 0.5, color = "skyblue",na.rm = TRUE) +
geom_smooth(method = "lm", color = "red", se = FALSE,na.rm = TRUE) +
labs(
title = paste("Daily Correlation between KOSPI and S&P 500: ", round(correlation_value, 2)),
x = "S&P 500 Adjusted Close",
y = "KOSPI Adjusted Close"
) +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'
ggsave("figure_2.png", width = 10, height = 6, dpi = 300)
## `geom_smooth()` using formula = 'y ~ x'
library(dplyr)
library(ggplot2)
pandemic_data <- kospi_data %>%
filter(date >= "2020-01-01") %>%
select(date, adjusted) %>%
rename(kospi_index = adjusted) %>%
inner_join(
sp500_data %>%
filter(date >= "2020-01-01") %>%
select(date, adjusted) %>%
rename(sp500_index = adjusted),
by = "date"
)
#상관관계는 0.79로 꽤 높은 양의 상관관계를 가진다. 그러나 figure 1의 그래프에서 팬데믹 이후 두 지수의 차이가 보여서 기존 연구설계와 다르게 이 부분에 좀 더 주목하기로 했다. 두 지수의 차이를 새로운 그래프로 작성
pandemic_data <- pandemic_data %>%
mutate(index_difference = sp500_index - kospi_index)
ggplot(pandemic_data, aes(x = date, y = index_difference)) +
geom_line(color = "purple", size = 0.5) + # 차이값의 시간 변화
geom_hline(yintercept = 0) +
labs(
title = "Difference between S&P 500 and KOSPI (Pandemic Onwards)",
x = "Date",
y = "Index Difference (S&P 500 - KOSPI)"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
ggsave("figure_3.png", width = 10, height = 6, dpi = 300)
In showing the figures that you created, describe why you designed it the way you did. Why did you choose those colors, fonts, and other design elements? Does it convey truth?
figure 1: The line colors are set to blue for KOSPI and red for S&P 500 to clearly differentiate the two indice and make it easy to compare the changes in both indices. Additionally, vertical lines are added to highlight key economic events, and text annotations are included to explain the impact of each event on the indices. These vertical lines mark significant turning points such as the Dotcom Bubble, the 2008 Financial Crisis, the COVID-19 Pandemic, and the 2024 Korean Martial Law. The text annotations provide the necessary context for understanding each event, helping the viewer easily connect the events with the changes in the indices. This visual design effectively conveys the truth, not just by listing the data, but by providing the context needed to understand the relationship between the events and the indices.
figure 2: Figure 2 visualizes the daily correlation between the KOSPI and S&P 500 adjusted closing prices in a scatter plot. Alpha transparency (alpha = 0.5) is applied to make the data points semi-transparent, improving visibility when points overlap. This is useful for large datasets, ensuring the plot doesn’t become too cluttered. The skyblue color of the points creates a calm background, preventing visual overwhelm. Additionally, a red linear regression line is added to highlight the relationship between the two indices. The red color makes the regression line stand out against the skyblue points, making it easy to spot. The correlation coefficient is displayed in the title, providing a quantitative measure of the relationship. This graph clearly conveys the correlation between the two indices, with the regression line emphasizing the overall trend.
figure 3: Figure 3 visualizes the difference between S&P 500 and KOSPI over time after the pandemic. The choice of purple was made to avoid confusion, as blue and red were already used for KOSPI and S&P 500 in previous graphs. By using a different color, purple, the difference is clearly emphasized without overlap. This graph clearly conveys the relative changes between the two indices after the pandemic, with the purple line visually showing the fluctuation in the difference.
Conclusion: Through Figure 1, the overall movement of the index could be understood, and how the index moved on major events that anyone could know was also confirmed. Since we could roughly grasp the trend of the two indices, we immediately started correlation analysis in Figure 2. The correlation was 0.79, showing a significant positive correlation. In other words, data from 30 years or so suggests that if the s&p 500 rises, the kospi will also rise with a fairly high probability, and investing accordingly guarantees quite high stability. However, in the graph of Figure 1, it was confirmed that the post-pandemic index of the s&p 500 continued to rise, while the kospi was sluggish. In response, the difference between the two indices was identified in Figure 3, and the difference was also widening. This suggest an inquiry topic that needs to be closely grasped what problems are in the KOSPI market from a macroscopic point of view. It raises the need to analyze and solve the causes of the ever-present “Korea discount,” the problem of the size of the domestic market, the export-dependent economic structure, and especially political instability such as recent emergency martial law.
You can also include images like this: