This report investigates the dependency between the stock prices of Nvidia and AMD, two leading companies in the GPU market. Using copula techniques, we model the joint dependence of their stock returns while accounting for their individual marginal distributions. This approach captures potential tail dependencies and non-linear relationships that traditional correlation measures might overlook.
library(fitdistrplus)
library(copula)
library(VineCopula)
library(ggplot2)
library(summarytools)
amd_data <- read.csv("/home/dell/R/midterm/MacroTrends_Data_Download_AMD.csv", header = TRUE)
nvda_data <- read.csv("/home/dell/R/midterm/MacroTrends_Data_Download_NVDA.csv", header = TRUE)
amd_data$date <- as.Date(amd_data$date, format = "%Y-%m-%d")
nvda_data$date <- as.Date(nvda_data$date, format = "%Y-%m-%d")
start_date <- as.Date("2020-01-01")
end_date <- as.Date("2023-03-16")
amd <- amd_data[amd_data$date >= start_date & amd_data$date <= end_date, ]
nvda <- nvda_data[nvda_data$date >= start_date & nvda_data$date <= end_date, ]
The daily log returns is used instead of raw prices because returns are stationary, allowing for meaningful statistical analysis. Log returns stabilize the variance, make the data scale-independent, and enable easier comparison between assets. Additionally, they ensure time-additivity and are more appropriate for modeling dependence and joint distributions
amd_adj <- amd$"close"
nvda_adj <- nvda$"close"
amd_returns <- diff(log(amd_adj))
nvda_returns <- diff(log(nvda_adj))
returns_data <- data.frame(
Date = amd$date[-1],
AMD_Returns = amd_returns,
NVDA_Returns = nvda_returns
)
hist(amd_returns, breaks = 30, main = "AMD Daily Returns", xlab = "Log Returns")
hist(nvda_returns, breaks = 30, main = "NVIDIA Daily Returns", xlab = "Log Returns")
The histogram of daily log returns for both AMD and NVIDIA shows a
roughly symmetric bell-shaped distribution centered around zero.
However, compared to a Normal distribution, there are noticeably more
extreme values in both tails.
qqnorm(amd_returns, main = "Q-Q Plot for AMD Returns")
qqline(amd_returns)
qqnorm(nvda_returns, main = "Q-Q Plot for NVIDIA Returns")
qqline(nvda_returns)
The points in both the lower-left and upper-right corners of both variables fall away from the line, showing that the empirical data has fatter tails than the normal distribution, confirming the presence of heavy tails.
Heavy tails in the data led us to choose the Student’s t-distribution, as it better captures extreme values than the Normal distribution.
ddt_ls <- function(x, m, s, df) { dt((x - m) / s, df) / s }
pdt_ls <- function(q, m, s, df) { pt((q - m) / s, df) }
qdt_ls <- function(p, m, s, df) { qt(p, df) * s + m }
fit_amd_t <- fitdist( data = amd_returns, distr = “dt_ls”, start = list(m = mean(amd_returns), s = sd(amd_returns), df = 5) )
fit_nvda_t <- fitdist( data = nvda_returns, distr = “dt_ls”, start = list(m = mean(nvda_returns), s = sd(nvda_returns), df = 5) )
| Parameter | AMD Estimate | AMD Std. Error | NVIDIA Estimate | NVIDIA Std. Error |
|---|---|---|---|---|
| m | 0.00056906 | 0.00081523 | 0.00253001 | 0.00085504 |
| s | 0.02541353 | 0.00089315 | 0.02684952 | 0.00089060 |
| df | 4.77887335 | 0.66512249 | 5.23219381 | 0.73697972 |
| Statistic | AMD | NVIDIA |
|---|---|---|
| Log-Likelihood | 2655.730 | 2609.796 |
| AIC | -5305.460 | -5213.592 |
| BIC | -5289.938 | -5198.070 |
| m | s | df | |
|---|---|---|---|
| m | 1.000000000 | 0.008037739 | 0.008940574 |
| s | 0.008037739 | 1.000000000 | 0.703536243 |
| df | 0.008940574 | 0.703536243 | 1.000000000 |
| m | s | df | |
|---|---|---|---|
| m | 1.00000000 | -0.03249407 | -0.03744379 |
| s | -0.03249407 | 1.00000000 | 0.67552588 |
| df | -0.03744379 | 0.67552588 | 1.00000000 |
The fitted Student’s t-distribution captures the heavy tails of both assets, with low degrees of freedom and better AIC/BIC than the normal distribution. This validates its use as the marginal distribution before applying the copula model.
kendall_tau <- cor(amd_returns, nvda_returns, method = "kendall")
spearman_rho <- cor(amd_returns, nvda_returns, method = "spearman")
kendall_tau
## [1] 0.6434669
spearman_rho
## [1] 0.8282584
Kendall’s Tau (0.64) and Spearman’s Rho (0.83) show a strong positive dependence between AMD and NVIDIA returns, supporting the use of a copula to model their joint behavior
u_amd <- pobs(as.matrix(amd_returns))[, 1]
u_nvda <- pobs(as.matrix(nvda_returns))[, 1]
plot(u_amd, u_nvda, main = "Pseudo-Observations: AMD vs NVIDIA", xlab = "u_amd", ylab = "u_nvda", pch = 16, cex = 0.5)
The pseudo-observations show strong positive dependence and clustering in both tails, indicating symmetric tail dependence, which justifies using the t-copula for modeling the joint behavior.
copula_gaussian <- BiCopEst(u1 = u_amd, u2 = u_nvda, family = 1)
copula_t <- BiCopEst(u1 = u_amd, u2 = u_nvda, family = 2)
copula_clayton <- BiCopEst(u1 = u_amd, u2 = u_nvda, family = 3)
copula_gumbel <- BiCopEst(u1 = u_amd, u2 = u_nvda, family = 4)
copula_frank <- BiCopEst(u1 = u_amd, u2 = u_nvda, family = 5)
copula_comparison <- data.frame(
Copula = c("Gaussian", "t-Copula", "Clayton", "Gumbel", "Frank"),
LogLikelihood = c(copula_gaussian$logLik, copula_t$logLik, copula_clayton$logLik, copula_gumbel$logLik, copula_frank$logLik),
AIC = c(copula_gaussian$AIC, copula_t$AIC, copula_clayton$AIC, copula_gumbel$AIC, copula_frank$AIC),
BIC = c(copula_gaussian$BIC, copula_t$BIC, copula_clayton$BIC, copula_gumbel$BIC, copula_frank$BIC)
)
copula_comparison
## Copula LogLikelihood AIC BIC
## 1 Gaussian 468.3981 -934.7962 -930.1041
## 2 t-Copula 490.6681 -977.3361 -967.9520
## 3 Clayton 419.4093 -836.8185 -832.1264
## 4 Gumbel 433.6611 -865.3222 -860.6301
## 5 Frank 461.1189 -920.2377 -915.5457
Based on the comparison of log-likelihood, AIC, and BIC values, the Student’s t-copula provides the best fit for the joint distribution of AMD and NVIDIA returns. It has the highest log-likelihood (490.67) and the lowest AIC (-977.34) and BIC (-967.95) among all the copulas tested. This indicates that the t-copula captures the dependence structure more accurately, particularly the symmetric tail dependence observed in the pseudo-observations. The Gaussian copula performed second best but lacks tail dependence, while the Archimedean copulas (Clayton, Gumbel, and Frank) showed poorer fits, likely due to their inability to model both upper and lower tail dependence simultaneously.
u <- v <- seq(0.01, 0.99, length.out = 50)
grid <- expand.grid(u = u, v = v)
z <- matrix(
BiCopPDF(u1 = grid$u, u2 = grid$v,
family = copula_t$family,
par = copula_t$par,
par2 = copula_t$par2),
nrow = length(u), ncol = length(v)
)
persp(u, v, z,
theta = 30, phi = 30, expand = 0.6,
col = "lightblue",
xlab = "u (AMD)",
ylab = "v (NVIDIA)",
zlab = "Density",
main = "t-Copula Density: 3D Surface Plot")
library(ggplot2)
copula_df <- data.frame(
u = grid$u,
v = grid$v,
density = as.vector(z)
)
ggplot(copula_df, aes(x = u, y = v, z = density)) +
geom_contour_filled(breaks = seq(0, 26, by = 2)) + # Custom breaks like your second plot
scale_fill_viridis_d(option = "plasma") + # Optional: better color scheme
labs(
title = "Contour Plot of t-Copula Density",
x = "u (AMD)",
y = "v (NVIDIA)"
) +
theme_minimal()
The contour plot of the fitted Student’s t-copula shows regions of high joint density along the diagonal from the lower-left to the upper-right, confirming the strong positive dependence between AMD and NVIDIA returns. The plot also reveals symmetric tail dependence, with denser areas in both the lower-left (joint extreme losses) and upper-right (joint extreme gains) corners. This visualization supports the statistical selection of the t-copula, demonstrating its ability to capture the likelihood of simultaneous extreme events in both assets.