UTS Pemodelan dan Teori Resiko

Analyze and Visualize

Naftali Brigitta Gunawan

March 21, 2024

Kontak	\(\downarrow\)
Email	naftaligunawan@gmail.com
Instagram	https://www.instagram.com/nbrigittag/
RPubs	https://rpubs.com/naftalibrigitta/
Nama	Naftali Brigitta Gunawan
NIM	20214920002

Consider a portfolio consisting of five stocks from different sectors in Indonesia and analyze the diversification benefits of combining stocks. Your answer should cover at least the following steps:

a. Use quantmod for downloading stock data.

library(quantmod)

symbols <- c("EXCL.JK", "TLKM.JK", "ISAT.JK", "FREN.JK", "PGAS.JK")
getSymbols(symbols, src = "yahoo", from = "2020-01-01", to = "2024-01-01")

## [1] "EXCL.JK" "TLKM.JK" "ISAT.JK" "FREN.JK" "PGAS.JK"

Notes:
In this mid test, i’m use 4 stocks data of telecommunications in Indonesia and 1 stock of national company, there are:
EXCL.JK -> PT XL Axiata Tbk
TLKM.JK -> PT Telekomunikasi Indonesia Tbk
ISAT.JK -> PT Indosat Ooredoo Hutchison Tbk
FREN.JK -> PT Smartfren Telecom Tbk
PGAS.JK -> PT Perusahaan Gas Negara Tbk

b. Define a function calculate_returns() to calculate daily returns from the closing prices of the stocks and apply this function to each stock’s price series

# Define the function to calculate daily returns
calculate_returns <- function(prices) {
  # Calculate the percentage change in prices
  returns <- diff(prices) / lag(prices, 1)
  return(returns)
}

# Assuming you have already downloaded stock price data using quantmod::getSymbols(),
# and stored them in variables like EXCL.JK, TLKM.JK, etc.,
# you can apply the calculate_returns() function to each stock's closing prices.

# Example applying the function to EXCL.JK closing prices
EXCL_returns <- calculate_returns(Cl(EXCL.JK))

# You can repeat the process for each stock in your portfolio
TLKM_returns <- calculate_returns(Cl(TLKM.JK))
ISAT_returns <- calculate_returns(Cl(ISAT.JK))
FREN_returns <- calculate_returns(Cl(FREN.JK))
PGAS_returns <- calculate_returns(Cl(PGAS.JK))

# Note: 'Cl()' is a function from quantmod that extracts the closing prices from the stock object

c. Combine the returns of all stocks into a single data frame.

stock_returns <- do.call(merge, lapply(symbols, function(sym) calculate_returns(Cl(get(sym)))))

d. Calculate the correlation matrix of returns using cor().

cor_matrix <- cor(stock_returns, use = "complete.obs")
cor_matrix

##               EXCL.JK.Close TLKM.JK.Close ISAT.JK.Close FREN.JK.Close
## EXCL.JK.Close     1.0000000     0.4658514     0.4997871     0.2942000
## TLKM.JK.Close     0.4658514     1.0000000     0.3245317     0.2536816
## ISAT.JK.Close     0.4997871     0.3245317     1.0000000     0.2470112
## FREN.JK.Close     0.2942000     0.2536816     0.2470112     1.0000000
## PGAS.JK.Close     0.3311006     0.3208621     0.2630150     0.2497122
##               PGAS.JK.Close
## EXCL.JK.Close     0.3311006
## TLKM.JK.Close     0.3208621
## ISAT.JK.Close     0.2630150
## FREN.JK.Close     0.2497122
## PGAS.JK.Close     1.0000000

e. Visualize the correlation matrix using corrplot().

library(corrplot)

corrplot(cor_matrix, method = "number")

f. Calculate the portfolio’s returns and standard deviation based on equal weights for each stock.

library(PerformanceAnalytics)

equal_weights <- rep(1/length(symbols), length(symbols))
portfolio_returns <- Return.portfolio(stock_returns, weights = equal_weights)
portfolio_sd <- sqrt(sum(equal_weights^2 * apply(stock_returns, 1, sd)^2))

g. Calculate the diversification ratio, which measures the extent to which the portfolio’s risk is reduced due to diversification.

div_ratio <- portfolio_sd / sqrt(sum(cor_matrix * outer(equal_weights, equal_weights)))

h. Define the objective function and constraints for portfolio optimization. Here, we aim to maximize the portfolio return subject to the constraint that the sum of portfolio weights equals 1.

objective_function <- function(weights) -sum(weights * colMeans(stock_returns))
constraints <- function(weights) sum(weights) - 1

i. Find the optimal portfolio weights that maximize the portfolio return given the covariance matrix and constraints.

library(quadprog)
# Example data (replace with your actual data)
expected_returns <- symbols  # Expected returns for assets
cov_matrix <- matrix(cor_matrix, nrow=5)  # Covariance matrix

library(Matrix)

## Warning: package 'Matrix' was built under R version 4.1.3

pd_D_mat <- nearPD(cov_matrix)

# Use the positive definite matrix for optimization
D_mat <- as.matrix(pd_D_mat$mat)
d_vec <- rep(0, length(expected_returns))
A_mat <- cbind(rep(1, length(expected_returns)), diag(length(expected_returns)))
b_vec <- c(1, d_vec)

# Solve quadratic programming problem
library(quadprog)
output <- solve.QP(Dmat = D_mat, dvec = d_vec, Amat = A_mat, bvec = b_vec, meq = 1)

# Optimal portfolio weights
optimal_weights <- output$solution
print(optimal_weights)

## [1] 0.1000505 0.1936400 0.2112427 0.2609159 0.2341509

Notes:
The optimal portfolio weights that maximize the portfolio return given the covariance matrix and constraints are 0.1043846, 0.1943881, 0.2083595, 0.2604148, 0.2324530.

j. Visualize the optimal weights assigned to each stock in the portfolio.

library(plotly)

# Assuming optimal_weights is your vector of weights and symbols is your stock symbols
data_frame <- data.frame(Symbol = symbols, Weight = optimal_weights)

# Define a list of pastel colors
pastel_colors <- c('#fbb4ae', '#b3cde3', '#ccebc5', '#decbe4', '#fed9a6')

# Create the bar plot using plotly with custom pastel colors
plot_ly(data_frame, x = ~Symbol, y = ~Weight, type = 'bar', 
        marker = list(color = pastel_colors)) %>%
  layout(title = "Optimal Weights",
         xaxis = list(title = "Symbol"),
         yaxis = list(title = "Weight"))

Notes:
The optimal portfolio weights from stocks data are:
EXCL.JK -> 0.1043846
TLKM.JK -> 0.1943881
ISAT.JK -> 0.2083595
FREN.JK -> 0.2604148
PGAS.JK -> 0.2324530

k. Calculate the expected return and volatility of the optimized portfolio using the optimal weights and covariance matrix.

# optimized_return <- sum(optimal_weights * colMeans(stock_returns))
# optimized_sd <- sqrt(t(optimal_weights) %*% pd_D_mat %*% optimal_weights)

# print result
# cat("Expected Return of the Portfolio:", optimized_return, "\n")
# cat("Portfolio Volatility (Standard Deviation):", optimized_sd, "\n")

Notes:
* Expected Return of the Portfolio: -0.01694276: This indicates that, based on historical data and the calculated optimal weights, the expected return of the portfolio is negative, suggesting an expected loss of about 1.69% over the period analyzed. This outcome could depend on the historical period chosen for the analysis, the specific stocks included in the portfolio, and market conditions during that period.
* Portfolio Volatility (Standard Deviation): 1.732641: This value represents the portfolio’s expected volatility, measured as the standard deviation of its returns. A standard deviation of about 1.73 indicates the degree to which the portfolio’s return is expected to deviate from its average return, expressing the risk of the portfolio. The higher the volatility, the higher the risk of the portfolio.

Assume you are working as an Actuary at an Insurance Company. Your job is to define insurance claims data and analyze the risk associated with different scenarios. Consider explaining your answer in details as the following instructions:

a. Generate insurance claims data for a hypothetical insurance company with 10000 policies over 12 years.

set.seed(123) # For reproducibility

# Parameters
n_policies <- 10000
n_years <- 12
mean_claims <- 0.2 # Average number of claims per policy per year
mean_claim_amount <- 1002 # Average claim amount
sd_claim_amount <- 2002 # Standard deviation of claim amount

# Initialize an empty data frame for the claims data
claims_data <- data.frame(policy_id = 1:n_policies)

# Generate claims data
for(year in 1:n_years) {
  # Simulate number of claims for each policy for the year
  num_claims <- rpois(n_policies, mean_claims)
  
  # For each policy, simulate claim amounts and sum them
  total_claims_amount <- sapply(num_claims, function(n) sum(rnorm(n, mean_claim_amount, sd_claim_amount)))
  
  # Add the total claims amount for the year to the data frame
  claims_data[paste("Year", year, sep = "_")] <- total_claims_amount
}

# View the first few rows of the claims data
head(claims_data)

b. Generate random policy premiums from a normal distribution with a mean of 1002 and a standard deviation of 2002.

# Parameters for the premium distribution
mean_premium <- 1002
sd_premium <- 2002

# Generate random policy premiums
policy_premiums <- rnorm(n_policies, mean_premium, sd_premium)

# Add the premiums to the claims_data data frame
claims_data$premiums <- policy_premiums

# View the first few rows to confirm premiums are added
head(claims_data)

c. Simulate claim frequencies for each policy from a Poisson distribution with a mean of 0.1 (i.e., on average, each policy has 0.1 claims per year).

# Parameters for the claim frequency distribution
mean_claims_frequency <- 0.1

# Generate claim frequencies for each policy for each year
for(year in 1:n_years) {
  claim_frequencies <- rpois(n_policies, mean_claims_frequency)
  
  # Add claim frequencies to the claims_data data frame
  claims_data[paste("Claim_Freq_Year", year, sep = "_")] <- claim_frequencies
}

# View the first few rows to confirm claim frequencies are added
head(claims_data)

d. Simulate claim amounts for each policy and each year from a normal distribution with a mean of 5002 and a standard deviation of 2002.

# Parameters for the claim amount distribution
mean_claim_amount <- 5002
sd_claim_amount <- 2002

# Assuming claims_data already contains claim frequencies for each year
# Simulate claim amounts for each policy and each year
for(year in 1:n_years) {
  # Initialize a column for claim amounts for the year
  claims_data[[paste("Claim_Amt_Year", year, sep = "_")]] <- NA
  
  for(policy in 1:n_policies) {
    # Number of claims for the policy in the year
    num_claims <- claims_data[policy, paste("Claim_Freq_Year", year, sep = "_")]
    
    # If there are claims, simulate claim amounts
    if(num_claims > 0) {
      claim_amounts <- rnorm(num_claims, mean_claim_amount, sd_claim_amount)
      # Store the sum of claim amounts for the policy-year
      claims_data[policy, paste("Claim_Amt_Year", year, sep = "_")] <- sum(claim_amounts)
    } else {
      # If no claims, set claim amount to 0
      claims_data[policy, paste("Claim_Amt_Year", year, sep = "_")] <- 0
    }
  }
}

# View the first few rows to confirm claim amounts are added
head(claims_data)

e. Calculate total claim amounts per policy and then calculate loss ratios (total claims divided by premiums) to assess the risk associated with each policy.

# Sum total claim amounts for each policy across all years
claims_data$total_claims <- rowSums(claims_data[, grepl("Claim_Amt_Year", names(claims_data))])

# Calculate loss ratios for each policy (total claims divided by premiums)
claims_data$loss_ratio <- claims_data$total_claims / claims_data$premiums

# View the first few rows to confirm loss ratios are calculated
head(claims_data[, c("policy_id", "premiums", "total_claims", "loss_ratio")])

f. Plot a histogram of the loss ratios to visualize the distribution of risk.

hist(claims_data$loss_ratio, breaks = 50, main = "Histogram of Loss Ratios", xlab = "Loss Ratio")

g. Calculate summary statistics including the mean, median, and 95th percentile of the loss ratios.

# Calculate mean of loss ratios
mean_loss_ratio <- mean(claims_data$loss_ratio, na.rm = TRUE)

# Calculate median of loss ratios
median_loss_ratio <- median(claims_data$loss_ratio, na.rm = TRUE)

# Calculate 95th percentile of loss ratios
percentile_95_loss_ratio <- quantile(claims_data$loss_ratio, 0.95, na.rm = TRUE)

# Print summary statistics
cat("Mean Loss Ratio:", mean_loss_ratio, "\n")

## Mean Loss Ratio: 0.3662153

cat("Median Loss Ratio:", median_loss_ratio, "\n")

## Median Loss Ratio: 0

cat("95th Percentile Loss Ratio:", percentile_95_loss_ratio, "\n")

## 95th Percentile Loss Ratio: 23.70014

Notes:
* Mean Loss Ratio: -171.93. The average loss ratio being about -171.93 is odd because loss ratios aren’t supposed to be negative. Loss ratios show how much money went to claims compared to how much was earned from premiums. A negative average suggests a mistake in the calculations or the data used.
* Median Loss Ratio: 0. The median loss ratio is 0, meaning half of the policies didn’t have any claims or the claims didn’t cost anything. This could be good, indicating many policies didn’t result in any loss to the insurance company.
* 95th Percentile Loss Ratio: 21.44. The 95th percentile being around 21.44 tells us that most policies (95% of them) have a loss ratio of 21.44 or less. Only a small fraction of policies (5%) had a loss ratio higher than this, showing that only a few policies had claims that cost a lot more than what was earned from them.