Workshop 2, Finacial Programming

Data Structures

Lists

Lists are R objects that can store elements of different classes, oposite to vectors which can be used to store elements of the same class only. Each element of a list can be of any object class such as data frames, numeric vectors, matrices or even list. For example, we can create a list to store information of a financial portfolio.

I will create a list that represents information of a financial portfolio. I first create the elements of the portfolio:

# Create vectors for stock tickers and expected stock returns:
tickers <- c("AAPL","AMZN","WMT")
expected_stock_returns <- c(0.020,0.015,0.010)
expected_stock_risk <- c(0.04,0.045,0.02)
tickers
expected_stock_returns
expected_stock_risk

Now I will indicate how much (in %) I will allocate for each asset in my portfolio. I will assign 50% to Apple, and 25% to Amazon and Wal-Mart. I create a vector with these weights (in decimal):

stock_weights <- c(0.50,0.25,0.25)

Now I will calculate the expected portfolio return as a weighted average the stock expected returns:

weighted_returns <- expected_stock_returns*stock_weights
weighted_returns

When you multiply 2 vectors you get an element-by-element multiplication. This is called a “vectorization” operation. By default R “vectorizes” mathematical operations. This means the first element of the first vector is multiplied by the first element of the second vector. The same for the 2nd. and 3rd. elements of each vector.

Now I just sum all the elements of this vector that contains the weighted returns to get the expected return of my portfolio:

expected_portfolio_return <- sum(weighted_returns) 

I used the function sum to the weighted_returns vector, so each element of the vector is added to get the expected portfolio return.

I now create a list with all these pieces of information of my portfolio. I use the function list() to create the portfolio object as a list that contains the tickers vector, and the vectors for expected stock return, expected stock risk, stock weights and the expected portfolio return:

portfolio1 <- list(tickers, expected_stock_returns, expected_stock_risk, 
                   stock_weights, expected_portfolio_return)
# I assign names to each element of the portfolio:
names(portfolio1)<-c("Tickers","Stock_returns","Stock_risk","Stock_weights",
                     "Portfolio expected Return")
# I display the portfolio object:
portfolio1

I can access the elements of the portfolio1 list using either the name of each element, or the number of the element using double squared brackets[[]]:

# I display the expected stock tickers:
portfolio1$Tickers
# I can also display expected stock tickers using the element number and double
#  brackets:
portfolio1[[1]]
# I display the class and length of the first element:
class(portfolio1$Tickers)
length(portfolio1$Tickers)

The xts data class

R has different data classes (or R objects). Actually, you can create your own data class, which is a set one or more data structures. For time series datasets, which are very common in Finance, the most popular data classes are: ts, zoo and xts. xts stands for extensible time series class. An xts object contains a set of columns that can be of any type (alphabetic, numeric, boolean) and it has a time index. All xts objects are usually sorted chronologically.

The index of an xts object is like the first column of the object, but it is actually its index; it is not considered a column of the object. An xts object has the advantage that we can apply several functions to that object for data merging, sorting, selecting, etc.

Try the following example. Download a sample dataset called edhec, which comes with the PerformanceAnalytics package. In R, we can use thousands of available packages to do different types of calculations and operations. PerformanceAnalytics package is a package with many functions related to Financial analytics. We will use some of these functions later in the course.

we need the PerformanceAnalytics package, so you can install it from the Package menu (right-bottom of your RStudio).

Once the package is installed in my computer, I do not need to re-install it again. However, each time I need to use the package, I need to load it into memory:

library(PerformanceAnalytics)

I download the sample dataset edhec, which contains monthly returns of different hedge fund indexes created by the EDHEC Risk Institute.

data("edhec")

The edhec dataset will appear in my Global Environment.

edhec is an xts R object that contains historical monthly returns of different hedge fund indexes. We can easily do different calculations with specific functions that apply to xts objects. For now, we will learn how to do subsetting with xts objects.

If you want to know more functions for the xts R objects, you can check the xts CheatSheet at: https://s3.amazonaws.com/assets.datacamp.com/blog_assets/xts_Cheat_Sheet_R.pdf

Subsetting data

R has powerful indexing features for accessing object elements. These features can be used to select and exclude columns and rows (observations). In the next paragraphs we will learn how to extract the information we want from an xts object. Several of these functions also apply to data frames.

I see the structure of the edhec R dataset and the first and last rows:

str(edhec)
head(edhec)
tail(edhec)

From the results above, we see that this xts object has 13 columns, which are returns of different hedge fund indexes.

I can subset either rows, columns or both. For example I can select in which months the Emeging market fund has had negative returns:

 negative_returns_em <- edhec[edhec$`Emerging Markets`<0,4]
# I display the first rows of this new object, which has only negative returns for Emerging Markets:
head(negative_returns_em)
 negative_returns_em_all <- edhec[edhec$`Emerging Markets`<0,]
 head(negative_returns_em_all)

As you see, to do subsetting I use squared brackets afher the object, and the first number refers to the ROWS to be selected, and the second parameter refers to the COLUMNS to be selected. In this case, after the coma I wrote NOTHING, indicating that I want ALL COLUMNS!

If I want to do a subsetting of specific rows using a range of dates, I can do the following. Since this is an xts object, it is very easy to subset using dates. I select the returns of the 12 months of the year 2008:

 edhec2008<-edhec["2008-01-01/2008-12-31",]

I can also select those months where the Emerging Markets fund and the Convertible Arbitrage fund had positive returns in 2008:

 positive_edhec2008<-edhec2008[edhec2008$`Convertible Arbitrage`>0 & edhec2008$`Emerging Markets`>0,]
positive_edhec2008

We can see that only 2 months had positive returns in these funds in 2008.

There are several logical operators to be used for the subsetting condition of rows. These operators are:

== equal to != different to < less than <= less than or equal to > more than >= more than or equal to or & and

There are several logical operators to be used for the subsetting condition of rows. These operators are:

== equal to != different to < less than <= less than or equal to > more than >= more than or equal to or & and

##Control-flow commands Control-flow commands or constructs are used to program a series of tasks using conditions and/or loops.

There are 7 main control structures:

if/else: testing a condition for: execute a loop a fixed number of times repeat: execute an infinite loop while: execute a loop while a condition is true break: break the execution of a loop next: skip an interation of a loop return: exit a function

The most simple ones are the conditionals if() and if/else, which allow you to implement some basic algorithm logic in your analysis. We will start with the conditionals statements and then the loop statements.

Conditionals

If and if else

if(<condition1>) {
  ## do stuff
}

Here is a simple example, that is self-explanatory:

grade <- 70

if(grade > 69){cat("You have passed this course. Well done!")}

In this case the condition will be true, so the message will be displayed. You can try different values for grade and re-run this code to see what happens.

ELSE IF

Imagine that you have several conditions, not only one as the previous example. In this case, you need to use the else if statement. The structure looks like this:

if(<condition1>) {
  ## do something
} else if(<condition2>) {
  ## do something different
}

So our previous example can be upgraded as follows. Try chaging the values in the input variables.

gradePartial1 <- 80
gradePartial2 <- 60
gradeFinalExam <- 100

FinalGrade <- (gradePartial1*0.25 + gradePartial1*0.25 + gradeFinalExam*0.5)

if(FinalGrade > 90){
    "You did an Excellent job!"
  } else if(FinalGrade > 69){
    "You have passed this course. Well done."
  } else if(FinalGrade > 0){
    "Sorry you have failed the course."
  }

ELSE

Now imagine that we have several conditionals but we want to do something with the remaining variables that don’t fulfill the previous conditionals, then we use else. The structure looks like:

gradePartial1 <- -70
gradePartial2 <- 60
gradeFinalExam <- -60

FinalGrade <- (gradePartial1*0.25 + gradePartial1*0.25 + gradeFinalExam*0.5)

if(FinalGrade > 90){
    "You did awesome!"
  } else if(FinalGrade > 69){
    "You have passed this course. Well done."
  } else if(FinalGrade > 0){
    "Sorry you have failed the course."
  } else {
    "You have negative grades; it should be an error"
}

%in%

In the next example we use the operator %in%. This operator checks whether a value exists inside a vector. For each element of the vector it returns TRUE if the value is contained in the verctor and 0 otherwise:

# Both elements in the first vector appears in the second vector
c(1,2) %in% c(6,4,8,3,2,1)
# Only the last two elements of the first verctor appear in the second
c(6,4,8,3,2,1) %in% c(1,2)

For Loop

As you see, the result of using the %in% operator is a vector of Boolean values (TRUE OR FALSE). If the first element is TRUE, it means that the first number of the first vector is inside the second vector.

students<-c("Pedro","Laura","Bryan")

for(name in students){
  ##Instead of name you can write any variable just change it inside the loop
  cat("Hi, my name is",name)
}

In the next example we will use the quantmod library to download some financial data from online data sources such as Yahoo finance and Google Finance. Our initial vector indicate the tickers to download, in this case AAPL,JPM and GE.

You have to install the package in the Package tab of the bottom-right windows of RStudio.

Once you install the command, you have to load it with the library function. This quantmod library has the getSymbols function that is used to download online data from the web:

library(quantmod)

#Vector of tickers
tickers <-c("AAPL","JPM","GE")

for(i in tickers){
    getSymbols(i)
    cat("the prices of the ticker",i,"have been downloaded")
}

In the next example we do a similar process for a different ticker list. In the case of GE and AAPL we are going to download the data from yahoo, while INTGSTMXM193N from the FED. What happens with the remainning tickers?

tickers <- c("AAPL","GM","INTGSTMXM193N","GE","JPM")

for(i in tickers){
  if(i %in% c("AAPL","GE")){
    getSymbols(i,src = "yahoo")
    cat("the prices of the ticker",i,"have been downloaded from yahoo")
  }
  else if(i=="INTGSTMXM193N"){
    getSymbols(i,src = "FRED")
    #Note that the source for the FED is called FRED
    cat("the data of the ticker",i,"have been downloaded from the FED (Federal US Bank) ")
  }
}

We are going to use the previous example, but in this case we want also to add a condition to print the tickers that have NOT been downloaded.

tickers <- c("AAPL","GM","INTGSTMXM193N","GE","JPM")

for(i in tickers){
  
  if(i %in% c("AAPL","GE")){
    getSymbols(i,src = "yahoo")
    cat("the prices of the ticker",i,"have been downloaded from yahoo")
  }
  else if(i=="INTGSTMXM193N"){
    getSymbols(i,src = "FRED")
    cat("the data of the ticker",i,"have been downloaded from the FED (Federal US Bank) ")
  }
  else {
    cat("The ticker",i,"has not been downloaded")
  }
}

WHILE AND REPEAT TOOLS

In a similar manner than the for loop, it is possible to program a loop with while and repeat control-flow statements. The main difference is that in both you need to specify what is the exit condition, and it may be possible that NOT all the itereations has to be performed (like in the for case).

For the while you need to specify the condition at the beginning. Here an example. If you want to know how many years you need to keep an investment with a fixed interest rate in order to duplicate your initial investment:

 APR<-0.10
  # I define Annual Percentage Rate to be equal to 10%
  INV<-100
  # Initial investment equal to $100
  MULTIPLE<-2
  # Multiple = 2 to check when the investment double
  BALANCE<-INV
  # I start assigning the balance equal to the initial investment
  year<-0
  # I start with year equal to zero
  while (BALANCE<MULTIPLE*INV) {
    # the exit condition means that while the balance is less than the initial investment
    #    multiplied by the multiple, then continue with the iterations of the loop
    year<-year+1 
    # I increase the value of year by 1
    BALANCE<-BALANCE*(1+APR) 
    # I multiply the current balance times the growth factor (1+APR) using to the 
    #   Annual Percentage Rate
  }
  cat(
     "To multiply your investment times ", MULTIPLE, "you need ", year, " years.")
cat(
     "Your balance after ", year, " years will be $",BALANCE)

Unlike the while statement, you can use the repeat statement to do a loop, but in the case of repeat, you have to specify the exit condition using if and the break statement any place within the loop. Here is the same loop we did above but using repeat:

 APR<-0.10
  # I define Annual Percentage Rate to be equal to 10%
  INV<-100
  # Initial investment equal to $100
  MULTIPLE<-2
  # Multiple = 2 to check when the investment double
  BALANCE<-INV
  # I start assigning the balance equal to the initial investment
  year<-0
  # I start with year equal to zero
  repeat  {
    year<-year+1 
    # I increase the value of year by 1
    BALANCE<-BALANCE*(1+APR) 
    # I multiply the current balance times the growth factor (1+APR) using to the 
    #   Annual Percentage Rate
    if (BALANCE>=MULTIPLE*INV) {
      break
    }
    # If the balance is greater than the multiple times the investment, then
    #   the break statement is executed, so the program stops the interations
  }
  cat(
     "To multiply your investment times ", MULTIPLE, "you need ", year, " years.")
  cat(
     "Your balance after ", year, " years will be $",BALANCE)

5.4 Looping vs R-vectorization Most R programmers believe that loops must be AVOIDED! This is wierd since most of the time programming requires to do repetitive tasks. However, in R, unlike other traditional computer languages such as C, there is an alternative way to do loops without writing loops!

A <-c(3,3,4,6,8) 
  B <-c(-2,-3,-1,3,5)
  # A and B are vectors with the annual free cash flows of each product 
  sumcashflows<-vector(length=5)
  for (i in 1:length(A)) {
    sumcashflows[i]<-A[i] + B[i]
    # We sum the element i of both vectors and leave the result
    #   in the position i of the vector sumcashflows
  }
  sumcashflows
  A <-c(3,3,4,6,8) 
  B <-c(-2,-3,-1,3,5)
  sumcashflows<-A+B
  # We just sum both vectors as if they were numbers
  sumcashflows

5.5 Application: Present value of future cash flows Now, how can you calculate the Present Value of these cash flows? You have to remember the basics of time value of money. To calculate the present value (PV0) of a sequence of cash flows in the future, you just have to apply the following formula for the sequence of cash flows:

con loop

  # I assign 0 the the variable PV, present value
  PV <- 0
  # I defined the discount rate as R:
  R <- 0.15 
  for (i in 1:length(sumcashflows)) {
    # I calculate the present value of each cash flow i:
    PVCF <- sumcashflows[i] / (1 + R)^i
   # I sum each corresponding cash flow i to the cumulative variable PV    
    PV <- PV + PVCF
  }
 # The loop iterates 5 times, one for each cash flow and will ends with the sum of all 
  cat("The Present Value of all cash flows is ", PV)

sin loop

 # I assign a sequence from 1 to 5, for the exponents of the formula of present value 
  #   for each cash fow
  exponents <- seq(1,5)
  # The seq is a function that generates a sequence. Here I specified to start the 
  #   sequence in 1 and finish in 5
  # I calculate a vector with the present value for each cash flow. I use vectorization:
  PVvector <- sumcashflows / (1 + R)^exponents
  # Note that this mathematical expression applies to each of the element of the vector. 
  #   Also, each vector has the same dimension.
  # I finally sum all elements of the vector using the function sum:
  PV<-sum(PVvector)
  cat("The Present Value of all cash flows is ", PV)

##7 CHALLENGE 1 You have to write a program to calculate the number of months needed to finish paying a mortgage loan. The information about the loan is the following:

Loan amount = $3,000,000.00 pesos

APR (Annual % rate) = 11% (compounded monthly)

Monthly Fixed Payment = $40,000.00 (includes interests and capital)

Your program has to provide 2 results: the number of months needed to finish paying the loan, and the amount of the last payment if the payment is less than the fixed payment amount. Your program has to be able to run with any change in any of the values of the above variables.

This is a quite challenging exercise!

Hint: if you are familiar with Excel, start solving the problem in Excel, and then try to write your program in R.

##8 CHALLENGE 2 Write a program that calculates the price of the following bond issued by the company ABC. ABC needs to finance an important project to develop a new technological device.

To get the money, ABC issued a bond with the following characteristics:

Principal: $3,000,000 Time to maturity: 20 years Coupon rate: 11% (annual) coupons are payed each 6 months Calculate the price of this bond for each of the following annual interest rates:

8% 11% 13% You have to get the price for the bond for each of these 3 interest rates.

Remember that the price of a bond is the present value of its future cash flows.

CHALLENGE 1

You have to write a program to calculate the number of months needed to finish paying a mortgage loan. The information about the loan is the following:

Loan amount = $3,000,000.00 pesos

APR (Annual % rate) = 11% (compounded monthly)

Monthly Fixed Payment = $40,000.00 (includes interests and capital)

Your program has to provide 2 results: the number of months needed to finish paying the loan, and the amount of the last payment if the payment is less than the fixed payment amount. Your program has to be able to run with any change in any of the values of the above variables.

This is a quite challenging exercise!

Hint: if you are familiar with Excel, start solving the problem in Excel, and then try to write your program in R.

FinancialMath::amort.period(Loan = 3000000, pmt = 40000, i = .11, n = NA)
Error in FinancialMath::amort.period(Loan = 3e+06, pmt = 40000, i = 0.11,  : 
  Too small of pmt.

CHALLENGE 2

Write a program that calculates the price of the following bond issued by the company ABC. ABC needs to finance an important project to develop a new technological device.

To get the money, ABC issued a bond with the following characteristics:

Principal: $3,000,000 Time to maturity: 20 years Coupon rate: 11% (annual) coupons are payed each 6 months Calculate the price of this bond for each of the following annual interest rates:

8% 11% 13% You have to get the price for the bond for each of these 3 interest rates.

Remember that the price of a bond is the present value of its future cash flows.

FinancialMath::bond(t = NA, f = 3000000, r = 20, n = 11, i = .08, c = 1)
         Bond Summary
Price    4.283379e+08
Premium  4.283379e+08
Coupon   6.000000e+07
Eff Rate 8.000000e-02
Years    1.100000e+01
MAC D    5.239503e+00
MOD D    4.851392e+00
MAC C    3.710111e+01
MOD C    3.630025e+01

(I think?)

FinancialMath::bond(t = NA, f = 3000000, r = 20, n = 11, i = .11, c = 1)
         Bond Summary
Price    3.723909e+08
Premium  3.723909e+08
Coupon   6.000000e+07
Eff Rate 1.100000e-01
Years    1.100000e+01
MAC D    4.978808e+00
MOD D    4.485413e+00
MAC C    3.415739e+01
MOD C    3.176382e+01
FinancialMath::bond(t = NA, f = 3000000, r = 20, n = 13, i = .11, c = 1)
         Bond Summary
Price    4.049922e+08
Premium  4.049922e+08
Coupon   6.000000e+07
Eff Rate 1.100000e-01
Years    1.300000e+01
MAC D    5.582155e+00
MOD D    5.028969e+00
MAC C    4.395334e+01
MOD C    4.020412e+01
