Financial analytics in R

Author

Beata Sirowy

Importing financial data to R with Quantmod package

An overview

Quantmod - a Quantitative Financial Modelling & Trading Framework - can be used to load data from a variety of sources, including:

  • R binary formats (.RData and .rda)
  • Comma Separated Value files (.csv)

OHLC data functions

Quantmod enables 3 major types of column extraction functions. If you’re looking for the open, the high, or the series low - it’s available:

  • Op, Hi,Lo, Cl,Vo, Ad - extract the columns Open, High, Low, Close, Volume, and Adjusted (Yahoo)

  • is.OHLC, has.OHLC, has.Op, has.Cl, has.Hi, has.Lo, has.Ad, and has.Vo - checks the desired parameters

  • seriesHi and seriesLo - the highest and the lowest point in the series

Charting with Quantmod

Quantmod also provides a tool to visualize financial time series - the function chartSeries - with which we can create line charts, as well as OHLC bar and candle charts.

library(quantmod)
Loading required package: xts
Loading required package: zoo

Attaching package: 'zoo'
The following objects are masked from 'package:base':

    as.Date, as.Date.numeric
Loading required package: TTR
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 

Examples

Getting Apple data (AAPL)

We want to collect information about Apple. We only need to know their ticker, which is ‘AAPL’ on the ‘NASDAQ’.

We can get the required data with the getSymbols function:

apple_df <- getSymbols('AAPL', src='yahoo', auto.assign=FALSE)
  • The auto.assign=FALSE parameter allows us to control the name of the dataframe and not have automatically put into memory

  • The src='yahoo' parameter allows us to specific the source of the stock information that we would like to use. We can

chartSeries(apple_df, name="AAPL stocks", subset="last 10 years", theme=chartTheme("white"))

barChart(apple_df, name="AAPL stocks", subset="last 5 years")

candleChart(apple_df, name="AAPL stocks", subset="last 3 months", multi.col=TRUE,theme="white") 

Non-OHLC and Volume series are handled automatically

OHLC functions

getSymbols("AAPL")
[1] "AAPL"
is.OHLC(AAPL) # does the data contain at least OHL and C
[1] TRUE
has.Vo(AAPL) # how about volume?
[1] TRUE
Op(AAPL) # select just the Open column 
            AAPL.Open
2007-01-03   3.081786
2007-01-04   3.001786
2007-01-05   3.063214
2007-01-08   3.070000
2007-01-09   3.087500
2007-01-10   3.383929
2007-01-11   3.426429
2007-01-12   3.378214
2007-01-16   3.417143
2007-01-17   3.484286
       ...           
2024-10-07 224.500000
2024-10-08 224.300003
2024-10-09 225.229996
2024-10-10 227.779999
2024-10-11 229.300003
2024-10-14 228.699997
2024-10-15 233.610001
2024-10-16 231.600006
2024-10-17 233.429993
2024-10-18 236.179993
seriesHi(AAPL) # where and what was the high point 
           AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
2024-10-15    233.61    237.49   232.37     233.85    64751400        233.85
HiCl(AAPL) #the percent change from high to close 
               HiCl.AAPL
2007-01-03 -0.0321091481
2007-01-04 -0.0033739764
2007-01-05 -0.0133409869
2007-01-08 -0.0122500588
2007-01-09 -0.0044095800
2007-01-10 -0.0081798054
2007-01-11 -0.0101260827
2007-01-12 -0.0046285509
2007-01-16 -0.0015423833
2007-01-17 -0.0271516478
       ...              
2024-10-07 -0.0177234257
2024-10-08 -0.0009292480
2024-10-09 -0.0009140662
2024-10-10 -0.0020043866
2024-10-11 -0.0081077572
2024-10-14 -0.0018555762
2024-10-15 -0.0153269582
2024-10-16 -0.0014647439
2024-10-17 -0.0072696693
2024-10-18 -0.0049961585

Subsetting data


AAPL[‘2024’] #returns all Apple’s 2024 OHLC
AAPL[‘2024-01’] #now just January of 2024
AAPL[’2024-06::2024-08-15 #Jun of 24through Aug 15 of 24

AAPL[‘::’] # everything in AAPL
AAPL[‘2022::’] # everything in AAPL, from 2022 onward
non.contiguous <- c(‘2007-01’,‘2007-02’,‘2007-12’)
AAPL[non.contiguous] # different dates

AAPL['2023::']
           AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
2023-01-03    130.28    130.90   124.17     125.07   112117500      123.9046
2023-01-04    126.89    128.66   125.08     126.36    89113600      125.1826
2023-01-05    127.13    127.77   124.76     125.02    80962700      123.8551
2023-01-06    126.01    130.29   124.89     129.62    87754700      128.4122
2023-01-09    130.47    133.41   129.89     130.15    70790800      128.9373
2023-01-10    130.26    131.26   128.12     130.73    63896200      129.5119
2023-01-11    131.25    133.51   130.46     133.49    69458900      132.2462
2023-01-12    133.88    134.26   131.44     133.41    71379600      132.1669
2023-01-13    132.03    134.92   131.66     134.76    57809700      133.5043
2023-01-17    134.83    137.29   134.13     135.94    63646600      134.6734
       ...                                                                  
2024-10-07    224.50    225.69   221.33     221.69    39505400      221.6900
2024-10-08    224.30    225.98   223.25     225.77    31855700      225.7700
2024-10-09    225.23    229.75   224.83     229.54    33591100      229.5400
2024-10-10    227.78    229.50   227.17     229.04    28183500      229.0400
2024-10-11    229.30    229.41   227.34     227.55    31759200      227.5500
2024-10-14    228.70    231.73   228.60     231.30    39882100      231.3000
2024-10-15    233.61    237.49   232.37     233.85    64751400      233.8500
2024-10-16    231.60    232.12   229.84     231.78    34082200      231.7800
2024-10-17    233.43    233.85   230.52     232.15    32993800      232.1500
2024-10-18    236.18    236.18   234.01     235.00    46416300      235.0000

Converting daily data with functions:

to.weekly()

to.monthly()

to.quarterly()

to.yearly()

to.yearly (apple_df)
           apple_df.Open apple_df.High apple_df.Low apple_df.Close
2007-12-31      3.081786      7.248571     2.925000       7.074286
2008-12-31      7.116786      7.152143     2.826429       3.048214
2009-12-31      3.067143      7.641071     2.792857       7.526071
2010-12-31      7.622500     11.666429     6.794643      11.520000
2011-12-30     11.630000     15.239286    11.089286      14.464286
2012-12-31     14.621429     25.181070    14.607143      19.006071
2013-12-31     19.779285     20.540714    13.753571      20.036428
2014-12-31     19.845715     29.937500    17.626785      27.594999
2015-12-31     27.847500     33.634998    23.000000      26.315001
2016-12-30     25.652500     29.672501    22.367500      28.955000
2017-12-29     28.950001     44.299999    28.690001      42.307499
2018-12-31     42.540001     58.367500    36.647499      39.435001
2019-12-31     38.722500     73.492500    35.500000      73.412498
2020-12-31     74.059998    138.789993    53.152500     132.690002
2021-12-31    133.520004    182.130005   116.209999     177.570007
2022-12-30    177.830002    182.940002   125.870003     129.929993
2023-12-29    130.279999    199.619995   124.169998     192.529999
2024-10-18    187.149994    237.490005   164.080002     235.000000
           apple_df.Volume apple_df.Adjusted
2007-12-31    246995985600          5.974059
2008-12-31    285981206000          2.574141
2009-12-31    143253686800          6.355580
2010-12-31    151024927200          9.728352
2011-12-30    124059339600         12.214731
2012-12-31    131964204400         16.192688
2013-12-31    102421569600         17.499369
2014-12-31     63657952400         24.608042
2015-12-31     52264199600         23.866425
2016-12-30     38729911200         26.845057
2017-12-29     27243106000         39.855339
2018-12-31     34156144800         37.707047
2019-12-31     28254942800         71.250435
2020-12-31     39863855600        129.894363
2021-12-31     22812206100        174.900360
2022-12-30     22065504500        128.719345
2023-12-29     14804257200        191.802170
2024-10-18     12146251000        235.000000

Adding technical analysis

We can add technical analysis using tools from TTR package to the above chart. The library was developed by Josh Ulrich - details are available on CRAN

chartSeries(apple_df, subset="last 3 months")

addMACD()

addBBands() 

Gold prices

We use data from Oanda - please note that Oanda only provides historical data for the past 180 days.

gold_prices_recent<- getSymbols("XAU/USD",src="oanda") 

Gold prices, weekly for the last 3 months, with custom color candles using the quantmod function to.weekly.

chartSeries(to.weekly(XAUUSD),subset="last 3 months", up.col='white',dn.col='blue')

chartSeries(XAUUSD,name="Gold prices (.oz) in $USD")

S&P 500 (^GSPC)

We download S&P 500 data since 1927 until today.

sp500_df <- getSymbols('^GSPC', src='yahoo', from = "1927-01-01", to = "2024-10-24", auto.assign=FALSE)
sp500_df <- getSymbols('^GSPC', src='yahoo', from = "1927-01-01", to = "2024-10-24", auto.assign=FALSE)
chartSeries(sp500_df, name="S&P 500 since 1927", subset="last 98 years", theme=chartTheme("white"))

Adding technical analysis - last 6 months

chartSeries(sp500_df, name="S&P 500", subset="last 6 months", theme=chartTheme("white"))

addMACD()

addBBands() 

Saving data to csv file

write.csv(sp500_df, "sp500_1.csv", row.names = T)

Dataframe preview in RStudio:

The dates in the data frame are not imported correctly with this method - they are turned into ascending numbers (1,2,3,…). This is because quantmod::getSymbols() returns an xts object by default.

xts objects are based on zoo objects, which are a matrix with an ‘index’ attribute. The date we see when xts/zoo objects are printed in R is the index attribute, not row names (or record names).

The easiest way to fix it is ti write xts/zoo objects to a text file is with write.zoo(). It will automatically include the index in the first column of the file.

write.zoo(sp500_df, "sp500zoo.csv", sep = ",")

Descriptive statistics

The function summary() calculates the summary statistics for the financial data:

summary(apple_df)
     Index              AAPL.Open         AAPL.High          AAPL.Low      
 Min.   :2007-01-03   Min.   :  2.835   Min.   :  2.929   Min.   :  2.793  
 1st Qu.:2011-06-13   1st Qu.: 12.624   1st Qu.: 12.682   1st Qu.: 12.481  
 Median :2015-11-23   Median : 28.006   Median : 28.260   Median : 27.668  
 Mean   :2015-11-25   Mean   : 56.915   Mean   : 57.520   Mean   : 56.334  
 3rd Qu.:2020-05-07   3rd Qu.: 79.657   3rd Qu.: 80.438   3rd Qu.: 79.119  
 Max.   :2024-10-18   Max.   :236.480   Max.   :237.490   Max.   :234.010  
   AAPL.Close       AAPL.Volume        AAPL.Adjusted    
 Min.   :  2.793   Min.   :2.405e+07   Min.   :  2.358  
 1st Qu.: 12.585   1st Qu.:9.513e+07   1st Qu.: 10.628  
 Median : 28.017   Median :1.878e+08   Median : 25.457  
 Mean   : 56.953   Mean   :3.441e+08   Mean   : 55.060  
 3rd Qu.: 79.729   3rd Qu.:4.642e+08   3rd Qu.: 77.592  
 Max.   :235.000   Max.   :3.373e+09   Max.   :235.000  

Getting data for multiple stocks

setSymbolLookup(
  AAPL = 'yahoo',
  GOOGL = 'yahoo',
  TSLA = 'yahoo',
  META = 'yahoo',
  AMZN = 'yahoo',
  NVDA = 'yahoo',
  MSFT = 'yahoo',
  "^GSPC" = 'yahoo',
  "GC=F" = 'yahoo',
  "BTC / USD" = 'yahoo'
  )

With the symbol lookup set, we don’t have to specify the source of data in the following operations.