tidyquant pulls data directly from several different financial and economic sources into R and integrates the core financial packages zoo, xts, quantmod, TTR, and PerformanceAnalytics with the tidyverse syntax.
rm(list=ls()) clears all objects from the global
environmentgc() frees up memory by cleaning up unused objects;
important if you’re working on more memory-intensive analysisrm(list=ls())
gc()
## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
## Ncells 531578 28.4 1184037 63.3 NA 669514 35.8
## Vcells 980710 7.5 8388608 64.0 18432 1851810 14.2
library(tidyquant)
library(tidyverse)
library(DataExplorer)
tq_get() pulls web-based financial data from different
sources. Some require an API key.
Yahoo Finance
stock.prices and stock.prices.japan: open,
high, low, close, volume and adjusted stock prices for a stock
symboldividends: dividends for a stock symbolsplits: split ratio for a stock symbolFRED
economic.data: economic data from FRED.Requires API key and includes crypto and other more specialized and exotic datasets:*
tq_get_options()
## [1] "stock.prices" "stock.prices.japan" "dividends"
## [4] "splits" "economic.data" "quandl"
## [7] "quandl.datatable" "tiingo" "tiingo.iex"
## [10] "tiingo.crypto" "alphavantager" "alphavantage"
## [13] "rblpapi"
Pull economic data from FRED using the series code on the top-right of the chart:
tq_get(c("IMPCA", "IMPCH"), get="economic.data")
## # A tibble: 242 × 3
## symbol date price
## <chr> <date> <dbl>
## 1 IMPCA 2015-01-01 25844.
## 2 IMPCA 2015-02-01 23266.
## 3 IMPCA 2015-03-01 25778.
## 4 IMPCA 2015-04-01 24763.
## 5 IMPCA 2015-05-01 24283.
## 6 IMPCA 2015-06-01 27352.
## 7 IMPCA 2015-07-01 24718.
## 8 IMPCA 2015-08-01 24817.
## 9 IMPCA 2015-09-01 25289.
## 10 IMPCA 2015-10-01 23669.
## # ℹ 232 more rows
tq_get(c("IMPCA", "IMPCH"), get="economic.data") %>%
ggplot(aes(x=date, y=price, color=symbol))+
geom_line() +
ggtitle("U.S. Imports of Goods by Customs Basis from China and Canada") +
ylab("Millions of Dollars") +
scale_y_continuous(labels = scales::dollar_format(prefix="$", suffix = "M"))
open, high, low, and close: the opening, high, low, and
closing stock prices that day.[volume:](https://www.investopedia.com/articles/technical/02/010702.asp#:~:text=Volume%20measures%20the%20number%20of,prices%20fall%20on%20increasing%20volume.)
the number of trades that day.adjusted stock price: While the closing price simply
refers to the cost of shares at the end of the day, the adjusted closing
price takes dividends, stock splits, and new stock offerings into
account.Jan. 1, 1980 is entered as the start date, but Apple went public on Dec. 12, 1980, so this is the earliest date pulled.
tq_get("AAPL", get = "stock.prices",
from = " 1980-01-01")
## # A tibble: 11,149 × 8
## symbol date open high low close volume adjusted
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 1980-12-12 0.128 0.129 0.128 0.128 469033600 0.0987
## 2 AAPL 1980-12-15 0.122 0.122 0.122 0.122 175884800 0.0936
## 3 AAPL 1980-12-16 0.113 0.113 0.113 0.113 105728000 0.0867
## 4 AAPL 1980-12-17 0.116 0.116 0.116 0.116 86441600 0.0889
## 5 AAPL 1980-12-18 0.119 0.119 0.119 0.119 73449600 0.0914
## 6 AAPL 1980-12-19 0.126 0.127 0.126 0.126 48630400 0.0970
## 7 AAPL 1980-12-22 0.132 0.133 0.132 0.132 37363200 0.102
## 8 AAPL 1980-12-23 0.138 0.138 0.138 0.138 46950400 0.106
## 9 AAPL 1980-12-24 0.145 0.146 0.145 0.145 48003200 0.112
## 10 AAPL 1980-12-26 0.158 0.159 0.158 0.158 55574400 0.122
## # ℹ 11,139 more rows
stocks <- tq_get(c("NVDA", "AMZN", "META", "AAPL"),
get = "stock.prices",
from = "2024-01-01",
to = "2025-03-07")
head(stocks)
## # A tibble: 6 × 8
## symbol date open high low close volume adjusted
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 NVDA 2024-01-02 49.2 49.3 47.6 48.2 411254000 48.2
## 2 NVDA 2024-01-03 47.5 48.2 47.3 47.6 320896000 47.6
## 3 NVDA 2024-01-04 47.8 48.5 47.5 48.0 306535000 48.0
## 4 NVDA 2024-01-05 48.5 49.5 48.3 49.1 415039000 49.1
## 5 NVDA 2024-01-08 49.5 52.3 49.5 52.3 642510000 52.2
## 6 NVDA 2024-01-09 52.4 54.3 51.7 53.1 773100000 53.1
For instance, you can see that Nvidia has a much higher volume of trades than the other tech stocks.
stocks %>%
ggplot(aes(x=date, y=volume, color=symbol)) +
geom_line() +
theme_tq() +
#facet_wrap(~symbol, scales="free_y", ncol=2) +
scale_y_continuous(labels = scales::comma_format())
tq_exchange_options()
## [1] "AMEX" "NASDAQ" "NYSE"
tq_fund_source_options()
## [1] "SSGA"
tq_index_options()
## [1] "DOW" "DOWGLOBAL" "SP400" "SP500" "SP600"
nyse <- tq_exchange("NYSE")
## Getting data...
head(nyse)
## # A tibble: 6 × 7
## symbol company last.sale.price market.cap country ipo.year industry
## <chr> <chr> <dbl> <dbl> <chr> <int> <chr>
## 1 A "Agilent Technolo… 122. 3.48e10 "Unite… 1999 "Biotec…
## 2 AA "Alcoa Corporatio… 31.0 8.02e 9 "Unite… 2016 "Alumin…
## 3 AACT "Ares Acquisition… 11.1 0 "" 2023 "Blank …
## 4 AAM "AA Mission Acqui… 10.2 0 "" 2024 ""
## 5 AAMI "Acadian Asset Ma… 23.6 8.87e 8 "Unite… 2014 "Invest…
## 6 AAP "Advance Auto Par… 36.6 2.19e 9 "Unite… NA "Auto &…
tq_mutate adds columns to the existing dataframe.
tq_transmute works exactly like tq_mutate except it only
returns the newly created columns. This is helpful when changing the
periodicity in the data, such as from daily to quarterly returns, where
the new columns would not have the same number of rows.
More Tutorials:
tq_mutate_fun_options()
## $zoo
## [1] "rollapply" "rollapplyr" "rollmax"
## [4] "rollmax.default" "rollmaxr" "rollmean"
## [7] "rollmean.default" "rollmeanr" "rollmedian"
## [10] "rollmedian.default" "rollmedianr" "rollsum"
## [13] "rollsum.default" "rollsumr"
##
## $xts
## [1] "apply.daily" "apply.monthly" "apply.quarterly" "apply.weekly"
## [5] "apply.yearly" "diff.xts" "lag.xts" "period.apply"
## [9] "period.max" "period.min" "period.prod" "period.sum"
## [13] "periodicity" "to.daily" "to.hourly" "to.minutes"
## [17] "to.minutes10" "to.minutes15" "to.minutes3" "to.minutes30"
## [21] "to.minutes5" "to.monthly" "to.period" "to.quarterly"
## [25] "to.weekly" "to.yearly" "to_period"
##
## $quantmod
## [1] "allReturns" "annualReturn" "ClCl" "dailyReturn"
## [5] "Delt" "HiCl" "Lag" "LoCl"
## [9] "LoHi" "monthlyReturn" "Next" "OpCl"
## [13] "OpHi" "OpLo" "OpOp" "periodReturn"
## [17] "quarterlyReturn" "seriesAccel" "seriesDecel" "seriesDecr"
## [21] "seriesHi" "seriesIncr" "seriesLo" "weeklyReturn"
## [25] "yearlyReturn"
##
## $TTR
## [1] "adjRatios" "ADX" "ALMA"
## [4] "aroon" "ATR" "BBands"
## [7] "CCI" "chaikinAD" "chaikinVolatility"
## [10] "CLV" "CMF" "CMO"
## [13] "CTI" "DEMA" "DonchianChannel"
## [16] "DPO" "DVI" "EMA"
## [19] "EMV" "EVWMA" "GMMA"
## [22] "growth" "HMA" "keltnerChannels"
## [25] "KST" "lags" "MACD"
## [28] "MFI" "momentum" "OBV"
## [31] "PBands" "ROC" "rollSFM"
## [34] "RSI" "runCor" "runCov"
## [37] "runMAD" "runMax" "runMean"
## [40] "runMedian" "runMin" "runPercentRank"
## [43] "runSD" "runSum" "runVar"
## [46] "SAR" "SMA" "SMI"
## [49] "SNR" "stoch" "TDI"
## [52] "TRIX" "ultimateOscillator" "VHF"
## [55] "VMA" "volatility" "VWAP"
## [58] "VWMA" "wilderSum" "williamsAD"
## [61] "WMA" "WPR" "ZigZag"
## [64] "ZLEMA"
##
## $PerformanceAnalytics
## [1] "Return.annualized" "Return.annualized.excess"
## [3] "Return.clean" "Return.cumulative"
## [5] "Return.excess" "Return.Geltner"
## [7] "zerofill"
tq_get(c("NVDA", "META", "AAPL", "MSFT"),
get = "stock.prices",
from = "2024-01-01") %>%
group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = periodReturn,
period = "quarterly",
type = "arithmetic") %>%
ggplot(aes(x=date, y=quarterly.returns, fill=symbol)) +
geom_col(position = "dodge") +
# facet_wrap(~ symbol, ncol = 2) +
theme_tq() +
scale_fill_tq() +
scale_y_continuous(labels = scales::percent)
This is a type of bar chart that plots the open, close, high, and low of the daily stock returns and color-codes the bars based on whether the day ended with the stock price up (blue) or down (red).
Source: HowToTrade.com
Tidyquant also has functions for candlestick
charts, geom_candlestick, and for displaying moving
averages, geom_ma() and geom_bbands() for more
complex Bollinger
Bands.
tq_get("NVDA",
get = "stock.prices",
from = "2025-01-01",
to = "2025-03-07") %>%
ggplot(aes(x = date, y = close)) +
geom_barchart(aes(open = open, high = high, low = low, close = close), na.rm = TRUE) +
labs(title = "Nvidia Daily Returns", y = "Closing Price", x = "") +
theme_tq() +
# geom_ma() + # moving average
scale_y_continuous(labels = scales::dollar_format())
The correlation matrix uses the DataExplorer package.
cor <- tq_get(c("META", "NVDA", # Tech
"GLD", "PPLT", # gold and platinum
"AAL", # American airlines
"CVS", # healthcare
"XOM", "DBO"), # Exxon &
get = "stock.prices",
from = "2020-01-01") %>%
group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = periodReturn,
period = "monthly",
type = "arithmetic") %>%
select(date, symbol, return = monthly.returns) %>%
spread(symbol, return)
cor
## # A tibble: 63 × 9
## date AAL CVS DBO GLD META NVDA PPLT XOM
## <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2020-01-31 -0.0773 -0.0855 -0.151 0.0374 -0.0375 -0.0145 -0.0233 -0.124
## 2 2020-02-28 -0.290 -0.127 -0.106 -0.00636 -0.0468 0.142 -0.100 -0.172
## 3 2020-03-31 -0.360 0.00253 -0.245 -0.00222 -0.133 -0.0240 -0.163 -0.262
## 4 2020-04-30 -0.0148 0.0374 -0.0962 0.0726 0.227 0.109 0.0901 0.224
## 5 2020-05-29 -0.126 0.0653 0.173 0.0259 0.0996 0.215 0.0681 -0.0215
## 6 2020-06-30 0.245 -0.00915 0.0754 0.0274 0.00880 0.0701 -0.0113 -0.0165
## 7 2020-07-31 -0.149 -0.0312 0.0486 0.108 0.117 0.118 0.0907 -0.0590
## 8 2020-08-31 0.174 -0.0130 0.0491 -0.00324 0.156 0.260 0.0308 -0.0509
## 9 2020-09-30 -0.0582 -0.0599 -0.0650 -0.0417 -0.107 0.0117 -0.0444 -0.140
## 10 2020-10-30 -0.0822 -0.0396 -0.107 -0.00519 0.00462 -0.0736 -0.0521 -0.0498
## # ℹ 53 more rows
plot_correlation(na.omit(cor, maxcat = 5L))
## Warning in dummify(data, maxcat = maxcat): Ignored all discrete features since
## `maxcat` set to 20 categories!