There are several R packages used to fetch public data from varied sources, Yahoo Finance, Google Finance, oanda, FRED, WDI, Quandl, datamarket, to name a few. Among them, pdfetch, quantmod, tseries, fImport, Quandl and rdatamarket are all very excellent.
Package pdfetch can retrieve data from Yahoo, FRED, world bank, US Bureau of Labor Statistics, the Bank of England interactive Statistical Database, European Central Bank’s statistical data warehouse, Eurostat, the French National Institute of Statistics and Economic Studies and UK Office of National Statistics.
Package quantmod can fetch data from Yahoo, Google, FRED and oanda.
Package WDI deals with data from the world bank (world development indicators). Package Quandl can fetch data from Quandl. Package rdatamarket is used to fetch data from datamarket
We need only data from Yahoo (shares, indices, financial statemengts etc.), FRED (US and world economic data), oanda (foreign exchages, precious metal prices), Quandl (commodities, futures etc.). We introduce the basic usage of quantmod and Quandl.
packageDescription('quantmod',
fields = c('Package', 'Version', 'Author', 'Description'))
## Package: quantmod
## Version: 0.4-0
## Author: Jeffrey A. Ryan
## Description: Specify, build, trade, and analyse quantitative
## financial trading strategies
##
## -- File: /home/rho/RLib/quantmod/Meta/package.rds
## -- Fields read: Package, Version, Author, Description
packageDescription('Quandl',
fields = c('Package', 'Version', 'Author', 'Description'))
## Package: Quandl
## Version: 2.3.2
## Author: Raymond McTaggart, Gergely Daroczi
## Description: This package interacts directly with the Quandl API
## to offer data in a number of formats usable in R, as well
## as the ability to upload and search.
##
## -- File: /home/rho/RLib/Quandl/Meta/package.rds
## -- Fields read: Package, Version, Author, Description
packageDescription('pdfetch',
fields = c('Package', 'Version', 'Author', 'Description'))
## Package: pdfetch
## Version: 0.1.5
## Author: Abiel Reinhart <abielr@gmail.com>
## Description: A package for downloading economic and financial time
## series from public sources.
##
## -- File: /home/rho/RLib/pdfetch/Meta/package.rds
## -- Fields read: Package, Version, Author, Description
Tabe 1 : Frequently Used ticker symbols in YAHOO
name | ticker symbol | name | ticker Symbol |
---|---|---|---|
上证综指 | 000001.ss or ^SSEC | 恒生指数 | ^HSI |
上证A指 | 000002.ss | Taiwan weighted | ^TWII |
上证B指 | 000003.ss | NIKKEI225 | ^N225 |
深证成指 | 399001.sz | Seoul composite | ^KSI |
深成指A | 399002.sz | straights times | ^STI |
深成指B | 399003.sz | NASDAQ | ^IXIC |
深证综指 | 399106.sz | DJIA | ^DJI |
深综指A | 399107.sz | SP500 | ^GSPC |
深综指B | 399108.sz | RUSSELL 1000 | ^RUI |
中小板指 | 399005.sz | Wilshire Total Market | ^W5000 |
沪深300 | 000300.ss | FTSE 100 | ^FTSE |
We will retrieve the historical dividends of “Hershey Foods” (HSY, 好时) and “Industrial and Commercial Bank of China” (601398.ss). Then we will get the historical trading prices of “Yinxing Energy” (000862.sz).
require('quantmod')
div.HSY <- getDividends('HSY', from = '2000-01-01', to = '2014-08-20')
div.ICBC <- getDividends('601398.ss', from = '2007-01-01', to = '2014-08-20')
# HSY pays its dividends quarterly, ICBC pays per annual
index.HSY <- endpoints(div.HSY, on = 'years')
ydiv.HSY <- period.sum(div.HSY, INDEX = index.HSY) # annual dividends
yinxing <- getSymbols('000862.sz', auto.assign = FALSE, from = '2010-10-10')
# number cannot used as object name, we set auto.assign as FALSE (default is TRUE), # or we can
setSymbolLookup(YINXING = list(name = '000862.sz'))
getSymbols('YINXING', from = '2010-10-10')
## [1] "YINXING"
Function getSplits() download split data from Yahoo, getQuote() fetch current stock quotes, getOptionChain() download option chain data.
getQuote('HSY; MSFT')
As a demonstration, we use pdfetch package to download historical trading data of “Yinxing Energy”.
require('pdfetch')
Yinxing.pdfetch <- pdfetch_YAHOO(identifiers = '000862.sz', fields = 'adjclose',
from = '2010-10-10')
Using Google Finance, we can retrieve financial statements of listed companies. But as you know, we can not link google freely. So you have to use some climbing over wall techniques to use this function.
getFinancials('HSY')
viewFinancials(HSY, type = 'BS', period = 'A')
# 'BS' for balance sheet, 'IS' for income statement, 'CF' for cash flow statement
# 'A' for annual, 'Q' for quarterly
We can retrieve foreign exchanges or precious metal data from oanda. Beware that there is a limit of 500 days per request.
getSymbols('USD/CNY', src = 'oanda', from = '2013-10-10')
## [1] "USDCNY"
# auto.assign = TRUE, or just getFX()
getMetals(Metals = 'gold', base.currency = 'CNY', from = '2013-10-10')
## [1] "XAUCNY"
Metals can be expressed in common name or symbol form.
Table 2 : 3-character symbols of precious metals
name | symbol | name | symbol |
---|---|---|---|
gold | XAU | silver | XAG |
palladium | XPD | platinum | XPT |
getSymbols('GDPC96', src = 'FRED') # us real gdp
## [1] "GDPC96"
getSymbols('AAA', src = 'FRED')
## [1] "AAA"
# monthly Moody's seasoned AAA corporate bond yield
# 'DAAA' for daily, 'WAAA' for weekly'
FRED provides both metadata and time series data for all of its economic data. We can get the metadata in this way.
base <- 'http://research.stlouisfed.org/fred2/series/'
seriesID <- 'GDPC96'
URL <- paste0(base, seriesID, '/downloaddata/', seriesID, '.xls')
tmp <- tempfile()
download.file(url = URL, destfile = tmp)
require('gdata')
gdpmetadata <- read.xls(tmp, nrows = 11, header = FALSE)
unlink(tmp)
require('xtable')
print(xtable(gdpmetadata, caption = 'Table 3: FRED metadata for GDPC96',
align = rep('l', 3)),
type = 'html', caption.placement = 'top', include.rownames = FALSE,
include.colnames = FALSE)
Title: | Real Gross Domestic Product, 3 Decimal |
Series ID: | GDPC96 |
Source: | U.S. Department of Commerce: Bureau of Economic Analysis |
Release: | Gross Domestic Product |
Seasonal Adjustment: | Seasonally Adjusted Annual Rate |
Frequency: | Quarterly |
Units: | Billions of Chained 2009 Dollars |
Date Range: | 1947-01-01 to 2014-04-01 |
Last Updated: | 2014-07-30 10:16 AM CDT |
Notes: | A Guide to the National Income and Product Accounts of the United |
States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf) |
Package WDI (world development indicators) used to fetch data from WDI. Here we want to fetch annual data of the ratio of listed companies’ market capitalization to GDP from year 2000 to 2012 in China, USA, GB, JP and IN.
require('WDI')
WDIsearch(string = 'market capitalization')
## indicator
## [1,] "CM.MKT.LCAP.CD"
## [2,] "CM.MKT.LCAP.GD.ZS"
## name
## [1,] "Market capitalization of listed companies (current US$)"
## [2,] "Market capitalization of listed companies (% of GDP)"
Now we know the series ID is “CM.MKT.LCAP.GD.ZS”. We can also search series IdD in < http://data.worldbank.org/indicator>.
mc <- WDI(indicator = 'CM.MKT.LCAP.GD.ZS',
country = c('CN', 'US', 'GB', 'JP', 'IN'),
start = 2000, end = 2012)
mc[mc$iso2c == 'CN', ]
## iso2c country CM.MKT.LCAP.GD.ZS year
## 1 CN China 44.93 2012
## 2 CN China 46.29 2011
## 3 CN China 80.31 2010
## 4 CN China 100.35 2009
## 5 CN China 61.78 2008
## 6 CN China 178.20 2007
## 7 CN China 89.43 2006
## 8 CN China 34.59 2005
## 9 CN China 33.12 2004
## 10 CN China 41.51 2003
## 11 CN China 31.85 2002
## 12 CN China 39.55 2001
## 13 CN China 48.48 2000
Table 4: Frequently Used Indicators (WDI)
indicator | name |
---|---|
AG.LND.ARBL.HA.PC | arable land (hectares per person) |
AG.LND.ARBL.ZS | arable land (% of land area) |
AG.LND.FRST.ZS | forest area (% of land area) |
SP.RUR.TOTL.ZS | rural population (% of total population) |
EN.ATM.CO2E.PC | CO2 emissions (metric tons per capita) |
EN.ATM.CO2E.PP.GD | CO2 emissions (kg per PPP $ of GDP) |
NY.GDP.MKTP.KD | GDP (constant 2000 US$) |
NY.GDP.PCAP.KD | GDP per capita (constant 2000 US$) |
NY.GDP.DEFL.KD.ZG | Inflation, GDP deflator (annual %) |
NY.GDP.MKTP.CD.XD | GDP deflator, index (2000=100; US$ series) |
GC.TAX.TOTL.GD.ZS | tax revenue (% of GDP) |
SE.XPD.TOTL.GD.ZS | Public spending on education, total (% of GDP) |
SL.UEM.TOTL.ZS | unemployment, total (% of total labor force) |
CM.MKT.LCAP.GD.ZS | market capitalization of listed companies (% of GDP) |
CM.MKT.TRNR | stocks traded, turnover ratio (%) |
FR.INR.LNDP | interest rate spread (lending rate minus deposit rate) |
SP.POP.65UP.TO.ZS | population ages 65 and above (% of total) |
IT.NET.USER.P3 | internet users (per 1,000 people) |
IS.VEH.NVEH.P3 | motor vehicles (per 1,000 people) |
SI.DST.10TH.10 | income share held by highest 10% |
SI.DST.FRST.10 | Income share held by lowest 10% |
MS.MIL.XPND.GD.ZS | ilitary expenditure (% of GDP) |
GB.XPD.RSDV.GD.ZS | research and development expenditure (% of GDP) |
TX.VAL.TECH.MF.ZS | high-technology exports (% of manufactured exports) |
arable: adj. (of farmland) capable of being farmed productively
hectare: noun, (abbreviated `ha’) equal to 100 ares (or 10000 square meters)
You should get a free account first. Click your account to get the authentication token.
Here we will get historical data of China crude oil consumption.
require('Quandl')
Quandl.auth('yourAuthenticationToken')
Quandl.search(query = 'China crude oil consumption')
# we get code: INDEXMUNDI/ENERGY_CHINA_CRUDEOIL
oilCons <- Quandl('INDEXMUNDI/ENERGY_CHINA_CRUDEOIL', type = 'xts',
start_date = '2005-01-01', end_date = '2013-12-31')
We can also upload our own data to Quandl with Quandl.push() function.