Everyday we are constantly being bombarded with a plethora of articles on data explosion- depicting vividly its possibilities and potential. However, though data on any topic are available all over the Web, it is not that easy to find and access them. Data are ubiquitous; but in the wilderness. The data relevant to us may be lying dormant somewhere in the depths of Cyberspace; but we may not be aware of it ; or it may be stored in a proprietary format. So, despite having an abundance of data, the reality seems to be: “data, data everywhere, nor any drop to use!”. In this context, the data search engine, Quandl, which provides access to millions of on-line datasets, assumes immense significance.

Quandl, which is being projected as Google for data, hosts millions of time-series data sets from over 400 sources. The service provides an easy to use interface that helps you locate and download the relevant data with ease. You don’t need to worry about the format in which the data is initially published; Quandl takes care of all those aspects and when the data comes to you, it will be presented in a standard format of your choice (like CSV). Millions of financial and economic data sets from different International Organisations, Central Banks, Stock Exchanges, Private Sector sources and Academia can be accessed from this service.

To access data from Quandl, one can either use its ‘Data Browser’ feature or its search application. For instance, if you wish to obtain ‘crude oil prices’, just enter the phrase in the search box and Quandl will list out a few links relevant to your data. Along with normal links, Quandl will also list out a few links under the label ‘Collections’. A collection is a set of data tables curated by real Quandl users. Now, click on the link relevant to you and immediately Quandl will take you to the dataset’s page (e.g.: https://www.quandl.com/DOE/RWTC-WTI-Crude-Oil-Spot-Price-Cushing-OK-FOB ).

Dataset code In Quandl every dataset has a unique identification code called ‘Quandl code’ (this code is always visible on the upper-right corner of the dataset’s page). This code consists of two components: data source code and the table code. For instance, the code for the data set mentioned above is: DOE/RWTC. Here, DOE is the code for ‘US Department of Energy’ and RWTC is the code for the data table.

Quandl APIs Besides offering tools for finding/accessing data via its Web interface, Quandl, through its API functions, allows the user to access all of its data programmatically too. For R users Quandl offers a package for downloading data directly from within R. To get started with the package, install and load it using the following commands:

library(Quandl)

If you have plans to access more than 50 datasets, register with the service, obtain (free ) the auth-token code and authenticate using the command displayed below:

Quandl.auth(“yourauthenticationtoken”)

Now, to access the data, you just need to know the Quandal code of the data. For example, if the code is DOE/RWTC, you can obtain the data set and assign it to the object ‘qdata’, using the following command:

qdata=Quandl("DOE/RWTC")

The (default) frequency for the above time series data is ‘daily’ and you may find it quite unwieldy to handle; the option ‘collapse’ lets you change the frequency of the data. Take a look:

qdata=Quandl("DOE/RWTC",collapse="annual")
head(qdata)
##         Date Value
## 1 2015-12-31 46.79
## 2 2014-12-31 53.45
## 3 2013-12-31 98.17
## 4 2012-12-31 91.83
## 5 2011-12-31 98.83
## 6 2010-12-31 91.38