Introduction

Cotton is an important oil seed cultivated in India. With introduction of seed protection technologies, area under cultivation and yields have grown significantly in India. Due to diverse agro-climatic conditions in India, cotton harvests are spread across year and price fluctuations are observed historically across markets. This scenario calls for accurate price forecasts minimizing speculations and facilitates informed policy decisions generating Minimum Support Price or Maximum export prices.As a first step to let us explore the cotton seed prices

Data sourcing & analysis

Data acquicition: Data is sourced from Open Government Data platform OGD. OGD collects data from AGMARKNET Portal. The XML files for 2001 to 2015 were manually downloaded. Data is converted to tables and aggregations were performed using R XML and base functions. Script is available as cotton-Seed prices.Ron git hub.

Visualization: Most of the visualizations were prepared by using base R plotting functions and the plotly.js. All the plots are interactive. Mouse over plot to obtain extra information. You could also select and deselect a state, zoom to a particular year on the plot etc. hit on Home button on plot window to reset axis. Have fun in exploring historic cotton seed prices. All the code that is used for generating these plots is available as seedPrices.R

Data exploration

How the average seed price changed since 2002
Across Indian markets, historically the cotton seed prices are show an increasing trend between 2002 and 2015. Average seed price recorded across years ranges between a Minimum price recorded was in May 2002 at 550 Rs and a maximum price recorded as 5033 Rs in Jan 2014. Yellow line indicates the loess smoothing function applied to the series data.

Above plot does not tell us anything about differences in mean prices across states over years. To visualize the same, I removed states with less than 5 years of data and aggregated across the markets from the state and plotted. Here after this data is used for all observations. Highest average price (6625 Rs) is realized from Karnataka in 2010, followed by MP and Punjab in 2014.

When I want to know within state how much variation is there in average price across years, above plot will be of less help. Box plots come in handy to visualize the same. That is below plot. It is clear that, except Karnataka in all other states there is not much variation in meanPrices across years.

However, this calls for another question i.e. How much is the variation in average price with in a month across years. Let us see below plot. Lowest variations in average price across states and years is observed in the months of August followed by June,July,September.

Mean seed price variation across months

Maximum price volatility

Volatility here is the difference between min ans maximum price recorded. This is dependent on several factors such as volumes arriving at market, international market prices, etc. I am not going to talk about them in this post.

By year
In the past thirteen years, maximum seed volatility (725 Rs) is observed in 2014 with a constant increase since 2010 except a dip in 2012. Witt in a year, price volatility (~400 Rs) is high during March - May and was low during July through September.

Modal priceand deviations

It is observed that, there are model prices arrived, for each market and each day. I am not sure of how and who of this information, let us examine how they are working.

Price volatility around the modal price was minimal in the early years. However since 2010, deviations form modal price started to raise constantly. This could indicate the changing dynamics of traders and producers.

Similarly, if we observe the volatility across states, the minimum volatility is observed in NCT of Delhi followed by Karnataka, Andhra pradesh.

Conclusion:

It is evident that, the price volatility is distributed across markets and across years. In some states, variation between mean and maximum is high where as it is low in some states. One thing i wanted to do but could not due to lack of readily available data is correlation between price variation and arrival quantities. This information would be highly valuable because prices are greatly affected by demand-supply dynamics.