In this assignment we are going to analyze time-series data for equities in our portfolio, building off of assignment #2. The app relies on heavy use of analytics, visual and display charts. Try to make the data as real-time as possible.
Daily returns (price today / price yesterday - 1) (Hint: Use Numpy for calculating this)
Daily price difference (price today - price yesterday)
Use http://www.nasdaq.com/g00/quotes/historical-quotes.aspx for grabbing historical data
Using data from WeatherUnderground, we are going to explore the correlation between any equity in our portfolio and weather in New York City. You may wish to load this data up on start up to enable the charting of the equities graphs to be as fast as possible.
https://www.wunderground.com/history/
Hint - Use DataFrames functions like join() to help you pull the data together into one DataFrame. Remember to clean out non-trading days from the weather data.
The correlation (-1 to 1) can be displayed an a column on the P/L table
Employ a machine learning algorithm of your choice to predict stock prices for any of the equities in the portfolio. The prediction does not have to be correct (but if it is consistently, give me a call and let’s talk :) ) but should demonstrate how you can apply the algorithm. For example, you might add a column to the P/L table labeled “Predicted price” where you state what you think the next price is.
In your README file please make sure you clearly state what algorithm you are use, why you chose that algorithm and what are some of the pros / cons you’ve learned of the algorithm while employing it in this trading system.
Intraday trading last 100 trades are web scraped from NASDAQ website
Last 100 days OLHC data is web scraped from GOOGLE finance website
Current price is web scraped from Bloomberg Markets
For Correlation analysis weather data is obtained from Weather Underground website.
https://www.wunderground.com/history/airport/KNYC/2017/1/1/CustomHistory.html?dayend=27&monthend=11&yearend=2017&req_city=&req_state=&req_statename=&reqdb.zip=&reqdb.magic=&reqdb.wmo=
Application is developed using
The application is developed using Python data structures. Most of the web scraping is done using packages requests and BeautifulSoup (bs4). For persistence, data is stored in the Sqlite database. It is accessed using object-relational mapper (ORM) SQLAlchemy. Web pages use the Bootstrap framework. Tables are exposed as classes.
Graphs on the website are developed using Matplotlib package.
Application source code uploaded to GitHub https://github.com/akulapa/data602-assignment3. Currently, it is set up as a private folder (course requirement).
Docker image is created with empty Sqlite database traders.db. Image is uploaded to Docker Cloud and can be downloaded using the following command
docker pull akulapa/ubuntu1604:data602-assignment3
After downloading, to create a Docker container execute the following
docker run -v /etc/localtime:/etc/localtime:ro -p 8080:5000 akulapa/ubuntu1604:data602-assignment3
-v option sets local time inside docker container using time from the host machine. The application runs on the port 5000 inside the container and is mapped to 8080 while accessing the website.
http://localhost:8080
Once Docker container is started and on first access of website, complete symbol list is downloaded from NASDAQ website. The downloaded file is saved to the folder. After saving the file it is uploaded into the Sqlite database, tickerData table. Additionally, Symbols can be refreshed anytime by clicking on top right button Ticker List Update.
/usr/src/data602-assignment3/app/instance/temp/tickerlist.csv
Home page for the application is displayed as follows. It gives information about current portfolio.
All the metrics about the stock are found on single page
Available funds is displayed on each page.
The symbol can be keyed in manually or by using the searchable drop down. Enter Key or click on Search button will get traders to the details page. Symbol validation is done against the list downloaded from NASDAQ website. If Symbol does not exist in the list following message is displayed.
Stock information,
The graph is scatter plot displaying the amount of shares sold at a certain price in last 100 trades.
Graph shows Gaussian distribution, functions used to generate the graph is hist and plot from matplotlib package and linspace function from numpy package, to smooth the curve. Graph also give characteristics of closing price for last 100 days, mean(\(\mu\)) and standard deviation(\(\sigma\)). They are calculated using norm fuction from scipy package.
Above graph is right skewed, reason is on daily basis stock volume traded is between 100000 and 1000000. There were few instances volume was above 1000000, close to 8000000.
Menu 100 Day has following options
OHLC option displays daily closing data. Additional column Intraday Price Diff., calculated as High - Low is added to the table.
Above table shows
This table displays Closing Price movement with respect to 5-Day SMA. Volatility explains actual movement.
Menu option Weather shows todays weather conditions in New York City. Data is used to verify, if there is any correlation between closing price and weather conditions.
Pearson method is used to estimate correlation. Calculation is derived using the corr function from the pandas package. Data displayed shows there is no strong correlation between weather conditions and daily trading. Following heat map shows \(-1\), red color as inversely related and \(1\), green color highly correlated.
To predict closing price, I have used linear, poly, rbf, sigmoid method from the SVM function of the sklearn package. I also used the LinearRegression function from the sklearn package.
The method I have used under given conditions of closing price can be any price. Example, 11/14/2017 closing price, \(\$1.47\) could have been closing price of 11/21/2017, \(\$1.23\). I used forward \(5-trading~ days\) price. In this case, estimated closing price shifts forward \(5-trading~ days\) leaving last \(5-trading~ days\) with no value. Estimation of the closing price for missing \(5-Days\) gives future \(5-Days\) price. Using those \(5-days\) price, I calculated mean and standard deviation.
Two years worth of data is used to predict the price. Ratio 80:20 is used to separate training and testing datasets. Last 5-days are used to predict the stock price.
My prediction is price should be close to mean, if not it should be within \(\pm2\) standard deviations.
Rest of the screen displays additional metrics.
Based on analytical information provided on the page, the trader can either buy or sell a stock. To complete a transaction trader needs to select the type of transaction and quantity, and then click on gavel button. Validation is done on transaction selected, and the quantity entered
- cannot be negative
- has to be numeric
- cannot be greater than current balance while selling
- cannot exceed current available funds can afford, (quantity * current price <= available funds)
If submitted data fails during validation a message is shown on the screen as displayed below.
P&L page displays the current position details along side correlation with weather conditions and predictive price.
Blotter page contains all the transactions.