Conclusions from week of 12th June

We have shown that the including google trend data can influence the performance of a ARIMA-GARCH forecasting model used to predict the daily return on apple stock. This information can not generate larger equity and has a faster run time than the equivalent ARIMA-GARCH model without including google trend data.

Update from week of June 19th

Now that we have an ARIMA-GARCH model that works well with the google trend data, and gives us reliable returns. The goal of this week is to:

Adjusting the risk tolerance

The ARIMA-GARCH model predicts the expected return of the next day given a certain number of previous days. To remind ourselves, if the expected return is positive the price of the asset is expected to increase, similarly, if the expected return is negative the price will decrease.

In the model developed last week, if the expected was positive this executed a buy/long condition, and a sell/short condition if negative, there were no hold conditions. This week I will add a threshold on the condition such that only if the expected return is greater than a given value will we buy, and only if it is below a certain value will we sell. This adds can remove the element of uncertainty from the model, but may miss out on some profits. Overall it should add a degree of conservatism to the model.

If we plot the marginal change in stock price we can see how the two versions of the model compare. One there is no threshold, equivalent to the model developed last week, and the other there is a threshold of 0.1% daily return. We again compare to a buy and hold strategy.

Using a daily expected return of 0.1% threshold the performance of both forms of the ARIMA-GARCH models differ. Both do worse at the beginning of May, however after the less conservative model outperforms the conservative model, where the expected returns predicted by the model are less than 0.1%, so a hold action is executed, this misses out on some profits gained by the less conservative model.

Overall the less conservative model outperformed the model with 0.1% threshold on the expected return, which can be configured dependent on the risk-tolerance of the user.

Emailing the results

Emailing the results to me turned out to be a little harder than I anticipated. The process involved settig up a Gmail API, which when connected to the R-package gmailr can send emails, and the R script to read API client ID and secret. The complete setup process can be found here.

complete_email <- mime(
  To = "moocarme@gmail.com",
  From = "moocarme@gmail.com",
  Subject = paste(toString(Sys.Date()),
                  "Stock prediction finished"),
  body = paste0('The trading prediction has finished for ',toString(Sys.Date()), 
                '. \n', 'The predridction for AAPL was', toString(ind[1]),
                '. \n', 'You should ', toString(ifelse(ind[1]>0, 'buy', 'sell')),
                '. \n',
                'A more conservative model at a daily buy/sell rate of '
                , toString(bsh),' percent would suggest you '
                , ifelse(ind[1]>bsh, 'buy.',
                         ifelse(ind[1]<(-bsh), 'sell.', 'hold.'))))
send_message(complete_email)

A screenshot of the email output from the above code is shown below. The dates, daily returns and actions are generalized so the same email body template can be used every day.

Email screenshot

Email screenshot

Scheduling the script

Next, the script is scheduled, so that the file will run at the end of each day. Some modifications of the script were necessary, such as generalizing the dates, so that stock prices and google trend prices will be the most recent.

The script was scheduled on a linux machine using crontab. The follwing crontab command was used to implement the script.

0 19 * * 1-5 /usr/bin/R --vanilla --quiet < /home/Documents/DataScienceBootcamp/Project/project_23Jun16.R

The first two entries represent the minute and the hour, in 24 hour format, so this is scheduled to run at 7pm. The third and fourth are the day and month for the script to run, and asterisk represents all instances, the fifth entry represents the day of the week, ‘1-5’ represents Monday through Friday. The last part of the command tells crontab to run the file in the file-location by the program R.

The program is set to run at 7pm when the trading markets have closed.

Conclusions from week of 19th June

This week has been focused on adjustments to the model, and building a complete data prediction product that needs little supervision. As it stands the model runs, has good predictive results (makes money), and the only action the user needs to take is provided in the email, whether to buy, sell or hold the asset which is dependent on their risk tolerance. Only two risk tolerances are provided here, but this could easily be expanded.

My next goal is to automate the process further by putting the script onto a raspberry pi computer, a small cheap computer, with 1Gb of RAM, but only runs on 5W, so it won’t cost a lot to keep running. As long as the script takes under 24 hours to run it should be no problem. This could be an issue with many stocks and many predictive features, which will be another problem for the future.

This process could also be setup with some trading software such as e-trade, or scott-trade, such that the process is completely automated.