In the first part of the seminar you will perform a technical analysis of the stock market (time series) data in R. To start you will need to select several public companies (i.e. companies that trade on the major stock exchanges) to investigate. Yahoo Finance https://uk.finance.yahoo.com/ and google finance https://www.google.co.uk/finance may help you find some companies to examine. A few companies and their tickers are listed below:
Trading Ticker | Company Name |
---|---|
GE | General Electric |
PG | Proctor and Gamble |
MSFT | Microsoft |
PFE | Pfizer |
AMD | Advanced Micro Devices |
AAPL | Apple |
DELL | Dell Computers |
GRPN | Groupon |
FB | |
CSCO | Cisco Systems |
INTC | Intel |
EZJ | Easyjet |
BP | BP |
HSBA | HSBC |
RR | Rolls Royce |
MKS | Marks and Spencer |
A trading ticker name is a unique name which can be used to identify a stock. You will need this to collect the data on the company. To collect the data in R we will use the quantmod package https://www.quantmod.com/ which you will need to install and load first:
Now we can download data for the companies you are interested using the following command:
getSymbols('AAPL', src="yahoo", from=as.Date("2018-01-01"), to=as.Date("2019-01-01"), return.class='ts')
This will download data for Apple (AAPL), you can modify the ticker name and dates above to download data for any company and time period. If you call back the name of your stock this should return a time series object:
You will see that six columns of information are received: AAPL.Open – The opening price of the stock on each day. AAPL.High - The highest price of the stock on each day. AAPL.Low - The lowest price of the stock on each day. AAPL.Close - The closing price of the stock on each day. AAPL.Volume - The amount of shares bought/sold of the stock on each day. AAPL.Adjusted - The closing price after adjustments for all applicable splits & dividend distributions
You can visualise all this information using a candlestick chart:
candleChart(AAPL, up.col = "black", dn.col = "red", theme = "white")
A black candlestick indicates a day where the closing price was higher than the open (i.e. a gain) and a red candlestick indicates a day where the open was higher than the close (i.e. a loss). The candleChart function allows you to add a simple moving average with: addSMA(n = 10)
However, if we wanted to do this ourselves with the SMA function we could use the following code with the close data:
s10 <- SMA(AAPL[, “AAPL.Close”], n=10)
Then we can plot this with plot(AAPL[, “AAPL.Close”], main = “Apple Stock”) and lines(s10, col=2). You should now see your closing stock data with a smoothed simple moving average line in red.
Similarly we can look at a 10-day weighted moving average with the following code:
w10 <- WMA(AAPL[, "AAPL.Close"], n=10, w=(1:10)^10)
plot(AAPL[, "AAPL.Close"], main = "Apple Stock")
lines(w10, col=2)
The 10 weights used above are :
## [1] 1 1024 59049 1048576 9765625 60466176
## [7] 282475249 1073741824 3486784401 10000000000
Next, we are going to use Bollinger Bands. These are composed of three lines which can be used to indicate the direction and strength of a trend. The middle line by default is just a simple moving average (set at 20-day) and the upper and lower bands are the +/- the standard deviation. For more information on Bollinger Bands take a look at: https://www.fidelity.com/learning-center/trading-investing/technical-analysis/technical-indicator-guide/bollinger-bands
To start, replot the candlestick chart with:
candleChart(AAPL, up.col = “black”, dn.col = “red”, theme = “white”)
Then we can add the Bollinger Bands with:
addBBands(sd = 2, maType = “SMA”, draw = ‘bands’)
Does your share price ever touch the upper and lower bands? If it touches the lower band it usually means selling activity will be high and if it touches the upper band it usually means purchasing activity will be high. So with the example share price above, it is likely at the most recent observation purchasing activity will be high.
The addBBands function also allows you to set the type of moving average method you wish to use. To use weighted moving average try: addBBands(sd = 2, maType = “WMA”, draw = ‘bands’)
Before starting this seminar make sure you have successfully used all of the code provided in the lecture notes to use Exponential Smoothing, ARIMA and Forecasting Accuracy on the iPhone sales data.Now, you will be working with three new datasets of varying format, duration and numbers of observations:
Name | Data format | No. of Observations | Duration | Source |
---|---|---|---|---|
Property sales in England and Wales | Monthly | 263 | Jan 1995 – Nov 2016 | Doogal |
Annual global carbon emissions from coal, oil and gas (in million tonnes of carbon per year) | Yearly | 56 | 1959 - 2014 | CDIAC |
Number of licenced (registered/sold) Toyota Prius cars in Great Britain | Quarterly | 29 | 2008 Q3 – 2015 Q3 | GOV.UK |
Donwload the series from Seminar Data on Minerva and upload the series in the global evnironment. Create the three time series. The first time series named psts will be the Total_Sales from the property sales. The second time series named ets will be the emissions from the second dataset. Finally, a time series pts will be for the Num from the third dataset.
Create the time series training and test datasets. We will be using the majority of the datasets as training data and the last 4 observations of each dataset as testing data to evaluate the accuracy of the forecast models you will create.
Decompose psts (property sales) and pts (prius cars) to help you better understand the data sets. The ets (emissions) dataset will not decompose as it is yearly data (frequency=1) which has no seasonality.
Use the simple forecasting techniques to forecast four periods ahead the psts, pts and ets datasets.
Use exponential smoothing to forecast the psts, pts and ets datasets.
Use ARIMA to forecast the psts, pts and ets datasets. This will involve you selecting what you think is the best ARIMA model (you can also use auto.arima()).
Evaluate your forecasts using the forecasting accuracy measures and the testing data. Has exponential smoothing, ARIMA or one of the simple forecasting techniques performed the best at forecasting each of the three datasets?