A stock market is where buyers and sellers trade shares of a company, and is one of the most popular ways for individuals and companies to invest money. The size of the world stock market is now estimated to be in the trillions. The largest stock market in the world is the New York Stock Exchange (NYSE), located in New York City. About 2,800 companies are listed on the NYSE. In this problem, we’ll look at the monthly stock prices of five of these companies: IBM, General Electric (GE), Procter and Gamble, Coca Cola, and Boeing.
IBM <- read.csv("IBMStock.csv")
GE <- read.csv("GEStock.csv")
ProcterGamble <- read.csv("ProcterGambleStock.csv")
CocaCola <- read.csv("CocaColaStock.csv")
Boeing <- read.csv("BoeingStock.csv")
Each data frame has two variables, described as follows:
str(IBM)
## 'data.frame': 480 obs. of 2 variables:
## $ Date : Factor w/ 480 levels "1/1/00","1/1/01",..: 11 171 211 251 291 331 371 411 451 51 ...
## $ StockPrice: num 360 347 327 320 270 ...
Before working with these data sets, we need to convert the dates into a format that R can understand.
Right now, the date variable is stored as a factor. We will convert this to a “Date” object.
IBM$Date = as.Date(IBM$Date, "%m/%d/%y")
GE$Date = as.Date(GE$Date, "%m/%d/%y")
CocaCola$Date = as.Date(CocaCola$Date, "%m/%d/%y")
ProcterGamble$Date = as.Date(ProcterGamble$Date, "%m/%d/%y")
Boeing$Date = as.Date(Boeing$Date, "%m/%d/%y")
min(IBM$Date)
## [1] "1970-01-01"
max(IBM$Date)
## [1] "2009-12-01"
mean(IBM$StockPrice)
## [1] 144.375
We can see that the mean value of the IBM StockPrice is 144.38.
min(GE$StockPrice)
## [1] 9.293636
We can see that the minimum value of the GE StockPrice is 9.294.
max(CocaCola$StockPrice)
## [1] 146.5843
We can see that the maximum value of the Coca-Cola StockPrice is 146.58.
median(Boeing$StockPrice)
## [1] 44.8834
sd(ProcterGamble$StockPrice)
## [1] 18.19414
We can see that the standard deviation of the Procter & Gamble StockPrice is 18.19414.
mean(ProcterGamble$StockPrice)
## [1] 77.70452
The median and the mean both measure central tendency. As we can see for Procter & Gamble stock price the mean and median are different which shows that the data is not symmetric.
plot(CocaCola$Date,CocaCola$StockPrice, type = "l")
plot(ProcterGamble$Date, ProcterGamble$StockPrice, type = "l")
plot(CocaCola$Date,CocaCola$StockPrice, type = "l", col="red")
lines(ProcterGamble$Date, ProcterGamble$StockPrice, col = "blue", lty=2)
abline(v=as.Date(c("2000-03-01")), lwd=1)
abline(v=as.Date(c("1983-01-01")), lwd=1)
Let’s take a look at how the stock prices changed from 1995-2005 for all five companies.
** Note: - This will plot the CocaCola stock prices from 1995 through 2005, which are the observations numbered from 301 to 432. The additional argument, ylim=c(0,210), makes the y-axis range from 0 to 210. This will allow us to see all of the stock values when we add in the other companies.
plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type = "l", col="red", ylim = c(0,210), xlab = "Year", ylab="Stock Price")
lines(ProcterGamble$Date[301:432], ProcterGamble$StockPrice[301:432], type = "l", col = "blue", lty=2)
lines(ProcterGamble$Date[301:432], IBM$StockPrice[301:432], type = "l", col = "orange", lty=3)
lines(ProcterGamble$Date[301:432], GE$StockPrice[301:432], type = "l", col = "black", lty=4)
lines(ProcterGamble$Date[301:432], Boeing$StockPrice[301:432], type = "l", col = "purple", lty=5)
abline(v=as.Date("2000-03-31", lwd=0.5))
abline(v=as.Date("1995-01-01",lwd=1))
abline(v=as.Date("2005-01-01",lwd=1))
plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type = "l", col="red", ylim = c(0,210), xlab = "Year", ylab="Stock Price")
lines(ProcterGamble$Date[301:432], ProcterGamble$StockPrice[301:432], type = "l", col = "blue", lty=2)
lines(ProcterGamble$Date[301:432], IBM$StockPrice[301:432], type = "l", col = "orange", lty=3)
lines(ProcterGamble$Date[301:432], GE$StockPrice[301:432], type = "l", col = "black", lty=4)
lines(ProcterGamble$Date[301:432], Boeing$StockPrice[301:432], type = "l", col = "purple", lty=5)
abline(v=as.Date("1997-09-01",lwd=1))
abline(v=as.Date("1997-11-30",lwd=1))
abline(v=as.Date("2004-01-01",lwd=1))
abline(v=as.Date("2005-12-31",lwd=1))
Lastly, let’s see if stocks tend to be higher or lower during certain months.
tapply(IBM$StockPrice, months(IBM$Date), mean)
## April August December February January July June
## 152.1168 140.1455 140.7593 152.6940 150.2384 139.0670 139.0907
## March May November October September
## 152.4327 151.5022 138.0187 137.3466 139.0885
mean(IBM$StockPrice)
## [1] 144.375
The overall average stock price for IBM is 144.375.
Comparing the monthly averages to this, we can see that the price has historically been higher than average January - May, and lower than average during the remaining months.
tapply(GE$StockPrice, months(GE$Date), mean)
## April August December February January July June
## 64.48009 56.50315 59.10217 62.52080 62.04511 56.73349 56.46844
## March May November October September
## 63.15055 60.87135 57.28879 56.23897 56.23913
mean(GE$StockPrice)
## [1] 59.3035
tapply(CocaCola$StockPrice, months(CocaCola$Date), mean)
## April August December February January July June
## 62.68888 58.88014 59.73223 60.73475 60.36849 58.98346 60.81208
## March May November October September
## 62.07135 61.44358 59.10268 57.93887 57.60024
mean(CocaCola$StockPrice)
## [1] 60.02973
After seeing these trends, we are ready to buy stock in certain months and sell it in others! But, we should be careful, because one really good or really bad year could skew the average to show a trend that is not really there in general.