Background Information on the Dataset

A stock market is where buyers and sellers trade shares of a company, and is one of the most popular ways for individuals and companies to invest money. The size of the world stock market is now estimated to be in the trillions. The largest stock market in the world is the New York Stock Exchange (NYSE), located in New York City. About 2,800 companies are listed on the NYSE. In this problem, we’ll look at the monthly stock prices of five of these companies: IBM, General Electric (GE), Procter and Gamble, Coca Cola, and Boeing. The data used in this problem comes from Infochimps.

There are two main types of crimes: violent crimes, and property crimes. In this problem, we’ll focus on one specific type of property crime, called “motor vehicle theft” (sometimes referred to as grand theft auto). This is the act of stealing, or attempting to steal, a car. In this problem, we’ll use some basic data analysis in R to understand the motor vehicle thefts in Chicago.

Download and read the following files into R, using the read.csv function: IBMStock.csv, GEStock.csv, ProcterGambleStock.csv, CocaCola.csv, and BoeingStock.csv

In this problem, we’ll take a look at how the stock dynamics of these companies have changed over time.

R Exercises

Right now, the date variable is stored as a factor. We can convert this to a “Date” object in R by using the following five commands (one for each data set):

IBM = read.csv("IBMStock.csv")
GE = read.csv("GEStock.csv")
ProcterGamble = read.csv("ProcterGambleStock.csv")
CocaCola = read.csv("CocaColaStock.csv")
Boeing = read.csv("BoeingStock.csv")

IBM$Date = as.Date(IBM$Date, "%m/%d/%y")

GE$Date = as.Date(GE$Date, "%m/%d/%y")

CocaCola$Date = as.Date(CocaCola$Date, "%m/%d/%y")

ProcterGamble$Date = as.Date(ProcterGamble$Date, "%m/%d/%y")

Boeing$Date = as.Date(Boeing$Date, "%m/%d/%y")

How many observations are there in each data set?

str(IBM)
## 'data.frame':    480 obs. of  2 variables:
##  $ Date      : Date, format: "1970-01-01" "1970-02-01" "1970-03-01" "1970-04-01" ...
##  $ StockPrice: num  360 347 327 320 270 ...

Explanation: Using the str function, we can see that each data set has 480 observations. We have monthly data for 40 years, so there are 12*40 = 480 observations.

What is the earliest year in our datasets?

summary(IBM)
##       Date              StockPrice    
##  Min.   :1970-01-01   Min.   : 43.40  
##  1st Qu.:1979-12-24   1st Qu.: 88.34  
##  Median :1989-12-16   Median :112.11  
##  Mean   :1989-12-15   Mean   :144.38  
##  3rd Qu.:1999-12-08   3rd Qu.:165.41  
##  Max.   :2009-12-01   Max.   :438.90

Explanation: Using the summary function, the minimum value of the Date variable is January 1, 1970 for any dataset.

What is the latest year in our datasets?

summary(IBM)
##       Date              StockPrice    
##  Min.   :1970-01-01   Min.   : 43.40  
##  1st Qu.:1979-12-24   1st Qu.: 88.34  
##  Median :1989-12-16   Median :112.11  
##  Mean   :1989-12-15   Mean   :144.38  
##  3rd Qu.:1999-12-08   3rd Qu.:165.41  
##  Max.   :2009-12-01   Max.   :438.90

Explanation: Using the summary function, the maximum value of the Date variable is December 1, 2009 for any dataset.

What is the mean stock price of IBM over this time period?

summary(IBM)
##       Date              StockPrice    
##  Min.   :1970-01-01   Min.   : 43.40  
##  1st Qu.:1979-12-24   1st Qu.: 88.34  
##  Median :1989-12-16   Median :112.11  
##  Mean   :1989-12-15   Mean   :144.38  
##  3rd Qu.:1999-12-08   3rd Qu.:165.41  
##  Max.   :2009-12-01   Max.   :438.90

Explanation: By typing summary(IBM), we can see that the mean value of the IBM StockPrice is 144.38.

What is the minimum stock price of General Electric (GE) over this time period?

summary(GE)
##       Date              StockPrice     
##  Min.   :1970-01-01   Min.   :  9.294  
##  1st Qu.:1979-12-24   1st Qu.: 44.214  
##  Median :1989-12-16   Median : 55.812  
##  Mean   :1989-12-15   Mean   : 59.303  
##  3rd Qu.:1999-12-08   3rd Qu.: 72.226  
##  Max.   :2009-12-01   Max.   :156.844

Explanation: By typing summary(GE), we can see that the minimum value of the GE StockPrice is 9.294.

What is the maximum stock price of Coca-Cola over this time period?

summary(CocaCola)
##       Date              StockPrice    
##  Min.   :1970-01-01   Min.   : 30.06  
##  1st Qu.:1979-12-24   1st Qu.: 42.76  
##  Median :1989-12-16   Median : 51.44  
##  Mean   :1989-12-15   Mean   : 60.03  
##  3rd Qu.:1999-12-08   3rd Qu.: 69.62  
##  Max.   :2009-12-01   Max.   :146.58

Explanation: By typing summary(CocaCola), we can see that the maximum value of the Coca-Cola StockPrice is 146.58.

What is the median stock price of Boeing over this time period?

summary(Boeing)
##       Date              StockPrice    
##  Min.   :1970-01-01   Min.   : 12.74  
##  1st Qu.:1979-12-24   1st Qu.: 34.64  
##  Median :1989-12-16   Median : 44.88  
##  Mean   :1989-12-15   Mean   : 46.59  
##  3rd Qu.:1999-12-08   3rd Qu.: 57.21  
##  Max.   :2009-12-01   Max.   :107.28

Explanation: By typing summary(Boeing), we can see that the median value of the Boeing StockPrice is 44.88.

What is the standard deviation of the stock price of Procter & Gamble over this time period?

sd(ProcterGamble$StockPrice)
## [1] 18.19414

Explanation: By typing sd(ProcterGamble$StockPrice), we can see that the standard deviation of the Procter & Gamble StockPrice is 18.19414.

Visualizing Stock Dynamics

plot(CocaCola$Date, CocaCola$StockPrice, type="l")

Around what year did Coca-Cola has its highest stock price in this time period?

Around what year did Coca-Cola has its lowest stock price in this time period?

In March of 2000, the technology bubble burst, and a stock market crash occurred. According to this plot, which company’s stock dropped more?

plot(CocaCola$Date, CocaCola$StockPrice, type="l", col="red")
lines(ProcterGamble$Date, ProcterGamble$StockPrice, col="blue")
abline(v=as.Date(c("2000-03-01")), lwd=2)
legend("bottomleft",
  legend=c("Coca Cola", "ProcterGamble"),
       col=c("red", "blue"), lty=1:2, cex=0.8)

Procter and Gamble.

Around 1983, the stock for one of these companies (Coca-Cola or Procter and Gamble) was going up, while the other was going down. Which one was going up?

Coca-Cola.

In the time period shown in the plot, which stock generally has lower values?

Coca-Cola.

Visualizing Stock Dynamics 1995-2005

Let’s take a look at how the stock prices changed from 1995-2005 for all five companies. In your R console, start by typing the following plot command:

plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type="l", col="red", ylim=c(0,210), xlab = "Date", ylab = "Stock Price")

lines(ProcterGamble$Date[301:432], ProcterGamble$StockPrice[301:432], col="blue")
lines(IBM$Date[301:432], IBM$StockPrice[301:432], col="green")
lines(GE$Date[301:432], GE$StockPrice[301:432], col="purple")
lines(Boeing$Date[301:432], Boeing$StockPrice[301:432], col="orange")
abline(v=as.Date(c("2000-03-01")), lwd=2)

legend( "topleft",
       legend=c("Coca Cola", "ProcterGamble", "IBM", "GE", "Boeing"),
       col=c("red", "blue", "green", "purple", "orange"), lty=1:2, cex=0.8)

Which stock fell the most right after the technology bubble burst in March 2000?

General Electric (GE).

Which stock reaches the highest value in the time period 1995-2005?

IBM.

In October of 1997, there was a global stock market crash that was caused by an economic crisis in Asia. Comparing September 1997 to November 1997, which companies saw a decreasing trend in their stock price?

plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type="l", col="red", ylim=c(0,210), xlab = "Date", ylab = "Stock Price")

lines(ProcterGamble$Date[301:432], ProcterGamble$StockPrice[301:432], col="blue")
lines(IBM$Date[301:432], IBM$StockPrice[301:432], col="green")
lines(GE$Date[301:432], GE$StockPrice[301:432], col="purple")
lines(Boeing$Date[301:432], Boeing$StockPrice[301:432], col="orange")
abline(v=as.Date(c("1997-09-01")), lwd=2)
abline(v=as.Date(c("1997-11-01")), lwd=2)

legend( "topleft",
       legend=c("Coca Cola", "ProcterGamble", "IBM", "GE", "Boeing"),
       col=c("red", "blue", "green", "purple", "orange"), lty=1:2, cex=0.8)

Two companies had a decreasing trend in stock prices from September 1997 to November 1997: Boeing and Procter & Gamble.

In the last two years of this time period (2004 and 2005) which stock seems to be performing the best, in terms of increasing stock price?

plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type="l", col="red", ylim=c(0,210), xlab = "Date", ylab = "Stock Price")

lines(ProcterGamble$Date[301:432], ProcterGamble$StockPrice[301:432], col="blue")
lines(IBM$Date[301:432], IBM$StockPrice[301:432], col="green")
lines(GE$Date[301:432], GE$StockPrice[301:432], col="purple")
lines(Boeing$Date[301:432], Boeing$StockPrice[301:432], col="orange")
abline(v=as.Date(c("2004-1-01")), lwd=2)
abline(v=as.Date(c("2006-1-01")), lwd=2)

legend( "topleft",
       legend=c("Coca Cola", "ProcterGamble", "IBM", "GE", "Boeing"),
       col=c("red", "blue", "green", "purple", "orange"), lty=1:2, cex=0.8)

Boeing is steadily increasing from 2004 to the beginning of 2006.