A stock market is where buyers and sellers trade shares of a company, and is one of the most popular ways for individuals and companies to invest money. The size of the world stock market is now estimated to be in the trillions. The largest stock market in the world is the New York Stock Exchange (NYSE), located in New York City. About 2,800 companies are listed on the NSYE. In this problem, we’ll look at the monthly stock prices of five of these companies: IBM, General Electric (GE), Procter and Gamble, Coca Cola, and Boeing. The data used in this problem comes from Infochimps.

Download and read the following files into R, using the read.csv function: IBMStock.csv, GEStock.csv, ProcterGambleStock.csv, CocaColaStock.csv, and BoeingStock.csv. (Do not open these files in any spreadsheet software before completing this problem because it might change the format of the Date field.)

Call the data frames “IBM”, “GE”, “ProcterGamble”, “CocaCola”, and “Boeing”, respectively. Each data frame has two variables, described as follows:

Date: the date of the stock price, always given as the first of the month. StockPrice: the average stock price of the company in the given month. In this problem, we’ll take a look at how the stock dynamics of these companies have changed over time.

Section 1 - Summary Statistics

1.1

Before working with these data sets, we need to convert the dates into a format that R can understand. Take a look at the structure of one of the datasets using the str function. Right now, the date variable is stored as a factor. We can convert this to a “Date” object in R by using the following five commands (one for each data set):

IBM$Date = as.Date(IBM$Date, "%m/%d/%y")

GE$Date = as.Date(GE$Date, "%m/%d/%y")

CocaCola$Date = as.Date(CocaCola$Date, "%m/%d/%y")

ProcterGamble$Date = as.Date(ProcterGamble$Date, "%m/%d/%y")

Boeing$Date = as.Date(Boeing$Date, "%m/%d/%y")

The first argument to the as.Date function is the variable we want to convert, and the second argument is the format of the Date variable. We can just overwrite the original Date variable values with the output of this function. Now, answer the following questions using the str and summary functions.

Our five datasets all have the same number of observations. How many observations are there in each data set?

IBM=read.csv("C:/bussiness analytics/data/IBMStock.csv")
GE=read.csv("C:/bussiness analytics/data/GEStock.csv")
CocaCola=read.csv("C:/bussiness analytics/data/CocaColaStock.csv")
Boeing=read.csv("C:/bussiness analytics/data/BoeingStock.csv")
ProcterGamble=read.csv("C:/bussiness analytics/data/ProcterGambleStock.csv")
IBM$Date=as.Date(IBM$Date , "%m/%d/%y")
GE$Date=as.Date(GE$Date , "%m/%d/%y")
CocaCola$Date=as.Date(CocaCola$Date , "%m/%d/%y")
Boeing$Date=as.Date(Boeing$Date , "%m/%d/%y")
ProcterGamble$Date=as.Date(ProcterGamble$Date , "%m/%d/%y")
str(IBM)
'data.frame':   480 obs. of  2 variables:
 $ Date      : Date, format: "1970-01-01" "1970-02-01" ...
 $ StockPrice: num  360 347 327 320 270 ...
nrow(IBM)
[1] 480

1.2

What is the earliest year in our datasets?

min(IBM$Date)
[1] "1970-01-01"

1.3

What is the latest year in our datasets?

max(IBM$Date)
[1] "2009-12-01"

1.4

What is the mean stock price of IBM over this time period?

mean(IBM$StockPrice)
[1] 144.375

1.5

What is the minimum stock price of General Electric (GE) over this time period?

min(GE$StockPrice)
[1] 9.293636

1.6

What is the maximum stock price of Coca-Cola over this time period?

max(CocaCola$StockPrice)
[1] 146.5843

1.7

What is the median stock price of Boeing over this time period?

median(Boeing$StockPrice)
[1] 44.8834

1.8

What is the standard deviation of the stock price of Procter & Gamble over this time period?

sd(ProcterGamble$StockPrice)
[1] 18.19414

Section 2 - Visualizing Stock Dynamics

Let’s plot the stock prices to see if we can visualize trends in stock prices during this time period. Using the plot function, plot the Date on the x-axis and the StockPrice on the y-axis, for Coca-Cola.

This plots our observations as points, but we would really like to see a line instead, since this is a continuous time period. To do this, add the argument type=“l” to your plot command, and re-generate the plot (the character is quotes is the letter l, for line). You should now see a line plot of the Coca-Cola stock price.

2.1

Around what year did Coca-Cola has its highest stock price in this time period?

  • 1973
  • 1980
  • 1985
  • 1995
  • 2008
plot(CocaCola$Date, CocaCola$StockPrice, type='l')  # 1973 1980
abline(v=CocaCola$Date[which.max(CocaCola$StockPrice)], col='green')
abline(v=CocaCola$Date[which.min(CocaCola$StockPrice)], col='red')

plot(CocaCola\(Date, CocaCola\)StockPrice, type=‘l’) 繪圖函數 abline()直線函數,作為時點的區分線 標紅線 或綠色 abline(v=CocaCola\(Date[which.max(CocaCola\)StockPrice)], col=‘green’) abline(v=CocaCola\(Date[which.min(CocaCola\)StockPrice)], col=‘red’)

Around what year did Coca-Cola has its lowest stock price in this time period?

plot(CocaCola$Date, CocaCola$StockPrice, type='l')  
abline(v=CocaCola$Date[which.min(CocaCola$StockPrice)], col='red')

2.2

Now, let’s add the line for Procter & Gamble too. You can add a line to a plot in R by using the lines function instead of the plot function. Keeping the plot for Coca-Cola open, type in your R console:

lines(ProcterGamble$Date, ProcterGamble$StockPrice)
plot(ProcterGamble$Date, ProcterGamble$StockPrice, type='l') 

Unfortunately, it’s hard to tell which line is which. Let’s fix this by giving each line a color. First, re-run the plot command for Coca-Cola, but add the argument col=“red”. You should see the plot for Coca-Cola show up again, but this time in red. Now, let’s add the Procter & Gamble line (using the lines function like we did before), adding the argument col=“blue”. You should now see in your plot the Coca-Cola stock price in red, and the Procter & Gamble stock price in blue.

As an alternative choice to changing the colors, you could instead change the line type of the Procter & Gamble line by adding the argument lty=2. This will make the Procter & Gamble line dashed.

Using this plot, answer the following questions.

In March of 2000, the technology bubble burst, and a stock market crash occurred. According to this plot, which company’s stock dropped more?

  • Coca-Cola
  • Procter and Gamble
plot(CocaCola$Date, CocaCola$StockPrice, type='l', col='red', lwd=2)
lines(ProcterGamble$Date, ProcterGamble$StockPrice,type='l' ,col="blue", lwd=2)
abline(v = as.Date("2000-03-01"), lty=3, col='orange')
abline(v = as.Date("1983-07-01"), lty=3, col='orange')
legend("topright",legend=c("Coke","P&G"),col=c('red','blue'),lwd=2)

legend(“topright”,legend=c(“Coke”,“P&G”),col=c(‘blue’,‘green’),lwd=2)在右上角做出標示 topright“, # 表示在右上角 pch = 1, # pch代表點的圖案 col = c(”blue“,”red“,”black“), # col代表顏色 legend = c(”May“,”August“,”Other Month“) # 顏色所對應的名稱 lwd=2 # lwd 代表線的粗細 lines()把資料點用線連接

To answer this question and the ones that follow, you may find it useful to draw a vertical line at a certain date. To do this, type the command

abline(v=as.Date(c("2000-03-01")), lwd=2)

in your R console, with the plot still open. This generates a vertical line at the date March 1, 2000. The argument lwd=2 makes the line a little thicker. You can change the date in this command to generate the vertical line in different locations.

2.3

Answer these questions using the plot you generated in the previous problem.

Around 1983, the stock for one of these companies (Coca-Cola or Procter and Gamble) was going up, while the other was going down. Which one was going up?

  • Coca-Cola
  • Procter and Gamble
# Coca-Cola

2.4

In the time period shown in the plot, which stock generally has lower values?

  • Coca-Cola
  • Procter and Gamble
# Coca-Cola

Section 3 - Visualizing Stock Dynamics 1995-2005

Let’s take a look at how the stock prices changed from 1995-2005 for all five companies. In your R console, start by typing the following plot command:

plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type="l",    
col="red", ylim=c(0,210))

This will plot the CocaCola stock prices from 1995 through 2005, which are the observations numbered from 301 to 432. The additional argument, ylim=c(0,210), makes the y-axis range from 0 to 210. This will allow us to see all of the stock values when we add in the other companies.

Now, use the lines function to add in the other four companies, remembering to only plot the observations from 1995 to 2005, or [301:432]. You don’t need the “type” or “ylim” arguments for the lines function, but remember to make each company a different color so that you can tell them apart. Some color options are “red”, “blue”, “green”, “purple”, “orange”, and “black”. To see all of the color options in R, type colors() in your R console.

(If you prefer to change the type of the line instead of the color, here are some options for changing the line type: lty=2 will make the line dashed, lty=3 will make the line dotted, lty=4 will make the line alternate between dashes and dots, and lty=5 will make the line long-dashed.)

Use this plot to answer the following four questions.

3.1

Which stock fell the most right after the technology bubble burst in March 2000?

  • Coca-Cola
  • Procter and Gamble
  • IBM
  • General Electric (GE)
  • Boeing
plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], 
     type="l", col="red", ylim=c(0,210))
lines(ProcterGamble$Date[301:432],  ProcterGamble$StockPrice[301:432],  col="blue")
lines(IBM$Date[301:432], IBM$StockPrice[301:432], col="green")
lines(GE$Date[301:432],  GE$StockPrice[301:432],  col="purple")
lines(Boeing$Date[301:432],  Boeing$StockPrice[301:432],  col="orange")
abline(v = as.Date("2000-03-01"), lty=3, col='gray')
abline(v = as.Date("1997-09-01"), lty=3, col='gray')
abline(v = as.Date("1997-11-01"), lty=3, col='gray')
legend("topright",legend=c("Coke","P&G","IBM","GE","BE"),col=c('red','blue','green','purple','orange'),lwd=2)

plot(CocaCola\(Date[301:432], CocaCola\)StockPrice[301:432], type=“l”,
col=“red”, ylim=c(0,210)) ylim()Y的範圍 xlim()X的範圍

3.2

Which stock reaches the highest value in the time period 1995-2005?

  • Coca-Cola
  • Procter and Gamble
  • IBM
  • General Electric (GE)
  • Boeing
#IBM

3.3

In October of 1997, there was a global stock market crash that was caused by an economic crisis in Asia. Comparing September 1997 to November 1997, which companies saw a decreasing trend in their stock price? (Select all that apply.)

  • Coca-Cola
  • Procter and Gamble
  • IBM
  • General Electric (GE)
  • Boeing
# Boeing

3.4

In the last two years of this time period (2004 and 2005) which stock seems to be performing the best, in terms of increasing stock price?

  • Coca-Cola
  • Procter and Gamble
  • IBM
  • General Electric (GE)
  • Boeing
# IBM
