Stock Market Data Analysis

A stock market is where buyers and sellers trade shares of a company, and is one of the most popular ways for individuals and companies to invest money. The size of the world stock market is now estimated to be in the trillions. The largest stock market in the world is the New York Stock Exchange (NYSE), located in New York City. About 2,800 companies are listed on the NYSE. In this problem, we’ll look at the monthly stock prices of five of these companies: IBM, General Electric (GE), Procter and Gamble, Coca Cola, and Boeing.

IBM <- read.csv("IBMStock.csv")
GE <- read.csv("GEStock.csv")
ProcterGamble <- read.csv("ProcterGambleStock.csv")
CocaCola <- read.csv("CocaColaStock.csv")
Boeing <- read.csv("BoeingStock.csv")

Each data frame has two variables, described as follows:

  • Date: the date of the stock price, always given as the first of the month.
  • StockPrice: the average stock price of the company in the given month.

  • str(IBM)
    ## 'data.frame':    480 obs. of  2 variables:
    ##  $ Date      : Factor w/ 480 levels "1/1/00","1/1/01",..: 11 171 211 251 291 331 371 411 451 51 ...
    ##  $ StockPrice: num  360 347 327 320 270 ...

    Data Findings

  • We have monthly data for 40 years, so there are 12*40 = 480 observations.

  • The minimum value of the Date variable is January 1, 1970 for any dataset.

  • The maximum value of the Date variable is December 1, 2009 for any dataset.

  • Before working with these data sets, we need to convert the dates into a format that R can understand.
    Right now, the date variable is stored as a factor. We will convert this to a “Date” object.

    IBM$Date = as.Date(IBM$Date, "%m/%d/%y")
    GE$Date = as.Date(GE$Date, "%m/%d/%y")
    CocaCola$Date = as.Date(CocaCola$Date, "%m/%d/%y")
    ProcterGamble$Date = as.Date(ProcterGamble$Date, "%m/%d/%y")
    Boeing$Date = as.Date(Boeing$Date, "%m/%d/%y")
    min(IBM$Date)
    ## [1] "1970-01-01"
    max(IBM$Date)
    ## [1] "2009-12-01"

    Mean Stock Price of IBM over this time period

    mean(IBM$StockPrice)
    ## [1] 144.375

    We can see that the mean value of the IBM StockPrice is 144.38.

    Minimum stock price of General Electric (GE) over this time period?

    min(GE$StockPrice)
    ## [1] 9.293636

    We can see that the minimum value of the GE StockPrice is 9.294.

    What is the maximum stock price of Coca-Cola over this time period?

    max(CocaCola$StockPrice)
    ## [1] 146.5843

    We can see that the maximum value of the Coca-Cola StockPrice is 146.58.

    What is the median stock price of Boeing over this time period?

    median(Boeing$StockPrice)
    ## [1] 44.8834

    What is the standard deviation of the stock price of Procter & Gamble over this time period?

    sd(ProcterGamble$StockPrice)
    ## [1] 18.19414

    We can see that the standard deviation of the Procter & Gamble StockPrice is 18.19414.

    What is the mean of the stock price of Procter & Gamble over this time period?

    mean(ProcterGamble$StockPrice)
    ## [1] 77.70452

    The median and the mean both measure central tendency. As we can see for Procter & Gamble stock price the mean and median are different which shows that the data is not symmetric.

    Visualizing Stock Dynamics

    Coca-Cola Stock Price trend

    plot(CocaCola$Date,CocaCola$StockPrice, type = "l")
    Findings: -
  • Coca-Cola has its highest stock price in 1973.

  • Coca-Cola has its lowest stock price in 1980.

  • ProcterGamble Stock Price trend

    plot(ProcterGamble$Date, ProcterGamble$StockPrice, type = "l")

    Stock Price Trend comparison of Coca-COla and Procter & Gamble

    plot(CocaCola$Date,CocaCola$StockPrice, type = "l", col="red")
    lines(ProcterGamble$Date, ProcterGamble$StockPrice, col = "blue", lty=2)
    abline(v=as.Date(c("2000-03-01")), lwd=1)
    abline(v=as.Date(c("1983-01-01")), lwd=1)
  • In March of 2000, the technology bubble burst, and a stock market crash occurred.

  • Looking at the plot, around 2000 both stocks drop, but Procter and Gamble’s stock drops more.

  • Around 1983, the stock for Coca-Cola was going up, while the other (Procter and Gamble) was going down.

  • For overall timeperiod, CocaCola stock generally has lower values.

  • Visualizing Stock Dynamics 1995-2005

    Let’s take a look at how the stock prices changed from 1995-2005 for all five companies.

    ** Note: - This will plot the CocaCola stock prices from 1995 through 2005, which are the observations numbered from 301 to 432. The additional argument, ylim=c(0,210), makes the y-axis range from 0 to 210. This will allow us to see all of the stock values when we add in the other companies.

    plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type = "l", col="red", ylim = c(0,210), xlab = "Year", ylab="Stock Price")
    lines(ProcterGamble$Date[301:432], ProcterGamble$StockPrice[301:432], type = "l", col = "blue", lty=2)
    lines(ProcterGamble$Date[301:432], IBM$StockPrice[301:432], type = "l", col = "orange", lty=3)
    lines(ProcterGamble$Date[301:432], GE$StockPrice[301:432], type = "l", col = "black", lty=4)
    lines(ProcterGamble$Date[301:432], Boeing$StockPrice[301:432], type = "l", col = "purple", lty=5)
    abline(v=as.Date("2000-03-31", lwd=0.5))
    abline(v=as.Date("1995-01-01",lwd=1))
    abline(v=as.Date("2005-01-01",lwd=1))
  • By looking at this plot, you can see that the stock for General Electric falls significantly more than the other stocks after the technology bubble burst.

  • Looking at the plot, you can see that IBM has the highest value, around 1999.

  • plot(CocaCola$Date[301:432], CocaCola$StockPrice[301:432], type = "l", col="red", ylim = c(0,210), xlab = "Year", ylab="Stock Price")
    lines(ProcterGamble$Date[301:432], ProcterGamble$StockPrice[301:432], type = "l", col = "blue", lty=2)
    lines(ProcterGamble$Date[301:432], IBM$StockPrice[301:432], type = "l", col = "orange", lty=3)
    lines(ProcterGamble$Date[301:432], GE$StockPrice[301:432], type = "l", col = "black", lty=4)
    lines(ProcterGamble$Date[301:432], Boeing$StockPrice[301:432], type = "l", col = "purple", lty=5)
    abline(v=as.Date("1997-09-01",lwd=1))
    abline(v=as.Date("1997-11-30",lwd=1))
    abline(v=as.Date("2004-01-01",lwd=1))
    abline(v=as.Date("2005-12-31",lwd=1))
  • In October of 1997, there was a global stock market crash that was caused by an economic crisis in Asia.

  • Comparing September 1997 to November 1997, two companies Boeing and Procter & Gamble had a decreasing trend in stock prices.

  • In the last two years of this time period (2004 and 2005) Boeing stock seems to be performing the best, in terms of increasing stock price.