Problem Statement

JP Morgan Chase & Co. is one of the oldest, largest and best-known financial institutions in the United States. As one of the most famous Investment Banks, I am especially interested in their financial performance. How much profits did they earn each year? Whether it is valuable for investors?
Analysing the risk and return could help predict the Expected Value of stocks and make smart investment plans in advance!!

Packages Required

Following are the packages required and explaining with their use.

  • tidyverse <- A set of packages includes readr, ggplot2, dplyr

  • DT <- Interactive HTML display of data

  • forecast <- especially used in making prediction of time series dataset

  • faraway <- Establishing model for stock price prediction

  • fpp2 <- Used for Data Forecasting, Data Principles and Practice

  • modelr <- Helper functions for modelling. This project is used for financial modeling.

  • plotly <- Create interactive plots

  • formattable <- Apply formatting on vectors and data frames

  • gglot2 <- Create data visualizations using the grammar of graphics

##Load required packages##
library(tidyverse)
library(DT)
library(forecast)
library(faraway)
library(fpp2)
library(modelr)
library(plotly)
library(formattable)
library(ggplot2)
library(gridExtra)

Data Preparation

Source

First Dataset

  • The dataset used in this project is from Yahoo Finance, which is a proffessional financial website and include the leatest finanical information of all companies in S&P 500. I download 5 years historical price of JPMorgan. In this dataset, it shows the daily stock prices in terms of Open price, Highest and lowest price within one day and also Adjusted close price.

Second Dataset

  • The second dataset is used in financial modeling when doing the regression analysis for JPMorgan stock price. It is downloaded from Fama French Data Library. “Fama French model” is one of the most regular used model in analyzing stock market value and its risk factors. Fama French modeling shows as following:



There are three variables in this modeling:

  • Rm-Rf
  • SML
  • HML

Third Dataset

  • The third dataset are also used for making regression analysis of JPMorgan stock price.

Fama French 5 factos modeling shows as following:




Import

The data is read in through read_csv() function, datasets are as following:

JPM_Prices <- read_csv("JPM.csv",  col_types = "?????_??") %>%
  as_tibble
  datatable(JPM_Prices)
Monthly <- read_csv("JPM_Monthly.csv",  col_types = "?????_?") %>%
  as_tibble()
  datatable(Monthly)
FF3<- read_csv("FF_Research.csv",  col_types = "??????") %>%
  as_tibble()
  datatable(Monthly)

Data Cleaning

Firstly,

  • droping the repeated data Year;

  • droping the unuseful variable-volume;

  • keeping the Date, Openning price, High price, Low price and the Adj Close price (note: Adj Close price is the adjusted priced of close, which is more accurate )

Then,

  • seperating the date by year, month and day;

  • keeping the year and unite the month and day together;

Finally,

  • group by Year, from 2012, 2013, 2014, 2015 … to 2017, which would be used for following analysis.
## Represent a small part of raw dataset ##
JPM_out <- JPM_Prices %>% select(-Volume)
JPM_byYear <- JPM_Prices %>%
    separate(Date, into = c("Y", "Month", "Date"), sep = "/") %>%
    unite(D, Month, Date, sep = "/") %>%
    select(-Year) %>%
    group_by(Y)
#following is the cleaned datable 
datatable(JPM_byYear)






Exploratory Data Analysis

Visualization

  1. Trend of Historical Stock Prices

Firstly, it is necessary to know the big picture of JPMorgan historical price in the past five years. In other words, whether it shows a upward trend, remain stable or a downward trend. Let’s see it distribution of daily prices in past five years:

par(mfrow=c(2,2))
ggplotly(ggplot(data = JPM_Prices, aes(x = Date, y = Open, group = 1))+
  geom_point(colour="green")+
  geom_line(colour="red"))




From the above sccater plot, we could see the stock prices of JP Morgan fluctuated increase in past years. Particularly, it experienced an obviously decrease in December 2016 and January 2017.

2.Fluctuation by year

After observing the overall dynamic fluctuations, I also want to explore to fluctuations in specific year. So I divided the dataset into different groups, which is showing as the stock prices in 2012, 2013, 2014, 2015, 2016 and 2017.

JPM_Prices %>%
    separate(Date, into = c("Y", "Month", "Date"), sep = "/") %>%
    unite(D, Month, Date, sep = "/") %>%
    select(-Year) %>%
    group_by(Y) %>%
    ggplot(aes(D, `Adj Close`, color = Y), stat = "identity") +
    geom_line(aes(group = Y))+
    geom_jitter()

The interesting finding is that

  • The fluctuation of different years is largely different. In most of years, it shows an increasingly upward trend, but in 2016 the overall stock price shows a decreasing trend.

  • The stock prices in 2017 is significant higher than past few years, which means investors could choose a buy option and would earn the profits in the long-term.

  • In general, the value of stock prices is increasing yearly. This indicates that the performance of company is becoming better, which is also a positive signal for long-time investors.

3.Time Series visualization

In the next step, I will visualization the data through time series and doing the following analysis.

(1) Transform the dataset into matrix data frame (.ts) In this step, I firstly transform my clean dataset (JPM_P) into the file (price. ts) *Note: since the data table has over 1000 rows, I hind the transformed dataset, but it could be saw in original code.

JPM_P <- read_csv("JPM.csv",  col_types = "?????_??") %>%
  as_tibble() %>%
  mutate(Date = as.Date(Date))%>%
  select(High,Open, Low)
price.ts <- ts(JPM_P)
str(price.ts)
price.ts



(2) Then, visualizing the data by using the function of “autoplot()”

#historical prices 
ggplotly(autoplot(price.ts,  alpha = 0.5,
    shape = 3, title = "JPM Stock Price in past 5 years")+
  ggtitle("Time Series visualization")+
           scale_y_continuous(labels = scales::dollar))



(3) I visualized three price variables, which are Open price, Highest price within one day and Lowest price in one day. If you zoom in, you could observe:
the lowest price in the past happened in the price drop time, which is only $50.12
the highest price is showed in resent time, which is almost nearly to $96.

It indicates that the stock price is constantly increasing now.

#increasing time series with a line in  
autoplot(price.ts, facets = TRUE) +
  geom_smooth() +
  labs("Price Variance",
       y = "Stock Price",
       x = "Time")+
  ggtitle("Time Series by different variables")+
  theme(panel.background = element_blank())



(4) By visualization of Open price, High price and Low price, although it exists different fluctuations, the overall trends are still increasing.

In addition, the slope of increase is relative stable in former 400 days, but the stock prices increase significantly in recent months.


4.Forecasting the Future Prices by Time Series Analysis

In the final step, I use the forecast model to predict the stock price of JP Morgan. The predicted price is showed as the blue area.
In order to observe more clearly, I also add the “ggplotly()” function, so investors could see the exact predicted price in following days.

fc_JPM_Prices <- naive(JPM_P, 25)
ggplotly(autoplot(fc_JPM_Prices, NA.RM = TRUE) +
  ggtitle("Forecasting by Time Series")+
           scale_y_continuous(labels = scales::dollar))

5.Conclusion

After visualization the data, it shows a strong liner relationship between the date and stock prices. so in the next part I established a financial model between each variables, visualized their relationship and than making the more persuasive prediction of JP Morgan future stock prices.


Establish Financial Model

FF_Model <-
  read_csv("FF_Research.csv") %>%
  mutate(`Rjpm-Rf` = Open - RF)
## Parsed with column specification:
## cols(
##   X1 = col_integer(),
##   Open = col_double(),
##   `Mkt-RF` = col_double(),
##   SMB = col_double(),
##   HML = col_double(),
##   RF = col_double()
## )
model3 <- lm(FF_Model$`Rjpm-Rf` ~ FF_Model$`Mkt-RF` + FF_Model$SMB + FF_Model$HML,
             data = FF_Model)
  1. Establish Financial Model-Fama French three-factor Model

In this finanical modeling

  • r is the Expectied Reture of JP Morgan company
  • Km is the return of the market portfolio
  • Rf stands for the risk free return
  • SMB stands for “Small [market capitalization] Minus Big”
  • HML for “High [book-to-market ratio] Minus Low”

These factors are calculated with combinations of portfolios composed by ranked stocks (BtM ranking, Cap ranking) and available historical market data.

  1. The financila modeling is caculated as:

Return(JPM) = Stock Price (JPM)= 61.1293 – 0.2803(Rm-Rf) – 0.2528SMB – 1.2854*HML

summary(model3)
## 
## Call:
## lm(formula = FF_Model$`Rjpm-Rf` ~ FF_Model$`Mkt-RF` + FF_Model$SMB + 
##     FF_Model$HML, data = FF_Model)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -24.423  -6.463  -1.543   4.762  33.124 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        61.1293     0.3464 176.460   <2e-16 ***
## FF_Model$`Mkt-RF`  -0.2803     0.4474  -0.627   0.5310    
## FF_Model$SMB       -0.2528     0.7321  -0.345   0.7299    
## FF_Model$HML       -1.2854     0.7544  -1.704   0.0887 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.12 on 1227 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.002852,   Adjusted R-squared:  0.0004145 
## F-statistic:  1.17 on 3 and 1227 DF,  p-value: 0.3199
  1. coefficient confidence interval
# coefficient confidence interval
confint(model3)
##                       2.5 %     97.5 %
## (Intercept)       60.449625 61.8089101
## FF_Model$`Mkt-RF` -1.158074  0.5974025
## FF_Model$SMB      -1.689147  1.1835117
## FF_Model$HML      -2.765535  0.1947234
  1. Assessing Residuals
  • The sccater plot is realive random distribution, which means the model is significant to explain the return of JP Morgan company.
  • The Nomal Q-Q plot is almost on the line, which means the model is normally distribution and the model is fit.
## fit simple regression line in scatter plot

library(gridExtra)

plot1<-
ggplot(FF_Model, aes(FF_Model$`Mkt-RF`,  Open - RF)) +
  geom_point() +
  geom_smooth(method = "lm") #  `Mkt-RF`

plot2<-
ggplot(FF_Model, aes(FF_Model$SMB,  Open - RF)) +
  geom_point() +
  geom_smooth(method = "lm") #  `SMB`

plot3<-
ggplot(FF_Model, aes(FF_Model$HML,  Open - RF)) +
  geom_point() +
  geom_smooth(method = "lm") #  HML --> significant 

grid.arrange(plot1,plot2,plot3,ncol=3)

# assessing residual plots****************************************************
par(mfrow=c(2,2)) 
plot(model3, which = 1) # resids vs. fitted
plot(model3, which = 2) # Q-Q plot
plot(model3, which = 3) # Scale-location plot
plot(model3, which = 4) # Cook’s distance  

Prediction

1.Forcasting the Stock Price through Linear Regresiion Model


(1) Prediction of Open Price : 81.43958
(2) Prediction of High Price : 82.04573
(3) Prediction of Low Price : 82.04573
(4) Prediction of Adj Close Price : 82.04573

##combining the four variables diagram together 
par(mfrow=c(2,2)) 

##Establishing the model bewteen Stock Open Prices and the day
day<-c(1:1257)
m1<-lm(JPM_Prices$Open~day,data=JPM_Prices)
plot(JPM_Prices$Open~day)
abline(m1)

##predicting the future price of JPMorgan
predict(m1,newdata = data.frame(day=1258))
##        1 
## 81.43958
##Establishing the model bewteen Stock High Prices and the day
day<-c(1:1257)
m2<-lm(JPM_Prices$High~day,data=JPM_Prices)
plot(JPM_Prices$High~day)
abline(m2)
##predicting the future price of JPMorgan
predict(m2,newdata = data.frame(day=1258))
##        1 
## 82.04573
##Establishing the model bewteen Stock High Prices and the day
day<-c(1:1257)
m3<-lm(JPM_Prices$Low~day,data=JPM_Prices)
plot(JPM_Prices$Low~day)
abline(m3)
##predicting the future price of JPMorgan
predict(m3,newdata = data.frame(day=1258))
##       1 
## 80.8455
##Establishing the model bewteen Stock High Prices and the day
day<-c(1:1257)
m4<-lm(JPM_Prices$`Adj Close`~day,data=JPM_Prices)
plot(JPM_Prices$`Adj Close`~day)
abline(m4)

##predicting the future price of JPMorgan
predict(m4,newdata = data.frame(day=1258))
##        1 
## 80.75351
##Here are tables we could see in future 0 to 1200 days. 
  1. Conclustion

Comparing with current stock price of JP Morgan, the future Price predicted by the Modeling is between $81 and $82. These prices are little lower than the curent price, which is $91.67 in Aug 23th.





Investment Suggestions

Problem Statement

J.P. Morgan & Co. is one of the largest banks both in the United States and globally offering a full complement of investment banking, commercial banking, retail banking, asset management, private banking and private equity businesses. It is worthwhile for investors to know the financial performance, historical stock prices and further forecast the trend of stock prices in the future.

Data Visualization and Prediction

This project firstly visualized the historical stock prices through time series method and also making the trend prediction through Fama French Financial Modeling.

Investment Suggestion

As a finaical company, the stock price of JP Morgan is closely related to entrie economy and finance industy. From the Financial model, we could see a liner relationship between historical prices and date. Overall, it shows an upward trend in the future days. Although, from the prediction model, we see the future price would be around $81, it still highly like to increase in the active financial market. Therefore, we recommend a buy option for investors.