Stock Price Project for Bank of America, CitiGroup, Goldman Sachs, JPMorgan Chase, Morgan Stanley, and Wells Fargo 2006-2017

Pei-Yu Lan | July 10, 2018

  • The data is from yahoo finance, I get the stock data of Bank of America, CitiGroup, Goldman Sachs, JPMorgan Chase, Morgan Stanley, and Wells Fargo from 2006 to 2017. I will be analyzing their stock prices through the years and research the financial crisis of 2007–2008.
In [1]:
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
from pandas_datareader import data, wb
import fix_yahoo_finance as yf
yf.pdr_override()
import numpy as np
import datetime
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
#Plotly Method Imports
import plotly
import cufflinks as cf
cf.go_offline()
In [2]:
import warnings
warnings.filterwarnings('ignore')
In [39]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')
Out[39]:
In [3]:
start = datetime.datetime(2006, 1, 1)
end = datetime.datetime(2018, 1, 1)

The dataset for the banks from 2006 to 2017 were downladed from Yahoo Finance

In [4]:
# Bank of America
BAC = data.get_data_yahoo('BAC', start, end)

# CitiGroup
C = data.get_data_yahoo("C", start, end)

# Goldman Sachs
GS = data.get_data_yahoo("GS", start, end)

# JPMorgan Chase
JPM = data.get_data_yahoo("JPM", start, end)

# Morgan Stanley
MS = data.get_data_yahoo("MS", start, end)

# Wells Fargo
WFC = data.get_data_yahoo("WFC", start, end)
[*********************100%***********************]  1 of 1 downloaded
[*********************100%***********************]  1 of 1 downloaded
[*********************100%***********************]  1 of 1 downloaded
[*********************100%***********************]  1 of 1 downloaded
[*********************100%***********************]  1 of 1 downloaded
[*********************100%***********************]  1 of 1 downloaded
In [5]:
tickers = ['BAC', 'C', 'GS', 'JPM', 'MS', 'WFC']

Combing all the stock price data of the banks from 2006 to 2017 into a dataset.

In [6]:
bank_stocks = pd.concat([BAC, C, GS, JPM, MS, WFC],axis=1,keys=tickers)
In [7]:
bank_stocks.columns.names = ['Bank Ticker','Stock Info']

Check the head of the bank_stocks dataframe.

In [8]:
bank_stocks.head()
Out[8]:
Bank Ticker BAC C ... MS WFC
Stock Info Open High Low Close Adj Close Volume Open High Low Close ... Low Close Adj Close Volume Open High Low Close Adj Close Volume
Date
2006-01-03 46.919998 47.180000 46.150002 47.080002 36.711761 16296700 490.000000 493.799988 481.100006 492.899994 ... 56.740002 58.310001 39.612274 5377000 31.600000 31.975000 31.195000 31.900000 22.228920 11016400
2006-01-04 47.000000 47.240002 46.450001 46.580002 36.321880 17757900 488.600006 491.000000 483.500000 483.799988 ... 58.349998 58.349998 39.639458 7977800 31.799999 31.820000 31.365000 31.530001 21.971096 10870000
2006-01-05 46.580002 46.830002 46.320000 46.639999 36.368652 14970700 484.399994 487.799988 484.000000 486.200012 ... 58.020000 58.509998 39.748142 5778000 31.500000 31.555000 31.309999 31.495001 21.946697 10158000
2006-01-06 46.799999 46.910000 46.349998 46.570000 36.314068 12599800 488.799988 489.000000 482.000000 486.200012 ... 58.049999 58.570000 39.788902 6889800 31.580000 31.775000 31.385000 31.680000 22.075617 8403800
2006-01-09 46.720001 46.970001 46.360001 46.599998 36.337463 15619400 486.000000 487.399994 483.000000 483.899994 ... 58.619999 59.189999 40.210091 4144500 31.674999 31.825001 31.555000 31.674999 22.072132 5619600

5 rows × 36 columns

In [9]:
bank_stocks.xs(key='Close',axis=1,level='Stock Info').max()
Out[9]:
Bank Ticker
BAC     54.900002
C      564.099976
GS     261.010010
JPM    107.830002
MS      89.300003
WFC     61.610001
dtype: float64

Creating the returns for each bank's stock. Returns are typically defined by:

$$r_t = \frac{p_t - p_{t-1}}{p_{t-1}} = \frac{p_t}{p_{t-1}} - 1$$

The first 5 rows of the returns for these banks:

In [10]:
returns = pd.DataFrame()
In [11]:
for tick in tickers:
    returns[tick+' Return'] = bank_stocks[tick]['Close'].pct_change()
returns.head()
Out[11]:
BAC Return C Return GS Return JPM Return MS Return WFC Return
Date
2006-01-03 NaN NaN NaN NaN NaN NaN
2006-01-04 -0.010620 -0.018462 -0.013812 -0.014183 0.000686 -0.011599
2006-01-05 0.001288 0.004961 -0.000393 0.003029 0.002742 -0.001110
2006-01-06 -0.001501 0.000000 0.014169 0.007046 0.001025 0.005874
2006-01-09 0.000644 -0.004731 0.012030 0.016242 0.010586 -0.000158

It's the pairplot of the returns for the 6 banks. From the graph, we can see that all the banks are centured at the 0 retuen rate, but some banks like Morgan Stanley and CitiGroup have higher fluctuation (deviation) for the return rates, which means that they have the higher risk for their stocks.

In [12]:
sns.set(style="ticks", color_codes=True)
sns.pairplot(returns[1:],kind="reg", plot_kws={'line_kws':{'color':'red', 'linestyle': ':'}})
Out[12]:
<seaborn.axisgrid.PairGrid at 0x1050cbe80>

When did the banks have the lowest return rate over the years?

In [13]:
returns.idxmin()
Out[13]:
BAC Return   2009-01-20
C Return     2009-02-27
GS Return    2009-01-20
JPM Return   2009-01-20
MS Return    2008-10-09
WFC Return   2009-01-20
dtype: datetime64[ns]

When did the banks have the highest return rate over the years?

In [14]:
returns.idxmax()
Out[14]:
BAC Return   2009-04-09
C Return     2008-11-24
GS Return    2008-11-24
JPM Return   2009-01-21
MS Return    2008-10-13
WFC Return   2008-07-16
dtype: datetime64[ns]

Looking at the standard deviation of the returns, CitiGroup is the riskiest over the entire time period

In [15]:
returns.std()
Out[15]:
BAC Return    0.034183
C Return      0.035917
GS Return     0.023954
JPM Return    0.025823
MS Return     0.035234
WFC Return    0.028128
dtype: float64

The Stock Price of Bank of America from 2006 to 2011

For these 6 banks, Bank of America, CitiGroup, JPMorgan Chase, Morgan Stanley, and Wells Fargo, their retuen rates went extended during 2018 and 2019. It means that they encountered some financial impact during these times and it caused them to have a huge fluctuation on their return rates. There were higher risks for their stocks. Around 2018, the global financial crisis happened and it undoubtedly impacted the banks and the stock markets dramastically dropped.

Among this 6 banks, we can see that Bank of America and CitiGroup had huger impact on their stock prices; on the other side, Golden Sachs and JPMorgan Chase had the less impact during this time. Golden Sachs and JPMorgan Chase have more stable stock prices through the years.

In [16]:
import warnings
warnings.filterwarnings(action='once')
In [17]:
f, axes = plt.subplots(2, 3, figsize=(12, 8), sharex=True)
sns.distplot(returns.loc['2006-01-01':'2006-12-31']['BAC Return'].dropna(),color='red',bins=50,ax=axes[0,0])
sns.distplot(returns.loc['2007-01-01':'2007-12-31']['BAC Return'].dropna(),color='red',bins=50,ax=axes[0,1])
sns.distplot(returns.loc['2008-01-01':'2008-12-31']['BAC Return'].dropna(),color='red',bins=50,ax=axes[0,2])
sns.distplot(returns.loc['2009-01-01':'2009-12-31']['BAC Return'].dropna(),color='red',bins=50,ax=axes[1,0])
sns.distplot(returns.loc['2010-01-01':'2010-12-31']['BAC Return'].dropna(),color='red',bins=50,ax=axes[1,1])
sns.distplot(returns.loc['2011-01-01':'2011-12-31']['BAC Return'].dropna(),color='red',bins=50,ax=axes[1,2])
axes[0,0].set_title("2006")
axes[0,1].set_title("2007")
axes[0,2].set_title("2008")
axes[1,0].set_title("2009")
axes[1,1].set_title("2010")
axes[1,2].set_title("2011")
axes[0,0].set_xlim([-0.4,0.4])
plt.tight_layout()
/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning:

The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.

The Stock Price of CitiGroup from 2006 to 2011

In [18]:
f, axes = plt.subplots(2, 3, figsize=(12, 8), sharex=True)
sns.distplot(returns.loc['2006-01-01':'2006-12-31']['C Return'].dropna(),color='orange',bins=50,ax=axes[0,0])
sns.distplot(returns.loc['2007-01-01':'2007-12-31']['C Return'].dropna(),color='orange',bins=50,ax=axes[0,1])
sns.distplot(returns.loc['2008-01-01':'2008-12-31']['C Return'].dropna(),color='orange',bins=50,ax=axes[0,2])
sns.distplot(returns.loc['2009-01-01':'2009-12-31']['C Return'].dropna(),color='orange',bins=50,ax=axes[1,0])
sns.distplot(returns.loc['2010-01-01':'2010-12-31']['C Return'].dropna(),color='orange',bins=50,ax=axes[1,1])
sns.distplot(returns.loc['2011-01-01':'2011-12-31']['C Return'].dropna(),color='orange',bins=50,ax=axes[1,2])
axes[0,0].set_title("2006")
axes[0,1].set_title("2007")
axes[0,2].set_title("2008")
axes[1,0].set_title("2009")
axes[1,1].set_title("2010")
axes[1,2].set_title("2011")
axes[0,0].set_xlim([-0.4,0.4])
plt.tight_layout()
/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning:

The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.

The Stock Price of Goldman Sachs from 2006 to 2011

In [19]:
f, axes = plt.subplots(2, 3, figsize=(12, 8), sharex=True)
sns.distplot(returns.loc['2006-01-01':'2006-12-31']['GS Return'].dropna(),color='brown',bins=50,ax=axes[0,0])
sns.distplot(returns.loc['2007-01-01':'2007-12-31']['GS Return'].dropna(),color='brown',bins=50,ax=axes[0,1])
sns.distplot(returns.loc['2008-01-01':'2008-12-31']['GS Return'].dropna(),color='brown',bins=50,ax=axes[0,2])
sns.distplot(returns.loc['2009-01-01':'2009-12-31']['GS Return'].dropna(),color='brown',bins=50,ax=axes[1,0])
sns.distplot(returns.loc['2010-01-01':'2010-12-31']['GS Return'].dropna(),color='brown',bins=50,ax=axes[1,1])
sns.distplot(returns.loc['2011-01-01':'2011-12-31']['GS Return'].dropna(),color='brown',bins=50,ax=axes[1,2])
axes[0,0].set_title("2006")
axes[0,1].set_title("2007")
axes[0,2].set_title("2008")
axes[1,0].set_title("2009")
axes[1,1].set_title("2010")
axes[1,2].set_title("2011")
axes[0,0].set_xlim([-0.4,0.4])
plt.tight_layout()
/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning:

The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.

The Stock Price of JPMorgan Chase from 2006 to 2011

In [20]:
f, axes = plt.subplots(2, 3, figsize=(12, 8), sharex=True)
sns.distplot(returns.loc['2006-01-01':'2006-12-31']['JPM Return'].dropna(),color='green',bins=50,ax=axes[0,0])
sns.distplot(returns.loc['2007-01-01':'2007-12-31']['JPM Return'].dropna(),color='green',bins=50,ax=axes[0,1])
sns.distplot(returns.loc['2008-01-01':'2008-12-31']['JPM Return'].dropna(),color='green',bins=50,ax=axes[0,2])
sns.distplot(returns.loc['2009-01-01':'2009-12-31']['JPM Return'].dropna(),color='green',bins=50,ax=axes[1,0])
sns.distplot(returns.loc['2010-01-01':'2010-12-31']['JPM Return'].dropna(),color='green',bins=50,ax=axes[1,1])
sns.distplot(returns.loc['2011-01-01':'2011-12-31']['JPM Return'].dropna(),color='green',bins=50,ax=axes[1,2])
axes[0,0].set_title("2006")
axes[0,1].set_title("2007")
axes[0,2].set_title("2008")
axes[1,0].set_title("2009")
axes[1,1].set_title("2010")
axes[1,2].set_title("2011")
axes[0,0].set_xlim([-0.4,0.4])
plt.tight_layout()
/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning:

The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.

The Stock Price of Morgan Stanley from 2006 to 2011

In [21]:
f, axes = plt.subplots(2, 3, figsize=(12, 8), sharex=True)
sns.distplot(returns.loc['2006-01-01':'2006-12-31']['MS Return'].dropna(),color='blue',bins=50,ax=axes[0,0])
sns.distplot(returns.loc['2007-01-01':'2007-12-31']['MS Return'].dropna(),color='blue',bins=50,ax=axes[0,1])
sns.distplot(returns.loc['2008-01-01':'2008-12-31']['MS Return'].dropna(),color='blue',bins=50,ax=axes[0,2])
sns.distplot(returns.loc['2009-01-01':'2009-12-31']['MS Return'].dropna(),color='blue',bins=50,ax=axes[1,0])
sns.distplot(returns.loc['2010-01-01':'2010-12-31']['MS Return'].dropna(),color='blue',bins=50,ax=axes[1,1])
sns.distplot(returns.loc['2011-01-01':'2011-12-31']['MS Return'].dropna(),color='blue',bins=50,ax=axes[1,2])
axes[0,0].set_title("2006")
axes[0,1].set_title("2007")
axes[0,2].set_title("2008")
axes[1,0].set_title("2009")
axes[1,1].set_title("2010")
axes[1,2].set_title("2011")
axes[0,0].set_xlim([-0.4,0.4])
plt.tight_layout()
/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning:

The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.

The Stock Price of Wells Fargo from 2006 to 2011

In [22]:
f, axes = plt.subplots(2, 3, figsize=(12, 8), sharex=True)
sns.distplot(returns.loc['2006-01-01':'2006-12-31']['WFC Return'].dropna(),color='purple',bins=50,ax=axes[0,0])
sns.distplot(returns.loc['2007-01-01':'2007-12-31']['WFC Return'].dropna(),color='purple',bins=50,ax=axes[0,1])
sns.distplot(returns.loc['2008-01-01':'2008-12-31']['WFC Return'].dropna(),color='purple',bins=50,ax=axes[0,2])
sns.distplot(returns.loc['2009-01-01':'2009-12-31']['WFC Return'].dropna(),color='purple',bins=50,ax=axes[1,0])
sns.distplot(returns.loc['2010-01-01':'2010-12-31']['WFC Return'].dropna(),color='purple',bins=50,ax=axes[1,1])
sns.distplot(returns.loc['2011-01-01':'2011-12-31']['WFC Return'].dropna(),color='purple',bins=50,ax=axes[1,2])
axes[0,0].set_title("2006")
axes[0,1].set_title("2007")
axes[0,2].set_title("2008")
axes[1,0].set_title("2009")
axes[1,1].set_title("2010")
axes[1,2].set_title("2011")
axes[0,0].set_xlim([-0.4,0.4])
plt.tight_layout()
/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning:

The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.

Close price for each bank for the entire time

In [23]:
for tick in tickers:
    bank_stocks[tick]['Close'].plot(figsize=(15,6),label=tick)
plt.legend()
Out[23]:
<matplotlib.legend.Legend at 0x1a1b0a80b8>

Moving Averages for each bank in 2018

To analyze the moving averages for these stocks in the year 2008. Plot the rolling 30 day average against the Close Price for each bank for the year 2008.

From the below graphs, we can see that for these 6 banks in 2008, their stock prices are generally dropped due to the impact of financial crisis. However, JPMorgan Chase and Wells Fargo had a better performance on thier stocks compared to the other banks who had the dramatic drop during this time.

In [24]:
f, axes = plt.subplots(3, 2, figsize=(20, 15))
axes[0,0].plot(BAC['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean(), label='30 Day Avg', color='red',ls=':')
axes[0,0].plot(BAC['Close'].loc['2008-01-01':'2009-01-01'], label='BAC CLOSE')
axes[0,0].set_title('Bank of America')

axes[0,1].plot(C['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean(), label='30 Day Avg', color='red',ls=':')
axes[0,1].plot(C['Close'].loc['2008-01-01':'2009-01-01'], label='BAC CLOSE')
axes[0,1].set_title('CitiGroup')

axes[1,0].plot(GS['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean(), label='30 Day Avg', color='red',ls=':')
axes[1,0].plot(GS['Close'].loc['2008-01-01':'2009-01-01'], label='BAC CLOSE')
axes[1,0].set_title('Goldman Sachs')

axes[1,1].plot(JPM['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean(), label='30 Day Avg', color='red',ls=':')
axes[1,1].plot(JPM['Close'].loc['2008-01-01':'2009-01-01'], label='BAC CLOSE')
axes[1,1].set_title('JPMargan Chase')

axes[2,0].plot(MS['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean(), label='30 Day Avg', color='red',ls=':')
axes[2,0].plot(MS['Close'].loc['2008-01-01':'2009-01-01'], label='BAC CLOSE')
axes[2,0].set_title('Morgan Stanley')

axes[2,1].plot(WFC['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean(), label='30 Day Avg', color='red',ls=':')
axes[2,1].plot(WFC['Close'].loc['2008-01-01':'2009-01-01'], label='BAC CLOSE')
axes[2,1].set_title('Wells Fargo')
Out[24]:
Text(0.5,1,'Wells Fargo')

Heatmap of the correlation between the stocks close price throgh the years

From the heatmap and clustermap, we can tell that Bank of America, CitiGroup and Morgan Stanley have higher correlation with each other, which means that their stock prices may have certain level of impact among themselves. Another group, JPMorgan Chase, Goldman Sachs and Wells Fargo have higher correlation with each other's performance of the stock prices.

In [25]:
plt.figure(figsize=(9,7.2))
sns.heatmap(bank_stocks.xs(key='Close',axis=1,level='Stock Info').corr(),cmap='RdBu_r',annot=True)
Out[25]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a1b059b00>

Optional: Use seaborn's clustermap to cluster the correlations together:

In [26]:
sns.clustermap(bank_stocks.xs(key='Close', axis=1, level='Stock Info').corr(),cmap='RdBu_r',annot=True)
Out[26]:
<seaborn.matrix.ClusterGrid at 0x1a1b3b9160>

Interactive Plots

You can zoom in these plots and see the details of the time that you are interested in

Candle Plot for Bank of America

From this plor, we can see that the financial crisis around 2018 impacted Bank of America very well. After the economic depression, Bank of America has the slighly increse after year 2012.

In [27]:
BAC[['Open', 'High', 'Low', 'Close']].loc['2006-01-01':'2018-01-01'].iplot(kind='candle')

Candle Plot for CitiGroup

CitiGroup had the dramastically drop on the stock price around 2008. They were hugely impacted by the financial crisis. Their stock price plummet from around 545 to around 10. Their stock price collapsed in these 2 years. After the economic downturn, they went up slightly at around 50.

In [28]:
C[['Open', 'High', 'Low', 'Close']].loc['2006-01-01':'2018-01-01'].iplot(kind='candle')

Candle Plot for Goldman Sachs

For the Goldman Sachs, they also have the sudden drop around 2018 but after the global financial tsunami, they bounced back and were grafually increasing through the years.

In [29]:
GS[['Open', 'High', 'Low', 'Close']].loc['2006-01-01':'2018-01-01'].iplot(kind='candle')

Candle Plot for JPMorgan Chase

JPMorgan Chase had the best performance among these 6 banks. Although they had a drop around 2009 but the impact was not that huge. It went back to the price as the one previouly and it was still growing on the stock price over the years. It has had a clear increase since 2012.

In [30]:
JPM[['Open', 'High', 'Low', 'Close']].loc['2006-01-01':'2018-01-01'].iplot(kind='candle')

Candle Plot for Morgan Stanley

Morgan Stanley had a serious drop during the financial crisis and then they went up to around 30 and then in 2012, it had another slight drop as well as in 2016. However, overall, they have a slight increase after the crisis.

In [31]:
MS[['Open', 'High', 'Low', 'Close']].loc['2006-01-01':'2018-01-01'].iplot(kind='candle')

Candle Plot for Wells Fargo

Wells Fargo was also suffered from the economic downturn, so they had the drop around 2009, and then they recovered quickly and went up until 2016. It dropped in 2016 and 2017. It may because of the sales scandal.

In [32]:
WFC[['Open', 'High', 'Low', 'Close']].loc['2006-01-01':'2018-01-01'].iplot(kind='candle')

The Simple Moving Average for 3 Major Banks, Bank of America, CitiGroup and JPMorgan Chase

Bollinger Band Plot for Bank of America for the year 2017

In [33]:
BAC['Close'].loc['2017-01-01':'2018-01-01'].ta_plot(study='boll')

Bollinger Band Plot for CitiGroup for the year 2017

In [34]:
C['Close'].loc['2017-01-01':'2018-01-01'].ta_plot(study='boll')

Bollinger Band Plot for JPMorgan Chase for the year 2017

In [36]:
JPM['Close'].loc['2017-01-01':'2018-01-01'].ta_plot(study='boll')