In this exercise you will learn to plot data using the ggplot2 package. To answer the questions below, use 4.1 Categorical vs. Categorical from Data Visualization with R.
# Load packages
library(tidyquant)
library(tidyverse)
library(lubridate) #for year()
# Pick stocks
stocks <- c("AAPL", "MSFT", "IBM")
# Import stock prices
stock_prices <- stocks %>%
tq_get(get = "stock.prices",
from = "1990-01-01",
to = "2019-05-31") %>%
group_by(symbol)
stock_prices
## # A tibble: 22,230 x 8
## # Groups: symbol [3]
## symbol date open high low close volume adjusted
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 1990-01-02 1.26 1.34 1.25 1.33 45799600 1.08
## 2 AAPL 1990-01-03 1.36 1.36 1.34 1.34 51998800 1.09
## 3 AAPL 1990-01-04 1.37 1.38 1.33 1.34 55378400 1.10
## 4 AAPL 1990-01-05 1.35 1.37 1.32 1.35 30828000 1.10
## 5 AAPL 1990-01-08 1.34 1.36 1.32 1.36 25393200 1.11
## 6 AAPL 1990-01-09 1.36 1.36 1.32 1.34 21534800 1.10
## 7 AAPL 1990-01-10 1.34 1.34 1.28 1.29 49929600 1.05
## 8 AAPL 1990-01-11 1.29 1.29 1.23 1.23 52763200 1.00
## 9 AAPL 1990-01-12 1.22 1.24 1.21 1.23 42974400 1.00
## 10 AAPL 1990-01-15 1.23 1.28 1.22 1.22 40434800 0.997
## # … with 22,220 more rows
# Process stock_prices and save it under stock_returns
stock_returns <-
stock_prices %>%
# Calculate yearly returns
tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "yearly") %>%
# create a new variable, year
mutate(year = year(date)) %>%
# drop date
select(-date)
stock_returns
## # A tibble: 90 x 3
## # Groups: symbol [3]
## symbol yearly.returns year
## <chr> <dbl> <dbl>
## 1 AAPL 0.169 1990
## 2 AAPL 0.323 1991
## 3 AAPL 0.0691 1992
## 4 AAPL -0.504 1993
## 5 AAPL 0.352 1994
## 6 AAPL -0.173 1995
## 7 AAPL -0.345 1996
## 8 AAPL -0.371 1997
## 9 AAPL 2.12 1998
## 10 AAPL 1.51 1999
## # … with 80 more rows
library(dplyr)
plotdata <- stock_returns %>%
group_by(symbol) %>%
summarize(mean_yearly.returns = mean(yearly.returns))
plotdata
## # A tibble: 3 x 2
## symbol mean_yearly.returns
## <chr> <dbl>
## 1 AAPL 0.366
## 2 IBM 0.116
## 3 MSFT 0.283
ggplot(plotdata,
aes(x = symbol,
y = mean_yearly.returns)) +
geom_bar(stat = "identity")
library(scales)
ggplot(plotdata,
aes(x = factor(symbol,
labels = c("AAPL",
"IBM",
"MSFT")),
y = mean_yearly.returns)) +
geom_bar(stat = "identity") +
geom_text(aes(label = percent(mean_yearly.returns)),
vjust = -0.25)
ggplot(stock_returns,
aes(x = yearly.returns,
fill = symbol)) +
geom_density(alpha = 0.4) +
labs(title = "Yearly returns distribution by stock")
The stock that has the highest chance of losing big when things go wrong is Apple becuse they are the only stock that hasn’t crashed yet. Microsoft and IBM hit their peak and crashed right after.
ggplot(stock_returns,
aes(x = symbol,
y = yearly.returns)) +
geom_boxplot() +
labs(title = "Yearly returns distribution by stock")
If I had to choose between the three stocks, I would choose Apple because although after it hit its peak it dropped, it didn’t crash like Microsoft and IBM did. Apple may be low in stocks but I believe they will rise back up. IBM and Microsoft are a scary investment because as shown on the density graph, they reached their all time high and flat-lined right after.