Goal


The goal of this tutorial is to learn how to draw manually additional lines in ggplot.


Preparing the data


We will use the Bitcoin Price Prediction dataset which can be downloaded through this link.

library(ggplot2)
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:lubridate':
## 
##     intersect, setdiff, union
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
bit_data <- read.csv("bitcoin_price_Training - Training.csv", header = T, sep = ",")
str(bit_data)
## 'data.frame':    1556 obs. of  7 variables:
##  $ Date      : Factor w/ 1556 levels "Apr 01, 2014",..: 763 758 753 748 743 738 733 728 723 718 ...
##  $ Open      : num  2763 2724 2807 2680 2539 ...
##  $ High      : num  2890 2759 2809 2897 2693 ...
##  $ Low       : num  2721 2645 2693 2680 2529 ...
##  $ Close     : num  2875 2757 2726 2809 2672 ...
##  $ Volume    : Factor w/ 1314 levels "-","1,064,730,000",..: 1245 1129 1205 28 1187 1286 31 1246 3 13 ...
##  $ Market.Cap: Factor w/ 1552 levels "1,000,070,000",..: 946 942 949 937 920 925 945 944 948 933 ...
# Setting Date as Date variable. If your local time is set in different type than C you will get NA, in order to fix that we must set the local type into C.
Sys.setlocale("LC_TIME", "C")
## [1] "C"
bit_data$Date <- as.Date(bit_data$Date, format = "%B %d, %Y")

For this tutorial we will only use the data from 2017. As we will use only the variable “Close” we will not change “Volume”nor “Market.Cap” variables into numeric.

bit_2017 <- filter(bit_data, year(Date) == 2017)

Visualization


Now we are going to plot the closing price of the bitcoin from January to July of 2017.

ggplot(bit_2017, aes(x = Date, y = Close)) + geom_line(color = "blue")

If we want to add additional lines showing, for example, the mean of the close price we can do it adding a new geom_line:

ggplot(bit_2017, aes(x = Date, y = Close)) + geom_line(color = "blue") +
 geom_line(aes(y = mean(Close)), color = "red", linetype = "dotted")

And not only mean can be plotted. Also quantile borders can be plot in order to understand better the distribution of the close price.

ggplot(bit_2017, aes(x = Date, y = Close)) + geom_line(color = "blue") +
  geom_line(aes(y = mean(Close)), color = "red", linetype = "dotted") +
  geom_line(aes(y = quantile(Open, 0.75)), color = "black", linetype = "dashed") +
  geom_line(aes(y = quantile(Open, 0.25)), color = "black", linetype = "dashed")

Finally, we are going to put labels to the lines in order to know what are their meaning.

ggplot(bit_2017, aes(x = Date, y = Close)) + geom_line(color = "blue") +
  geom_line(aes(y = mean(Close)), color = "red", linetype = "dotted") +
  geom_line(aes(y = quantile(Close, 0.75)), color = "black", linetype = "dashed") +
  geom_line(aes(y = quantile(Close, 0.25)), color = "black", linetype = "dashed") +
  geom_text(aes(Date[210], mean(Close) , label = "Mean"), vjust= -0.3) +
  geom_text(aes(Date[210], quantile(Close, 0.75) , label = "75%"), vjust= -0.3) +
  geom_text(aes(Date[210], quantile(Close, 0.25) , label = "25%"), vjust= -3)


Conclusion


In this tutorial we have learnt how to put additional information in a visualization. This is very useful when presenting information both technical nor not technical audience.