We’re going to showcase some features of various packages from the Tidyverse!

Here we have data concerning the DogeCoin/USD exchange rates according to Yahoo Finance from 2014 to roughly the present. Using magrittr, dplyr, lubridate, and ggplot2, this should be a snap. Let’s get started.

Load the data.

rates <- read.csv("https://raw.githubusercontent.com/TheWerefriend/data607/master/tidyverseAssignment/DOGE-USD.csv")

colnames(rates)
## [1] "Date"      "Open"      "High"      "Low"       "Close"     "Adj.Close"
## [7] "Volume"

Check to see if close and adjusted close are the same.

tests <- ifelse(rates$Close==rates$Adj.Close, "Identical", "Distinct")

"Distinct" %in% tests
## [1] FALSE

This result makes sense because there are no possible corporate actions like stock splits which can cause the closing value in a day to be adjusted.

Remove Adj.Close.

library(magrittr)
library(dplyr)

rates <- rates %>%
  select(!Adj.Close)

colnames(rates)
## [1] "Date"   "Open"   "High"   "Low"    "Close"  "Volume"

From the dplyr package, we can use the ! operator to select the complement of a list of variables by name.

Convert the Date column to a date, and everything else to numeric values.

library(lubridate)

rates$Date <- ymd(rates$Date)

Now, all the other values are stored as factors… But, something strange happens when we try to convert them directly to numeric values:

rates$Open[[1]]
## [1] 0.000293
## 1413 Levels: 0.000087 0.000089 0.000090 0.000091 0.000092 0.000093 ... null
as.numeric(rates$Open[[1]])
## [1] 190

Apparently, 0.000293 is the 190th factor level in the column! Since there are some null values in the data (days where major changes happened with the DOGE network, I assume), we must throw throw those observations out.

rates[, c(2:6)] <- rates %>%
  select(!Date) %>%
  sapply(as.character) %>%
  sapply(as.numeric)
rates <- na.omit(rates)
anyNA(rates)
## [1] FALSE

Transmute() works exactly the same as mutate(), except that it alters an existing variable instead of creating a new one.

Graph the prices and volume of Doge.

library(ggplot2)

a <- ggplot(rates, aes(x = rates$Date)) +
  geom_line(aes(y = rates$Open), color = "steelblue") +
  geom_line(aes(y = rates$High), color = "green") +
  geom_line(aes(y = rates$Low), color = "red") +
  geom_line(aes(y = rates$High), color = "purple")

b <- ggplot(rates, aes(x = rates$Date)) +
  geom_line(aes(y = rates$Volume), color = "black")

a + labs(x = "Time", y = "Price in USD")

b + labs(x = "Time", y = "Volume")