We’re going to showcase some features of various packages from the Tidyverse!
Here we have data concerning the DogeCoin/USD exchange rates according to Yahoo Finance from 2014 to roughly the present. Using magrittr, dplyr, lubridate, and ggplot2, this should be a snap. Let’s get started.
rates <- read.csv("https://raw.githubusercontent.com/TheWerefriend/data607/master/tidyverseAssignment/DOGE-USD.csv")
colnames(rates)
## [1] "Date" "Open" "High" "Low" "Close" "Adj.Close"
## [7] "Volume"
tests <- ifelse(rates$Close==rates$Adj.Close, "Identical", "Distinct")
"Distinct" %in% tests
## [1] FALSE
This result makes sense because there are no possible corporate actions like stock splits which can cause the closing value in a day to be adjusted.
library(magrittr)
library(dplyr)
rates <- rates %>%
select(!Adj.Close)
colnames(rates)
## [1] "Date" "Open" "High" "Low" "Close" "Volume"
From the dplyr package, we can use the ! operator to select the complement of a list of variables by name.
library(lubridate)
rates$Date <- ymd(rates$Date)
Now, all the other values are stored as factors… But, something strange happens when we try to convert them directly to numeric values:
rates$Open[[1]]
## [1] "0.000293"
as.numeric(rates$Open[[1]])
## [1] 0.000293
Apparently, 0.000293 is the 190th factor level in the column! Since there are some null values in the data (days where major changes happened with the DOGE network, I assume), we must throw throw those observations out.
rates[, c(2:6)] <- rates %>%
select(!Date) %>%
sapply(as.character) %>%
sapply(as.numeric)
rates <- na.omit(rates)
anyNA(rates)
## [1] FALSE
Transmute() works exactly the same as mutate(), except that it alters an existing variable instead of creating a new one.
library(ggplot2)
a <- ggplot(rates, aes(x = rates$Date)) +
geom_line(aes(y = rates$Open), color = "steelblue") +
geom_line(aes(y = rates$High), color = "green") +
geom_line(aes(y = rates$Low), color = "red") +
geom_line(aes(y = rates$High), color = "purple")
b <- ggplot(rates, aes(x = rates$Date)) +
geom_line(aes(y = rates$Volume), color = "black")
a + labs(x = "Time", y = "Price in USD")
b + labs(x = "Time", y = "Volume")
rates2<-transform(rates , High = as.numeric(High), Volume = as.numeric(Volume))
rates2 <- na.omit(rates2)
## Find the date where the highest volume trading happened
#Filter the stock by max value trade volume
rates2 %>% filter(Volume==max(Volume))
## Date Open High Low Close Volume
## 1 2021-01-29 0.043734 0.077973 0.032341 0.047162 25403310432
## Find the date where the highest price has reached
# filter the stock by max price
rates2 %>% filter(High==max(High))
## Date Open High Low Close Volume
## 1 2021-02-08 0.078352 0.084945 0.064702 0.078825 12844375210
It appears that the highest trading volume and max DOGECOIN price didn’t achieved on the same day , and it may be due to the difference in open price where the max trading happened on a low open price.