Let’s find the most popular day of the week for enjoying candles.
First, load the libraries.
# load libraries
library(readr)
library(dbplyr)
library(tidyverse)
library(here)
Next, let’s load and inspect the data.
# load data
burn_times <- read.csv("burn_times.csv")
head(burn_times)
As you can see, the burn_times data frame has a
start_time column. We can use the information from this
column to find the day of the week.
To get started, we can use the transmute function to
create a new data frame with only the calculations we specify. We’ll
keep session_id by making a new column equal to the
existing one. Then we can create two new columns using the information
from the start_time column in the original data frame.
better_dates <- burn_times %>%
transmute(session_id = session_id,
date = as.Date(start_time),
day = weekdays(date))
better_dates
Now we have the day of the week for each listed date. Also note that
the date column is a date type, whereas before
start_time was imported as a character type. While we don’t
necessarily need date anymore now that we’ve gotten the day
out of it, it’s handy to know how to convert data types like that.
That’s a lot of timestamps! Let’s group them by day of the week.
day_of_week <- better_dates %>%
group_by(day) %>%
summarize(count = n())
day_of_week
Now we’ve got our aggregated data. Looks like a lot of candle burning happened on Sundays and Mondays! Let’s make things pretty now.
viz <- day_of_week %>%
arrange(count) %>%
mutate(day=factor(day,levels=day)) %>%
ggplot(aes(fill=day,x=day,y=count)) + geom_bar(stat = "identity") + geom_text(aes(label=count), nudge_y = 2) + theme(axis.title.x = element_blank(),axis.title.y = element_blank(),axis.text.y = element_blank(),legend.position = "none", panel.background = element_blank())
viz + labs(title="Candle Burning Days",subtitle="Most popular days for burning candles",fill="Day")
Conclusion: I should start burning candles more on Tuesday and Wednesday.
Thanks for reading!