Fa25_Assignment4_Alex

# Load Packages
library(tidyverse)
library(trelliscopejs)
library(ggplot2)

#View(bike)
bike <- read_csv("bikes.csv")

bike_data <- bike %>%
  group_by(start_dow) %>%
  mutate(ride_count = n())

bike_data$ride_count <- cog(bike_data$ride_count, 
                               desc = "The total amount of rides for each day-of-week")

ggplot(bike_data, aes(x = start_hod)) +
  geom_histogram(binwidth = 1, fill = "lightblue") +
  labs(title = "Hour of Day by Day of Week",
       x = "Start Hour of the Day",
       y = "Count") +
  
  facet_trelliscope(~ start_dow,
                    name = "Hour of Day by the Day of Week",
                    desc = "Bike trips by the start hour for each day-of-week",
                    nrow = 1, ncol = 2,
                    scales = "same",
                    path = ".",
                    self_contained = TRUE)

The dataset that I chose was “bike” from data camp, within the trelliscope course. This data set provides information about bike trips such as duration of bike rides, destinations, and start and end times. Within this set, there is truly a lot that can be analyzed visually. Using the trelliscope package, I created a faceted histogram exploring the bike rides start hour of the day by the start day of the week. The goal was to analyze not only how the number of bike rides change throughout the day, but how they change depending on the day of the week. I thought that a histogram would be best at capturing that relationship. To achieve the plot, I placed the start hour of day on the X axis, and I faceted it by the days of the week. In addition, I created the cognostic measure of the number of total rides. This measure, like it sounds, allows us to know the total amount of bike rides that occur on each day of the week.

In the graphing window, there are are seven different histograms that show bike rides for each day of the week, Sunday through Saturday. For the weekday plots (Monday - Friday) there is a clear binomial distribution. There is big increase in bike rides in the early morning (around 7am/8am) and there is another big increase in the evening (around 5pm/6pm). This distrution would make sense for those individuals who work a typical 9-5 schedule, biking before or after work. Then for Saturday and Sunday, there is definetly a more normal distribution. Since most people are off on the weekend, they are able to bike in the middle of the day. For the cognostic measure of total rides, we can see that the majority of bike rides occur on Tuesdays, Wednesdays, and Thursdays. This did surprise me because I would think that the majority of bike rides would occur over the weekend when people have more time.

The biggest issue that I had was added the cognostic measure in the plot. Since I used the function facet_trelliscope(), I could not use the cognostic feature inside. To get around this, I followed a similar process that I learned from the data camp course, Visualizing Big Data with Trelliscope. I first created a new variable called ride count that simply counts the amount of rides. Then I added this variable to the bike_data dataset and used the cog function to create the measure. If I did not watch the data camp video, I know I would have struggled even more trying to make this function work. But is important to point out that there are many different measures I could have computed. Even though I did compute the total number of rides, I could have made measures such as the mean start hour, or even the busiest hour each day.

RPUBS Access https://rpubs.com/DataVisAK/1370379

Fa25_Assignment4_Alex_Krzan

Alex Krzan

2025-11-18