1. Converting Time Column to Date

# Read the data
bike_sharing_data <- read.csv("C:/Statistics for Data Science/Week 2/bike+sharing+dataset/hour.csv")

# Convert date column to proper Date format
bike_ts <- bike_sharing_data |>
  mutate(date = as.Date(dteday)) |>
  group_by(date) |>
  summarise(total_rides = sum(cnt)) |>
  as_tsibble(index = date)

# Show the first few rows
print("First few rows of our tsibble:")
## [1] "First few rows of our tsibble:"
print(bike_ts)
## # A tsibble: 731 x 2 [1D]
##    date       total_rides
##    <date>           <int>
##  1 2011-01-01         985
##  2 2011-01-02         801
##  3 2011-01-03        1349
##  4 2011-01-04        1562
##  5 2011-01-05        1600
##  6 2011-01-06        1606
##  7 2011-01-07        1510
##  8 2011-01-08         959
##  9 2011-01-09         822
## 10 2011-01-10        1321
## # ℹ 721 more rows

Insight: Successfully converted date string to R Date format and created a tsibble object for time series analysis.

2. Choose Response Variable and Create Time Series Plot

# Plot different time windows
# Full period
p1 <- bike_ts |>
  ggplot(aes(x = date, y = total_rides)) +
  geom_line() +
  labs(title = "Daily Bike Rentals - Full Period",
       x = "Date",
       y = "Total Rentals") +
  theme_minimal()

# Create weekly and monthly views
weekly_ts <- bike_ts |>
  index_by(week = yearweek(date)) |>
  summarise(total_rides = sum(total_rides))

monthly_ts <- bike_ts |>
  index_by(month = yearmonth(date)) |>
  summarise(total_rides = sum(total_rides))

# Plot weekly view
p2 <- weekly_ts |>
  ggplot(aes(x = week, y = total_rides)) +
  geom_line() +
  labs(title = "Weekly Bike Rentals",
       x = "Week",
       y = "Total Rentals") +
  theme_minimal()

# Plot monthly view
p3 <- monthly_ts |>
  ggplot(aes(x = month, y = total_rides)) +
  geom_line() +
  labs(title = "Monthly Bike Rentals",
       x = "Month",
       y = "Total Rentals") +
  theme_minimal()

print(p1)

print(p2)

print(p3)

What stands out immediately: - Clear weekly cycling pattern in usage - Strong seasonal variation with summer peaks - Overall upward trend in ridership - Weekend drops in usage visible in daily data

4. Seasonal Analysis with Smoothing

# Calculate moving averages for smoothing
bike_ts_smooth <- bike_ts |>
  mutate(
    MA7 = slider::slide_dbl(total_rides, mean, .before = 3, .after = 3, .complete = TRUE),
    MA30 = slider::slide_dbl(total_rides, mean, .before = 15, .after = 14, .complete = TRUE)
  )

# Plot with smoothing
ggplot(bike_ts_smooth, aes(x = date)) +
  geom_line(aes(y = total_rides), alpha = 0.3) +
  geom_line(aes(y = MA7, color = "Weekly MA")) +
  geom_line(aes(y = MA30, color = "Monthly MA")) +
  labs(title = "Bike Rentals with Moving Averages",
       x = "Date",
       y = "Number of Rentals",
       color = "Moving Average") +
  theme_minimal()
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 29 rows containing missing values or values outside the scale range
## (`geom_line()`).

Smoothing Insights: - Clear weekly seasonal pattern - Strong trend component - Varying seasonal amplitude - Some irregular patterns visible

5. ACF and PACF Analysis

# Generate ACF and PACF plots
bike_ts |>
  gg_tsdisplay(total_rides, 
               plot_type = "partial",
               lag_max = 30) +
  labs(title = "ACF and PACF of Daily Rentals")

Seasonality Insights from ACF/PACF: - Strong weekly seasonality (lag 7) - Significant autocorrelation pattern - Clear seasonal dependencies - Useful for forecasting model selection

Further Questions to Investigate:

  1. Weather Impacts
    • How do different weather conditions affect ridership?
    • What’s the temperature sensitivity of usage?
  2. User Patterns
    • How do casual and registered users differ?
    • What causes the weekend usage patterns?
  3. Growth Analysis
    • Is the growth rate sustainable?
    • Are there capacity constraints?
  4. Seasonal Effects
    • What drives the strong seasonal patterns?
    • How can winter ridership be improved?
  5. Operational Questions
    • How should maintenance be scheduled?
    • What’s the optimal bike distribution?