Main Question and Goal
Main Question: How do various weather factors
influence the number of bike rentals in the Capital Bikeshare
system?
Goal: To analyze the relationship between weather
conditions (such as temperature, humidity, and windspeed) and bike
rental counts. This will help in understanding the patterns and factors
that drive bike-sharing usage, enabling better resource allocation and
system optimization.
Interesting Aspects for Further Investigation
1) Relationship between bike rentals, time of day, and
temperature
Visualization: Heatmap of Bike Rentals by Hour and
Temperature
# Assuming 'hr' represents the hour of the day
ggplot(bike_sharing_data, aes(x = temp, y = factor(hr))) +
geom_bin2d(bins = 30) +
scale_fill_gradient(low = "lightyellow", high = "red") +
labs(title = "Heatmap of Bike Rentals by Temperature and Hour",
x = "Normalized Temperature",
y = "Hour of Day",
fill = "Count") +
theme_minimal()

Explanation: Identifying peak hours and their
corresponding temperatures can help in optimizing bike distribution and
availability.
Insights: Bike rentals peak during morning and
evening commute hours (7-9 AM, 5-7 PM) and are highest at moderate
temperatures (0.4 - 0.7), with significantly fewer rentals at colder or
extremely hot temperatures.
2) Visualizing the correlations between multiple numeric variables
in the dataset simultaneously.
Visualization: Correlation Heat Matrix
# Select numeric variables
numeric_vars <- bike_sharing_data %>%
select(temp, atemp, hum, windspeed, cnt)
# Compute correlation matrix
cor_matrix <- cor(numeric_vars)
# Convert to long format for ggplot
cor_long <- as.data.frame(as.table(cor_matrix))
# Heatmap
ggplot(cor_long, aes(Var1, Var2, fill = Freq)) +
geom_tile(color = "white") +
scale_fill_gradient2(low = "blue", high = "red", mid = "white",
midpoint = 0, limit = c(-1,1), space = "Lab",
name="Pearson\nCorrelation") +
theme_minimal() +
labs(title = "Correlation Matrix Heatmap") +
theme(axis.text.x = element_text(angle = 45, vjust = 1,
size = 12, hjust = 1))

Explanation: Understanding correlations helps in
feature selection for modeling and identifying potential
multicollinearity issues.
Insight: Temperature has a moderate positive
correlation with bike rentals (~0.63), while humidity and windspeed show
weak or negligible correlations, indicating temperature as a key driver
of rental activity.