North Star Scooters

Statistical Consulting Project Case Report

Author

Griffin Lessinger

Brief “Elevator” Summary

In this report, one primary effect is outlined: registered users’ scooter use case.

Average Daily Rentals

Observe that in the plot above, the rental counts for registered users appears to be greatest during weekdays, then decline during weekends. The same is true of the casual user base, although the trend is inverted. A more rigorous analysis is performed in section 2.1.

Average Rentals by Hour

Additionally, the peaks of hourly rentals by hour of the day spike at common (occupational) commuting times.

Average Rentals by Season

Lastly, we can see that the registered users maintain the vast majority of the total user base, and the proportion of hourly rentals by registered users is greatest in Winter and Fall (harshest months for scooter operation). We have strong evidence to conclude that a large proportion of the registered user base likely uses scooters as a commuter vehicle, hence the peak times of use and the consistent use throughout seasons.

These three plots were chosen as they effectively demonstrate, at differing levels of granularity, how a large portion of the registered users use their rented scooters.

Introduction

Outlined in this document are the findings of a quantitative analysis of the North Star Scooters Statistical Consulting Project. Specifically, a business summary report detailing consumer class comparisons, user rental activity, statistical testing, and recommendations for proceeding throughout their ongoing expansion.

1. Consumer Demographics & Behavior

1.1) User types

Proportions of User Type

The above displays user types by proportions of “Casual” (or non-registered) users and “Registered” users, cumulative across 2023 and 2024. During these years, there were 65,516 rentals by casual users and 285,724 rentals by registered users, constituting ~19% and ~81% of all rentals, respectively.

1.2) Seasonal Averages

Average Rentals by Season

The pattern of average hourly rentals by season makes intuitive sense, given the service provided. An observed peak of ~180 hourly scooter rentals occurs during summer, likely when weather is more permissive of using scooters. Intriguingly, the average number of hourly rentals does reach a minimum of ~80 during the winter, while the ratio of hourly registered to casual rentals simultaneously reaches an extreme point of about 7 to 1; it’s likely that many of these registered users are dependent on the scooters for transportation. For all seasons, the average hourly rentals of registered users is greater than that of casual users.

1.3) Monthly Totals

Monthly Rental Totals

As expected, the pattern of monthly rental totals is similar to the seasonal hourly averages. Note that, for every month, the rental totals were higher in 2024 than in 2023; additionally, the rental totals in 2024 are more consistent, in that they have a somewhat tighter grouping near the peak 2024 monthly rental total than the 2023 totals to the 2023 peak. Although this is speculative, this may again be explainable by commuters. As the number of total rentals increases, it may be the case that the proportion of registered users increases as well, as commuting via rental scooter is observed as an increasingly viable option.

1.4) Weekly Averages

Average Daily Rentals

Across all users, the total average hourly rental volume is similar for each day of the week. However, it’s important to notice the distinction between the registered and casual hourly rental averages: the registered users’ hourly rentals are high on weekdays, then they decrease on weekends. The opposite is the case for the casual users’ rental behavior. This likely further reinforces the idea of scooter-based commuting.

1.5) Rentals by Hour

Average hourly rentals by both weekday and weekend

The vertical scales of the plots are different, but the trend is still clear: weekends tend to observe high rental volume for both groups between 11:00 A.M. and 8:00 P.M., whereas weekdays only typically see high rental volume for registered users during “rush-hour” or high-traffic periods

2. Inferential Statistics & Regression Analysis

2.1) Tests for Difference in Hourly Means

The first statistical tests performed were intended to supply evidence of the apparent difference in hourly rental means, for each group (all, registered, or casual users), on weekdays vs. weekends (data shown in Fig. 8). As such, for each group, let \(m_d\) be the mean hourly rentals during weekdays, \(m_e\) be the mean hourly rentals during weekends:

\[ H_0: m_d = m_e, \] \[ H_A: m_d \ne m_e, \]

Observed data suggests that inter-group variances are unequal, so Welch’s 2-sample \(t\)-test is used. It was found that, for all users, there was not sufficient evidence to conclude that weekday hourly means were unequal to weekend hourly means (\(p=0.298\)). However, for registered users (\(p=2.50\text{e-12}\)) and casual users (\(p=9.02\text{e-29}\)), there was sufficient evidence to conclude that weekday hourly means were unequal to weekend hourly means at a <1% significance level.

2.2) Regression Analysis

A multiple linear regression was also trained on the observed data in an effort to establish relationships between the count of hourly users and several of the covariates related to weather, seasonality, year, and weekend-ness. The model equation is as such: \[ \hat{y} = \small\beta_0 + \beta_1(\text{temp}) + \beta_2(\text{windspeed}) + \beta_3(\text{humidity}) + \beta_{4, 5, 6}(\text{season}) + \beta_7(\text{year}) + \beta_8(\text{weekend}) + \beta_{9,10}(\text{weather}) + \epsilon \] Where season, year, weekend, and weather were all modeled using indicator variables. The resulting model followed:

              Coefficient Standard.Error 95%CI.Lower 95%CI.Upper
Intercept           39.65          12.63       14.89       64.40
Temperature          3.50           0.22        3.06        3.93
Windspeed            0.18           0.30       -0.40        0.76
Humidity            -1.95           0.14       -2.22       -1.67
Spring               8.39           7.89       -7.07       23.85
Summer             -14.90          10.28      -35.04        5.24
Fall                48.66           6.95       35.03       62.28
Year-2024           58.34           4.55       49.42       67.25
Weekend             -4.87           4.91      -14.49        4.75
Mist                11.05           5.43        0.40       21.69
Precipitation       -1.71           9.16      -19.66       16.24

The linear model had an \(R^2\) of approximately 0.32, which is rather low (only 32% of observed variance explained).

The model would suggest that Fall and Spring are the times of peak seasonal rental activity, with Fall having a predicted hourly rental estimate that is greater than Winter and Summer by 48.66 and 63.56 respectively. This is a dubious implication, as already seen in the earlier exploratory analysis. It is also implied that misty weather would result in an increased estimated hourly rental rate compared to clear and rainy weather, that greater humidity would result in fewer estimated rentals, and that greater temperature and windspeed would both result in greater estimated rentals.

Because of the apparent shortcomings in the linear model coefficients (Wald-type significance failures not shown, but verified), as well as the general lack-of-fit evidenced by the low \(R^2\), and probable multicollinearity (e.g., seasons and temperatures), it is suggested that this model not be used for predictive or analytical tasks. In fact, for future analyses, a general linear model designed to model count data should be considered; the variance structure of the temperature data vs hourly totals (overdispersion evidence) indicates that a quasi-Poisson or negative binomial model may be suitable. Fewer covariates or dimensionality reduction techniques might also be applied.

Summary

With the conclusion of the analytical findings of this report, we have recommendations for the future strategy in the expansion of Northstar Scooters.

First, it seems clear to suggest that within the body of registered users, many use their scooters as a means for commuting/practical mobility, and that the proportion of registered users may increase with time. It may be advisable to pursue further research into the effectiveness of scooters as an economical in-city mode of personal transport, as well as to market this effectiveness if further analysis shows this to be the case. Simple data collection via user surveys could demonstrate user use-cases cleanly.

Second, of the casual (non-registered) users, knowing why they rent scooters may be worthwhile. Whether they are Minneapolis-area residents that occasionally rent scooters for recreational purposes, renters who seek to try out the scooters and personally determine their effectiveness for future registered use, or even tourists who are “exploring” the city. The market motive here is likely important to understand.

Third, a more robust statistical model for use in predicting hourly user counts could prove useful as an advising tool. Recommendations for future research have been discussed in section 2.2.