We want to assess the real impact of a test promotion on the ridership in the city of Toronto’s bike sharing program. Over the past two years, the marketing department randomly assigned and declared half of the year’s days to be “promotional days” where the daily rental fee paid by an individual is discounted by 30% of the normal rate.
To test the impact, we are going to build a regression model to try to predict total riders on a given day and isolate as best we can the impact of the promotion on total ridership. Our model will incorporate various factors such as the month, day of the week, temperature, wind, humidity, and other factors in addition to whether or not the promotion is in place that day.
##
## ======================================================
## Dependent variable:
## ---------------------------
## total_riders
## ------------------------------------------------------
## poly(temp, 3, raw = TRUE)1 -297.367***
## (88.244)
##
## poly(temp, 3, raw = TRUE)2 33.773***
## (4.781)
##
## poly(temp, 3, raw = TRUE)3 -0.706***
## (0.080)
##
## Promotion 1,951.248***
## (59.158)
##
## holiday -597.933***
## (174.747)
##
## humidity -36.410***
## (2.297)
##
## windspeed -68.720***
## (5.954)
##
## as.factor(mnth)2 4.248
## (151.578)
##
## as.factor(mnth)3 378.694**
## (164.635)
##
## as.factor(mnth)4 562.584***
## (182.344)
##
## as.factor(mnth)5 781.762***
## (205.887)
##
## as.factor(mnth)6 751.496***
## (231.185)
##
## as.factor(mnth)7 945.152***
## (255.372)
##
## as.factor(mnth)8 794.622***
## (236.781)
##
## as.factor(mnth)9 1,113.148***
## (212.332)
##
## as.factor(mnth)10 1,244.829***
## (182.950)
##
## as.factor(mnth)11 1,116.341***
## (166.023)
##
## as.factor(mnth)12 772.039***
## (156.109)
##
## Constant 4,521.871***
## (495.473)
##
## ------------------------------------------------------
## Observations 731
## R2 0.842
## Adjusted R2 0.838
## Residual Std. Error 780.784 (df = 712)
## F Statistic 210.101*** (df = 18; 712)
## ======================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Our generalized regression equation is 4,521 -297.4(temp) + 33.8(temp)^2 -0.7(temp)^3 + 1951(Promotion) - 597.9(holiday) - 36.4(humidity) - 68.7(windspeed) + month.
There were collinearity issues with adding other variables. For example, using weather situation and humidity or windspeed caused issues. The humidity and windspeed variables led to a higher R-square and Adjusted R-squared than weather situation alone, so the decision was made to exclude weather situation from the model and use humidity and windspeed instead.
Holding all else constant, October has the highest ridership. Unseasonably cold or rainy weather in the month should not impact the coefficient related to month since we are taking into account those factors with the temperature and humidity variables, respectively.
All else being equal, we can expect 1,951 more riders on days with the promotion in place as opposed to days when it is not. In fact, the promotion is the strongest single predictor in our model with a t-value of 33.0. The results of our analysis support the claimes of Toronto’s BikeShare marketing department that the promotion was successful in promoting bike usage.
We are interested in whether the promotion impacted casual and registered riders differently. One way to approach this question is to separately model casual and registered ridership.
##
## ======================================================
## Dependent variable:
## ---------------------------
## casual
## ------------------------------------------------------
## poly(temp, 3, raw = TRUE)1 -116.641**
## (58.044)
##
## poly(temp, 3, raw = TRUE)2 10.626***
## (3.145)
##
## poly(temp, 3, raw = TRUE)3 -0.211***
## (0.052)
##
## Promotion 279.379***
## (38.912)
##
## holiday 318.800***
## (114.942)
##
## humidity -9.937***
## (1.511)
##
## windspeed -19.483***
## (3.916)
##
## as.factor(mnth)2 -21.846
## (99.703)
##
## as.factor(mnth)3 291.404***
## (108.291)
##
## as.factor(mnth)4 421.284***
## (119.940)
##
## as.factor(mnth)5 450.141***
## (135.425)
##
## as.factor(mnth)6 360.322**
## (152.065)
##
## as.factor(mnth)7 484.555***
## (167.975)
##
## as.factor(mnth)8 330.779**
## (155.746)
##
## as.factor(mnth)9 383.370***
## (139.664)
##
## as.factor(mnth)10 361.786***
## (120.338)
##
## as.factor(mnth)11 201.390*
## (109.204)
##
## as.factor(mnth)12 80.146
## (102.683)
##
## Constant 1,166.362***
## (325.904)
##
## ------------------------------------------------------
## Observations 731
## R2 0.454
## Adjusted R2 0.441
## Residual Std. Error 513.572 (df = 712)
## F Statistic 32.935*** (df = 18; 712)
## ======================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Above we can see the output of our model when attempting to predict casual ridership. Promotion is still the single strongest predictor of ridership when looking at just casual riders. However, the t-value of 7.2 is much lower than in our combined model.
##
## ======================================================
## Dependent variable:
## ---------------------------
## registered
## ------------------------------------------------------
## poly(temp, 3, raw = TRUE)1 -180.727**
## (86.889)
##
## poly(temp, 3, raw = TRUE)2 23.147***
## (4.707)
##
## poly(temp, 3, raw = TRUE)3 -0.495***
## (0.079)
##
## Promotion 1,671.870***
## (58.250)
##
## holiday -916.733***
## (172.063)
##
## humidity -26.473***
## (2.261)
##
## windspeed -49.237***
## (5.863)
##
## as.factor(mnth)2 26.094
## (149.250)
##
## as.factor(mnth)3 87.290
## (162.106)
##
## as.factor(mnth)4 141.300
## (179.544)
##
## as.factor(mnth)5 331.621
## (202.725)
##
## as.factor(mnth)6 391.174*
## (227.634)
##
## as.factor(mnth)7 460.597*
## (251.450)
##
## as.factor(mnth)8 463.843**
## (233.144)
##
## as.factor(mnth)9 729.778***
## (209.071)
##
## as.factor(mnth)10 883.043***
## (180.140)
##
## as.factor(mnth)11 914.951***
## (163.473)
##
## as.factor(mnth)12 691.894***
## (153.711)
##
## Constant 3,355.508***
## (487.863)
##
## ------------------------------------------------------
## Observations 731
## R2 0.763
## Adjusted R2 0.757
## Residual Std. Error 768.792 (df = 712)
## F Statistic 127.486*** (df = 18; 712)
## ======================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Somewhat surprisingly, the promotion seems to have a larger impact on the ridership of registered bikers. The t-value of 28.7 for Promotion makes it by far the strongest predictor in our model.
We can also look at the impact without any linear modeling by instead just comparing the percentage increase in casual riders on days when the promotion was in place compared to the increase for registered riders.
## # A tibble: 1 x 2
## average_casual average_registered
## <dbl> <dbl>
## 1 677. 2728.
## # A tibble: 1 x 2
## average_casual average_registered
## <dbl> <dbl>
## 1 1018. 4581.
We can see that on average the promotion increased casual ridership by 50.3692762% and increased registered ridership by 67.9252199%. Again, we can see that registered riders seem more price sensitive. On the surface, this would see counter intuitive. We would expect registered riders to be less prone to behavioral changes since they are likely to be more regular users. However, it does make sense that those who went to the added trouble of setting up a membership to save some money would be more sensitive to pricing in general and especially interested in getting a good deal. This is one way to interpret the greater impact the promotion has had on registered riders.
We know from our analysis that the promotion has proven successful at increasing ridership. However, we do not have enough information to determine whether the promotion has increased profitability. First, we would need to know the variable costs of increased ridership (especially wear and tear on the bicycles and other maintenance related costs) to determine how much profit is added from the incremental addition of riders who would not have otherwise used bicycles but for the promotion. Once we know the added profit from the additional riders added via the promotion, we can compare that to the reduced profits from those riders who would have used the bicycles regardless of whether there was a promotion in place. If the fees paid by the added riders (minus any variable costs) is more than the cost of the unnecessary discounts for riders who are less price sensitive, we can judge the promotion a financial success.