Introduction

The following is a modeling experiment using bike share data from Toronto.

Building a Model

Using temperature as our basis for estimating ridership, I added on humidity, as hot and humid are not an ideal combo, month, weather, and promotion. Month was added in order to account for the changing times of year. During certain seasons, people are less likely to plan on doing outdoor activities due to colder weather and accounts for people that have preplanned not to do outdoor activities. Weather was added as there aren’t many individuals that will be biking in the snow or rain. Promotion was included to see if there were any increases in ridership as a result of a promotion.

## 
## Call:
## lm(formula = total ~ poly(temp, 3, raw = TRUE) + weathersit + 
##     poly(humidity, 2, raw = TRUE) + mnth + Promotion, data = bikeshare_OD)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3841.0  -351.5    64.5   450.1  2502.8 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     5.544e+02  6.354e+02   0.873 0.383219    
## poly(temp, 3, raw = TRUE)1     -2.974e+02  8.707e+01  -3.415 0.000674 ***
## poly(temp, 3, raw = TRUE)2      3.270e+01  4.722e+00   6.924 9.83e-12 ***
## poly(temp, 3, raw = TRUE)3     -6.872e-01  7.883e-02  -8.718  < 2e-16 ***
## weathersit2                    -4.156e+02  7.778e+01  -5.343 1.23e-07 ***
## weathersit3                    -1.731e+03  2.172e+02  -7.971 6.26e-15 ***
## poly(humidity, 2, raw = TRUE)1  5.581e+01  1.463e+01   3.816 0.000147 ***
## poly(humidity, 2, raw = TRUE)2 -5.854e-01  1.205e-01  -4.858 1.46e-06 ***
## mnth2                           2.787e+01  1.496e+02   0.186 0.852213    
## mnth3                           5.348e+02  1.626e+02   3.288 0.001057 ** 
## mnth4                           6.588e+02  1.805e+02   3.650 0.000281 ***
## mnth5                           9.264e+02  2.033e+02   4.557 6.11e-06 ***
## mnth6                           1.100e+03  2.291e+02   4.801 1.93e-06 ***
## mnth7                           1.343e+03  2.527e+02   5.313 1.45e-07 ***
## mnth8                           1.102e+03  2.344e+02   4.702 3.10e-06 ***
## mnth9                           1.426e+03  2.094e+02   6.809 2.09e-11 ***
## mnth10                          1.458e+03  1.814e+02   8.036 3.86e-15 ***
## mnth11                          1.185e+03  1.643e+02   7.212 1.41e-12 ***
## mnth12                          8.770e+02  1.539e+02   5.700 1.76e-08 ***
## Promotion1                      1.962e+03  5.874e+01  33.403  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 770.1 on 711 degrees of freedom
## Multiple R-squared:  0.8461, Adjusted R-squared:  0.842 
## F-statistic: 205.7 on 19 and 711 DF,  p-value: < 2.2e-16
##                                    GVIF Df GVIF^(1/(2*Df))
## poly(temp, 3, raw = TRUE)     20.660725  3        1.656498
## weathersit                     2.263781  2        1.226616
## poly(humidity, 2, raw = TRUE)  2.766855  2        1.289723
## mnth                          22.239540 11        1.151418
## Promotion                      1.063149  1        1.031091
## [1] 6.497311
##               temp  humidity
## temp     1.0000000 0.1269629
## humidity 0.1269629 1.0000000

Multicolinearity

There was some multicolinearity among variables, particularly between humidity and the weather type. This is unsurprising as with more humidity, there’s a higher chance of rain. This doesn’t concern me, however, as it isn’t always the case that it is raining the whole day or snowing the whole day. However, some rain or some snow would likely impact ridership for that day.

Assessing Monthly Ridership

Interestingly, the month showing the highest ridership was October. If this month became cold, however, you would be more likely to see a drastic change in ridership as cold weather will negatively impact overall ridership.

Promotional Effect

Promotion appears to have actually had a positive effect on overall ridership. Promotions are expected to increase overall ridership by 1,962 riders per day. However, because of the multicolinearity, the success of a promotion might be more highly impacted by the conditions than by the promotion itself.

Casual vs. Registered Promotion

Using logs to determine the effects of promotion on ridership, it can be determined that there is some effect on ridership depending on if the individual is casual or registered. However, it appears the registered riders were more highly impacted than the casual riders.

## 
## Call:
## lm(formula = log(casual) ~ poly(temp, 3, raw = TRUE) + weathersit + 
##     poly(humidity, 2, raw = TRUE) + mnth + Promotion, data = bikeshare_OD)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.4428 -0.3603 -0.1124  0.4478  1.6668 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     3.152e+00  4.739e-01   6.650 5.83e-11 ***
## poly(temp, 3, raw = TRUE)1      1.541e-01  6.494e-02   2.373  0.01790 *  
## poly(temp, 3, raw = TRUE)2      1.298e-04  3.522e-03   0.037  0.97061    
## poly(temp, 3, raw = TRUE)3     -7.972e-05  5.880e-05  -1.356  0.17559    
## weathersit2                    -2.559e-01  5.801e-02  -4.412 1.19e-05 ***
## weathersit3                    -1.501e+00  1.620e-01  -9.267  < 2e-16 ***
## poly(humidity, 2, raw = TRUE)1  1.564e-02  1.091e-02   1.434  0.15212    
## poly(humidity, 2, raw = TRUE)2 -1.800e-04  8.988e-05  -2.002  0.04562 *  
## mnth2                           4.934e-02  1.115e-01   0.442  0.65836    
## mnth3                           7.133e-01  1.213e-01   5.880 6.31e-09 ***
## mnth4                           8.202e-01  1.346e-01   6.093 1.81e-09 ***
## mnth5                           8.918e-01  1.516e-01   5.881 6.27e-09 ***
## mnth6                           8.306e-01  1.709e-01   4.860 1.44e-06 ***
## mnth7                           1.013e+00  1.885e-01   5.376 1.04e-07 ***
## mnth8                           8.719e-01  1.748e-01   4.987 7.70e-07 ***
## mnth9                           8.804e-01  1.562e-01   5.637 2.49e-08 ***
## mnth10                          8.043e-01  1.353e-01   5.944 4.35e-09 ***
## mnth11                          6.463e-01  1.225e-01   5.274 1.77e-07 ***
## mnth12                          4.355e-01  1.148e-01   3.795  0.00016 ***
## Promotion1                      3.438e-01  4.381e-02   7.848 1.55e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5744 on 711 degrees of freedom
## Multiple R-squared:  0.6902, Adjusted R-squared:  0.6819 
## F-statistic: 83.36 on 19 and 711 DF,  p-value: < 2.2e-16
## 
## Call:
## lm(formula = log(registered) ~ poly(temp, 3, raw = TRUE) + weathersit + 
##     poly(humidity, 2, raw = TRUE) + mnth + Promotion, data = bikeshare_OD)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.4905 -0.1070  0.0575  0.1621  0.9851 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     6.444e+00  2.708e-01  23.795  < 2e-16 ***
## poly(temp, 3, raw = TRUE)1      6.022e-03  3.711e-02   0.162 0.871119    
## poly(temp, 3, raw = TRUE)2      4.767e-03  2.012e-03   2.369 0.018106 *  
## poly(temp, 3, raw = TRUE)3     -1.235e-04  3.360e-05  -3.677 0.000254 ***
## weathersit2                    -5.428e-02  3.315e-02  -1.637 0.102005    
## weathersit3                    -7.673e-01  9.255e-02  -8.290 5.63e-16 ***
## poly(humidity, 2, raw = TRUE)1  1.841e-02  6.233e-03   2.954 0.003245 ** 
## poly(humidity, 2, raw = TRUE)2 -1.875e-04  5.136e-05  -3.652 0.000280 ***
## mnth2                           6.371e-02  6.374e-02   1.000 0.317842    
## mnth3                           6.790e-02  6.931e-02   0.980 0.327623    
## mnth4                           6.685e-02  7.692e-02   0.869 0.385044    
## mnth5                           1.354e-01  8.664e-02   1.563 0.118564    
## mnth6                           1.682e-01  9.765e-02   1.722 0.085458 .  
## mnth7                           2.062e-01  1.077e-01   1.915 0.055940 .  
## mnth8                           1.568e-01  9.989e-02   1.570 0.116934    
## mnth9                           2.546e-01  8.924e-02   2.853 0.004456 ** 
## mnth10                          2.466e-01  7.732e-02   3.189 0.001488 ** 
## mnth11                          3.731e-01  7.002e-02   5.328 1.33e-07 ***
## mnth12                          2.598e-01  6.557e-02   3.962 8.18e-05 ***
## Promotion1                      4.605e-01  2.503e-02  18.398  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3282 on 711 degrees of freedom
## Multiple R-squared:  0.6718, Adjusted R-squared:  0.663 
## F-statistic: 76.59 on 19 and 711 DF,  p-value: < 2.2e-16

Financial Success or Failure

In order to assess if the promotions really were a success, we would need to determine if the increased ridership was equivalent to an increase in revenue or if it resulted in reduced revenue, but increased ridership. If the promotions have led to more registered riders and more riders over time, then it can be considered a success. However, if there is reduced revenue the promotions would be considered a failure.

In addition, the data could be significantly influenced by the fact that there were so many promotions that riders began waiting for promotional days to ride. Knowing whether an individual rode because of a promotion or not would be helpful information.