Question 1

The best model identified determined 87% of total ridership variance can be explained by weekday, month of year, promotion flag, holiday flag, temperature, weather, humidity, and windspeed.

## 
## Call:
## lm(formula = total_ridership ~ poly(temp_f, 3, raw = TRUE) + 
##     as.factor(promotion) + as.factor(month) + as.factor(holiday) + 
##     as.factor(weekday) + as.factor(weathersit) + humidity + windspeed, 
##     data = bikeshare)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3221.7  -342.8    53.6   414.3  2273.5 
## 
## Coefficients:
##                                               Estimate Std. Error t value
## (Intercept)                                  2.357e+04  3.640e+03   6.475
## poly(temp_f, 3, raw = TRUE)1                -1.209e+03  1.695e+02  -7.132
## poly(temp_f, 3, raw = TRUE)2                 2.197e+01  2.563e+00   8.573
## poly(temp_f, 3, raw = TRUE)3                -1.208e-01  1.262e-02  -9.569
## as.factor(promotion)1                        1.969e+03  5.446e+01  36.160
## as.factor(month)Feb                          3.580e+01  1.390e+02   0.258
## as.factor(month)Mar                          4.832e+02  1.514e+02   3.192
## as.factor(month)Apr                          7.122e+02  1.681e+02   4.238
## as.factor(month)May                          8.763e+02  1.895e+02   4.623
## as.factor(month)Jun                          1.012e+03  2.143e+02   4.723
## as.factor(month)Jul                          1.260e+03  2.367e+02   5.324
## as.factor(month)Aug                          1.023e+03  2.190e+02   4.670
## as.factor(month)Sep                          1.289e+03  1.961e+02   6.575
## as.factor(month)Oct                          1.412e+03  1.687e+02   8.369
## as.factor(month)Nov                          1.159e+03  1.526e+02   7.594
## as.factor(month)Dec                          8.047e+02  1.433e+02   5.616
## as.factor(holiday)1                         -5.689e+02  1.673e+02  -3.400
## as.factor(weekday)Monday                     1.573e+02  1.021e+02   1.541
## as.factor(weekday)Tuesday                    3.013e+02  9.969e+01   3.022
## as.factor(weekday)Wednesday                  3.712e+02  9.996e+01   3.713
## as.factor(weekday)Thursday                   3.898e+02  1.000e+02   3.896
## as.factor(weekday)Friday                     4.485e+02  9.981e+01   4.493
## as.factor(weekday)Saturday                   4.490e+02  9.914e+01   4.529
## as.factor(weathersit)Mist, cloudy           -4.393e+02  7.178e+01  -6.121
## as.factor(weathersit)Light snow, light rain -1.943e+03  1.828e+02 -10.628
## humidity                                    -2.021e+01  2.740e+00  -7.378
## windspeed                                   -5.416e+01  5.626e+00  -9.627
##                                             Pr(>|t|)    
## (Intercept)                                 1.77e-10 ***
## poly(temp_f, 3, raw = TRUE)1                2.46e-12 ***
## poly(temp_f, 3, raw = TRUE)2                 < 2e-16 ***
## poly(temp_f, 3, raw = TRUE)3                 < 2e-16 ***
## as.factor(promotion)1                        < 2e-16 ***
## as.factor(month)Feb                         0.796851    
## as.factor(month)Mar                         0.001475 ** 
## as.factor(month)Apr                         2.55e-05 ***
## as.factor(month)May                         4.49e-06 ***
## as.factor(month)Jun                         2.80e-06 ***
## as.factor(month)Jul                         1.37e-07 ***
## as.factor(month)Aug                         3.61e-06 ***
## as.factor(month)Sep                         9.47e-11 ***
## as.factor(month)Oct                         3.13e-16 ***
## as.factor(month)Nov                         9.91e-14 ***
## as.factor(month)Dec                         2.82e-08 ***
## as.factor(holiday)1                         0.000712 ***
## as.factor(weekday)Monday                    0.123742    
## as.factor(weekday)Tuesday                   0.002600 ** 
## as.factor(weekday)Wednesday                 0.000221 ***
## as.factor(weekday)Thursday                  0.000107 ***
## as.factor(weekday)Friday                    8.21e-06 ***
## as.factor(weekday)Saturday                  6.95e-06 ***
## as.factor(weathersit)Mist, cloudy           1.55e-09 ***
## as.factor(weathersit)Light snow, light rain  < 2e-16 ***
## humidity                                    4.55e-13 ***
## windspeed                                    < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 715.9 on 704 degrees of freedom
## Multiple R-squared:  0.8683, Adjusted R-squared:  0.8634 
## F-statistic: 178.5 on 26 and 704 DF,  p-value: < 2.2e-16

Question 2

We can use the residuals to study the linearity of the model. The data show normal distribution of data – with slight negative skewness, but as plotting a Normal Q-Q the model shows it is over estimating after two standard deviations while highly under estimating lower than -2 standard deviations.

In evaluating for multicollinearity we can test the variance-inflation of each factor. Month and Temperature have a colinear relationship. This would make sense as each Month has their own characteristic. The other factors are not significant concerns as there values show little multicollinearity in the model. Even thought Month and Temperature were colinear I decided to keep both factors in the equation as removing one reduces R-squared value of the model.

Variance Inflation

##                                  GVIF Df GVIF^(1/(2*Df))
## poly(temp_f, 3, raw = TRUE) 21.278361  3        1.664650
## as.factor(promotion)         1.057480  1        1.028339
## as.factor(month)            22.937281 11        1.153036
## as.factor(holiday)           1.114231  1        1.055571
## as.factor(weekday)           1.162035  6        1.012593
## as.factor(weathersit)        1.885455  2        1.171801
## humidity                     2.169240  1        1.472834
## windspeed                    1.215508  1        1.102501

Question 3

The summary of coefficients as listed below shows October would have the highest total ridership of 1,412. With everything held constant if October became unseasonably cold with light snow/rain total ridership could drop by -7,612 riders according to the model.

The model results that represent these conditions:

poly(temp_f, 3, raw = TRUE)2:as.factor(month)Oct:as.factor(weathersit)Light snow, light rain -7.612083e+03

poly(temp_f, 3, raw = TRUE)3:as.factor(month)Oct:as.factor(weathersit)Light snow, light rain 4.303216e+01

poly(temp_f, 3, raw = TRUE)1:as.factor(month)Nov:as.factor(weathersit)Light snow, light rain -2.849246e+02

Question 4

Promotion can be available or not available on a particual day. When the promotion is available the model estimates the promotion will contribute 1,969 riders to the total ridership for the day. The marketing department can feel confident in the value as promotion was identified as being statistically significant in the model and that the model determined 87% of total ridership variance could be explained.

Question 5

By standardizes all coefficients and looking at the promotion coefficient for each population (casual and registered riders), we see promotion has a stronger contribution to registered riders than casual riders.

## 
## Call:
## lm(formula = registered ~ poly(temp_f, 3, raw = TRUE) + as.factor(promotion) + 
##     as.factor(month) + as.factor(holiday) + as.factor(weekday) + 
##     as.factor(weathersit) + humidity + windspeed, data = bikeshare)
## 
## Standardized Coefficients::
##                                 (Intercept) 
##                                 0.000000000 
##                poly(temp_f, 3, raw = TRUE)1 
##                                -6.494743024 
##                poly(temp_f, 3, raw = TRUE)2 
##                                16.608244378 
##                poly(temp_f, 3, raw = TRUE)3 
##                                -9.861886944 
##                       as.factor(promotion)1 
##                                 0.542799823 
##                         as.factor(month)Feb 
##                                 0.008273578 
##                         as.factor(month)Mar 
##                                 0.036678514 
##                         as.factor(month)Apr 
##                                 0.066199517 
##                         as.factor(month)May 
##                                 0.102567693 
##                         as.factor(month)Jun 
##                                 0.142761270 
##                         as.factor(month)Jul 
##                                 0.173535362 
##                         as.factor(month)Aug 
##                                 0.151283037 
##                         as.factor(month)Sep 
##                                 0.190652825 
##                         as.factor(month)Oct 
##                                 0.203784842 
##                         as.factor(month)Nov 
##                                 0.173020703 
##                         as.factor(month)Dec 
##                                 0.134843724 
##                         as.factor(holiday)1 
##                                -0.120294103 
##                    as.factor(weekday)Monday 
##                                 0.206796308 
##                   as.factor(weekday)Tuesday 
##                                 0.246970570 
##                 as.factor(weekday)Wednesday 
##                                 0.262769900 
##                  as.factor(weekday)Thursday 
##                                 0.264537321 
##                    as.factor(weekday)Friday 
##                                 0.236613680 
##                  as.factor(weekday)Saturday 
##                                 0.066049646 
##           as.factor(weathersit)Mist, cloudy 
##                                -0.104235991 
## as.factor(weathersit)Light snow, light rain 
##                                -0.175362401 
##                                    humidity 
##                                -0.123717922 
##                                   windspeed 
##                                -0.119979067
## 
## Call:
## lm(formula = casual ~ poly(temp_f, 3, raw = TRUE) + as.factor(promotion) + 
##     as.factor(month) + as.factor(holiday) + as.factor(weekday) + 
##     as.factor(weathersit) + humidity + windspeed, data = bikeshare)
## 
## Standardized Coefficients::
##                                 (Intercept) 
##                                 0.000000000 
##                poly(temp_f, 3, raw = TRUE)1 
##                                -9.025038561 
##                poly(temp_f, 3, raw = TRUE)2 
##                                21.476515314 
##                poly(temp_f, 3, raw = TRUE)3 
##                               -12.093962143 
##                       as.factor(promotion)1 
##                                 0.201456935 
##                         as.factor(month)Feb 
##                                -0.004809483 
##                         as.factor(month)Mar 
##                                 0.112844859 
##                         as.factor(month)Apr 
##                                 0.134487118 
##                         as.factor(month)May 
##                                 0.122733069 
##                         as.factor(month)Jun 
##                                 0.080477340 
##                         as.factor(month)Jul 
##                                 0.117260680 
##                         as.factor(month)Aug 
##                                 0.071481646 
##                         as.factor(month)Sep 
##                                 0.082551078 
##                         as.factor(month)Oct 
##                                 0.110061404 
##                         as.factor(month)Nov 
##                                 0.070451277 
##                         as.factor(month)Dec 
##                                 0.020344868 
##                         as.factor(holiday)1 
##                                 0.134868327 
##                    as.factor(weekday)Monday 
##                                -0.389521760 
##                   as.factor(weekday)Tuesday 
##                                -0.407813919 
##                 as.factor(weekday)Wednesday 
##                                -0.408149500 
##                  as.factor(weekday)Thursday 
##                                -0.402693930 
##                    as.factor(weekday)Friday 
##                                -0.309357575 
##                  as.factor(weekday)Saturday 
##                                 0.079424557 
##           as.factor(weathersit)Mist, cloudy 
##                                -0.065984513 
## as.factor(weathersit)Light snow, light rain 
##                                -0.074515228 
##                                    humidity 
##                                -0.138171512 
##                                   windspeed 
##                                -0.136933338

Looking at the coefficients for each population, promotion estimates 1,693 increase of registered riders, while 276 increase of casual riders. With this analysis registered riders have more substantial impact over casual riders when a promotion was active. This could point to promotional days not being as random as assumed.

##                                 (Intercept) 
##                                1.410161e+04 
##                poly(temp_f, 3, raw = TRUE)1 
##                               -7.501178e+02 
##                poly(temp_f, 3, raw = TRUE)2 
##                                1.400450e+01 
##                poly(temp_f, 3, raw = TRUE)3 
##                               -7.843476e-02 
##                       as.factor(promotion)1 
##                                1.692656e+03 
##                         as.factor(month)Feb 
##                                4.811075e+01 
##                         as.factor(month)Mar 
##                                2.052669e+02 
##                         as.factor(month)Apr 
##                                3.760400e+02 
##                         as.factor(month)May 
##                                5.740078e+02 
##                         as.factor(month)Jun 
##                                8.109416e+02 
##                         as.factor(month)Jul 
##                                9.711698e+02 
##                         as.factor(month)Aug 
##                                8.466373e+02 
##                         as.factor(month)Sep 
##                                1.082985e+03 
##                         as.factor(month)Oct 
##                                1.140457e+03 
##                         as.factor(month)Nov 
##                                9.828274e+02 
##                         as.factor(month)Dec 
##                                7.546367e+02 
##                         as.factor(holiday)1 
##                               -1.122850e+03 
##                    as.factor(weekday)Monday 
##                                9.193419e+02 
##                   as.factor(weekday)Tuesday 
##                                1.102328e+03 
##                 as.factor(weekday)Wednesday 
##                                1.172847e+03 
##                  as.factor(weekday)Thursday 
##                                1.180735e+03 
##                    as.factor(weekday)Friday 
##                                1.056101e+03 
##                  as.factor(weekday)Saturday 
##                                2.936329e+02 
##           as.factor(weathersit)Mist, cloudy 
##                               -3.436075e+02 
## as.factor(weathersit)Light snow, light rain 
##                               -1.636868e+03 
##                                    humidity 
##                               -1.355283e+01 
##                                   windspeed 
##                               -3.605262e+01
##                                 (Intercept) 
##                               9466.70823774 
##                poly(temp_f, 3, raw = TRUE)1 
##                               -458.71044700 
##                poly(temp_f, 3, raw = TRUE)2 
##                                  7.96947746 
##                poly(temp_f, 3, raw = TRUE)3 
##                                 -0.04232912 
##                       as.factor(promotion)1 
##                                276.46069158 
##                         as.factor(month)Feb 
##                                -12.30747961 
##                         as.factor(month)Mar 
##                                277.91437770 
##                         as.factor(month)Apr 
##                                336.18779469 
##                         as.factor(month)May 
##                                302.26706638 
##                         as.factor(month)Jun 
##                                201.17539785 
##                         as.factor(month)Jul 
##                                288.78966430 
##                         as.factor(month)Aug 
##                                176.04503533 
##                         as.factor(month)Sep 
##                                206.35928128 
##                         as.factor(month)Oct 
##                                271.05928409 
##                         as.factor(month)Nov 
##                                176.11247620 
##                         as.factor(month)Dec 
##                                 50.10535248 
##                         as.factor(holiday)1 
##                                553.99939456 
##                    as.factor(weekday)Monday 
##                               -762.05804083 
##                   as.factor(weekday)Tuesday 
##                               -801.03177988 
##                 as.factor(weekday)Wednesday 
##                               -801.69093087 
##                  as.factor(weekday)Thursday 
##                               -790.97505169 
##                    as.factor(weekday)Friday 
##                               -607.64294048 
##                  as.factor(weekday)Saturday 
##                                155.38572717 
##           as.factor(weathersit)Mist, cloudy 
##                                -95.72139139 
## as.factor(weathersit)Light snow, light rain 
##                               -306.08662589 
##                                    humidity 
##                                 -6.66097522 
##                                   windspeed 
##                                -18.10767313

Quesiton 6

The best model identified determined 87% of total ridership variance can be explained by weekday, month of year, promotion flag, holiday flag, temperature, weather, humidity, and windspeed. Although this is great additional information is need to come to a financial conclusion. We were provided the discounted rates for casual and registered riders when a promotion was active, but it would be nice to understand the customer cost of each. As these bikes may need general maintenance knowing the overhead costs of the business to evaluate if a profit is being made by CAO. Additionally, I would like to know if there were riders who participated more than once in a day as this would have an impact on the potential revenue.