Sections
- Packages Required
- Questions
Required Packages
The packages required for this markdown are:
Base regression model
Total riders by temperature
Fit of total riders as a function of temperature using a third-degree polynomial.
Table continues below
| (Intercept) |
519 |
775.3 |
0.6694 |
0.5035 |
| poly(temp, 3, raw = TRUE)1 |
63.14 |
134.5 |
0.4693 |
0.639 |
| poly(temp, 3, raw = TRUE)2 |
16.63 |
7.217 |
2.305 |
0.02146 |
| poly(temp, 3, raw = TRUE)3 |
-0.4324 |
0.1208 |
-3.58 |
0.0003663 |
| (Intercept) |
|
| poly(temp, 3, raw = TRUE)1 |
|
| poly(temp, 3, raw = TRUE)2 |
* |
| poly(temp, 3, raw = TRUE)3 |
* * * |
Fitting linear model: total_riders ~ poly(temp, 3, raw = TRUE)
| 731 |
1423 |
0.4627 |
0.4604 |

Residuals

1. Best regression model possible
After trying with 9 different models, incrementaly adding more IV, some worked others didn’t. The model below is the best fit.
total_riders = b_0 + b_1 temp + b_2(temp)^2+b_3 (temp)^3 + b_4 promotion + b_6 mnth + b_7 weathersit + b_8 humidity + b_9 windspeed
Values
Table continues below
| (Intercept) |
3656 |
476.9 |
7.665 |
5.895e-14 |
| poly(temp, 3, raw = TRUE)1 |
-290.6 |
83.35 |
-3.487 |
0.0005192 |
| poly(temp, 3, raw = TRUE)2 |
32.6 |
4.519 |
7.215 |
1.39e-12 |
| poly(temp, 3, raw = TRUE)3 |
-0.6887 |
0.07543 |
-9.13 |
6.967e-19 |
| as.factor(Promotion)1 |
1962 |
56.05 |
35 |
8.011e-157 |
| as.factor(mnth)2 |
43.41 |
143.1 |
0.3033 |
0.7618 |
| as.factor(mnth)3 |
510.1 |
155.6 |
3.278 |
0.001095 |
| as.factor(mnth)4 |
708 |
172.7 |
4.099 |
4.623e-05 |
| as.factor(mnth)5 |
885.3 |
194.6 |
4.55 |
6.303e-06 |
| as.factor(mnth)6 |
1042 |
219.2 |
4.754 |
2.419e-06 |
| as.factor(mnth)7 |
1255 |
242 |
5.186 |
2.807e-07 |
| as.factor(mnth)8 |
1054 |
224 |
4.704 |
3.059e-06 |
| as.factor(mnth)9 |
1302 |
201 |
6.476 |
1.752e-10 |
| as.factor(mnth)10 |
1409 |
173.3 |
8.128 |
1.93e-15 |
| as.factor(mnth)11 |
1149 |
157.1 |
7.314 |
7.014e-13 |
| as.factor(mnth)12 |
812.9 |
147.5 |
5.512 |
4.961e-08 |
| as.factor(weathersit)2 |
-398.7 |
73.44 |
-5.429 |
7.801e-08 |
| as.factor(weathersit)3 |
-1836 |
186.8 |
-9.825 |
1.89e-21 |
| humidity |
-21.66 |
2.801 |
-7.731 |
3.66e-14 |
| windspeed |
-55.11 |
5.786 |
-9.525 |
2.537e-20 |
| (Intercept) |
* * * |
| poly(temp, 3, raw = TRUE)1 |
* * * |
| poly(temp, 3, raw = TRUE)2 |
* * * |
| poly(temp, 3, raw = TRUE)3 |
* * * |
| as.factor(Promotion)1 |
* * * |
| as.factor(mnth)2 |
|
| as.factor(mnth)3 |
* * |
| as.factor(mnth)4 |
* * * |
| as.factor(mnth)5 |
* * * |
| as.factor(mnth)6 |
* * * |
| as.factor(mnth)7 |
* * * |
| as.factor(mnth)8 |
* * * |
| as.factor(mnth)9 |
* * * |
| as.factor(mnth)10 |
* * * |
| as.factor(mnth)11 |
* * * |
| as.factor(mnth)12 |
* * * |
| as.factor(weathersit)2 |
* * * |
| as.factor(weathersit)3 |
* * * |
| humidity |
* * * |
| windspeed |
* * * |
Fitting linear model: total_riders ~ poly(temp, 3, raw = TRUE) + as.factor(Promotion) + as.factor(mnth) + as.factor(weathersit) + humidity + windspeed
| 731 |
737.1 |
0.859 |
0.8552 |
Plot

2. Problems with assumptions
- Multicollinearity between season and month, 83%
- We found that the plot of the model, the residuals vs fitted, shows overestimating on lower and higher values.
- We tested comparing the model (anova) and we can see that this model works better than base model.


Analysis of Variance Table
Model 1: total_riders ~ poly(temp, 3, raw = TRUE)
Model 2: total_riders ~ poly(temp, 3, raw = TRUE) + as.factor(Promotion) +
as.factor(mnth) + as.factor(weathersit) + humidity + windspeed
Res.Df RSS Df Sum of Sq F Pr(>F)
1 727 1472082143
2 711 386341671 16 1085740472 124.88 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
3. Month with highest number
- Which month has the highest number of riders, holding everything else constant?
Month = 10
What if this month became unseasonably cold and rainy?
- Coeficcients: Month 10 = 1409.04 and weathersit 3 = -1835.55 (light snow or light rain)
The bad weather doesn’t change the coefficient for month, however it will reduce the total number of riders by 426.51
5. Casual vs Registered riders
Casual - Model
Using the best model to predict number of casual riders
- The effect on promotion on casual riders is an increase of 286 riders,
- R2 of 0.446 for casual is an indicator that the model is not a good fit for casual riders
|
|
casual
|
|
Predictors
|
Estimates
|
CI
|
p
|
|
(Intercept)
|
937.85
|
290.04 – 1585.65
|
0.005
|
|
poly(temp, 3, raw = TRUE)1
|
-114.97
|
-228.18 – -1.76
|
0.047
|
|
poly(temp, 3, raw = TRUE)2
|
10.21
|
4.07 – 16.35
|
0.001
|
|
poly(temp, 3, raw = TRUE)3
|
-0.20
|
-0.31 – -0.10
|
<0.001
|
|
as factor(Promotion)1
|
286.57
|
210.44 – 362.70
|
<0.001
|
|
as factor(mnth)2
|
-17.23
|
-211.63 – 177.18
|
0.862
|
|
as factor(mnth)3
|
306.09
|
94.75 – 517.43
|
0.005
|
|
as factor(mnth)4
|
454.03
|
219.44 – 688.62
|
<0.001
|
|
as factor(mnth)5
|
463.55
|
199.28 – 727.83
|
0.001
|
|
as factor(mnth)6
|
398.53
|
100.84 – 696.22
|
0.009
|
|
as factor(mnth)7
|
526.90
|
198.21 – 855.58
|
0.002
|
|
as factor(mnth)8
|
357.66
|
53.39 – 661.93
|
0.022
|
|
as factor(mnth)9
|
415.33
|
142.31 – 688.36
|
0.003
|
|
as factor(mnth)10
|
393.41
|
157.96 – 628.86
|
0.001
|
|
as factor(mnth)11
|
212.52
|
-0.83 – 425.88
|
0.051
|
|
as factor(mnth)12
|
82.53
|
-117.79 – 282.85
|
0.420
|
|
as factor(weathersit)2
|
-157.39
|
-257.15 – -57.64
|
0.002
|
|
as factor(weathersit)3
|
-458.34
|
-712.09 – -204.59
|
<0.001
|
|
humidity
|
-5.28
|
-9.09 – -1.48
|
0.007
|
|
windspeed
|
-15.66
|
-23.52 – -7.80
|
<0.001
|
|
Observations
|
731
|
|
R2 / adjusted R2
|
0.461 / 0.446
|
Registered riders - Model
- The effect on promotion on registered riders is an increase of 1675 riders
- With a R2 od 0.764 which is an indicator that the model is a good fit for registered riders
|
|
registered
|
|
Predictors
|
Estimates
|
CI
|
p
|
|
(Intercept)
|
2717.71
|
1757.34 – 3678.08
|
<0.001
|
|
poly(temp, 3, raw = TRUE)1
|
-175.64
|
-343.48 – -7.81
|
0.041
|
|
poly(temp, 3, raw = TRUE)2
|
22.39
|
13.29 – 31.49
|
<0.001
|
|
poly(temp, 3, raw = TRUE)3
|
-0.49
|
-0.64 – -0.33
|
<0.001
|
|
as factor(Promotion)1
|
1675.39
|
1562.53 – 1788.26
|
<0.001
|
|
as factor(mnth)2
|
60.63
|
-227.57 – 348.84
|
0.680
|
|
as factor(mnth)3
|
204.02
|
-109.29 – 517.34
|
0.202
|
|
as factor(mnth)4
|
253.96
|
-93.81 – 601.74
|
0.153
|
|
as factor(mnth)5
|
421.77
|
29.98 – 813.56
|
0.035
|
|
as factor(mnth)6
|
643.34
|
202.01 – 1084.66
|
0.004
|
|
as factor(mnth)7
|
728.03
|
240.75 – 1215.30
|
0.004
|
|
as factor(mnth)8
|
696.23
|
245.15 – 1147.32
|
0.003
|
|
as factor(mnth)9
|
886.49
|
481.74 – 1291.25
|
<0.001
|
|
as factor(mnth)10
|
1015.63
|
666.58 – 1364.69
|
<0.001
|
|
as factor(mnth)11
|
936.36
|
620.06 – 1252.66
|
<0.001
|
|
as factor(mnth)12
|
730.41
|
433.45 – 1027.38
|
<0.001
|
|
as factor(weathersit)2
|
-241.29
|
-389.18 – -93.41
|
0.001
|
|
as factor(weathersit)3
|
-1377.21
|
-1753.39 – -1001.03
|
<0.001
|
|
humidity
|
-16.37
|
-22.01 – -10.73
|
<0.001
|
|
windspeed
|
-39.45
|
-51.11 – -27.80
|
<0.001
|
|
Observations
|
731
|
|
R2 / adjusted R2
|
0.771 / 0.764
|