The data set represented the amount of people crossing the manhattan bridge on march of 2017. This data set contains the amount of people crossing the manhattan bridge during certain weather conditions. The random response variable is the rate of people corssing the bridge. The predictor variables are precipitation, High temp, Low temp, and the total population size.
| Date | Day | HighTemp | LowTemp | Precipitation | QueensboroBridge | Total | AveTemp | NewPrecipitation.cat |
|---|---|---|---|---|---|---|---|---|
| Date | Saturday | 84.9 | 72.0 | 0.23 | 3216 | 11867 | 78.45 | 1 |
| Date | Sunday | 87.1 | 73.0 | 0.00 | 3579 | 13995 | 80.05 | 0 |
| Date | Monday | 87.1 | 71.1 | 0.45 | 4230 | 16067 | 79.10 | 1 |
| Date | Tuesday | 82.9 | 70.0 | 0.00 | 3861 | 13925 | 76.45 | 0 |
| Date | Wednesday | 84.9 | 71.1 | 0.00 | 5862 | 23110 | 78.00 | 0 |
| Date | Thursday | 75.0 | 71.1 | 0.00 | 5251 | 21861 | 73.05 | 0 |
Since it’s reasonable to think that the expected count of people crossing the Queenbroro Bridge is proportional to the population size, we would prefer to model the crossing rate per weather condition. However, illustration purposes, we will fit the Poisson regression model with both counts and rate of people crossing the Manhattan bridge.
We first build a Poisson frequency regression model and ignore the population size.
| Estimate | Std. Error | z value | Pr(>|z|) | |
|---|---|---|---|---|
| (Intercept) | 7.9739275 | 0.0431838 | 184.650909 | 0 |
| AveTemp | 0.0060353 | 0.0005483 | 11.006896 | 0 |
| NewPrecipitation.cat1 | -0.3189133 | 0.0072562 | -43.950275 | 0 |
| DayMonday | 0.1586579 | 0.0102399 | 15.494037 | 0 |
| DaySaturday | -0.0989334 | 0.0107978 | -9.162367 | 0 |
| DaySunday | -0.1283273 | 0.0108855 | -11.788804 | 0 |
| DayThursday | 0.1686654 | 0.0106693 | 15.808539 | 0 |
| DayTuesday | 0.0934492 | 0.0109975 | 8.497326 | 0 |
| DayWednesday | 0.1894845 | 0.0107878 | 17.564721 | 0 |
The above inferential table about the regression coefficients shows both Average temperature and New Precipitation are statistically significant. This means, if we look at the amount of people crossing the Queensboro bridge across the different weather conditions, there is statistical evidence to support the potential discrepancy across the different groups of weather conditions. However, However, statistical significance is not equivalent the clinical importance. Moreover, the sample size could impact the statistical significance of the variables.
The other way to look at the model is the appropriateness model. The amount of people crossing the bridge are dependent on the population size. Ignoring the population size implies the information in the sample was not effectively used.
The other way to look at the model is goodness of the model. The amount of people crossing the bridge are dependent on the population size. Ignoring the population size implies the information in the sample was not used effectively. In the next subsection, we model the bridge crossing rate that involves the population size.
The following model assesses the possible relationship between the Queensboro Bridge Crossing rate and precipitation. We also want to adjust the relationship of the different average temperatures.
| Estimate | Std. Error | z value | Pr(>|z|) | |
|---|---|---|---|---|
| (Intercept) | -1.2569023 | 0.0432727 | -29.046054 | 0.0000000 |
| AveTemp | -0.0016134 | 0.0005475 | -2.946729 | 0.0032115 |
| NewPrecipitation.cat1 | 0.0476215 | 0.0071899 | 6.623369 | 0.0000000 |
| DayMonday | -0.0678714 | 0.0102390 | -6.628717 | 0.0000000 |
| DaySaturday | -0.0357733 | 0.0107891 | -3.315688 | 0.0009142 |
| DaySunday | -0.0817864 | 0.0108370 | -7.546972 | 0.0000000 |
| DayThursday | -0.0350102 | 0.0105830 | -3.308148 | 0.0009392 |
| DayTuesday | -0.0477855 | 0.0109113 | -4.379469 | 0.0000119 |
| DayWednesday | -0.0433977 | 0.0106518 | -4.074197 | 0.0000462 |
The above table indicates that the log of the bridge crossing rate is almost identical for all days of the week, but is not identical across the different weather conditions. For example, the log rate of when it does rain is slightly higher compared to when it does not rain. Also, the baseline NewPrecipitation.cat0 (When it does not rain) has a higher log rate. The regression coefficients represent the change of log rate between the days of the week and the reference day of the week which is Friday. The same interpretation applies to the change in log rate among the precipitation groups.
| Estimate | Std. Error | z value | Pr(>|z|) | |
|---|---|---|---|---|
| (Intercept) | -1.2569023 | 0.0432727 | -29.046054 | 0.0000000 |
| AveTemp | -0.0016134 | 0.0005475 | -2.946729 | 0.0032115 |
| NewPrecipitation.cat1 | 0.0476215 | 0.0071899 | 6.623369 | 0.0000000 |
| DayMonday | -0.0678714 | 0.0102390 | -6.628717 | 0.0000000 |
| DaySaturday | -0.0357733 | 0.0107891 | -3.315688 | 0.0009142 |
| DaySunday | -0.0817864 | 0.0108370 | -7.546972 | 0.0000000 |
| DayThursday | -0.0350102 | 0.0105830 | -3.308148 | 0.0009392 |
| DayTuesday | -0.0477855 | 0.0109113 | -4.379469 | 0.0000119 |
| DayWednesday | -0.0433977 | 0.0106518 | -4.074197 | 0.0000462 |
The dispersion index can be extracted from the quasi-Poisson object with the bottom code
| Dispersion |
|---|
| 0 |
The dispersion index is 0. It is slightly dispersed. Thus, we stay with the regular Poisson regression model.
The inferential tables of the Poisson regression models in the previous sections give numerical information about the potential discrepancy across the temperature and precipitation groups. Afterwards, we create a graph to visualize the relationship between the bridge crossing rate and the temperature and age groups.
First of all, every city has a trend line that reflects the relationship between the cancer rate and age. We next find the rates of combinations of precipitation and temperature groups based on our following working rate model. \[ \text{log-rate} = -1.25-0.0016 \times \text{Ave.Temp} +0.032 \times \text{} +0.047 \times \text{NewPrecipitation.cat1+} -.06 \times \text{Monday}-.06 \times \text{Saturday}-.03 \times \text{Sunday} -.08 \times \text{Thursday}-.03 \times \text{Tuesday}-.04 \times \text{Wensday} \]
We use Newprecipitation as the horizontal axis and the estimated bridge crossing rates as the vertical axis to make the trend lines for each of the average temperature groups using the following code.
To make the visual representation, we calculate cancer rates of the corresponding combinations of Average temperatures and Newprecipitation group in the following calculation based on the regression equation with coefficients given in above table.
For example, exp(−1.25-.016AveTemp) gives the the crossing rate of the baseline precipitation group precipitation0 and baseline day of the week Friday. Also, exp(-1.25-.067-.0016AveTemp) gives the crossing rate of the baseline precipitation group and the day of the week, Monday. Following the same pattern, we can find the bridge crossing rate for each combination of day of the week and precipitation group.
The regression model based on the amount of people crossing the Queensboro bridge is not appropriate because the information on the population size is an important variable in the study of the distribution becauseing including the population size in the regression model will increase the statistical significance of Precipitation.
| Estimate | Std. Error | z value | Pr(>|z|) | |
|---|---|---|---|---|
| (Intercept) | -0.0018024 | 0.1806566 | -0.0099771 | 0.9920396 |
| DayMonday | -0.0384168 | 0.0110458 | -3.4779711 | 0.0005052 |
| DaySaturday | -0.0456841 | 0.0108735 | -4.2014379 | 0.0000265 |
| DaySunday | -0.0887382 | 0.0108828 | -8.1539484 | 0.0000000 |
| DayThursday | -0.0074731 | 0.0112700 | -0.6630979 | 0.5072678 |
| DayTuesday | -0.0273199 | 0.0112832 | -2.4212960 | 0.0154653 |
| DayWednesday | -0.0122433 | 0.0115255 | -1.0622770 | 0.2881099 |
| AveTemp | -0.0006360 | 0.0005644 | -1.1269015 | 0.2597841 |
| NewPrecipitation.cat1 | -0.0009709 | 0.0099111 | -0.0979606 | 0.9219636 |
| log(Total) | 0.8646609 | 0.0189278 | 45.6819959 | 0.0000000 |
As it is not raining the bridge crossing rate is the lowest during Sunday. It is highest during Fridays. There is a straight negative linear relationship between New Ave Temp and the bridgecrossingrate. As Average temperature goes up, the bridge crossing rate continued to decrease. The reason could be because people may not prefer to walk outside while it is hot outside.
Moreover, As it is raining, the bridge crossing rate is also the lowest during Sunday and highest during Fridays. There is also a straight negative linear relationship between New Ave Temp and the bridgecrossingrate. As Average temperature goes up, the bridge crossing rate continued to decrease. The reason could be again that people may not prefer to walk outside while it is hot outside.
Also, there appears to be no interaction effect for both graphs because all the lines are parrelel.
Unfortunatley, This is only a small data set with not much information. All conclusions in this report are only based on our given data set.