We develop regression models for analyzing impact of the ride sharing services on transit demand. Our analysis is motivated by the problem of the loss of transit ridership over the past several years. Those losses overlap with the introduction of Transportation Network Companies (TNCs). There is a seeming correlation between the two, which has led to speculation that TNCs have caused transit ridership loss. However, many other factors outside of TNC ridership could be driving transit ridership loss, including changing populations, shifting land use patterns, transit service changes, macro-economic factors, and so on. Our model allows for disaggregate analysis of long-term transit ridership data from Chicago between 2010 and 2020, before and after the introduction of TNC services, along with TNC trip patterns to determine relationships between TNC ridership and transit ridership for similar areas.
The TNC data provided by the city of Chicago contains hourly number of tripf between each origin/destination community areas. We have summarized this table to daily trips, the resulting data has 794179 and ranged from 2018-11-01 to 2020-02-29. Each record has origin, destination, date and averages over attributes of the trip.
ntrips | cost | ttime | |
---|---|---|---|
Min. : 1.0 | Min. : 0.67 | Min. : 0.000 | |
1st Qu.: 2.0 | 1st Qu.: 13.09 | 1st Qu.: 1.333 | |
Median : 6.0 | Median : 17.48 | Median : 2.333 | |
Mean : 103.1 | Mean : 19.19 | Mean :155.038 | |
3rd Qu.: 25.0 | 3rd Qu.: 22.55 | 3rd Qu.:307.500 | |
Max. :29556.0 | Max. :198.94 | Max. :999.000 |
The average number of daily trips is 102.1955238 and the daily plot is given below
We started by analyzing origin designation flows along the Blue line of the L-system. The trip table has 78 community areas (CA) and 54.5m trips and spans 517 days from 2018-11-01 to 2020-03-31. It origin or destination CA is outside of Chicago, the records were deleted.
barplot(height = tmp$ct,names.arg = tmp$my)
barplot(height = tmp$ct,names.arg = tmp$my)
barplot(height = tmp$ct,names.arg = tmp$my)
We use similar to that proposed by Bernardinelli, Clayton, Pascutto, Montomoli, Ghislandi, and Songini (1995), which represents the spatio-temporal pattern in the mean response with spatially varying linear time trends. We assume the data is Gaussian. The model estimates autocorrelated linear time trends for each community area (areal unit). Thus it is appropriate if the goal of the analysis is to estimate which areas are exhibiting increasing or decreasing (linear) trends in the response over time. The full model specification is given below.
We assume we have a set of \(K\) non-overlaping areal units \(S_1,\ldots,S_K\) and data is recorded for each areal unit for \(N\) consecutive time steps \(t = 1,\ldots,N\), then the general hierarchical model is \[ \begin{aligned} Y_{kt} \sim & f(y_{kt} \mid \mu_{kt},\nu^2)\\ g(\mu_{kt}) = & x^T_{kt}\beta+O_{kt} + \psi_{kt}\\ \beta \sim & N(\mu_{\beta},\Sigma_{\beta}) \end{aligned} \] Given that we assume Normal (continious) data, we use \(f =\) Noraml density and link function \(g\) to be identity function.
The spatio-temporal correlations are modeled using \[ \psi_{kt} = \beta_1 + \phi_k + (\alpha + \delta_k)\left(\dfrac{t - \bar t}{N}\right) \] Thus, the temporal pattern is modeled by a local trend \(\delta_k\) and the spatial correlations are modeled via \(\phi_k\), both are random effects, which are conditionally normal \[ \begin{aligned} \phi_k \mid \phi_{-k}, W \sim & N\left(\dfrac{1}{c}\sum_{j=1}^Kw_{kj}\phi_j, \dfrac{\tau^2_{int}}{c}\right),~~~c = \sum_{j=1}^Kw_{kj}-1 +1/\rho_{int}\\ \delta_k \mid \delta_{-k},W \sim & N\left(\dfrac{1}{q}\sum_{j=1}^Kw_{kj}\delta_j, \dfrac{\tau^2_{slo}}{q}\right),~~~q = \sum_{j=1}^Kw_{kj}-1 +1/\rho^2_{slo}\\ \tau^2_{int},\tau^2_{slo} \sim & IG(a,b)\\ \rho_{int},\rho_{slo} \sim & Uniform(0,1)\\ \alpha \sim N(\mu_{\alpha},\sigma^2_{\alpha}) \end{aligned} \] Eech community area has its own linear trend with inctercept \(\beta_1 + \phi_k\) and slope \(\alpha + \delta_k\). Here \(\rho_{int},\rho_{slo}\) are spatial dependence parameters, with values of one corresponding to strong spatial smoothness that is equivalent to the intrinsic CAR prior proposed by Besag et al. (1991), while values of zero correspond to independence (for example if \(\rho_{slo}\) then \(\delta_k \sim N(0, \tau^2_{slo})\).
Now lets look at the results of the models
## Median 2.5% 97.5%
## (Intercept) 15.5738 8.9403 21.7206
## bus 0.4777 0.3840 0.5779
## alpha -0.8258 -2.6406 0.8807
## tau2.int 15271.9249 10542.0553 21930.1200
## tau2.slo 0.0082 0.0022 0.0810
## nu2 59.8469 54.7327 66.0348
## rho.int 0.7499 0.4120 0.9586
## rho.slo 0.3700 0.0185 0.9095
Bus and tnc move in the same direction \(\beta = 0.5\)
Overall the trend is down for tnc \(\alpha = -0.8\)
There is a strong time-independent local effect
## var1 var2 var3 var4 var5
## 5% -29.36917 -24.27986 39.96184 -17.58459 -16.72675
## 95% -21.49325 -16.25814 47.95857 -9.86926 -9.59751
- There is no local effet on the trend
## var1 var2 var3 var4 var5
## 5% -0.1684918 -0.1504134 -0.1411595 -0.1215328 -0.1194774
## 95% 0.1647867 0.1393985 0.1496100 0.1323126 0.1238643
## Median 2.5% 97.5%
## (Intercept) 127.5000 89.8817 164.2657
## train 0.4308 0.2895 0.5744
## alpha 11.5450 -7.8647 28.3170
## tau2.int 388636.9237 229171.2010 740640.3867
## tau2.slo 0.0080 0.0021 0.1035
## nu2 3288.6626 2913.6250 3711.6146
## rho.int 0.1601 0.0090 0.5657
## rho.slo 0.3755 0.0191 0.9173
## Median 2.5% 97.5%
## (Intercept) -33.3342 -100.1132 32.4743
## bus 2.4717 1.5956 3.3383
## train 0.2772 0.1223 0.4227
## alpha 0.0080 -17.0117 17.4833
## tau2.int 328802.0112 200068.5778 628969.1932
## tau2.slo 0.0083 0.0021 0.0912
## nu2 3130.1253 2767.8649 3530.2913
## rho.int 0.1176 0.0053 0.4641
## rho.slo 0.3699 0.0162 0.9176
## Median 2.5% 97.5%
## (Intercept) 0.6936 -0.2081 1.5101
## log(bus) 0.7060 0.5045 0.9032
## log(train) 0.0753 -0.0333 0.1950
## alpha -0.0462 -0.0922 -0.0021
## tau2.int 2.8196 1.8534 4.5280
## tau2.slo 0.0379 0.0148 0.0808
## nu2 0.0217 0.0190 0.0247
## rho.int 0.6356 0.2739 0.9211
## rho.slo 0.6134 0.1426 0.9296
## Median 2.5% 97.5%
## (Intercept) 0.2409 -0.6849 1.2140
## log(bus) 0.4650 0.2521 0.6977
## log(train) 0.3644 0.2593 0.4764
## alpha -0.0259 -0.0742 0.0212
## tau2.int 2.5868 1.7078 4.1574
## tau2.slo 0.0259 0.0120 0.0530
## rho.int 0.6294 0.2778 0.9172
## rho.slo 0.7416 0.2451 0.9695