We develop regression models for analyzing impact of the ride sharing services on transit demand. Our analysis is motivated by the problem of the loss of transit ridership over the past several years. Those losses overlap with the introduction of Transportation Network Companies (TNCs). There is a seeming correlation between the two, which has led to speculation that TNCs have caused transit ridership loss. However, many other factors outside of TNC ridership could be driving transit ridership loss, including changing populations, shifting land use patterns, transit service changes, macro-economic factors, and so on. Our model allows for disaggregate analysis of long-term transit ridership data from Chicago between 2010 and 2020, before and after the introduction of TNC services, along with TNC trip patterns to determine relationships between TNC ridership and transit ridership for similar areas.

Data

Transportation Network Companies (TNC)

The TNC data provided by the city of Chicago contains hourly number of tripf between each origin/destination community areas. We have summarized this table to daily trips, the resulting data has 794179 and ranged from 2018-11-01 to 2020-02-29. Each record has origin, destination, date and averages over attributes of the trip.

ntrips cost ttime
Min. : 1.0 Min. : 0.67 Min. : 0.000
1st Qu.: 2.0 1st Qu.: 13.09 1st Qu.: 1.333
Median : 6.0 Median : 17.48 Median : 2.333
Mean : 103.1 Mean : 19.19 Mean :155.038
3rd Qu.: 25.0 3rd Qu.: 22.55 3rd Qu.:307.500
Max. :29556.0 Max. :198.94 Max. :999.000

The average number of daily trips is 102.1955238 and the daily plot is given below

We started by analyzing origin designation flows along the Blue line of the L-system. The trip table has 78 community areas (CA) and 54.5m trips and spans 517 days from 2018-11-01 to 2020-03-31. It origin or destination CA is outside of Chicago, the records were deleted.

barplot(height = tmp$ct,names.arg = tmp$my)

L-System

barplot(height = tmp$ct,names.arg = tmp$my)

Bus

barplot(height = tmp$ct,names.arg = tmp$my)

Accessability Maps

We use similar to that proposed by Bernardinelli, Clayton, Pascutto, Montomoli, Ghislandi, and Songini (1995), which represents the spatio-temporal pattern in the mean response with spatially varying linear time trends. We assume the data is Gaussian. The model estimates autocorrelated linear time trends for each community area (areal unit). Thus it is appropriate if the goal of the analysis is to estimate which areas are exhibiting increasing or decreasing (linear) trends in the response over time. The full model specification is given below.

We assume we have a set of \(K\) non-overlaping areal units \(S_1,\ldots,S_K\) and data is recorded for each areal unit for \(N\) consecutive time steps \(t = 1,\ldots,N\), then the general hierarchical model is \[ \begin{aligned} Y_{kt} \sim & f(y_{kt} \mid \mu_{kt},\nu^2)\\ g(\mu_{kt}) = & x^T_{kt}\beta+O_{kt} + \psi_{kt}\\ \beta \sim & N(\mu_{\beta},\Sigma_{\beta}) \end{aligned} \] Given that we assume Normal (continious) data, we use \(f =\) Noraml density and link function \(g\) to be identity function.

The spatio-temporal correlations are modeled using \[ \psi_{kt} = \beta_1 + \phi_k + (\alpha + \delta_k)\left(\dfrac{t - \bar t}{N}\right) \] Thus, the temporal pattern is modeled by a local trend \(\delta_k\) and the spatial correlations are modeled via \(\phi_k\), both are random effects, which are conditionally normal \[ \begin{aligned} \phi_k \mid \phi_{-k}, W \sim & N\left(\dfrac{1}{c}\sum_{j=1}^Kw_{kj}\phi_j, \dfrac{\tau^2_{int}}{c}\right),~~~c = \sum_{j=1}^Kw_{kj}-1 +1/\rho_{int}\\ \delta_k \mid \delta_{-k},W \sim & N\left(\dfrac{1}{q}\sum_{j=1}^Kw_{kj}\delta_j, \dfrac{\tau^2_{slo}}{q}\right),~~~q = \sum_{j=1}^Kw_{kj}-1 +1/\rho^2_{slo}\\ \tau^2_{int},\tau^2_{slo} \sim & IG(a,b)\\ \rho_{int},\rho_{slo} \sim & Uniform(0,1)\\ \alpha \sim N(\mu_{\alpha},\sigma^2_{\alpha}) \end{aligned} \] Eech community area has its own linear trend with inctercept \(\beta_1 + \phi_k\) and slope \(\alpha + \delta_k\). Here \(\rho_{int},\rho_{slo}\) are spatial dependence parameters, with values of one corresponding to strong spatial smoothness that is equivalent to the intrinsic CAR prior proposed by Besag et al. (1991), while values of zero correspond to independence (for example if \(\rho_{slo}\) then \(\delta_k \sim N(0, \tau^2_{slo})\).

Now lets look at the results of the models

Findings:

##                 Median       2.5%      97.5%
## (Intercept)    15.5738     8.9403    21.7206
## bus             0.4777     0.3840     0.5779
## alpha          -0.8258    -2.6406     0.8807
## tau2.int    15271.9249 10542.0553 21930.1200
## tau2.slo        0.0082     0.0022     0.0810
## nu2            59.8469    54.7327    66.0348
## rho.int         0.7499     0.4120     0.9586
## rho.slo         0.3700     0.0185     0.9095
##          var1      var2     var3      var4      var5
## 5%  -29.36917 -24.27986 39.96184 -17.58459 -16.72675
## 95% -21.49325 -16.25814 47.95857  -9.86926  -9.59751

- There is no local effet on the trend

##           var1       var2       var3       var4       var5
## 5%  -0.1684918 -0.1504134 -0.1411595 -0.1215328 -0.1194774
## 95%  0.1647867  0.1393985  0.1496100  0.1323126  0.1238643
##                  Median        2.5%       97.5%
## (Intercept)    127.5000     89.8817    164.2657
## train            0.4308      0.2895      0.5744
## alpha           11.5450     -7.8647     28.3170
## tau2.int    388636.9237 229171.2010 740640.3867
## tau2.slo         0.0080      0.0021      0.1035
## nu2           3288.6626   2913.6250   3711.6146
## rho.int          0.1601      0.0090      0.5657
## rho.slo          0.3755      0.0191      0.9173

With L-System and bus

##                  Median        2.5%       97.5%
## (Intercept)    -33.3342   -100.1132     32.4743
## bus              2.4717      1.5956      3.3383
## train            0.2772      0.1223      0.4227
## alpha            0.0080    -17.0117     17.4833
## tau2.int    328802.0112 200068.5778 628969.1932
## tau2.slo         0.0083      0.0021      0.0912
## nu2           3130.1253   2767.8649   3530.2913
## rho.int          0.1176      0.0053      0.4641
## rho.slo          0.3699      0.0162      0.9176

With L-System and bus on log-scale

##              Median    2.5%   97.5%
## (Intercept)  0.6936 -0.2081  1.5101
## log(bus)     0.7060  0.5045  0.9032
## log(train)   0.0753 -0.0333  0.1950
## alpha       -0.0462 -0.0922 -0.0021
## tau2.int     2.8196  1.8534  4.5280
## tau2.slo     0.0379  0.0148  0.0808
## nu2          0.0217  0.0190  0.0247
## rho.int      0.6356  0.2739  0.9211
## rho.slo      0.6134  0.1426  0.9296

With L-System and bus on log-scale and Poisson

##              Median    2.5%  97.5%
## (Intercept)  0.2409 -0.6849 1.2140
## log(bus)     0.4650  0.2521 0.6977
## log(train)   0.3644  0.2593 0.4764
## alpha       -0.0259 -0.0742 0.0212
## tau2.int     2.5868  1.7078 4.1574
## tau2.slo     0.0259  0.0120 0.0530
## rho.int      0.6294  0.2778 0.9172
## rho.slo      0.7416  0.2451 0.9695