We calculate walkability \(W\) at point \(o\) as:
\(W_{o} = max\Bigg\{\sum\limits_{i=1}^{g} 100c_i \cdot min\bigg\{\frac{\sum\limits_{p_{i}=1}^{n_{i}} \alpha(d_{p_{i}}) \cdot \beta(r_{p_{i}})}{\sum\limits_{{p_{i}:r_{p_{i}}<m_{i}}} \beta(r_{p_{i}})}, 1\bigg\} + \sum\limits_{j=1}^{t} f_{j} \cdot w_{j} \cdot 100s_{j} \ \ , \ \ 0\Bigg\}\)
For convenience, we may use the following short form notation:
\(\Omega_{o} = \frac{\sum\limits_{p_{i}=1}^{n_{i}} \alpha(d_{p_{i}}) \cdot \beta(r_{p_{i}})}{\sum\limits_{p_{i}:r_{p_{i}}<m_{i}} \beta(r_{p_{i}})} \ \ \ and \ \ \ \Psi_{o} = 100 \cdot \sum\limits_{j=1}^{t} f_{j} \cdot w_{j} \cdot s_{j}\)
Such that:
\(W_{o} = max\bigg\{\sum\limits_{i=1}^{g} c_i \cdot \min\Big\{\Omega_{o}, 1\Big\} + \Psi_{o} \ \ , \ \ 0\bigg\}\)
We define the following notation for the model:
\(i\): identifier for a DUC (Diverse Use Category) – the infrastructure deemed important for assessing walkability
\(g\): the total number of DUC considered (as a default, we consider four: grocery stores, hospitals, schools, and transit stops)
\(c_{i}\): the category weight associated with DUC \(i\), subject to \(\sum_{c_{i}} = 1\) (as a default, we consider all DUC to have \(c_i = 0.25\). In an interactive setting, users will specify this input, and will be free to set \(c_i = 0\) to indicate that they “do not care” about DUC \(i\))
\(p_{i}\): identifier for a point of DUC \(i\) in the service area (walkable area) of point \(o\)
\(n_{i}\): the total number of points in DUC \(i\) in the service area of point \(o\)
\(y_{i}\): the maximum walkable distance for DUC \(i\) (in minutes) – used to define the extent of a service area (as a default, we consider all \(y_i = 10\). In an interactive setting, users will specify this input)
\(d_{p_{i}}\): the walk-time distance from \(o\) to point \(p_{i}\) (in minutes)
\(\alpha(\cdot)\): the decay function to weight a point \(p_{i}\) based on its walk-time distance from \(o\) (defined below)
\(\alpha(d_{p_{i}}) = 1 - \frac{d_{p_{i}} - 1}{2 \cdot y_{i} - 2}\)
Defines a linear decay of “distance weight” as walk-time distance increases. Points within 1 minute will always have distance weight 1, while points within \(y_{i}-1\) to \(y_{i}\) minutes will always have weight 0.5 (i.e., assume that walking 1 minute is always considered twice as advantageous as walking the maximum walkable distance)
\(r_{p_{i}}\): the closeness rank of point \(p_{i}\) to \(o\) amongst all points of DUC \(i\) in the service area (i.e. the closest point will have rank 1, second closest will have rank 2, and so on)
\(m_{i}\): the number of points of DUC \(i\) desired within a walkable distance of \(o\) (as a default, we consider all DUC to have \(m = 1\). In an interactive setting, users will specify this input)
\(\beta(\cdot)\): the decay function to weight a point \(p_{i}\) based on its rank of closeness to point \(o\) amongst all points of DUC \(i\) in the service area (defined below)
\(\beta(r_{p_{i}}) = minmax\big\{\frac{1}{1+e^{k \cdot (r_{p_{i}} - m_i - 0.5)}}\big\} = \frac{\frac{1}{1+e^{k \cdot (r_{p_{i}} - m_i - 0.5)}} - \frac{1}{1+e^{k \cdot (l - m_i - 0.5)}}}{\frac{1}{1+e^{k \cdot (0.5 - m_i)}} - \frac{1}{1+e^{k \cdot (l - m_i - 0.5)}}}\)
Defines a logisitic decay of “closeness rank weight” as rank increases. Regardless of walk-time distance, the closest \(p_{i}\) to \(o\) will always have rank weight 1. More generally, all \(p_{i}\) with \(r_{p_{i}} \leq m_i\) will have a “high” rank weight, while all others will have a “low” rank weight. \(l\) defines a function length (default is 100, though any large number will do), which defines the upper bound of the x-domain on which the function is constructed, and is meant to guarantee that all points will have rank weight > 0
\(k\) – the steepness of the drop-off after the rank \(m_{i}\) point – is defined as the root of the following expression: \(\frac{\frac{1}{1+e^{k \cdot (0.5)}} - \frac{1}{1+e^{k \cdot (l - m_i - 0.5)}}}{b*\big\{\frac{1}{1+e^{k \cdot (0.5 - m_i)}} - \frac{1}{1+e^{k \cdot (l - m_i - 0.5)}}\big\}} - 1\)
\(b\): the desired weight for the rank \(m_i + 1\) point (as a default, we set \(b = 0.2\))
There is no closed form solution for \(k\), so it is solved for numerically
\(j\): identifier for a CA (City Attribute) – city characteristics deemed to have an effect on walkability
\(t\): the total number of CA considered (as a default, we consider 4: crashes involving pedestrians, crime, historical sites [designated by the National Registry of Historical Places], and street trees)
\(f_{j}\): the sign associated with CA \(j\), indicating whether it considered to have a negative or positive impact on walkability (for example, crime would have \(f_{j} = -1\), because it is considered to make an area less walkable)
\(w_{j}\): the weight associated with CA \(j\) [currently, this is experimental, and isn’t implemented]
\(s_{j}\): the percentage of CA \(j\) occurring in the service area of point \(o\)
The list below documents previous model features or ideas, and the reasons for ultimately moving past them.
Variable functional form of the decay curve, defined by CA. A “decay parameter” took the form of what is currently defined
as \(\Psi_o\). If \(\Psi_o < 0\), we’d have exponential decay with parameter \(\Psi_o\); if \(\Psi_o = 0\), we’d have linear decay (same as currently defined); if \(\Psi_o > 0\) we’d have “flipped exponential” decay (exponential decay flipped across \(y = -x\)) with parameter \(\Psi_o\).
Any reasonably interpretable model in the mathematical sense resulted in \(\alpha(y_i) = 0\)
If a model was built ignoring the issue of interpretation, the curve too quickly approached \(y = 1\) on \(x \ \epsilon \ [1, y_i]\) when the decay parameter was > 0, which results in “clusters” of walk scores at 25, 50, 75, and 100 (because many category scores were being inaccurately inflated to the maximum category score)
This makes assumptions of both the weight at distance \(y_i\) and potentially three different functional forms. This is more assumptions than we’d like to make.
The CA should not control both relative weights and functional form, so, we’d have to introduce more inputs. This could be overkill [given our minimal background information on what affects walkability, and how the effects manifest]
Putting the CA in the \(\alpha(d_{p_i})\) equation implies that all considered CA affect how far one walks in their service area. Though this may be true for some CA (e.g. crime), it may not be true for others (e.g. street trees). It is probably safer to generalize how CA affect walkability.
Unbounded walk score – \(W_o \ \epsilon \ [0, \infty]\)
Fixed steepness \(k\) for the drop off in marginal utility of the closeness rank weight decay curve
Despite a user specifying a maximum walk time distance they’d allow, all walk times in that range are not equal to some extent unique to a user.
A linaer decay is desirable for modeling a user’s walking habits with relationship to distance. Furthermore, this type of decay is true across all DUC and all users.
For a maximum walk time distance \(y_{i}\) minutes, it is reasonable to think that walking between 0 and 1 minutes is twice as preferable as walking between \(y_{i} - 1\) and \(y_{i}\) minutes.
Marginal utility can be applied to having things within a walkable distance of you.
Logistic decay is appropriate for modeling marginal utility of having points in a DUC close to you. This means that, given a desired number \(m_i\), points with ranks \(1, 2, ..., m_i\) will have relatively high weight, followed by a steep drop such that points \(m_i + 1, m_i + 2, ..., n_i\) will have relatively low weight.
The weight for the rank \(m_i + 1\) point can be reasonably preselected. Furthermore, this weight is true across all DUC and all users.
CA affect walkability in a “general” sense – they affect the walkability of an area overall, not the way in which users walk to points in their service areas. In other words, the CA have no (or very limited) spatially unique, individual behavioral effects.