Overview

The proposed template for the functionality can be viewed in this sketch. This is a draft of what the functionality the dashboard might include if there is sufficient support for implementation from center staff. All graphical demonstrations are created using mock data.
The functionality is threefold at present:

Center Management Dashboard

Graph 1:

  1. A line graph of the number of students applying for a course
  2. The AUC (area-under-curve) is split by color and binned by a selectable unit of time into stacked bar graphs indicating the ratio of old students/new students per unit time
  3. A prediction for the closing date of a course can be created from the average of two linear regression lines:
    • applications per unit time since first application for retreat
    • applications in the past week
    This will provide the basis for a moving “expected date of close” estimate to provide a guideline for old students intent on attending the course as to when they will need to have the application completed by to avoid being waitlisted. This could also be paired with an automatic email notification feature when the proposed close date reaches 1 week etc.

Graph 2

Graph 2 will provide a statistical overview of various metrics per student across a given time period. Default time bins will be seasons, with adjustments of quarters, months, weeks, days. The graph will have y-axis as the # of students admitted, and x-axis as the date. Stacked bar graphs will show the number of os v ns per retreat across the duration of the retreat. These can be regressed against electricity cost, food cost, donations etc to establish an amount per student of a given resource per given time period.

Ride Suggestion

A ride suggestion algorithm can be created with KNN (k nearest neighbors) and the Google Maps API. The application process can be modified to include inputs for:

  • Their location of departure
  • An optional checkbox to select “offering a ride” or “needing a ride”
  • If offering a ride is selected, the following appear:
    • a date of departure selection dialog
    • # of seats available (taking into account luggage of each traveler)

With this information, the OpenMaps API can then be queried to provide an inline user selection (with PHP) of the latitude and longitude of that location. Because kNN is directionally indifferent, we need to create a custom algorithm to find the individuals that are closest to the driver’s path to the center. To do so we will need to compute the distance of parallel lines between the rider’s origin point \(r_{la},r_{lo}\) and the vector between the driver’s origin and the center \(\vec{d_{oc}}\) for a rough estimate of whom will be closest to the path of travel. We can determine which riders will be encompassed in a query for a drive by finding the bisecting half way point \(distance(\vec{d_{oc}})/2\) on the direction vector from the driver origin to center, and creating a circular radius equivalent to half the distance of the journey with an overlapping sphere of 10 mile radius from their point of origin (a driver would likely be willing to pick up someone with 10 miles even if in the opposite direction). The distance of each of rider’s origin falling within these radii to the will be calculated using an equation for a perpendicular line, and k closest riders will be determined where k is equal to the number of open seats.

The k closest riders will then be submitted to the Google Maps API to determine the amount of time added to the driver’s trip in order to pick up the rider - the total added time will be computed for each and for all (combinations can be manually calculated from this info). The Google Maps API provides 2500 free queries per 24 hour period, thus the queries will be as follows:

  1. Time to the center (1 query)
  2. Time to the center with each of the closest riders (k queries)
  3. Time to the center with all of the closest riders (1 query)

With the pre-computation described above, the queries per day will be unlikely to exceed the 24 hour quota.

Exceptions

Exceptions might occur where a person resides just outside of a 10 mile radius but also in the same large city, and would be willing to take a Lyft or public transportation to reach the driver. To account for such exceptions, a regex matching system based on city name would be helpful to identify and match these exceptions.

Potentially relevant links:
  1. find number of points within a radius in R using lon and lat coordinates [closed]
  2. Geographic / geospatial distance between 2 lists of lat/lon points (coordinates) Ask Question
  3. Equation of vector that is perpendicular to and intersects a line with unknown variables

Graph 1

Due to the number and diversity of variables involved in this analysis, I believe that trying to develop a demo with dummy data will be a wasted effort. The code for this particular calculation will be heavily dependent on where, how, and in what units the data is stored and retrieved and thus the actual code necessary to complete the analysis will likely entirely invalidate any mock data demo. This section will be skipped until further information is available.

Ride Suggestion

The level of complexity of the algorithm documented above may not be feasible for the purposes of this project, but a simpler mechanism that employs kNN and a directional filtering mechanism may be implementable in a reasonable amount of time and provide sufficient matching to help individuals without vehicular transporation get to the center with greater ease.

## $nn
##      [,1] [,2] [,3]
## [1,]    2    4    5
## [2,]    4    5    6
## [3,]    1    2    4
## [4,]    5    6    2
## [5,]    4    6    2
## [6,]    5    4    2
## 
## $np
## [1] 6
## 
## $k
## [1] 3
## 
## $dimension
## [1] 2
## 
## $x
##           Lon       Lat
## [1,] 40.69533 -73.92560
## [2,] 41.32225 -72.97030
## [3,] 41.26475 -75.83605
## [4,] 41.82965 -72.63202
## [5,] 42.08225 -72.56694
## [6,] 42.37814 -72.54875
## 
## attr(,"class")
## [1] "knn"
## attr(,"call")
## spdep::knearneigh(x = as.matrix(RS[, c("Lon", "Lat")]), k = 3)

The algorithm will have to be built such that the driver’s shortest path is used as the point from which the nearest locations are calculated.