The Road Home Project

High Level Views

Let’s get a quick overview of the dataset you gave me to see if any immediate properties stick out.

##   GenderDesc                              RaceDesc        Age       
##  Female: 30   American Indian or Alaska Native:  5   Min.   :22.00  
##  Male  :196   Black or African American       : 31   1st Qu.:39.25  
##               Multi-Racial                    :  4   Median :53.00  
##               White                           :186   Mean   :49.96  
##                                                      3rd Qu.:59.00  
##                                                      Max.   :86.00  
##                                                                     
##    EnrollDate            ExitDate           DaysEnrolled   
##  Min.   :2014-07-21   Min.   :2014-07-22   Min.   :  2.00  
##  1st Qu.:2014-11-05   1st Qu.:2015-01-07   1st Qu.: 19.50  
##  Median :2015-01-07   Median :2015-03-31   Median : 61.00  
##  Mean   :2015-01-17   Mean   :2015-03-11   Mean   : 71.12  
##  3rd Qu.:2015-03-29   3rd Qu.:2015-05-31   3rd Qu.:120.25  
##  Max.   :2015-07-17   Max.   :2015-07-31   Max.   :184.00  
##                       NA's   :30                           
##   LenthofStay        Enrolled StillEnrolled       Exited     
##  Min.   :  1.00   Min.   :1   Min.   :0.000   Min.   :0.000  
##  1st Qu.: 19.50   1st Qu.:1   1st Qu.:0.000   1st Qu.:1.000  
##  Median : 68.50   Median :1   Median :0.000   Median :1.000  
##  Mean   : 74.27   Mean   :1   Mean   :0.146   Mean   :0.854  
##  3rd Qu.:133.00   3rd Qu.:1   3rd Qu.:0.000   3rd Qu.:1.000  
##  Max.   :183.00   Max.   :1   Max.   :1.000   Max.   :1.000  
##  NA's   :30                                                  
##     ClientID    
##  Min.   : 1097  
##  1st Qu.:37856  
##  Median :54920  
##  Mean   :47384  
##  3rd Qu.:60429  
##  Max.   :69247  
##                 
##                                                     ExitReason 
##  Completed Program                                       :178  
##  Criminal activity/destruction of property/violence      :  1  
##  Death                                                   :  1  
##  Left for a housing opportunity before completing program:  4  
##  Non-Compliance with Program                             :  4  
##  Other                                                   :  4  
##  NA's                                                    : 34  
##                                                                                                                       ExitDestination
##  Rental by client, no ongoing housing subsidy                                                                                 :99    
##  Rental by client, VASH Subsidy                                                                                               :62    
##  Emergency Shelter, including hotel or motel paid for with shelter voucher                                                    :10    
##  Place not meant for habitation (e.g., a vehicle, an abandoned building, bus/train/subway station/airport or anywhere outside): 4    
##  Rental by client, other (non-VASH) ongoing housing subsidy                                                                   : 4    
##  (Other)                                                                                                                      :17    
##  NA's                                                                                                                         :30    
##   monthExited    
##  Min.   : 0.000  
##  1st Qu.: 2.000  
##  Median : 4.000  
##  Mean   : 4.845  
##  3rd Qu.: 6.000  
##  Max.   :12.000  
## 

Initial Observations:

  1. The median age of clients is ~50 years old.
  2. The average number of days your clients stay enrolled is 71 days.
  3. 14.6% of your clients are still enrolled.
  4. (obviously) 85.4% have exited the program
  5. 44% exited with rental and without any ongoing subsidy, followed by 27% that recieved a VASH subsidy, which seems to me like a success rate of at least 71%.
  6. You have 6.5 males to every female client.

The Main Question

What’s peaked my interest is to figure out the percentage of clients who exited the program in a “bad” way and see if there are any particular properties that stand out among them.

A high nonRental percentage is considered a bad thing, so we will typically be looking to find low spots in the graph because these will me high rental outcome rates (and obviously, low nonRental outcome rates).

NOTE: I’m going to consider anything that doesn’t result in renting a “bad” result (hereby known as nonRental). So the majority of this analysis will consist of attempting to understand this.

nonRental Type

It looks like the highest nonRental outcome is Emergency Shelters, and after that uninhabitable living quarters.

nonRental Age

It does appear as though there is a peak at 30 and 50-60 year olds in nonRental outcomes, but as you can see by the regression line, it isn’t anything too crazy.

This is explained by the fact that you have two main sets of age groups that you work with, namely a ~30 year old group and a 50-65 year old group.

Females

In our early conversations and in my exploratory analysis I noticed a strong correlation between lack of completion and Female clients. Let’s see if this holds true for nonRental outcomes.

##   GenderDesc   nonRentals 
##  Female:30   Min.   :0.0  
##  Male  : 0   1st Qu.:0.0  
##              Median :0.0  
##              Mean   :0.3  
##              3rd Qu.:1.0  
##              Max.   :1.0
##   GenderDesc    nonRentals    
##  Female:  0   Min.   :0.0000  
##  Male  :196   1st Qu.:0.0000  
##               Median :0.0000  
##               Mean   :0.1122  
##               3rd Qu.:0.0000  
##               Max.   :1.0000

Females have a 70% chance of renting in the program, while males have an 88.8%.

Let’s investigate this further. Similar to the overall outcome the majority are leaving the program due to Emergency Shelters, this holds true for females as well.

##    monthExited nonRentals
## 1            1        0.0
## 2            2        0.0
## 3            3        0.5
## 4            4        0.0
## 5            5        0.0
## 6            7        0.0
## 7            8        1.0
## 8            9        0.0
## 9           11        0.0
## 10          12        1.0

It looks like females have a 50% chance of ending up in a rental in March, the rest of the year seems successful (90%-100%).

Race

There doesn’t appear to be any correlation between race and renting rate, except multi-racial and Native people have a 100% rental rate (this is due to lack of data in those racial descriptions).

##                           RaceDesc nonRentals
## 1 American Indian or Alaska Native  0.0000000
## 2        Black or African American  0.1538462
## 3                     Multi-Racial  0.0000000
## 4                            White  0.1666667

Date

For my own curiousity I’d like to take a look at how date affects nonRental outcomes, particularly if the winter months affect rental rates.

Here we are plotting the sum of all nonRental outcomes for each month of the year. The blue line signifies the average value of nonRental outcomes, as you can see mid February to mid April are all above the average.

Here we plot the percentage of nonRental outcomes for each month of the year. Once again, blue signifies the average nonRental rate, and red the rate for each month.

It appears as though you have a much higher nonRental rate in August (25% success rate), and a much higher rental rate in May and June (98% and 95% success).

However, if we look at the total number of outcomes, we’ll see the reason for this poor success rate in August is lack of data in that month. Which means people aren’t leaving your program often in August, they are however leaving in January up until May, which could explain the above average outcomes we saw in the first graph.

If we remove this outlier and check again we’ll see there is a relatively strong correlation between rental rate success and the number of total outcomes in the data set. (notice the peak in the number of outcomes occurs in May, as does the most successful in rental rates). This means, the more people exiting your program, the higher your percentage of rental outcomes is (the higher the success rate).

Success Rate vs. Overall Exit Number

If we plot success rate against the total number of outcomes we see this idea much more clearly. As the number of all outcomes increases your success rate increases.

Overall

Lets visualize the some broad views of your program, regardless of any sort of “good” or “bad” results.

Enrollment Length

The longer a client stays in your program, the less likely they are to end up in a rental, among lengths of stays that do not include exclusively 100% success days (ie there is some amount of failure on these lengths)

Number of Clients

Here we can see the the number of clients peaked in Jan 2015, but is back on the decline by July 2015. It could be that you see an increase in clients during the fall to winter months, a longer dataset could verify this hypothesis.

Age

It looks like over the course of the timeframe of this dataset, your clients are getting older.

Sucess Rate

Finally, lets check your success rate against time, and see if you’re getting better or worse.

NICE! it does look like you guys are getting better over time at getting people into rental units as you can see by the linear regression line. The nonRental outcomes are less likely as time progresses in this data set.

Congrats!

Conclusion

The next step I’d like to take with this data is to start predicting on it. Essentially the idea is, you type in the characteristics of your client (“White”, “male”,“22”,etc.) and I return a percentage of each possible outcome (eg, 20% chance of rental, 10% chance of Emergency shelter, etc.).