HW8 Answers

PAF 573

Elaine MacPherson

Question

Include a one or two paragraph description of the question you are asking in your project

#It is a commonly cited argument that cites with warmer weather attract more people experiencing homelessness, since it could be easier to sleep outside. That also means that in assessing the impact of weather (our regressor) on the number of people experiencing homelessness (our outcome) we should be specifying between sheltered and unsheltered homelessness. Sheltered homelessness simply refers to the number of people experiencing homelessness residing in shelters as opposed to being literally outside. I’d like to understand climate as a variable, and possibly also look at mogrration data as another variable, since that would indicative of people moving to warmer cities.

Data

Describe the data you are using.

Where did it come from? Include a link to a website if appropriate.

#This data came from the Department of Housing and Urban Development (HUD) Office of Policy Development and Research. It can be found here: https://www.huduser.gov/portal/datasets/hpmd.html?q=datasets%2Fhpmd.html.

Is it longitudinal or cross-sectional?

#This dataset is cross sectional. It is a point-in-time snapshot of the extent of homelessness in each jurisdiction, aptly called the “Point-in-Time Count”. It is essentially a census with interviews by volunteers. I am interested in this data since one of the functions of my role at the City of Tucson is to oversee this count. This is a HUD requirement so all communities do it, even though most people agree this is a garbage way to conduct a count due to the many reasons why someone experiencing homelessness would not be accessible to an interviewer on one night in January. There is a longitudinal study called the “Longitudinal System Analysis” and it details how people going throught the homeless response system use the services. It’s interesting: https://www.hudexchange.info/homelessness-assistance/lsa/

What variable(s) are you using to measure your outcomes?

#The outcome and secondary outcomes respectively are “pit_tot_unshelt_pit_hud” (total amount of people experiencing unsheltered homelessness) and “pit_ind_unshelt_pit_hud” (individuals experiencing unsheltered homelessness).

What variable(s) are your key regressors of interest?

#The key regressors of interest are: #“dem_pop_mig_census”: net migration from year-1 to year of interest #“env_wea_avgtemp_summer_noaa”: average summer temperature - June, July, August #“env_wea_precip_annual_noaa”: total annual precipitation #“env_wea_avgtemp_noaa”: average January temperature #“env_wea_precip_noaa”: total January precipitation #“dem_pop_mig_census_share”: yearly increase in population to total population #“dem_soc_ed_lesshs_acs5yr_diff”: change in the adult population (ages 20-64) to total population. Though I wonder if I shouldn’t use change and should just use the “dem_pop_adult_census_share”.

# read in your data here and any additional libraries you need
# you can point to your own working directory. I don't need to be able 
# to read in the data myself. 
library(tidyverse)
library(jtools)

# show me the first 20 lines of your data for the important columns
# use something like this to do so:
# yourDatasetName %>% select(importantVariable1,importantVariable2, ...) %>% head(20)

Sample Regression

Run a single regression using the an outcome variable and one key regressor. If you have any thoughts about where you are going beyond this with your analysis, please share and I will give you feedback.

#Call: #lm(formula = pit_tot_unshelt_pit_hud ~ env_wea_avgtemp_noaa + # env_wea_precip_annual_noaa, data = HLN_DATA)

#Residuals: # Min 1Q Median 3Q Max # -2023 -467 -181 99 41327

#Coefficients: # Estimate Std. Error t value Pr(>|t|)
#(Intercept) 25.413 118.058 0.215 0.83
#env_wea_avgtemp_noaa 35.547 2.362 15.050 <2e-16 #env_wea_precip_annual_noaa -18.417 1.983 -9.289 <2e-16

#If I am reading this correctly, the impact of average temperature and annual precipitation has a strong relationship - with each change in temperature / annual precipitation resulting in a 25-35 unit change in homelessness. It’s not clear to me whether this direction is up or down, though - the coefficient itself is positive, but I’m struggling to understand how that applies to temperature. I’m unsure if temperature can be used as a continuous variable, or if I should make it a dummy variable. One thing to consider in my analysis is that this always happens the fourth week of January, for all jurisdictions. So, in a sense, this variable is controlled. However, my regressor is climate, so it seems appropriate to consider the timing of this count. One variable I could consider is the average January temperature, unsure if that makest the analysis more viable.