Surratt Eviction Analysis

Author

Brian Surratt

Published

July 26, 2023

Data Sources

Household Pulse Survey Phases 3.2, 3.3, and 3.4 from July 21, 2021 – May 9, 2022 (https://www.census.gov/programs-surveys/household-pulse-survey.html)
COVID-19 Housing Policy Scorecard (https://evictionlab.org/covid-policy-scorecard/)

Measures

Dependent variables

From Household Pulse Survey

mnhlth: An index from 0 to 12 consisting of the sum of ‘anxious’, ‘worry’, ‘interest’, and ‘down’.
phq4: A variable for bad mental health derived from ‘anxious’, ‘worry’, ‘interest’, and ‘down’. Value can be None, Mild, Moderate, or Severe. Alternately, I could code this as a dummy variable.
badmh: A dummy variable for bad mental health. Bad mental health (6 to 12) is 1, good mental health (0 to 5) is 0. The threshold for this can be adjusted.

Independent variables

From Household Pulse Survey

rentcur: Caught up on rent. Dichotomized so that “Caught up on rent” = 1 and “Not caught up on rent” = 0. NAs removed.
evict: Eviction in next two months. 1 is very likely, 2 is somewhat likeley, 3 is not very likely, 4 is not likely at all, 88 and 99. Only asked if rentcur = 2.

State Policy Variable

From COVID-19 Housing Policy Scoreboard

score: State score on the COVID-19 Housing Policy Scorecard. Value is on a scale from 0 to 4.63

Covariates

(from Household Pulse Survey)

income: Total household income (before taxes).
1. Less than $25,000
1. $25,000 - $34,999
1. $35,000 - $49,999
1. $50,000 - $74,999
1. $75,000 - $99,999
1. $100,000 - $149,999
1. $150,000 - $199,999
1. $200,000 and above
1. Question seen but category not selected
1. Missing / Did not report
age: Derived from tbirth_year (year of birth)
genid_describe: Current gender identity. 1 is Male, 2 is Female, 3 is Transgender, 4 is none of these, 88, 99.
race_eth: Categorical variable derived from ‘rhispanic’ and ‘rrace’. (Describe coding)
single_adult: Dummy variable for single adult in household vs. multiple adults. Derived from ‘thhld_numadlt’.
thhld_numper: Total number of people in household
child: Dummy variable derived from ‘thhld_numkid’
eeduc: Educational attainment. 1 is less than high school, 2 is some high school, 3 is high school graduate, 4 is some college but no degree, 5 is associates degree, 6 is bachelor’s degree, 7 is graduate degree.

Other variables for cleaning

week: week of interview (review how to use this)
anxious, worry, interest, and down. Used to derive phq4. Frequency of having bad feeling over previous 2 weeks. 1 not at all, 2 is several days, 3 is more than half the days, 4 is nearly every day, 88 is missing, 99 is category not selected.
tenure: 3 is rented. Used to select renters.
est_st: FIPS code for state, used to merge
state: Name of state
abbv: Two letter abbreviation of state

Reading in the data

Importing the data for Household Pulse Survey phase 3.2 to 3.4.

Code

hps34 <- read.csv("./data/HPS_Week34_PUF_CSV/pulse2021_puf_34.csv")

hps35 <- read.csv("./data/HPS_Week35_PUF_CSV/pulse2021_puf_35.csv")

hps36 <- read.csv("./data/HPS_Week36_PUF_CSV/pulse2021_puf_36.csv")

hps37 <- read.csv("./data/HPS_Week37_PUF_CSV/pulse2021_puf_37.csv")

hps38 <- read.csv("./data/HPS_Week38_PUF_CSV/pulse2021_puf_38.csv")

hps39 <- read.csv("./data/HPS_Week39_PUF_CSV/pulse2021_puf_39.csv")

hps40 <- read.csv("./data/HPS_Week40_PUF_CSV/pulse2021_puf_40.csv")

hps41 <- read.csv("./data/HPS_Week41_PUF_CSV/pulse2022_puf_41.csv")

hps42 <- read.csv("./data/HPS_Week42_PUF_CSV/pulse2022_puf_42.csv")

hps43 <- read.csv("./data/HPS_Week43_PUF_CSV/pulse2022_puf_43.csv")

hps44 <- read.csv("./data/HPS_Week44_PUF_CSV/pulse2022_puf_44.csv")

hps45 <- read.csv("./data/HPS_Week45_PUF_CSV/pulse2022_puf_45.csv")

Code

names(hps34) <- tolower(names(hps34))

hps34 <- hps34 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps35) <- tolower(names(hps35))

hps35 <- hps35 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps36) <- tolower(names(hps36))

hps36 <- hps36 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps37) <- tolower(names(hps37))

hps37 <- hps37 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps38) <- tolower(names(hps38))

hps38 <- hps38 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps39) <- tolower(names(hps39))

hps39 <- hps39 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps40) <- tolower(names(hps40))

hps40 <- hps40 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps41) <- tolower(names(hps41))

hps41 <- hps41 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps42) <- tolower(names(hps42))

hps42 <- hps42 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps43) <- tolower(names(hps43))

hps43 <- hps43 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps44) <- tolower(names(hps44))

hps44 <- hps44 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

names(hps45) <- tolower(names(hps45))

hps45 <- hps45 %>%
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st)

Combining into one dataframe.

Code

hps <- rbind(hps34, hps35, hps36, hps37, hps38, hps39, hps40, hps41, hps42, hps43, hps44, hps45)

nrow(hps)

[1] 803905

Code

statedat <- read.csv("./data/state_policy.csv")

nrow(statedat)

[1] 51

Joining hps with state.

Code

alldata <- merge(x=hps, y=statedat, by='est_st', all.x=TRUE)

nrow(alldata)

[1] 803905

Selecting variables from alldata for a working dataframe.

Code

dat <- alldata %>% 
  dplyr::select(week, income, tenure, evict, rentcur, anxious, worry, interest, down, tbirth_year, genid_describe, rhispanic, rrace, ms, thhld_numadlt, thhld_numper, thhld_numkid, eeduc, hweight, pweight, est_st, state, abbv, score)

nrow(dat)

[1] 803905

Cleaning and exploring the data

Filter for only renters

First I will filter for renters. This will exclude homeowners.

Check the distribution of tenure. Category “3” is renters.

Code

tabyl(dat$tenure)

 dat$tenure      n     percent
        -99   6498 0.008083045
        -88 117839 0.146583241
          1 194842 0.242369434
          2 316637 0.393873654
          3 159649 0.198591873
          4   8440 0.010498753

Filter for only renters.

Code

dat <- dat %>%
  filter(.$tenure == 3)

nrow(dat)

[1] 159649

Clean and recode mental health variables

First, I add the 4 mental health categories together to create a mental health index.

Code

dat <- dat %>% 
  mutate(mnhlth = anxious + worry + interest + down)

tabyl(dat$mnhlth)

 dat$mnhlth     n      percent
       -396   116 7.265940e-04
       -296    23 1.440660e-04
       -295    21 1.315386e-04
       -294     1 6.263741e-06
       -293     8 5.010993e-05
       -196    27 1.691210e-04
       -195    10 6.263741e-05
       -194    24 1.503298e-04
       -193     7 4.384619e-05
       -192     5 3.131871e-05
       -191     4 2.505496e-05
       -190     6 3.758245e-05
        -96   196 1.227693e-03
        -95    73 4.572531e-04
        -94    72 4.509894e-04
        -93   106 6.639566e-04
        -92    26 1.628573e-04
        -91    33 2.067035e-04
        -90    36 2.254947e-04
        -89    16 1.002199e-04
        -88    18 1.127473e-04
        -87    35 2.192309e-04
          4 38789 2.429643e-01
          5 11661 7.304148e-02
          6 14092 8.826864e-02
          7 13424 8.408446e-02
          8 21480 1.345452e-01
          9  9370 5.869125e-02
         10  8548 5.354246e-02
         11  6647 4.163509e-02
         12  8038 5.034795e-02
         13  5267 3.299112e-02
         14  5204 3.259651e-02
         15  4284 2.683387e-02
         16 11982 7.505215e-02

Filter for results from 4 to 16.

Code

dat <- dat %>% 
  filter(.$mnhlth %in% c(4:16))

tabyl(dat$mnhlth)

 dat$mnhlth     n    percent
          4 38789 0.24428476
          5 11661 0.07343846
          6 14092 0.08874838
          7 13424 0.08454146
          8 21480 0.13527641
          9  9370 0.05901024
         10  8548 0.05383346
         11  6647 0.04186137
         12  8038 0.05062159
         13  5267 0.03317043
         14  5204 0.03277367
         15  4284 0.02697971
         16 11982 0.07546005

Convert to PHQ4 scale by subtracting 4 from each score.

Code

dat$mnhlth <- dat$mnhlth - 4

tabyl(dat$mnhlth)

 dat$mnhlth     n    percent
          0 38789 0.24428476
          1 11661 0.07343846
          2 14092 0.08874838
          3 13424 0.08454146
          4 21480 0.13527641
          5  9370 0.05901024
          6  8548 0.05383346
          7  6647 0.04186137
          8  8038 0.05062159
          9  5267 0.03317043
         10  5204 0.03277367
         11  4284 0.02697971
         12 11982 0.07546005

Histogram of mental health index for all respondents.

Code

hist(dat$mnhlth)

Coding the mental health index to a variable “phq4”. There are four psychological distress levels: None, Mild, Moderate, and Severe.

Code

dat <- dat %>%
  mutate(phq4 = case_when(.$mnhlth %in% c(0:2) ~"1 none",
                          .$mnhlth %in% c(3:5) ~"2 mild",
                          .$mnhlth %in% c(6:8) ~"3 moderate",
                          .$mnhlth %in% c(9:12) ~"4 severe",
                          )
         )

tabyl(dat$phq4)

   dat$phq4     n   percent
     1 none 64542 0.4064716
     2 mild 44274 0.2788281
 3 moderate 23233 0.1463164
   4 severe 26737 0.1683839

Make a dummy variable for bad mental health. Bad mental health (6 to 12) is 1, good mental health (0 to 5) is 0. The threshold for this can be adjusted.

Code

dat <- dat %>%
  mutate(badmh = case_when(.$mnhlth <=5 ~ 0,
                           .$mnhlth >=6 ~ 1,
                           )
         )

tabyl(dat$badmh)

 dat$badmh      n   percent
         0 108816 0.6852997
         1  49970 0.3147003

Clean and recode ‘rentcur’

Then I will check the distribution of ‘rentcur’ and remove observations where ‘rentcur’ is missing. In the original dataset, 1 is “caught up on rent” and 2 is “Not caught up on rent.”

Then I will create a dummy variable for ‘rentcur’ so that the value of ‘1’ means ‘caught up on rent’ and the value of ‘0’ is ‘not caught up on rent’.

Checking distribution of rentcur.

Code

tabyl(dat$rentcur)

 dat$rentcur      n      percent
         -99    300 0.0018893353
         -88    149 0.0009383699
           1 140233 0.8831572053
           2  18104 0.1140150895

Removing missing values for rentcur.

Code

dat <- dat %>% 
  filter(.$rentcur %in% c(1:2))

Checking distribution of rentcur.

Code

tabyl(dat$rentcur)

 dat$rentcur      n   percent
           1 140233 0.8856616
           2  18104 0.1143384

Code

nrow(dat)

[1] 158337

Dichotomize rentcur. “Caught up on rent” = 1 and “Not caught up on rent” = 0

Code

dat$rentcur <- car::recode(dat$rentcur, "1=1; else=0")
tabyl(dat$rentcur)

 dat$rentcur      n   percent
           0  18104 0.1143384
           1 140233 0.8856616

Clean and recode ‘evict’

Checking distribution of evict.

evict: Eviction in next two months. 1 is very likely, 2 is somewhat likeley, 3 is not very likely, 4 is not likely at all, 88 and 99. Only asked if rentcur = 2.

Code

tabyl(dat$evict)

 dat$evict      n      percent
       -99    133 0.0008399805
       -88 140663 0.8883773218
         1   2489 0.0157196360
         2   4961 0.0313319060
         3   5241 0.0331002861
         4   4850 0.0306308696

Lets see how ‘evict’ is distributed across states. What if I added more HH pulse phases?

Code

dat %>%
  #filter(.$evict %in% (1:4)) %>%
  tabyl(., state, evict)

                state -99   -88   1   2   3   4
              Alabama   1  1325  50  68  75  51
               Alaska   2  2154  37  84  86  70
              Arizona   3  3518  45 101  93  69
             Arkansas   1  1498  49  83  89  52
           California  14 15042 251 481 495 579
             Colorado   0  3551  37  78  92  61
          Connecticut   4  2234  43  81 110 109
             Delaware   4  1110  22  44  66  51
 District of Columbia   4  3281  25  72  93 122
              Florida   6  4806 118 179 183 199
              Georgia   1  2904 113 140 128  95
               Hawaii   1  1944  16  58  72  76
                Idaho   0  2317  18  58  67  40
             Illinois   3  2771  60 120 134 119
              Indiana   1  2039  31  73  96  74
                 Iowa   3  1717  16  58  82  55
               Kansas   2  2384  35  61  65  84
             Kentucky   2  1582  38  76  63  46
            Louisiana   4  1375  62 108  73  58
                Maine   0  1445  11  37  62  43
             Maryland   3  2829  73 122 139 143
        Massachusetts   8  4454  55 124 149 200
             Michigan   2  2760  53 117 132 120
            Minnesota   3  2441  25  57  67  74
          Mississippi   1   886  38  80  59  43
             Missouri   1  2148  45  65 103  54
              Montana   0  1606  18  40  44  48
             Nebraska   2  2345  31  76  66  65
               Nevada   3  2919  57 116  84  83
        New Hampshire   2  2260  21  52  71  63
           New Jersey   8  2495  43  84 120 118
           New Mexico   4  2469  44 104  90  84
             New York  10  3919  74 144 221 229
       North Carolina   1  2443  49  98  90  79
         North Dakota   2  1284  21  42  31  27
                 Ohio   3  2083  42  79  85  51
             Oklahoma   1  1830  53 108  91  62
               Oregon   0  4491  71 159 141 101
         Pennsylvania   3  3195  70 121 133 141
         Rhode Island   1  1620  25  71  71  75
       South Carolina   0  1378  36  85  62  57
         South Dakota   0  1224  13  31  49  40
            Tennessee   0  2126  51  95  95  71
                Texas  11  7476 166 325 258 257
                 Utah   1  3298  35  53  58  64
              Vermont   3  1411  13  20  44  35
             Virginia   1  3481  44 101 101 129
           Washington   3  6409  82 158 193 158
        West Virginia   0  1140  22  69  52  37
            Wisconsin   0  2214  27  67  74  60
              Wyoming   0  1032  15  38  44  29

“Risk of eviction” question is only asked if rentcur = 2 (not caught up on rent), so it is only asked of people who are not caught up on rent. 1) Very likely 2) Somewhat likely 3) Not very likely 4) Not likely at all. Should I recode this?

Check state policy variable

Code

tabyl(dat, score)

 score     n     percent
  0.00 14572 0.092031553
  0.23  3829 0.024182598
  0.30  1407 0.008886110
  0.38  4180 0.026399389
  0.53  2438 0.015397538
  0.60  2585 0.016325938
  0.63  1320 0.008336649
  0.75  1107 0.006991417
  0.78  3857 0.024359436
  0.83  5079 0.032077152
  1.03  2795 0.017652223
  1.15  2314 0.014614398
  1.23  2433 0.015365960
  1.28 11802 0.074537221
  1.33 16862 0.106494376
  1.43  2631 0.016616457
  1.58  7089 0.044771595
  1.84  2760 0.017431175
  1.98  1863 0.011766043
  2.08  2868 0.018113265
  2.29  1756 0.011090269
  2.30  1618 0.010218711
  2.48  1526 0.009637672
  2.65  1807 0.011412367
  2.73  4963 0.031344537
  2.78  1931 0.012195507
  3.00  2442 0.015422801
  3.08  3207 0.020254268
  3.13  3663 0.023134201
  3.15  2167 0.013685999
  3.25  4597 0.029033012
  3.33  3184 0.020109008
  3.38  3819 0.024119441
  3.73  2469 0.015593323
  3.75  7003 0.044228449
  3.78  2581 0.016300675
  3.80  2667 0.016843820
  3.88  1297 0.008191389
  4.03  4990 0.031515060
  4.30  3262 0.020601628
  4.63  3597 0.022717369

Clean up covariates

Clean income

income: Total household income (before taxes).
1. Less than $25,000
1. $25,000 - $34,999
1. $35,000 - $49,999
1. $50,000 - $74,999
1. $75,000 - $99,999
1. $100,000 - $149,999
1. $150,000 - $199,999
1. $200,000 and above
1. Question seen but category not selected
1. Missing / Did not report

Check distribution of income.

Code

tabyl(dat$income)

 dat$income     n    percent
        -99  1885 0.01190499
        -88  6001 0.03790017
          1 37223 0.23508719
          2 22912 0.14470402
          3 22682 0.14325142
          4 26647 0.16829294
          5 15730 0.09934507
          6 14209 0.08973897
          7  5618 0.03548128
          8  5430 0.03429394

Filter out rows with missing income data.

Code

dat <- dat %>% 
  filter(.$income %in% c(1:8))

tabyl(dat$income)

 dat$income     n    percent
          1 37223 0.24740946
          2 22912 0.15228879
          3 22682 0.15076005
          4 26647 0.17711414
          5 15730 0.10455231
          6 14209 0.09444271
          7  5618 0.03734106
          8  5430 0.03609148

Create new variable with the median value of the income level called ‘inclvl’.

Code

dat <- dat %>%
  mutate(inclvl = case_when(.$income == '1' ~ 12500,
                            .$income == '2' ~ 30000,
                            .$income == '3' ~ 42500,
                            .$income == '4' ~ 62500,
                            .$income == '5' ~ 87500,
                            .$income == '6' ~ 125000,
                            .$income == '7' ~ 175000,
                            .$income == '8' ~ 200000,
                              )
        )

tabyl(dat$inclvl)

 dat$inclvl     n    percent
      12500 37223 0.24740946
      30000 22912 0.15228879
      42500 22682 0.15076005
      62500 26647 0.17711414
      87500 15730 0.10455231
     125000 14209 0.09444271
     175000  5618 0.03734106
     200000  5430 0.03609148

Checking sample size.

Code

nrow(dat)

[1] 150451

Histogram of income levels.

Code

hist(dat$inclvl)

Clean age

age: Derived from tbirth_year (year of birth)

Creating a new variable ‘age’ by subtracting birth year from the year the data were collected.

Code

dat$age <- 2021-dat$tbirth_year

tabyl(dat$age)

 dat$age    n      percent
      17   40 0.0002658673
      18  240 0.0015952038
      19  418 0.0027783132
      20  737 0.0048986049
      21 1068 0.0070986567
      22 1541 0.0102425374
      23 1970 0.0130939641
      24 2594 0.0172414939
      25 3060 0.0203388479
      26 3269 0.0217280045
      27 3481 0.0231371011
      28 3702 0.0246060179
      29 3768 0.0250446989
      30 3883 0.0258090674
      31 3770 0.0250579923
      32 3717 0.0247057181
      33 3680 0.0244597909
      34 3502 0.0232766814
      35 3330 0.0221334521
      36 3493 0.0232168613
      37 3404 0.0226253066
      38 3377 0.0224458462
      39 3462 0.0230108142
      40 3381 0.0224724329
      41 3319 0.0220603386
      42 3165 0.0210367495
      43 2947 0.0195877728
      44 2972 0.0197539398
      45 2772 0.0184246034
      46 2694 0.0179061621
      47 2608 0.0173345475
      48 2581 0.0171550870
      49 2623 0.0174342477
      50 2680 0.0178131086
      51 2787 0.0185243036
      52 2549 0.0169423932
      53 2403 0.0159719776
      54 2294 0.0152474892
      55 2388 0.0158722774
      56 2412 0.0160317977
      57 2405 0.0159852710
      58 2458 0.0163375451
      59 2458 0.0163375451
      60 2359 0.0156795236
      61 2403 0.0159719776
      62 2317 0.0154003629
      63 2218 0.0147423414
      64 2315 0.0153870695
      65 2126 0.0141308466
      66 2167 0.0144033606
      67 2084 0.0138516859
      68 1930 0.0128280969
      69 1761 0.0117048075
      70 1578 0.0104884647
      71 1465 0.0097373896
      72 1336 0.0088799676
      73 1286 0.0085476334
      74 1256 0.0083482330
      75  988 0.0065669221
      76  786 0.0052242923
      77  694 0.0046127975
      78  621 0.0041275897
      79  554 0.0036822620
      80  485 0.0032236409
      81  377 0.0025057992
      82  342 0.0022731653
      83  298 0.0019807113
      84  232 0.0015420303
      85  218 0.0014489767
      86  160 0.0010634692
      87  383 0.0025456793
      88  310 0.0020604715

Code

hist(dat$age)

Checking sample size.

Code

nrow(dat)

[1] 150451

Clean genid_describe

genid_describe: Current gender identity. 1 is Male, 2 is Female, 3 is Transgender, 4 is none of these. 99 is ‘question seen but category not selected.’

Checking the distribution of gender identity.

Code

tabyl(dat$genid_describe)

 dat$genid_describe     n     percent
                -99   805 0.005350579
                  1 54814 0.364331244
                  2 91606 0.608875980
                  3  1030 0.006846083
                  4  2196 0.014596114

Dichotomizing genid_describe.

Code

dat$male <- car::recode(dat$genid_describe, "1=1; else=0") #gen1 = male
tabyl(dat$male)

 dat$male     n   percent
        0 95637 0.6356688
        1 54814 0.3643312

Code

dat$female <- car::recode(dat$genid_describe, "2=1; else=0") #gen2 = female
tabyl(dat$female)

 dat$female     n  percent
          0 58845 0.391124
          1 91606 0.608876

Code

dat$transgender <- car::recode(dat$genid_describe, "3=1; else=0") #gen3 = transgender
tabyl(dat$transgender)

 dat$transgender      n     percent
               0 149421 0.993153917
               1   1030 0.006846083

Code

dat$gen_none <- car::recode(dat$genid_describe, "4=1; else=0") #gen4 = None of these
tabyl(dat$gen_none)

 dat$gen_none      n    percent
            0 148255 0.98540389
            1   2196 0.01459611

Code

dat$gen_notsel <- car::recode(dat$genid_describe, "-99=1; else=0") #gen5 = not selected
tabyl(dat$gen_notsel)

 dat$gen_notsel      n     percent
              0 149646 0.994649421
              1    805 0.005350579

Checking sample size.

Code

nrow(dat)

[1] 150451

Clean race_eth

race_eth: Categorical variable derived from ‘rhispanic’ and ‘rrace’. (Describe coding)

Combine race/ethnicity into a race_eth variable.

Code

tabyl(dat$rrace)

 dat$rrace      n    percent
         1 113230 0.75260384
         2  19108 0.12700481
         3   7995 0.05314023
         4  10118 0.06725113

Code

tabyl(dat$rhispanic)

 dat$rhispanic      n   percent
             1 131198 0.8720314
             2  19253 0.1279686

Code

dat <- dat %>%
  mutate(race_eth = case_when(.$rhispanic == 1 & .$rrace == 1 ~"nh_white",
                              .$rhispanic == 1 & .$rrace == 2 ~"nh_black",
                              .$rhispanic == 1 & .$rrace == 3 ~"nh_asian",
                              .$rhispanic == 1 & .$rrace == 4 ~"other",
                              .$rhispanic == 2 & .$rrace %in% c(1:4) ~ "hispanic"
                              )
        )

tabyl(dat$race_eth)

 dat$race_eth     n    percent
     hispanic 19253 0.12796857
     nh_asian  7464 0.04961084
     nh_black 17882 0.11885597
     nh_white 98094 0.65199965
        other  7758 0.05156496

Dichotomizing race and ethnicity.

Code

dat$hispanic <- car::recode(dat$race_eth, "'hispanic'=1; else=0")
tabyl(dat$hispanic)

 dat$hispanic      n   percent
            0 131198 0.8720314
            1  19253 0.1279686

Code

dat$nh_white <- car::recode(dat$race_eth, "'nh_white'=1; else=0")
tabyl(dat$nh_white)

 dat$nh_white     n   percent
            0 52357 0.3480003
            1 98094 0.6519997

Code

dat$nh_black <- car::recode(dat$race_eth, "'nh_black'=1; else=0")
tabyl(dat$nh_black)

 dat$nh_black      n  percent
            0 132569 0.881144
            1  17882 0.118856

Code

dat$nh_asian <- car::recode(dat$race_eth, "'nh_asian'=1; else=0")
tabyl(dat$nh_asian)

 dat$nh_asian      n    percent
            0 142987 0.95038916
            1   7464 0.04961084

Code

dat$other <- car::recode(dat$race_eth, "'other'=1; else=0")
tabyl(dat$other)

 dat$other      n    percent
         0 142693 0.94843504
         1   7758 0.05156496

Checking sample size.

Code

nrow(dat)

[1] 150451

Clean single_adult

single_adult: Dummy variable for single adult derived from ‘thhld_numadlt’.

Dichotomizing number of adults in household. Single adult is coded as “1”, multiple adults is coded as “0”.

Code

tabyl(dat$thhld_numadlt)

 dat$thhld_numadlt     n      percent
                 1 60123 0.3996184804
                 2 65643 0.4363081668
                 3 15711 0.1044260257
                 4  5991 0.0398202737
                 5  1893 0.0125821696
                 6   582 0.0038683691
                 7   206 0.0013692166
                 8    86 0.0005716147
                 9   111 0.0007377817
                10   105 0.0006979016

Code

dat$single_adult <- car::recode(dat$thhld_numadlt, "1=1; else=0")
tabyl(dat$single_adult)

 dat$single_adult     n   percent
                0 90328 0.6003815
                1 60123 0.3996185

Checking sample size.

Code

nrow(dat)

[1] 150451

Clean thhld_numper

thhld_numper: Total number of people in household

Distribution of number of people in household. I’m leaving this as an integer.

Code

tabyl(dat$thhld_numper)

 dat$thhld_numper     n     percent
                1 48901 0.325029412
                2 49312 0.327761198
                3 22757 0.151258549
                4 15847 0.105329975
                5  7651 0.050853766
                6  3345 0.022233152
                7  1303 0.008660627
                8   653 0.004340284
                9   237 0.001575264
               10   445 0.002957774

Checking sample size.

Code

nrow(dat)

[1] 150451

Clean child

child: Dummy variable derived from ‘thhld_numkid’

Check distribution of ‘thhld_numkid’.

Code

tabyl(dat$thhld_numkid)

 dat$thhld_numkid      n     percent
                0 106465 0.707639032
                1  21946 0.145868090
                2  13430 0.089264943
                3   5540 0.036822620
                4   1981 0.013167078
                5   1089 0.007238237

Dichotomizing presence of children. Children in household is coded as “1”, no children is “0”.

Code

dat$children <- car::recode(dat$thhld_numkid, "0=0; else=1")
tabyl(dat$children)

 dat$children      n  percent
            0 106465 0.707639
            1  43986 0.292361

Clean eeduc

eeduc: Educational attainment. 1 is less than high school, 2 is some high school, 3 is high school graduate, 4 is some college but no degree, 5 is associates degree, 6 is bachelor’s degree, 7 is graduate degree.

For educational attainment, dichotomize into “less than high school”, “high school”, “some college”, “bachelor’s or higher”.

Code

tabyl(dat$eeduc)

 dat$eeduc     n    percent
         1  1530 0.01016942
         2  3427 0.02277818
         3 21504 0.14293026
         4 37741 0.25085244
         5 16029 0.10653967
         6 40556 0.26956285
         7 29664 0.19716718

Dichotomizing educational attainment.

Code

dat$lessthanhs <- car::recode(dat$eeduc, "1=1; 2=1; else=0")
tabyl(dat$lessthanhs)

 dat$lessthanhs      n   percent
              0 145494 0.9670524
              1   4957 0.0329476

Code

dat$highschool <- car::recode(dat$eeduc, "3=1; else=0")
tabyl(dat$highschool)

 dat$highschool      n   percent
              0 128947 0.8570697
              1  21504 0.1429303

Code

dat$somecollege <- car::recode(dat$eeduc, "4=1; 5=1; else=0")
tabyl(dat$somecollege)

 dat$somecollege     n   percent
               0 96681 0.6426079
               1 53770 0.3573921

Code

dat$bachelors <- car::recode(dat$eeduc, "6=1; 7=1; else=0")
tabyl(dat$bachelors)

 dat$bachelors     n percent
             0 80231 0.53327
             1 70220 0.46673

Checking sample size.

Code

nrow(dat)

[1] 150451

Visualizations

Table of mental health by rentcur.

Code

tabyl(dat, rentcur, mnhlth)

 rentcur     0     1     2     3     4    5    6    7    8    9   10   11   12
       0  2253   722   967  1051  2161 1047 1051  836 1158  881  895  850 2886
       1 34092 10370 12411 11728 18230 7867 7089 5494 6450 4124 4074 3238 8526

Table of phq4 by rentcur.

Code

tabyl(dat, rentcur, phq4)

 rentcur 1 none 2 mild 3 moderate 4 severe
       0   3942   4259       3045     5512
       1  56873  37825      19033    19962

Table of badmh by rentcur.

Code

tabyl(dat, rentcur, badmh)

 rentcur     0     1
       0  8201  8557
       1 94698 38995

Table of mental health by evict.

Code

tabyl(dat, evict, mnhlth)

 evict     0     1     2     3     4    5    6    7    8    9   10   11   12
   -99    29     8     3     8    10    4    4    4    3    3    0    1   13
   -88 34092 10370 12411 11728 18230 7867 7089 5494 6450 4124 4074 3238 8526
     1   127    56    70    92   210  106  124  107  162  151  188  171  796
     2   294   131   181   233   617  304  346  249  407  314  286  317 1066
     3   523   222   335   360   731  391  338  297  381  267  260  238  674
     4  1280   305   378   358   593  242  239  179  205  146  161  123  337

Table of phq4 by evict.

Code

tabyl(dat, evict, phq4)

 evict 1 none 2 mild 3 moderate 4 severe
   -99     40     22         11       17
   -88  56873  37825      19033    19962
     1    253    408        393     1306
     2    606   1154       1002     1983
     3   1080   1482       1016     1439
     4   1963   1193        623      767

Table of badmh by evict.

Code

tabyl(dat, evict, badmh)

 evict     0     1
   -99    62    28
   -88 94698 38995
     1   661  1699
     2  1760  2985
     3  2562  2455
     4  3156  1390

Creating a dataframe that only includes those who have a value in ‘evict’.

Code

evictdat <- dat %>%
  filter(evict %in% (1:4))

tabyl(evictdat, evict)

 evict    n   percent
     1 2360 0.1415887
     2 4745 0.2846772
     3 5017 0.3009959
     4 4546 0.2727382

Checking sample size.

Code

nrow(evictdat)

[1] 16668

Table of mental health by evict (evictdat only).

Code

tabyl(evictdat, evict, mnhlth)

 evict    0   1   2   3   4   5   6   7   8   9  10  11   12
     1  127  56  70  92 210 106 124 107 162 151 188 171  796
     2  294 131 181 233 617 304 346 249 407 314 286 317 1066
     3  523 222 335 360 731 391 338 297 381 267 260 238  674
     4 1280 305 378 358 593 242 239 179 205 146 161 123  337

Table of phq4 by evict (evictdat only).

Code

tabyl(evictdat, evict, phq4)

 evict 1 none 2 mild 3 moderate 4 severe
     1    253    408        393     1306
     2    606   1154       1002     1983
     3   1080   1482       1016     1439
     4   1963   1193        623      767

Table of badmh by evict (evictdat only).

Code

tabyl(evictdat, evict, badmh)

 evict    0    1
     1  661 1699
     2 1760 2985
     3 2562 2455
     4 3156 1390

Checking count of observations in each state. Is this a sufficient number in each state for a multilevel model? North Dakota has 114. Vermont has 103. This is with HH Pulse phases 3.2 and 3.3.

Code

tabyl(evictdat, state)

                state    n     percent
              Alabama  232 0.013918886
               Alaska  258 0.015478762
              Arizona  288 0.017278618
             Arkansas  262 0.015718743
           California 1723 0.103371730
             Colorado  259 0.015538757
          Connecticut  335 0.020098392
             Delaware  169 0.010139189
 District of Columbia  295 0.017698584
              Florida  637 0.038216943
              Georgia  450 0.026997840
               Hawaii  206 0.012359011
                Idaho  174 0.010439165
             Illinois  412 0.024718023
              Indiana  257 0.015418766
                 Iowa  200 0.011999040
               Kansas  236 0.014158867
             Kentucky  212 0.012718982
            Louisiana  277 0.016618671
                Maine  143 0.008579314
             Maryland  456 0.027357811
        Massachusetts  497 0.029817615
             Michigan  399 0.023938085
            Minnesota  219 0.013138949
          Mississippi  203 0.012179026
             Missouri  259 0.015538757
              Montana  143 0.008579314
             Nebraska  224 0.013438925
               Nevada  319 0.019138469
        New Hampshire  196 0.011759059
           New Jersey  346 0.020758339
           New Mexico  302 0.018118551
             New York  629 0.037736981
       North Carolina  297 0.017818575
         North Dakota  114 0.006839453
                 Ohio  247 0.014818814
             Oklahoma  305 0.018298536
               Oregon  459 0.027537797
         Pennsylvania  448 0.026877850
         Rhode Island  226 0.013558915
       South Carolina  229 0.013738901
         South Dakota  126 0.007559395
            Tennessee  298 0.017878570
                Texas  955 0.057295416
                 Utah  202 0.012119030
              Vermont  103 0.006179506
             Virginia  361 0.021658267
           Washington  567 0.034017279
        West Virginia  176 0.010559155
            Wisconsin  220 0.013198944
              Wyoming  118 0.007079434

Lets see how ‘evict’ is distributed across states. Do I need to add more HH pulse phases?

Code

evictdat %>%
  tabyl(., state, evict)

                state   1   2   3   4
              Alabama  48  67  69  48
               Alaska  34  79  81  64
              Arizona  40  96  91  61
             Arkansas  47  80  85  50
           California 239 460 477 547
             Colorado  35  76  90  58
          Connecticut  41  81 107 106
             Delaware  20  43  59  47
 District of Columbia  23  69  92 111
              Florida 110 169 174 184
              Georgia 111 137 119  83
               Hawaii  14  55  69  68
                Idaho  18  54  66  36
             Illinois  59 112 125 116
              Indiana  30  65  91  71
                 Iowa  16  57  76  51
               Kansas  34  59  62  81
             Kentucky  37  71  61  43
            Louisiana  56 101  68  52
                Maine  10  35  60  38
             Maryland  71 118 133 134
        Massachusetts  53 112 144 188
             Michigan  51 109 125 114
            Minnesota  25  55  67  72
          Mississippi  34  72  57  40
             Missouri  42  64 102  51
              Montana  17  39  44  43
             Nebraska  28  70  65  61
               Nevada  53 113  74  79
        New Hampshire  20  48  68  60
           New Jersey  39  81 113 113
           New Mexico  41  98  85  78
             New York  71 138 209 211
       North Carolina  47  88  88  74
         North Dakota  20  40  29  25
                 Ohio  42  78  77  50
             Oklahoma  50 105  90  60
               Oregon  69 157 137  96
         Pennsylvania  67 116 130 135
         Rhode Island  22  68  66  70
       South Carolina  34  82  59  54
         South Dakota  13  31  47  35
            Tennessee  47  93  93  65
                Texas 157 310 247 241
                 Utah  34  52  55  61
              Vermont  12  19  41  31
             Virginia  40  98  98 125
           Washington  78 154 188 147
        West Virginia  22  67  51  36
            Wisconsin  26  67  71  56
              Wyoming  13  37  42  26

Variation across states

The key thing for the model is to seee how much variation is there across states.

Scatterplot - aggregating values by state and graphing it. Calculate percent on bad mental health. First do the mean on the raw HH Pulse mental health score. For each state you have the score on the eviction policy. State eviction score is the unit of analysis on X axis score. On the Y axis you have mean mental health score by state or percent with bad mental health by state.

Here I create a new dataframe that takes the data in evictdat (sample of all renters who are late on rent) and groups them by state. Then I find the mean mental health number for each state. I create a dataframe that contains the state (full name, I should change to abbreviations way up top) and the mean mental health score of renters who are late on rent in each state.

Code

statemh <- evictdat %>%
  group_by(est_st) %>%
  summarize(mean_mh = mean(mnhlth), mh_n=n())

statedat <- merge(x=statemh, y=statedat, by='est_st', all.x=TRUE)

nrow(statedat)

[1] 51

Code

ggplot(statedat, aes(x=score, y=mean_mh, label=abbv)) +
    geom_point() +
    geom_text(hjust=-0.3, vjust=0.5) +
    geom_smooth(method=lm) +
    labs(title = "Mental Health of Tenants Behind on Rent by State",
          x = "State COVID Housing Policy Score",
          y = "Mean Mental Health")

`geom_smooth()` using formula = 'y ~ x'

Warning: The following aesthetics were dropped during statistical transformation: label
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

Pearson correlation test between State COVID-19 Score and the mean mental health among renters who are late on rent.

Code

test <- cor.test(statedat$score, statedat$mean_mh, method = "spearman")

Warning in cor.test.default(statedat$score, statedat$mean_mh, method =
"spearman"): Cannot compute exact p-value with ties

Code

test


    Spearman's rank correlation rho

data:  statedat$score and statedat$mean_mh
S = 31628, p-value = 0.001586
alternative hypothesis: true rho is not equal to 0
sample estimates:
       rho 
-0.4311286

Here I am using the percent with bad mental health instead of the mean mental health score.

Code

statebmh <- evictdat %>%
  group_by(est_st) %>%
  summarize(mean_bmh = mean(badmh), bmh_n=n())

statedat <- merge(x=statebmh, y=statedat, by='est_st', all.x=TRUE)

nrow(statedat)

[1] 51

Code

ggplot(statedat, aes(x=score, y=mean_bmh, label=abbv)) +
    geom_point() +
    geom_text(hjust=-0.3, vjust=0.5) +
    geom_smooth(method=lm) +
    labs(title = "Percent of Tenants Behind on Rent with Bad Mental Health",
          x = "State COVID Housing Policy Score",
          y = "Percent with Bad Mental Health")

`geom_smooth()` using formula = 'y ~ x'

Warning: The following aesthetics were dropped during statistical transformation: label
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

Pearson correlation test between State COVID-19 Score and the percent with bad mental health among renters who are late on rent.

Code

test2 <- cor.test(statedat$score, statedat$mean_bmh, method = "spearman")

Warning in cor.test.default(statedat$score, statedat$mean_bmh, method =
"spearman"): Cannot compute exact p-value with ties

Code

test2


    Spearman's rank correlation rho

data:  statedat$score and statedat$mean_bmh
S = 30442, p-value = 0.006319
alternative hypothesis: true rho is not equal to 0
sample estimates:
       rho 
-0.3774811

Plot of risk of eviction and bad mental health.

Code

ggplot(evictdat, aes(evict, mnhlth)) + 
  geom_boxplot(aes(group = evict)) +
  geom_smooth(method=lm) +
    labs(title = "Eviction Risk and Mental Health For Tenants Behind on Rent",
          x = "Risk of Eviction",
          y = "Mental Health Scores")

`geom_smooth()` using formula = 'y ~ x'

Code

evictdat$mnhlth <- as.numeric(evictdat$mnhlth)

evictdat$evict  <- as.numeric(evictdat$evict)

test3 <- cor.test(evictdat$evict, evictdat$mnhlth, method = "spearman")

Warning in cor.test.default(evictdat$evict, evictdat$mnhlth, method =
"spearman"): Cannot compute exact p-value with ties

Code

test3


    Spearman's rank correlation rho

data:  evictdat$evict and evictdat$mnhlth
S = 1.0442e+12, p-value < 2.2e-16
alternative hypothesis: true rho is not equal to 0
sample estimates:
       rho 
-0.3529651

Modelling

Plan for modelling.

Independent variable: - evict: Eviction in next two months. 1 is very likely, 2 is somewhat likeley, 3 is not very likely, 4 is not likely at all, 88 and 99. Only asked if rentcur = 2.

Dependent variables: - mnhlth: An index from 0 to 12 consisting of the sum of ‘anxious’, ‘worry’, ‘interest’, and ‘down’. - badmh: A dummy variable for bad mental health. Bad mental (6 to 12) is 1, good mental health (0 to 5) is 0.

Covariates

income: Total household income (before taxes).
1. Less than $25,000
1. $25,000 - $34,999
1. $35,000 - $49,999
1. $50,000 - $74,999
1. $75,000 - $99,999
1. $100,000 - $149,999
1. $150,000 - $199,999
1. $200,000 and above
1. Question seen but category not selected
1. Missing / Did not report
age: Derived from tbirth_year (year of birth)
genid_describe: Current gender identity. 1 is Male, 2 is Female, 3 is Transgender, 4 is none of these, 88, 99.
race_eth: Categorical variable derived from ‘rhispanic’ and ‘rrace’. (Describe coding)
single_adult: Dummy variable for single adult in household vs. multiple adults. Derived from ‘thhld_numadlt’.
thhld_numper: Total number of people in household
children: Dummy variable derived from ‘thhld_numkid’
eeduc: Educational attainment. 1 is less than high school, 2 is some high school, 3 is high school graduate, 4 is some college but no degree, 5 is associates degree, 6 is bachelor’s degree, 7 is graduate degree.

When the outcome variable is categorical but ordered, we should use the ordered logit models. (https://libguides.princeton.edu/R-logit)

Code

# Ordered logit model of mental health with risk of eviction.

evictdat$mnhlth <- as.factor(evictdat$mnhlth)

evictdat$evict  <- as.factor(evictdat$evict)

m1 <- polr(mnhlth ~ evict, data = evictdat, Hess = TRUE)

summary(m1)

Call:
polr(formula = mnhlth ~ evict, data = evictdat, Hess = TRUE)

Coefficients:
        Value Std. Error t value
evict2 -0.477    0.04475  -10.66
evict3 -1.033    0.04469  -23.12
evict4 -1.939    0.04716  -41.12

Intercepts:
      Value    Std. Error t value 
0|1    -3.0016   0.0440   -68.1428
1|2    -2.6484   0.0426   -62.2183
2|3    -2.2617   0.0413   -54.7769
3|4    -1.9087   0.0403   -47.3042
4|5    -1.2880   0.0391   -32.9620
5|6    -1.0105   0.0386   -26.1529
6|7    -0.7342   0.0383   -19.1817
7|8    -0.5103   0.0380   -13.4146
8|9    -0.1835   0.0378    -4.8540
9|10    0.0863   0.0378     2.2855
10|11   0.3938   0.0379    10.3855
11|12   0.7361   0.0385    19.1333

Residual Deviance: 79553.66 
AIC: 79583.66

We can store the coefficients and p-values into one new dataframe.

Code

m1.coef <- data.frame(coef(summary(m1)))
m1.coef$pval = round((pnorm(abs(m1.coef$t.value), lower.tail = FALSE) * 2),2)
m1.coef

             Value Std..Error    t.value pval
evict2 -0.47697701 0.04474832 -10.659103 0.00
evict3 -1.03331623 0.04469305 -23.120290 0.00
evict4 -1.93918336 0.04716135 -41.118061 0.00
0|1    -3.00160324 0.04404869 -68.142844 0.00
1|2    -2.64841750 0.04256655 -62.218275 0.00
2|3    -2.26165435 0.04128845 -54.776932 0.00
3|4    -1.90870540 0.04034964 -47.304151 0.00
4|5    -1.28804037 0.03907654 -32.961987 0.00
5|6    -1.01053897 0.03863967 -26.152888 0.00
6|7    -0.73418093 0.03827505 -19.181709 0.00
7|8    -0.51027800 0.03803914 -13.414553 0.00
8|9    -0.18351098 0.03780615  -4.853999 0.00
9|10    0.08628544 0.03775293   2.285530 0.02
10|11   0.39384453 0.03792259  10.385486 0.00
11|12   0.73609595 0.03847198  19.133301 0.00

We can calculate the odds ratios by taking the exponential of these coefficients.

Code

cbind(Estimate=round(coef(m1),4),
      OR=round(exp(coef(m1)),4))

       Estimate     OR
evict2  -0.4770 0.6207
evict3  -1.0333 0.3558
evict4  -1.9392 0.1438

Code

# Playing with covariates.  Do these need to be numerical like age or binomial? Rather than categorical like gender or race/ethnicity?

m2 <- polr(mnhlth ~ evict + age, data = evictdat, Hess = TRUE)

summary(m2)

Call:
polr(formula = mnhlth ~ evict + age, data = evictdat, Hess = TRUE)

Coefficients:
          Value Std. Error t value
evict2 -0.48330   0.044851  -10.78
evict3 -1.03791   0.044791  -23.17
evict4 -1.93390   0.047237  -40.94
age    -0.01387   0.001045  -13.28

Intercepts:
      Value    Std. Error t value 
0|1    -3.6425   0.0658   -55.3296
1|2    -3.2866   0.0647   -50.8356
2|3    -2.8973   0.0636   -45.5394
3|4    -2.5424   0.0629   -40.4499
4|5    -1.9179   0.0617   -31.0718
5|6    -1.6387   0.0613   -26.7280
6|7    -1.3607   0.0609   -22.3263
7|8    -1.1355   0.0607   -18.7094
8|9    -0.8068   0.0604   -13.3595
9|10   -0.5353   0.0602    -8.8880
10|11  -0.2262   0.0602    -3.7562
11|12   0.1176   0.0604     1.9464

Residual Deviance: 79376.97 
AIC: 79408.97

Ordered logit for mnhlth.

Use all renters and mental health.

Principle components analysis.

Social diffusion models.

Galeen Samari immigration policy. Look at this for policy change over time, factor anlaysis. Event history analysis?

Renters vs. non-renters? Effect of policy? SHOULDN’T apply, but what if it does?

Renters who are current vs. renters who are behind, different policy effect? The survey DOES measure mental health of all renters.