Introduction

This data was provided by the State of Connecticut’s Department of Housing. The “Affordable Housing by Town” data set is published yearly, and the information is collected from federal, state, and local programs.The goal of this data is to measure affordable housing through the various types of assisted housing units: government, tenant rental, single family CHFA/USDA mortgages, and deed restricted. (Data Source)

This analysis will focus on the data related to Government Assisted Housing Units, Tenant Rental Assisted Housing Units, the percent affordability measure.

#reading in the CSV file
ct_housing_data <- read_csv("CT_affordable_housing_2011-2023.csv")

Data Exploration (Missing data, number of observations, etc.)

#checking for missing values
colSums(is.na(ct_housing_data)) >0
##                               Year                          Town Code 
##                              FALSE                              FALSE 
##                               Town                       Census Units 
##                              FALSE                              FALSE 
##                Government Assisted           Tenant Rental Assistance 
##                               TRUE                              FALSE 
## Single Family CHFA/ USDA Mortgages              Deed Restricted Units 
##                              FALSE                              FALSE 
##               Total Assisted Units                 Percent Affordable 
##                              FALSE                              FALSE
paste("There are missing values in the 'Government Assisted' column.")
## [1] "There are missing values in the 'Government Assisted' column."
#number of rows in the initial data set
nrow(ct_housing_data)
## [1] 2194
#number of rows in the "clean" data set
clean_housing_data <- ct_housing_data[complete.cases(ct_housing_data),]
nrow(clean_housing_data)
## [1] 2193
#number of rows ommitted =  original data set - clean data set
omitted_rows <- nrow(ct_housing_data) - nrow(clean_housing_data)

paste("The number of rows omitted from this data set are:", omitted_rows, "rows.")
## [1] "The number of rows omitted from this data set are: 1 rows."

Cumulative Summary Statistics, 2011-2023: State-wide

names(clean_housing_data)
##  [1] "Year"                               "Town Code"                         
##  [3] "Town"                               "Census Units"                      
##  [5] "Government Assisted"                "Tenant Rental Assistance"          
##  [7] "Single Family CHFA/ USDA Mortgages" "Deed Restricted Units"             
##  [9] "Total Assisted Units"               "Percent Affordable"
#Government Assisted Units
paste("Summary Data for Government Assisted Units")
## [1] "Summary Data for Government Assisted Units"
summary(clean_housing_data$`Government Assisted`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0    24.0   137.0   539.4   377.0 10755.0
paste("IQR for Government Assisted Units")
## [1] "IQR for Government Assisted Units"
IQR(clean_housing_data$`Government Assisted`)
## [1] 353
#Tenant Rental Assistance
paste("Summary Data for Rental Assisted Units")
## [1] "Summary Data for Rental Assisted Units"
summary(clean_housing_data$`Tenant Rental Assistance`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       2       7     266      88    9132
paste("IQR for Rental Assisted Units")
## [1] "IQR for Rental Assisted Units"
IQR(clean_housing_data$`Government Assisted`)
## [1] 353
#Percent Affordability
paste("Summary Data for Percent Affordable Units")
## [1] "Summary Data for Percent Affordable Units"
summary(clean_housing_data$`Percent Affordable`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   2.170   4.220   6.196   7.730  40.760
paste("IQR for Percent Affordable Units")
## [1] "IQR for Percent Affordable Units"
IQR(clean_housing_data$`Government Assisted`)
## [1] 353

Affordable Housing in 2023:

The data is positively skewed, and indicates that the mean % affordable housing is larger than the median. While the percent of affordable housing is high for most towns in 2023, it appears that there are a number of low affordability towns that are causing a larger right tail to the distribution.

#subset for 2023
housing2023 <- subset(clean_housing_data, subset = clean_housing_data$Year==2023)
skew(housing2023$`Percent Affordable`)
## [1] 2.301124
hist(housing2023$`Percent Affordable`, breaks="FD", main="Distribution of % CT Affordable Housing in 2023", xlab = "Affordable Housing in CT (%)")

Investigating Percent Affordable Housing by Town: 2023

Which town is the most affordable to live in?

#town with highest affordability percentage
highest_affordable_percentage <- max(housing2023$`Percent Affordable`)
subset(housing2023, subset = housing2023$`Percent Affordable` == highest_affordable_percentage)
## # A tibble: 1 × 10
##    Year `Town Code` Town     `Census Units` `Government Assisted`
##   <dbl>       <dbl> <chr>             <dbl>                 <dbl>
## 1  2023          64 Hartford          53259                 10755
## # ℹ 5 more variables: `Tenant Rental Assistance` <dbl>,
## #   `Single Family CHFA/ USDA Mortgages` <dbl>, `Deed Restricted Units` <dbl>,
## #   `Total Assisted Units` <dbl>, `Percent Affordable` <dbl>

Which town is the least affordable to live in?

#town with highest affordability percentage
lowest_affordable_percentage <- min(housing2023$`Percent Affordable`)
subset(housing2023, subset = housing2023$`Percent Affordable` == lowest_affordable_percentage)
## # A tibble: 1 × 10
##    Year `Town Code` Town        `Census Units` `Government Assisted`
##   <dbl>       <dbl> <chr>                <dbl>                 <dbl>
## 1  2023          16 Bridgewater            863                     0
## # ℹ 5 more variables: `Tenant Rental Assistance` <dbl>,
## #   `Single Family CHFA/ USDA Mortgages` <dbl>, `Deed Restricted Units` <dbl>,
## #   `Total Assisted Units` <dbl>, `Percent Affordable` <dbl>