This data was provided by the State of Connecticut’s Department of Housing. The “Affordable Housing by Town” data set is published yearly, and the information is collected from federal, state, and local programs.The goal of this data is to measure affordable housing through the various types of assisted housing units: government, tenant rental, single family CHFA/USDA mortgages, and deed restricted. (Data Source)
This analysis will focus on the data related to Government Assisted Housing Units, Tenant Rental Assisted Housing Units, the percent affordability measure.
#reading in the CSV file
ct_housing_data <- read_csv("CT_affordable_housing_2011-2023.csv")
#checking for missing values
colSums(is.na(ct_housing_data)) >0
## Year Town Code
## FALSE FALSE
## Town Census Units
## FALSE FALSE
## Government Assisted Tenant Rental Assistance
## TRUE FALSE
## Single Family CHFA/ USDA Mortgages Deed Restricted Units
## FALSE FALSE
## Total Assisted Units Percent Affordable
## FALSE FALSE
paste("There are missing values in the 'Government Assisted' column.")
## [1] "There are missing values in the 'Government Assisted' column."
#number of rows in the initial data set
nrow(ct_housing_data)
## [1] 2194
#number of rows in the "clean" data set
clean_housing_data <- ct_housing_data[complete.cases(ct_housing_data),]
nrow(clean_housing_data)
## [1] 2193
#number of rows ommitted = original data set - clean data set
omitted_rows <- nrow(ct_housing_data) - nrow(clean_housing_data)
paste("The number of rows omitted from this data set are:", omitted_rows, "rows.")
## [1] "The number of rows omitted from this data set are: 1 rows."
names(clean_housing_data)
## [1] "Year" "Town Code"
## [3] "Town" "Census Units"
## [5] "Government Assisted" "Tenant Rental Assistance"
## [7] "Single Family CHFA/ USDA Mortgages" "Deed Restricted Units"
## [9] "Total Assisted Units" "Percent Affordable"
#Government Assisted Units
paste("Summary Data for Government Assisted Units")
## [1] "Summary Data for Government Assisted Units"
summary(clean_housing_data$`Government Assisted`)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 24.0 137.0 539.4 377.0 10755.0
paste("IQR for Government Assisted Units")
## [1] "IQR for Government Assisted Units"
IQR(clean_housing_data$`Government Assisted`)
## [1] 353
#Tenant Rental Assistance
paste("Summary Data for Rental Assisted Units")
## [1] "Summary Data for Rental Assisted Units"
summary(clean_housing_data$`Tenant Rental Assistance`)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 2 7 266 88 9132
paste("IQR for Rental Assisted Units")
## [1] "IQR for Rental Assisted Units"
IQR(clean_housing_data$`Government Assisted`)
## [1] 353
#Percent Affordability
paste("Summary Data for Percent Affordable Units")
## [1] "Summary Data for Percent Affordable Units"
summary(clean_housing_data$`Percent Affordable`)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 2.170 4.220 6.196 7.730 40.760
paste("IQR for Percent Affordable Units")
## [1] "IQR for Percent Affordable Units"
IQR(clean_housing_data$`Government Assisted`)
## [1] 353
The data is positively skewed, and indicates that the mean % affordable housing is larger than the median. While the percent of affordable housing is high for most towns in 2023, it appears that there are a number of low affordability towns that are causing a larger right tail to the distribution.
#subset for 2023
housing2023 <- subset(clean_housing_data, subset = clean_housing_data$Year==2023)
skew(housing2023$`Percent Affordable`)
## [1] 2.301124
hist(housing2023$`Percent Affordable`, breaks="FD", main="Distribution of % CT Affordable Housing in 2023", xlab = "Affordable Housing in CT (%)")
#town with highest affordability percentage
highest_affordable_percentage <- max(housing2023$`Percent Affordable`)
subset(housing2023, subset = housing2023$`Percent Affordable` == highest_affordable_percentage)
## # A tibble: 1 × 10
## Year `Town Code` Town `Census Units` `Government Assisted`
## <dbl> <dbl> <chr> <dbl> <dbl>
## 1 2023 64 Hartford 53259 10755
## # ℹ 5 more variables: `Tenant Rental Assistance` <dbl>,
## # `Single Family CHFA/ USDA Mortgages` <dbl>, `Deed Restricted Units` <dbl>,
## # `Total Assisted Units` <dbl>, `Percent Affordable` <dbl>
#town with highest affordability percentage
lowest_affordable_percentage <- min(housing2023$`Percent Affordable`)
subset(housing2023, subset = housing2023$`Percent Affordable` == lowest_affordable_percentage)
## # A tibble: 1 × 10
## Year `Town Code` Town `Census Units` `Government Assisted`
## <dbl> <dbl> <chr> <dbl> <dbl>
## 1 2023 16 Bridgewater 863 0
## # ℹ 5 more variables: `Tenant Rental Assistance` <dbl>,
## # `Single Family CHFA/ USDA Mortgages` <dbl>, `Deed Restricted Units` <dbl>,
## # `Total Assisted Units` <dbl>, `Percent Affordable` <dbl>