Home ownership remains a key indicator of economic stability and social mobility in the United States, yet access to it is shaped by a complex interplay of socioeconomic factors. In Queens, New York—one of the most diverse urban areas in the country—understanding who owns homes and why is especially important for informing housing policy and equity initiatives. This analysis uses data from the 2023 American Community Survey (ACS) to investigate the impact of income and nativity status (foreign-born vs. U.S.-born) on the likelihood of home ownership. By applying Poisson and negative binomial regression models, the study aims to quantify how significantly earning over $100,000 per year and being foreign-born influence home ownership rates, while also accounting for statistical challenges such as overdispersion. Through a series of regression models, marginal effect estimates, and visualizations, this report offers a clearer picture of the demographic and economic dynamics influencing home ownership in Queens.
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 526709 28.2 1174690 62.8 660385 35.3
## Vcells 961271 7.4 8388608 64.0 1770057 13.6
options(repos = c(CRAN = "https://cloud.r-project.org/"))
# Set working directory
directory <- "C:/Users/mikem/iCloudDrive/Spring 2025/DATA 712/Data sets"
setwd(directory)
# Set seed for reproducibility
set.seed(123)
# Load required libraries
if (!require("here")) install.packages("here", dependencies = TRUE)
install.packages(c("sf", "ggplot2", "dplyr", "clarify", "AER", "janitor", "pscl"))## package 'sf' successfully unpacked and MD5 sums checked
## package 'ggplot2' successfully unpacked and MD5 sums checked
## package 'dplyr' successfully unpacked and MD5 sums checked
## package 'clarify' successfully unpacked and MD5 sums checked
## package 'AER' successfully unpacked and MD5 sums checked
## package 'janitor' successfully unpacked and MD5 sums checked
## package 'pscl' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\mikem\AppData\Local\Temp\RtmpWYJUOr\downloaded_packages
library(clarify)
library(AER)
library(tidyverse)
library(pscl)
acs23 <- read.csv("2023 ACS Queens.csv", stringsAsFactors = FALSE)For ease of coding, I simply created new vectors for the variables I would be using for this exercise.
##
## Call:
## glm(formula = Own ~ Income_over_100k, family = poisson, data = acs23)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 8.181e+00 3.852e-03 2124.1 <2e-16 ***
## Income_over_100k 8.429e-05 5.541e-07 152.1 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 70245 on 62 degrees of freedom
## Residual deviance: 48991 on 61 degrees of freedom
## AIC: 49651
##
## Number of Fisher Scoring iterations: 4
The Poisson regression model examines the relationship between home ownership in Queens in 2023 and whether individuals earn over $100,000 annually. The results indicate a statistically significant positive association between higher income and home ownership. Specifically, the coefficient for individuals earning over $100K is 0.00008429, with a very small standard error and a z-value of 152.1, resulting in a p-value less than 2e-16. This suggests that as income crosses the $100K threshold, the expected log count of home ownership slightly increases. Although the effect size appears small, it is highly statistically significant, indicating that higher income is associated with a greater likelihood of owning a home. The model fit improves substantially with the inclusion of this income variable, as seen by the reduction in deviance from 70,245 to 48,991. Overall, the data supports the conclusion that earning over $100,000 is positively related to home ownership in Queens.
library(clarify)
# Simulate coefficients
sim_coefs1 <- sim(m1)
# Simulate average marginal effects
sim_est1 <- sim_ame(sim_coefs1, var = "Income_over_100k", contrast = "rd")
# Summarize the results
summary(sim_est1)## Estimate 2.5 % 97.5 %
## E[dY/d(Income_over_100k)] 0.497 0.491 0.504
The marginal effects analysis provides a clearer interpretation of how income influences home ownership in Queens in 2023. Using the clarify package, the average marginal effect of earning over $100,000 per year on home ownership was estimated to be approximately 0.497. This means that, on average, individuals earning over $100K are nearly 50 percentage points more likely to own a home compared to those earning less than $100K, holding other factors constant. The 95% confidence interval ranges from 0.491 to 0.504, indicating high precision and statistical significance of this effect. This result confirms and strengthens the findings from the earlier Poisson regression: income is a strong and meaningful predictor of home ownership, and the transition to earning over $100,000 substantially increases the likelihood of owning a home in Queens.
##
## Call:
## glm(formula = Own ~ Income_over_100k + Foreign_born, family = poisson,
## data = acs23)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 8.159e+00 3.917e-03 2083.12 <2e-16 ***
## Income_over_100k 6.806e-05 7.069e-07 96.28 <2e-16 ***
## Foreign_born 6.368e-06 1.668e-07 38.17 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 70245 on 62 degrees of freedom
## Residual deviance: 47562 on 60 degrees of freedom
## AIC: 48223
##
## Number of Fisher Scoring iterations: 4
In this updated Poisson regression model, the variable Foreign_born was added to assess whether being born outside the U.S. influences home ownership in Queens in 2023, alongside income. The results show that both income and nativity status are statistically significant predictors of home ownership. Specifically, individuals earning over $100,000 annually continue to show a strong positive association with home ownership, with a coefficient of 0.00006806 and a highly significant p-value (<2e-16). Additionally, being foreign-born is also positively associated with home ownership, with a coefficient of 0.06368 and a similarly significant p-value. Although the magnitude of the foreign-born effect is smaller compared to income, it still indicates that, holding income constant, foreign-born individuals are more likely to own homes than their U.S.-born counterparts. The residual deviance has further decreased from the previous model (from 48,991 to 47,562), and the AIC dropped to 48,223, indicating an improved model fit with the inclusion of nativity status. Overall, this model suggests that both higher income and being foreign-born are associated with increased likelihood of home ownership in Queens.
library(clarify)
# Simulate coefficients
sim_coefs2 <- sim(m2)
# Simulate average marginal effects
sim_est2 <- sim_ame(sim_coefs2, var = "Income_over_100k", contrast = "rd")
# Summarize the results
summary(sim_est2)## Estimate 2.5 % 97.5 %
## E[dY/d(Income_over_100k)] 0.402 0.393 0.410
After incorporating nativity status (Foreign_born) into the regression model, the average marginal effect of earning over $100,000 on home ownership in Queens was re-estimated using simulated coefficients. The new marginal effect is approximately 0.402, with a 95% confidence interval ranging from 0.394 to 0.410. This means that, holding nativity constant, individuals earning over $100K are about 40 percentage points more likely to own a home than those earning less. Compared to the previous model without nativity status (which showed an effect of ~49.7 percentage points), the effect of income has decreased slightly, suggesting that part of the income effect may have been confounded with nativity status. Nonetheless, income remains a strong and statistically significant predictor of home ownership. Including nativity provides a more nuanced and accurate picture of the socioeconomic factors influencing home ownership in Queens.
## Estimate 2.5 % 97.5 %
## E[dY/d(Foreign_born)] 0.0376 0.0356 0.0396
The marginal effect of being foreign-born on home ownership in Queens was also estimated using simulated coefficients from the adjusted Poisson regression model. The average marginal effect is approximately 0.0376, with a 95% confidence interval ranging from 0.0356 to 0.0396. This means that, holding income constant, foreign-born individuals are about 3.8 percentage points more likely to own a home compared to U.S.-born individuals. While the effect is modest in size, it is statistically significant and suggests that foreign-born status is associated with a slight increase in the likelihood of home ownership. This finding adds nuance to the broader picture, indicating that nativity, though not as strong a predictor as income, still plays a meaningful role in shaping patterns of home ownership in Queens.
The plot generated from the sim_adrf() function illustrates the average
dose-response function (ADRF) for the Foreign_born variable, showing how
the expected number of homeowners changes as the number of foreign-born
individuals increases. The graph reveals a clear and nearly linear
positive relationship: as the number of foreign-born individuals rises,
the expected count of homeowners also increases steadily. The shaded
region around the line represents the 95% confidence interval,
indicating the model’s uncertainty, which remains relatively
narrow—suggesting strong precision in the estimates. This visual
reinforces the earlier marginal effects finding by showing that
foreign-born individuals contribute positively and consistently to home
ownership levels in Queens, emphasizing their important role in the
borough’s housing landscape.
This plot represents the average marginal effect function (AMEF) for the
Foreign_born variable, showing how the marginal effect of being
foreign-born on home ownership varies across different values of the
foreign-born population. The y-axis reflects the estimated marginal
change in home ownership for each additional foreign-born individual,
while the x-axis shows the number of foreign-born individuals in the
population. The plot reveals a positive and slightly increasing trend,
suggesting that the marginal effect of being foreign-born on home
ownership grows stronger as the foreign-born population increases. The
shaded area indicates the 95% confidence interval, which remains
relatively tight, suggesting high confidence in the estimates. This
visualization highlights that the impact of nativity status on home
ownership is not static—it becomes more influential as the number of
foreign-born individuals rises, potentially reflecting growing social or
economic integration over time.
##
## Overdispersion test
##
## data: m2
## z = 6.5765, p-value = 2.408e-11
## alternative hypothesis: true dispersion is greater than 1
## sample estimates:
## dispersion
## 679.9466
The results of the overdispersion test for the Poisson regression model (m2) reveal strong evidence of overdispersion in the data. The test produced a z-score of 6.5765 with a highly significant p-value of 2.408e-11, indicating that the variance in the outcome variable (home ownership) is significantly greater than what the Poisson model assumes. The estimated dispersion parameter is approximately 680, far exceeding the ideal value of 1 expected under a true Poisson distribution. This suggests that the model’s assumption of equal mean and variance does not hold, and the standard errors produced by the Poisson model are likely underestimated. As a result, statistical inferences based on this model—such as p-values and confidence intervals—may be unreliable. Given this finding, it would be more appropriate to use a negative binomial regression model, which adjusts for overdispersion and yields more robust and trustworthy results.
##
## Call:
## MASS::glm.nb(formula = Own ~ Income_over_100k + Foreign_born,
## data = acs23, init.theta = 6.195295689, link = log)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 8.066e+00 1.221e-01 66.060 < 2e-16 ***
## Income_over_100k 8.508e-05 2.471e-05 3.443 0.000576 ***
## Foreign_born 5.949e-06 5.663e-06 1.051 0.293459
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(6.1953) family taken to be 1)
##
## Null deviance: 90.030 on 62 degrees of freedom
## Residual deviance: 64.716 on 60 degrees of freedom
## AIC: 1154.9
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 6.20
## Std. Err.: 1.08
##
## 2 x log-likelihood: -1146.878
This output presents the results of a negative binomial regression model (m3) used to address the overdispersion found in the earlier Poisson model predicting home ownership. The model includes two predictors: Income_over_100k and Foreign_born. The coefficient for Income_over_100k remains statistically significant (p = 0.000576), indicating that individuals earning over $100,000 are still significantly more likely to own a home, even after accounting for overdispersion. However, the coefficient for Foreign_born is no longer statistically significant (p = 0.293459), suggesting that once overdispersion is properly handled, the previously observed effect of nativity status on home ownership is no longer robust. The dispersion parameter (Theta) is estimated at 6.20, confirming the presence of overdispersion that warranted the use of this model. Overall, the negative binomial model provides a more reliable estimation framework, and the results reinforce the importance of income in determining home ownership, while casting doubt on the independent effect of nativity status.
sim_coefs3 <- sim(m3)
sim_est3 <- sim_ame(sim_coefs3, var = "Income_over_100k",
contrast = "rd")
summary(sim_est3)## Estimate 2.5 % 97.5 %
## E[dY/d(Income_over_100k)] 0.506 0.223 0.827
The marginal effects output from the negative binomial regression model provides a more robust estimate of the impact of income on home ownership after correcting for overdispersion. The average marginal effect of earning over $100,000 is approximately 0.506, with a 95% confidence interval ranging from 0.223 to 0.827. This indicates that, on average, individuals earning above this threshold are about 51 percentage points more likely to own a home than those earning less, even when accounting for variability in the data. Although the confidence interval is wider than in the earlier Poisson models—reflecting greater uncertainty—the effect remains substantively large and statistically significant. These results reinforce the conclusion that income is a powerful predictor of home ownership in Queens, and they affirm the appropriateness of using a negative binomial model for more accurate inference in the presence of overdispersion.
## Estimate 2.5 % 97.5 %
## E[dY/d(Foreign_born)] 0.0354 -0.0260 0.1054
The marginal effect of being foreign-born on home ownership, as estimated from the negative binomial regression model, is approximately 0.0354, suggesting a modest 3.5 percentage point increase in the likelihood of owning a home. However, the 95% confidence interval ranges from -0.0260 to 0.1054, which includes zero. This indicates that the effect is not statistically significant, and we cannot confidently conclude that being foreign-born has a meaningful impact on home ownership once overdispersion is accounted for. These findings suggest that any observed association between nativity status and home ownership in earlier models may have been overstated, and that income remains the more consistent and significant predictor.
This plot illustrates the average dose-response function (ADRF) for the
Foreign_born variable based on the negative binomial model. The graph
depicts the expected number of homeowners (E[Y(Foreign_born)]) as the
number of foreign-born individuals increases. While the central trend
line shows a slight upward trajectory, indicating a potential positive
relationship between the foreign-born population and home ownership, the
wide and expanding confidence intervals (shaded area) reflect
substantial uncertainty in the estimates—especially at higher levels of
the foreign-born population. This visual reinforces the earlier marginal
effect findings: although there may be a weak positive trend, the
relationship between being foreign-born and home ownership is not
statistically robust once overdispersion is addressed. The growing
uncertainty at higher population levels also suggests that predictions
become less reliable as the number of foreign-born individuals
increases.
This plot shows the average marginal effect function (AMEF) for the
Foreign_born variable using the negative binomial regression model. The
y-axis represents the marginal effect of being foreign-born on home
ownership, while the x-axis reflects the number of foreign-born
individuals. The central line shows a slight upward trend, suggesting
that the marginal effect of being foreign-born may increase slightly as
the foreign-born population grows. However, the shaded confidence band
is wide and overlaps with zero across much of the range, indicating a
high degree of uncertainty and lack of statistical significance. This
reinforces the earlier findings that, although there may be a minor
positive trend, the effect of being foreign-born on home ownership is
not robust and should be interpreted with caution. In essence, the
influence of nativity status remains weak and uncertain after accounting
for overdispersion in the data.
This study explored the relationship between income, nativity status, and home ownership in Queens using American Community Survey data from 2023. Initial Poisson regression models suggested that earning over $100,000 per year was a strong and statistically significant predictor of home ownership, with marginal effects indicating that high earners were nearly 50 percentage points more likely to own a home. Adding nativity status to the model showed that being foreign-born was also associated with a modest increase in the likelihood of home ownership. However, further testing revealed significant overdispersion in the data, indicating that the Poisson model assumptions were violated. This led to the use of a more appropriate negative binomial regression model to obtain more reliable estimates.
Once overdispersion was accounted for, income remained a significant and meaningful predictor of home ownership, with marginal effects still showing a substantial increase in ownership likelihood among high earners. In contrast, the effect of being foreign-born was no longer statistically significant, and its marginal effect estimates showed wide confidence intervals overlapping zero. Visualization of dose-response and marginal effect functions further reinforced this uncertainty, highlighting that while foreign-born status may show a slight positive trend in some contexts, it lacks statistical robustness when properly modeled. Overall, the findings underscore the central role of income in shaping home ownership in Queens, while suggesting that nativity status has, at best, a minimal and uncertain influence once statistical noise and model assumptions are properly addressed.