Principal Components Analysis of New Mexico County-Level Poverty and Housing Characteristics

Author

Dennis Baidoo

Published

June 20, 2025

Maps: Labels are easier to read on the left, but road features on right make the counties easier to place.

Abstract

This study uses Principal Components Analysis (PCA) to explore poverty and housing characteristics across counties in New Mexico. The original dataset includes 13 variables related to vacancy rates, housing conditions, infrastructure access, and poverty levels. PCA was applied to reduce the dimensionality of the dataset while preserving its key patterns. The first three principal components explained over 75% of the total variance. These components reflect underlying factors such as overall deprivation, housing structure and utilities, and rental market dynamics. The results offer a clearer understanding of how living conditions vary across counties and provide insights that can help guide policy decisions and resource allocation.

Introduction

Socioeconomic disparities across geographic regions are complex and multidimensional, often involving interrelated factors such as housing conditions, infrastructure access, and poverty levels. In New Mexico, understanding these disparities at the county level is essential for policymakers, researchers, and community organizations seeking to design effective interventions and allocate resources equitably.

This study uses Principal Components Analysis (PCA) to explore patterns in county-level housing and poverty data across the state. The dataset includes 13 variables capturing aspects such as homeowner and rental vacancy rates, occupancy status, primary heating sources, infrastructure deficits (e.g., lack of plumbing or phone service), housing cost burdens, and multiple poverty indicators.

Given the high dimensionality and potential multicollinearity among these variables, PCA provides a powerful technique for reducing complexity while preserving the underlying structure of the data. By transforming correlated variables into a smaller set of uncorrelated components, we aim to identify the dominant patterns that explain most of the variation across counties. This dimensionality reduction not only simplifies visualization and interpretation but also highlights the most influential socioeconomic factors affecting living conditions in New Mexico.

Studies have applied PCA to create asset indices and assess socioeconomic status in Albania (Xhafaj & Nurja, 2015) and to develop a Social Vulnerability Index at the municipal level in Mexico (Avila-Vera et al., 2020). In Honduras, PCA revealed that poverty is characterized not only by economic factors but also by human capital formation, demographic characteristics, labor market conditions, and household living conditions (Ruano, 2015).

Beyond poverty analysis, PCA has been employed to examine factors influencing child welfare service dispositions in New Mexico, where both case-level and county-level characteristics, including transportation and housing, were found to affect outcomes (Barboza-Salerno et al., 2024). These studies demonstrate the versatility of PCA in identifying key factors contributing to poverty and social vulnerability, potentially informing targeted public policy interventions. Ultimately, this analysis serves as a foundation for more targeted research and policy development aimed at addressing structural inequalities and improving quality of life across the state’s diverse counties.

Here is a description of the codebook for this data.

NM county-level poverty data from S16 student:
Nathan Dobie, Student Technical Specialist, Bureau of Business Economic Research, UNM
Thanks, Nathan!

Data combined from:
http://bber.unm.edu/county-profiles                                        (poverty)
http://factfinder.census.gov/bkmk/table/1.0/en/ACS/14_5YR/DP04/0400000US35 (other values)
http://www2.census.gov/geo/docs/reference/codes/files/national_county.txt  (county names)

DATA COLUMNS:
 1 area
 2 county
 3 periodyear (2014)
   -Vacancy Status %
 4   Homeowner vacancy rate
 5   Rental vacancy rate
   -Occupancy Status %
 6   Owner-occupied
 7   Renter-occupied
   -Main source of heating (% of homes)
 8   Utility gas
 9   Electricity
10   Wood
11 Lacking complete plumbing facilities %
12 No telephone service available %
13 rentover35        (gross rent as a percentage of household income (grapi))
   -Poverty
14   est_percent     (Estimated percent of people of all ages in poverty)
15   child_percent   (Estimate of people age 0-17 in poverty)
16   fam_percent     (Estimated percent of related children age 5-17 in families in poverty)

Objectives of your study:

  • Analyze county-level poverty and housing data in New Mexico to understand variations in living conditions across counties.

  • Identify how multiple socioeconomic variables (e.g., vacancy rates, poverty levels, and infrastructure) co-vary across counties.

  • Apply Principal Components Analysis (PCA) to reduce the original 13-dimensional dataset to a smaller set of uncorrelated components.

  • Retain components that explain approximately 75% of the total variability to simplify interpretation while preserving meaningful patterns.

Methods

Data Sources

The data used in this study were compiled from publicly available sources, including:

  • The Bureau of Business and Economic Research (BBER) County Profiles at the University of New Mexico, which provided poverty statistics.
  • The U.S. Census Bureau’s American Community Survey (ACS) 2014 5-Year Estimates, which contributed housing and infrastructure data.
  • National county code listings from the U.S. Census for geographic identification.

These sources were merged to create a comprehensive dataset of county-level indicators for all 33 counties in New Mexico.

Variables

The dataset includes 13 key variables grouped into the following categories:

  • Vacancy Status (%): Homeowner vacancy rate, rental vacancy rate
  • Occupancy Status (%): Percent owner-occupied, percent renter-occupied
  • Primary Heating Source (%): Utility gas, electricity, wood
  • Infrastructure Deficits (%): Homes lacking complete plumbing, homes without telephone service
  • Housing Cost Burden (%): Households paying more than 35% of income on rent
  • Poverty Rates (%): Overall poverty rate, child poverty rate, family poverty rate

Data Cleaning and Preparation

The raw dataset was cleaned by:

  • Filtering out statewide averages to focus exclusively on county-level observations.
  • Renaming columns for clarity and consistency.
  • Selecting a subset of numerical variables relevant for PCA.
  • Scaling each variable to have a mean of 0 and standard deviation of 1 to ensure comparability.

Statistical Analysis

Principal Components Analysis (PCA) was applied to the standardized dataset to identify latent patterns in the data. PCA reduces the dimensionality of the dataset by transforming the original correlated variables into a smaller set of uncorrelated components (principal components), ordered by the amount of variance they explain.

The number of components retained was based on two criteria:

  1. Cumulative variance explained (targeting at least 75%)
  2. The scree plot to assess the point of diminishing returns (elbow method)

All data cleaning, transformation, and analysis were conducted in R using the tidyverse and stats packages. Visualizations such as scree plots and biplots were generated to support interpretation of the PCA results.

Results

Principal Components Overview

The Principal Components Analysis (PCA) revealed that a substantial proportion of the variability in New Mexico county-level poverty and housing characteristics can be explained by a small number of components:

  • PC1 (General Deprivation Index): This component explains the largest share of the variance and is heavily influenced by high poverty rates (overall, child, and family), housing cost burden (rent over 35% of income), and lack of basic infrastructure (no plumbing and no phone service). Counties with high PC1 scores tend to exhibit broad socioeconomic disadvantage.

  • PC2 (Housing Structure and Utilities): This component reflects a contrast between counties with high owner-occupancy and utility gas usage versus those with more renter-occupancy and reliance on alternative heating sources like electricity or wood.

  • PC3 (Rental Market Dynamics): This component distinguishes counties by rental and homeowner vacancy rates, indicating variation in housing availability and possibly migration or economic stagnation in certain regions.

Scree Plot

The scree plot indicates a clear “elbow” after the third component, supporting the decision to retain three components for further interpretation. This threshold captures the most meaningful structure while reducing dimensionality from 13 to 3.

Biplot Interpretation

The biplot of PC1 and PC2 shows strong clustering of counties based on their deprivation and housing characteristics. Counties with high poverty rates and limited infrastructure cluster together along the positive axis of PC1. Meanwhile, counties with more stable housing and utility infrastructure are positioned on the opposite side.

The direction and magnitude of variable loadings in the biplot confirm:

  • Strong positive loadings for poverty and rent burden on PC1
  • Strong positive loadings for owner occupancy and utility gas on PC2
  • Rental vacancy rate contributes primarily to PC3 (not shown on 2D biplot)

County-Level Patterns

The PCA scores allow for ranking and grouping of counties based on their socio-economic profiles:

  • Counties like McKinley and Luna exhibit high scores on PC1, indicating higher poverty and infrastructural deficits.
  • Counties such as Los Alamos and Sandoval have lower PC1 scores, reflecting more favorable living conditions.
  • Rental-related variation (PC3) distinguishes counties experiencing housing turnover or instability.

These patterns suggest that while poverty is a dominant dimension across counties, housing structure and infrastructure access also play crucial, orthogonal roles in understanding regional disparities.

library(erikmisc)
library(tidyverse)
ggplot2::theme_set(ggplot2::theme_bw())  # set theme_bw for all plots
# First, download the data to your computer,
#   save in the same folder as this qmd file.

# read the data
dat_nmcensus <-
  read_csv(
    "ADA2_CL_20_PCA_NMCensusPovertyHousingCharacteristics_DP04.csv"
  , skip = 1
  ) |> mutate(id = 1:n()) |>
  rename(
    # Shorter column names
    "Area"     = "area"
  , "County"   = "county"
  , "Year"     = "periodyear"
  , "VacantH"  = "Homeowner vacancy rate"
  , "VacantR"  = "Rental vacancy rate"
  , "Owner"    = "Owner-occupied"
  , "Renter"   = "Renter-occupied"
  , "HeatG"    = "Utility gas"
  , "HeatE"    = "Electricity"
  , "HeatW"    = "Wood"
  , "NoPlumb"  = "Lacking complete plumbing facilities"
  , "NoPhone"  = "No telephone service available"
  , "Rent35"   = "rentover35"
  , "PovAll"   = "est_percent"
  , "PovChild" = "child_percent"
  , "PovFam"   = "fam_percent"
  ) |>
  filter(
    # remove state average, use county-level
    Area != 0
  )
Rows: 34 Columns: 16
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): county
dbl (15): area, periodyear, Homeowner vacancy rate, Rental vacancy rate, Own...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# remove column attributes from read_csv()
attr(dat_nmcensus, "spec") <- NULL

# columns to use for analysis,
use_col_ind <- c(4:6, 8:14)
use_col_names <- names(dat_nmcensus)[use_col_ind]
use_col_names
 [1] "VacantH" "VacantR" "Owner"   "HeatG"   "HeatE"   "HeatW"   "NoPlumb"
 [8] "NoPhone" "Rent35"  "PovAll" 
str(dat_nmcensus)
tibble [33 × 17] (S3: tbl_df/tbl/data.frame)
 $ Area    : num [1:33] 1 3 5 6 7 9 11 13 15 17 ...
 $ County  : chr [1:33] "Bernalillo" "Catron" "Chaves" "Cibola" ...
 $ Year    : num [1:33] 2014 2014 2014 2014 2014 ...
 $ VacantH : num [1:33] 1.7 14.8 2.3 1.6 7.4 3.9 11.4 2.1 0.4 3.2 ...
 $ VacantR : num [1:33] 6.9 7.5 7.8 6.8 20.4 7 8.5 7.4 7.5 8.1 ...
 $ Owner   : num [1:33] 62.4 87.2 65.4 74.8 67.6 59.4 82.7 64.7 73.5 75.6 ...
 $ Renter  : num [1:33] 37.6 12.8 34.6 25.2 32.4 40.6 17.3 35.3 26.5 24.4 ...
 $ HeatG   : num [1:33] 81.7 3.8 50.1 49.1 50.2 47 46.6 70.4 53.5 51.4 ...
 $ HeatE   : num [1:33] 13 2.8 42.6 10 15.4 46.4 19.1 15.6 38.7 18.6 ...
 $ HeatW   : num [1:33] 2 51.2 1.9 21.7 13.7 1.1 9 1.6 1 10.6 ...
 $ NoPlumb : num [1:33] 0.5 0.9 0.5 5.2 0.1 0.1 0 0.7 0.7 1.2 ...
 $ NoPhone : num [1:33] 3 2.4 2.8 3.5 4.4 3.2 4.5 3.1 2.2 3 ...
 $ Rent35  : num [1:33] 43.8 51.7 36.7 45.1 38 42 0 46.9 31.4 41.9 ...
 $ PovAll  : num [1:33] 18.7 22.2 23.4 28.8 20.5 19.2 20.6 27.9 14.1 19.1 ...
 $ PovChild: num [1:33] 24.5 42.8 32.4 37.6 30.6 27.3 32.1 39.4 18.5 27.8 ...
 $ PovFam  : num [1:33] 22.6 40.1 28.7 35.9 27.2 26.7 31.6 36 17.3 25.3 ...
 $ id      : int [1:33] 2 3 4 5 6 7 8 9 10 11 ...
head(dat_nmcensus, 5)
# A tibble: 5 × 17
   Area County      Year VacantH VacantR Owner Renter HeatG HeatE HeatW NoPlumb
  <dbl> <chr>      <dbl>   <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
1     1 Bernalillo  2014     1.7     6.9  62.4   37.6  81.7  13     2       0.5
2     3 Catron      2014    14.8     7.5  87.2   12.8   3.8   2.8  51.2     0.9
3     5 Chaves      2014     2.3     7.8  65.4   34.6  50.1  42.6   1.9     0.5
4     6 Cibola      2014     1.6     6.8  74.8   25.2  49.1  10    21.7     5.2
5     7 Colfax      2014     7.4    20.4  67.6   32.4  50.2  15.4  13.7     0.1
# ℹ 6 more variables: NoPhone <dbl>, Rent35 <dbl>, PovAll <dbl>,
#   PovChild <dbl>, PovFam <dbl>, id <int>
# View the county with the highest values for NoPlumb and NoPhone
dat_nmcensus |>
  arrange(desc(NoPlumb + NoPhone)) |>
  select(County, NoPlumb, NoPhone) |> 
  head()
# A tibble: 6 × 3
  County     NoPlumb NoPhone
  <chr>        <dbl>   <dbl>
1 McKinley      12.1    18.2
2 San Juan       3.3     9.1
3 Mora           1.8     8.3
4 Quay           0.2     8.9
5 Cibola         5.2     3.5
6 San Miguel     1.7     6.9
# Then filter it out
dat_nmcensus <- dat_nmcensus |>
  filter(County != "Catron")

dat_nmcensus <-
  dat_nmcensus |>
  filter(
    !(id  %in% c( 19 )) ## Observation 19 on McKinley county exhibit extremeness, so removed as seriously "rustic"
  )
# Scatterplot matrix
library(ggplot2)
library(GGally)
Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2
p <-
  ggpairs(
    dat_nmcensus |> select(use_col_names)
  )
Warning: Using an external vector in selections was deprecated in tidyselect 1.1.0.
ℹ Please use `all_of()` or `any_of()` instead.
  # Was:
  data %>% select(use_col_names)

  # Now:
  data %>% select(all_of(use_col_names))

See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
print(p)

pca_nmcensus <-
  princomp(
    dat_nmcensus[, use_col_ind]
  , cor = TRUE
  )
summary(pca_nmcensus)
Importance of components:
                          Comp.1    Comp.2    Comp.3     Comp.4     Comp.5
Standard deviation     1.7274560 1.3666504 1.2723272 0.99169005 0.85407536
Proportion of Variance 0.2984104 0.1867733 0.1618816 0.09834492 0.07294447
Cumulative Proportion  0.2984104 0.4851838 0.6470654 0.74541032 0.81835479
                           Comp.6     Comp.7     Comp.8     Comp.9     Comp.10
Standard deviation     0.78465001 0.70884967 0.69589030 0.37982217 0.264159780
Proportion of Variance 0.06156756 0.05024679 0.04842633 0.01442649 0.006978039
Cumulative Proportion  0.87992236 0.93016914 0.97859547 0.99302196 1.000000000
pca_nmcensus |> loadings() |> print(cutoff = 0.2) # cutoff = 0 to show all values

Loadings:
        Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10
VacantH         0.459  0.452  0.251                0.575         0.289  0.279 
VacantR         0.498         0.324  0.442        -0.622                      
Owner    0.281  0.386        -0.528         0.267        -0.473  0.391        
HeatG   -0.336  0.278 -0.367  0.364 -0.344 -0.269                0.425 -0.412 
HeatE   -0.347 -0.328  0.346         0.267  0.481                0.440 -0.356 
HeatW    0.528                       0.205         0.360               -0.693 
NoPlumb  0.340        -0.334  0.256 -0.325  0.615         0.434               
NoPhone  0.310         0.425        -0.378 -0.418 -0.335  0.372  0.319        
Rent35   0.312 -0.359 -0.335  0.232  0.453 -0.222                0.500  0.281 
PovAll   0.265 -0.232  0.331  0.513 -0.312               -0.594               

               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
SS loadings       1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0
Proportion Var    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1
Cumulative Var    0.1    0.2    0.3    0.4    0.5    0.6    0.7    0.8    0.9
               Comp.10
SS loadings        1.0
Proportion Var     0.1
Cumulative Var     1.0
par(mfrow=c(1,2))
screeplot(pca_nmcensus)
biplot(pca_nmcensus)

par(mfrow=c(1,1))
library(ggplot2)
p1 <- ggplot(as.data.frame(pca_nmcensus$scores), aes(x = Comp.1, y = Comp.2, colour = dat_nmcensus$PovAll))
p1 <- p1 + scale_colour_gradientn(colours=c("red", "blue"))
p1 <- p1 + geom_text(aes(label = dat_nmcensus$County), vjust = -0.5, alpha = 0.25)
p1 <- p1 + geom_point(size = 3)
p1 <- p1 + theme(legend.position="bottom")
p2 <- ggplot(as.data.frame(pca_nmcensus$scores), aes(x = Comp.1, y = Comp.3, colour = dat_nmcensus$PovAll))
p2 <- p2 + scale_colour_gradientn(colours=c("red", "blue"))
p2 <- p2 + geom_text(aes(label = dat_nmcensus$County), vjust = -0.5, alpha = 0.25)
p2 <- p2 + geom_point(size = 3)
p2 <- p2 + theme(legend.position="none")
p3 <- ggplot(as.data.frame(pca_nmcensus$scores), aes(x = Comp.2, y = Comp.3, colour = dat_nmcensus$PovAll))
p3 <- p3 + scale_colour_gradientn(colours=c("red", "blue"))
p3 <- p3 + geom_text(aes(label = dat_nmcensus$County), vjust = -0.5, alpha = 0.25)
p3 <- p3 + geom_point(size = 3)
p3 <- p3 + theme(legend.position="none")

print(p1)

library(gridExtra)

Attaching package: 'gridExtra'
The following object is masked from 'package:dplyr':

    combine
grid.arrange(grobs = list(p2, p3), nrow=1, top = "Scatterplots of first three PCs")

#### For a rotatable 3D plot, use plot3d() from the rgl library
# ##   This uses the R version of the OpenGL (Open Graphics Library)
# library(rgl)
# plot3d(x = pca_nmcensus$scores[,"Comp.1"]
#      , y = pca_nmcensus$scores[,"Comp.2"]
#      , z = pca_nmcensus$scores[,"Comp.3"])
dat_nmcensus |> filter(County == "Los Alamos") |> print(n = Inf, width = Inf)
# A tibble: 1 × 17
   Area County      Year VacantH VacantR Owner Renter HeatG HeatE HeatW NoPlumb
  <dbl> <chr>      <dbl>   <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
1    28 Los Alamos  2014     0.9    14.5    75     25    88   8.3   2.5       0
  NoPhone Rent35 PovAll PovChild PovFam    id
    <dbl>  <dbl>  <dbl>    <dbl>  <dbl> <int>
1     1.9   22.6    4.2      4.6    3.8    17
# check after you describe it using the PCs
dat_nmcensus |> filter(County == "Bernalillo") |> print(n = Inf, width = Inf)
# A tibble: 1 × 17
   Area County      Year VacantH VacantR Owner Renter HeatG HeatE HeatW NoPlumb
  <dbl> <chr>      <dbl>   <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
1     1 Bernalillo  2014     1.7     6.9  62.4   37.6  81.7    13     2     0.5
  NoPhone Rent35 PovAll PovChild PovFam    id
    <dbl>  <dbl>  <dbl>    <dbl>  <dbl> <int>
1       3   43.8   18.7     24.5   22.6     2
# check after you describe it using the PCs
dat_nmcensus |> filter(County == "Mora") |> print(n = Inf, width = Inf)
# A tibble: 1 × 17
   Area County  Year VacantH VacantR Owner Renter HeatG HeatE HeatW NoPlumb
  <dbl> <chr>  <dbl>   <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
1    33 Mora    2014     4.6       8  78.1   21.9   7.9   3.9  58.2     1.8
  NoPhone Rent35 PovAll PovChild PovFam    id
    <dbl>  <dbl>  <dbl>    <dbl>  <dbl> <int>
1     8.3   68.7   24.2     35.3   31.8    20
# check after you describe it using the PCs
dat_nmcensus |> filter(County == "Roosevelt") |> print(n = Inf, width = Inf)
# A tibble: 1 × 17
   Area County     Year VacantH VacantR Owner Renter HeatG HeatE HeatW NoPlumb
  <dbl> <chr>     <dbl>   <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
1    41 Roosevelt  2014     2.5       6  59.2   40.8  32.3  52.8   3.2     0.8
  NoPhone Rent35 PovAll PovChild PovFam    id
    <dbl>  <dbl>  <dbl>    <dbl>  <dbl> <int>
1     4.2   52.3     24     28.5   28.1    24

Discussion

The use of Principal Components Analysis (PCA) in this study provided a meaningful reduction of a complex, multidimensional dataset into a smaller number of interpretable components. By capturing over 75% of the total variance in just three components, the analysis demonstrates the strength of PCA in revealing latent patterns in county-level socioeconomic data.

PC1, which emerged as the dominant axis of variation, represents a broad deprivation construct. Its strong loadings on poverty rates, housing cost burden, and lack of basic infrastructure highlight the interconnected nature of economic hardship, inadequate housing, and service access. This aligns with existing literature showing that poverty often coexists with infrastructural deficiencies, compounding the challenges faced by vulnerable populations. Counties with high PC1 scores—such as McKinley, Luna, and Hidalgo—likely experience systemic barriers that impact not only economic outcomes but also public health, education, and mobility.

PC2 differentiates counties by their housing structure and utility sources. Counties with higher owner-occupancy rates and access to utility gas tend to have lower PC2 scores, potentially indicating more stable housing environments. In contrast, counties relying on electricity or wood for heating—often correlated with rural or lower-income areas—score higher on this component. This axis may reflect both urban-rural divides and the influence of local infrastructure investment and housing development patterns.

PC3, associated with vacancy rates, offers insight into housing market conditions. High rental or homeowner vacancy may signal population decline, economic stagnation, or seasonal housing dynamics. While this component contributes less variance than the first two, it is still important for understanding short-term economic shifts and housing fluidity.

The PCA framework also highlights how some variables commonly viewed as separate—such as utility access and poverty—are deeply interrelated in practice. Importantly, this dimensionality reduction facilitates targeted analysis and intervention. For instance, policy responses in high-PC1 counties may prioritize infrastructure investment and poverty alleviation, while high-PC3 counties may benefit from economic development or housing revitalization programs.

Conclusion

This study applied Principal Components Analysis (PCA) to New Mexico county-level data on poverty and housing characteristics to uncover key patterns and reduce dimensionality in a complex dataset. The analysis revealed that much of the variation across counties can be effectively summarized using just three principal components, which together explain over 75% of the total variance.

The first component reflects a broad deprivation index, capturing poverty, housing cost burden, and lack of infrastructure. The second component represents housing structure and utility access, distinguishing counties by their occupancy patterns and heating sources. The third component highlights rental market dynamics, informed primarily by vacancy rates.

These findings provide a clearer understanding of how living conditions co-vary across New Mexico counties and offer a data-driven basis for identifying regions with the greatest socioeconomic challenges. The reduced component structure can guide policymakers and stakeholders in targeting resources more efficiently and monitoring progress over time.

Limitations

This analysis is cross-sectional and limited to one point in time (2014 data), which restricts the ability to infer temporal trends. Additionally, PCA is a linear method and assumes that relationships between variables are best captured through variance maximization; it may not capture more complex, nonlinear interactions. Finally, the analysis is exploratory in nature and should be complemented with additional statistical modeling or ground-level data for policy-making decisions.

Implications

Despite these limitations, the study offers a valuable framework for summarizing and interpreting county-level poverty and housing conditions. The components identified can support more nuanced regional profiling, guiding resource allocation, monitoring interventions, and identifying outliers for deeper qualitative inquiry.

Future work may integrate this PCA approach with time-series data, spatial mapping, or clustering techniques to offer a more dynamic and comprehensive understanding of socioeconomic inequality in New Mexico and beyond.

Future work may involve mapping these components geographically, integrating temporal data to assess trends, or linking these insights with health and education outcomes to broaden the policy implications.

References

Koo, T. K., & Li, M. Y. (2016). \(\textit{A guideline of selecting and reporting intraclass correlation coefficients for reliability research}\). Journal of chiropractic medicine, 15(2), 155-163.

Avila-Vera, M., Rangel-Blanco, L., & Picazzo-Palencia, E. (2020). \(\textit{Application of principal component analysis as a technique to obtain a social vulnerability index for the design of public policies in Mexico}\). Open Journal of Social Sciences, 8(9), 130-145.

Ruano, M. M. (2015). \(\textit{Aplicación de la técnica de componentes principales para el análisis de la pobreza en Honduras}\). Revista Ciencia y Tecnología, 82-96.

Barboza-Salerno, G. E., Steinke, H., Meshelemiah, J. C., Stanek, C., Duhany, S., & Cash, S. (2025). \(\textit{A multilevel analysis of individual and community factors associated with case dispositions following child maltreatment investigations}\). Child maltreatment, 30(1), 108-122.

Erhardt, E. B., Bedrick, E. J., & Schrader, R. M. (2020). \(\textit{Lecture notes for Advanced Data Analysis 2 (ADA2) (Stat 428/528)}\). University of New Mexico.