Explanation on this document and variables
The purpose of this document is to show how the Austin census tracts
are shaping up in my research as well as to explore some of the
preliminary PCA work I’ve been doing.
I’ve pulled the following variables from the Logan Thomas
Database:
pnwXX percent NHW
peduXX # of people with 4 years of college / # of people at least
25
Map 1: Austin tracts increasing number of bldg permits
Mapping the tracts (Note the missing tracts are the tracts in the
top %40 of mhhinc):
map with distinction between increasing number of bldg permits
Working on the legend, but the “0” means no increase in bldg
permits, “12” means an increase from 2006-2010 to 2011-2015, “19” means
an increase in building permits from 2011-2015 to 2016-2020, and “both”
means and increase in both categories
Map 2: Facet by category, exploring change in percent nhw from 2000
- 2019

This was just a brief look on what the Austin data looks like.
Obviously more to do here.
Latent Profile Analysis
Jumping into LPA at this point leads to the following output (note
the first graph shows the results of the LPA without using
MplusAutomation (which seems to have been created in order to help LPA
run better/smoother) and the second graph shows the results using it)
:
As a reminder, I’m trying to distinguish ascending neighborhoods
into two distnict classes, those ascensing sue to gentrification and
those ascending but not due to gentrification.
Just TidyLPA

Using TidyLPA with MplusAutomation

The Mplus Automation seems a bit cleaner, but, overall, the results
here seem a bit murky, so I’ve turned to PCA to see if I can eliminate
some of the existing collinearity.
Principal Comoponent Analysis
I tried several combinations of variables, but the one that seems to
be the cleanest and easiest to understand is utilizing the raw data and
not manipulating it at all.


The following are the eigen values and the categories:
## eigenvalue percentage of variance
## comp 1 4.2910741 35.758951
## comp 2 3.0360447 25.300373
## comp 3 1.6380112 13.650093
## comp 4 1.2792640 10.660533
## comp 5 0.6686684 5.572237
## comp 6 0.2960917 2.467431
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## y06_10 -0.140132886 0.8006512 -0.39754643 0.216503678 0.114568318
## y11_15 -0.005910495 0.8945147 -0.26319145 0.047828745 0.088241002
## y16_20 -0.282247129 0.7797818 -0.38603114 0.172230888 0.066799789
## pnhw00 0.695676090 0.1628920 -0.30325341 -0.555629829 -0.146355529
## pnhw12 0.899775563 0.1348092 -0.06783231 -0.211967886 -0.059843759
## pnhw19 0.309011263 0.2439938 0.57763409 -0.215069297 0.674875483
## pedu00 0.943599526 -0.0360922 -0.18414486 -0.100493058 0.007778948
## pedu12 0.924760054 0.1087139 -0.14744685 -0.082110701 0.032578370
## pedu19 0.130893594 0.6387956 0.65101019 0.045492959 -0.217257446
## mhhinc00 0.683888586 -0.1935121 0.01146400 0.643254800 0.003113459
## mhhinc12 0.736274125 -0.1368964 0.05669803 0.606451629 0.040426305
## mhhinc19 0.173757721 0.6365353 0.59045999 0.007644548 -0.335959897
The scree plot:

From the eigen values and the screeplot there appears to be four
components of the PCA with the two largest of these componenets
representing over %60 of the variance
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## y06_10 -0.140132886 0.8006512 -0.39754643 0.21650368 0.11456832
## y11_15 -0.005910495 0.8945147 -0.26319145 0.04782875 0.08824100
## y16_20 -0.282247129 0.7797818 -0.38603114 0.17223089 0.06679979
## pnhw00 0.695676090 0.1628920 -0.30325341 -0.55562983 -0.14635553
## pnhw12 0.899775563 0.1348092 -0.06783231 -0.21196789 -0.05984376
## pnhw19 0.309011263 0.2439938 0.57763409 -0.21506930 0.67487548
And as I prefer pictures, I thought this helped with understanding
the categories of the dimensions:
this one makes much more sense. Bigger means high level of
association, blue means positive association and red is negative:

Using the four principal components in PCA in LPA
trying the recommended classes from earlier recommendation, 2
TidyLPA
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## 1 3.30057217 1.7414419 1.3628310 0.7031743 -1.1985835
## 2 0.65361441 1.0380128 -1.9558061 -1.9123510 -0.7221169
## 3 0.04070208 1.1473277 0.4249807 -1.7744552 -0.2330693
## 4 -2.00508960 2.6922397 -1.5245914 0.4344906 1.0486206
## 5 -1.08756407 -0.8885329 0.8587591 -0.7473335 -0.7110226
## 6 -3.43059088 0.9649757 1.8040117 0.2781315 1.3700463

TidyLPA with MplusAutomation

Playing around with the LPA models
comparing models
## Compare tidyLPA solutions:
##
## Model Classes AIC BIC Warnings
## 1 1 1159.352 1178.206
## 1 2 1160.960 1191.597
## 1 3 1128.420 1170.841
## 6 1 1171.352 1204.346
## 6 2 1116.402 1184.747
## 6 3 1110.288 1213.983 Warning
##
## Best model according to AIC is Model 6 with 3 classes.
## Best model according to BIC is Model 1 with 3 classes.
##
## An analytic hierarchy process, based on the fit indices AIC, AWE, BIC, CLC, and KIC (Akogul & Erisoglu, 2017), suggests the best solution is Model 1 with 3 classes.
Looking at recommended model

trying with just the first two dimensions
two categories
