I had set my goals for the Christmas break at trying to get some writing done on the results section of my project so far. How naive I was thinking I could look at the output and decipher the results. As a reminder, here is the table of the means of all of the variables of the seven different classifications of neighborhoods:

V1 V2 V3 V4 V5 V6 V7
class 1.000 2.000 3.000 4.000 5.000 6.000 7.000
y00_04 3.810 204.300 559.980 405.400 139.340 1419.740 133.880
y05_09 5.240 235.880 783.070 490.250 128.670 1735.130 120.770
y10_14 6.910 229.870 1019.980 597.140 95.200 1583.870 121.120
y15_19 8.440 248.970 1091.210 722.670 133.080 2055.260 138.020
pnhw00 0.753 0.196 0.830 0.516 0.562 0.612 0.277
pnhw12 0.657 0.123 0.801 0.428 0.442 0.533 0.226
pnhw19 0.603 0.108 0.762 0.421 0.393 0.490 0.218
pnhb00 0.048 0.390 0.026 0.117 0.063 0.128 0.058
pnhb12 0.064 0.364 0.022 0.120 0.065 0.118 0.052
pnhb19 0.072 0.343 0.027 0.122 0.075 0.115 0.058
phisp00 0.156 0.388 0.090 0.306 0.338 0.208 0.633
phisp12 0.002 0.005 0.001 0.004 0.004 0.003 0.007
phisp19 0.002 0.005 0.001 0.004 0.005 0.003 0.007
mhhinc00 97241.100 55834.590 127372.950 77783.050 96469.870 114547.950 57017.440
mhhinc12 90430.270 45932.350 127001.880 66855.030 89176.820 104961.390 48726.600
mhhinc19 96186.780 49207.810 143166.640 76046.210 92032.980 116905.870 53634.450
pedu00 0.293 0.121 0.631 0.346 0.306 0.414 0.171
pedu12 0.320 0.114 0.688 0.365 0.331 0.483 0.176
pedu19 0.350 0.134 0.735 0.437 0.350 0.529 0.204
pop00 8.162 8.302 8.029 8.242 7.805 6.966 8.363
pop12 8.515 8.413 8.137 8.350 8.419 7.492 8.317
pop19 8.672 8.526 8.234 8.444 8.608 7.662 8.385
dev01 0.765 0.778 0.955 0.871 0.510 0.470 0.995
dev11 0.832 0.816 0.959 0.903 0.595 0.548 0.997
dev19 0.849 0.831 0.960 0.914 0.629 0.581 0.997
la00 0.192 0.271 0.352 0.359 0.027 0.128 0.414
la10 0.192 0.271 0.352 0.359 0.027 0.128 0.414
la19 0.201 0.305 0.403 0.423 0.071 0.151 0.463

You may be saying to yourself, I thought there were only five classifications the last time we spoke, and you’d be right. I ended up finding two discrepancies that caused me to go back and cleanly rewrite all my code for cleaning/tidying my data. With the newly cleaned data, everything ran smooth and fairly quickly (***note to self, found something here about the interaction of PCA to LPA to talk about), but the new recommendation was for seven classes. The LPA had some good spread between the classes. I didn’t want to distract from our discussion for today, so I didn’t include the PCA and LPA graphs that I was busy with last semester.

I then made some graphs to help me visualize the data better. I stopped mid-way thru as I’m not sure this is the best way forward. Here are the graphs I did for race/ethnicity:

Race/Ethnicity for each class of neighborhood

Then I looked at percent education, which shows the percentage of people that have four or more years of college education:

Percent education for each class by time

Percent Education by City

Median Household Income for each class over time

And median household income by class, and the by class by city

Median Household income by city

I cleaned the code to generate graphs for my other variables, but when I analyzed these graphs, I thought they were too busy, so I paused generating graphs. I’m having a tough time analyzing the classes for different variables, especially when looking at the differences between cities. This is where I think I need some help. Do you know of some better techniques for data visualization?

Also, I think these graphs work fairly well for graphing percentages, but what type of graph would you recommend when graphing mhhinc and comparing it with a percentage?

Thanks in advance for your help!!!