I had set my goals for the Christmas break at trying to get some writing done on the results section of my project so far. How naive I was thinking I could look at the output and decipher the results. As a reminder, here is the table of the means of all of the variables of the seven different classifications of neighborhoods:
| V1 | V2 | V3 | V4 | V5 | V6 | V7 | |
|---|---|---|---|---|---|---|---|
| class | 1.000 | 2.000 | 3.000 | 4.000 | 5.000 | 6.000 | 7.000 |
| y00_04 | 3.810 | 204.300 | 559.980 | 405.400 | 139.340 | 1419.740 | 133.880 |
| y05_09 | 5.240 | 235.880 | 783.070 | 490.250 | 128.670 | 1735.130 | 120.770 |
| y10_14 | 6.910 | 229.870 | 1019.980 | 597.140 | 95.200 | 1583.870 | 121.120 |
| y15_19 | 8.440 | 248.970 | 1091.210 | 722.670 | 133.080 | 2055.260 | 138.020 |
| pnhw00 | 0.753 | 0.196 | 0.830 | 0.516 | 0.562 | 0.612 | 0.277 |
| pnhw12 | 0.657 | 0.123 | 0.801 | 0.428 | 0.442 | 0.533 | 0.226 |
| pnhw19 | 0.603 | 0.108 | 0.762 | 0.421 | 0.393 | 0.490 | 0.218 |
| pnhb00 | 0.048 | 0.390 | 0.026 | 0.117 | 0.063 | 0.128 | 0.058 |
| pnhb12 | 0.064 | 0.364 | 0.022 | 0.120 | 0.065 | 0.118 | 0.052 |
| pnhb19 | 0.072 | 0.343 | 0.027 | 0.122 | 0.075 | 0.115 | 0.058 |
| phisp00 | 0.156 | 0.388 | 0.090 | 0.306 | 0.338 | 0.208 | 0.633 |
| phisp12 | 0.002 | 0.005 | 0.001 | 0.004 | 0.004 | 0.003 | 0.007 |
| phisp19 | 0.002 | 0.005 | 0.001 | 0.004 | 0.005 | 0.003 | 0.007 |
| mhhinc00 | 97241.100 | 55834.590 | 127372.950 | 77783.050 | 96469.870 | 114547.950 | 57017.440 |
| mhhinc12 | 90430.270 | 45932.350 | 127001.880 | 66855.030 | 89176.820 | 104961.390 | 48726.600 |
| mhhinc19 | 96186.780 | 49207.810 | 143166.640 | 76046.210 | 92032.980 | 116905.870 | 53634.450 |
| pedu00 | 0.293 | 0.121 | 0.631 | 0.346 | 0.306 | 0.414 | 0.171 |
| pedu12 | 0.320 | 0.114 | 0.688 | 0.365 | 0.331 | 0.483 | 0.176 |
| pedu19 | 0.350 | 0.134 | 0.735 | 0.437 | 0.350 | 0.529 | 0.204 |
| pop00 | 8.162 | 8.302 | 8.029 | 8.242 | 7.805 | 6.966 | 8.363 |
| pop12 | 8.515 | 8.413 | 8.137 | 8.350 | 8.419 | 7.492 | 8.317 |
| pop19 | 8.672 | 8.526 | 8.234 | 8.444 | 8.608 | 7.662 | 8.385 |
| dev01 | 0.765 | 0.778 | 0.955 | 0.871 | 0.510 | 0.470 | 0.995 |
| dev11 | 0.832 | 0.816 | 0.959 | 0.903 | 0.595 | 0.548 | 0.997 |
| dev19 | 0.849 | 0.831 | 0.960 | 0.914 | 0.629 | 0.581 | 0.997 |
| la00 | 0.192 | 0.271 | 0.352 | 0.359 | 0.027 | 0.128 | 0.414 |
| la10 | 0.192 | 0.271 | 0.352 | 0.359 | 0.027 | 0.128 | 0.414 |
| la19 | 0.201 | 0.305 | 0.403 | 0.423 | 0.071 | 0.151 | 0.463 |
You may be saying to yourself, I thought there were only five classifications the last time we spoke, and you’d be right. I ended up finding two discrepancies that caused me to go back and cleanly rewrite all my code for cleaning/tidying my data. With the newly cleaned data, everything ran smooth and fairly quickly (***note to self, found something here about the interaction of PCA to LPA to talk about), but the new recommendation was for seven classes. The LPA had some good spread between the classes. I didn’t want to distract from our discussion for today, so I didn’t include the PCA and LPA graphs that I was busy with last semester.
I then made some graphs to help me visualize the data better. I stopped mid-way thru as I’m not sure this is the best way forward. Here are the graphs I did for race/ethnicity:
Then I looked at percent education, which shows the percentage of people that have four or more years of college education:
And median household income by class, and the by class by city
I cleaned the code to generate graphs for my other variables, but when I analyzed these graphs, I thought they were too busy, so I paused generating graphs. I’m having a tough time analyzing the classes for different variables, especially when looking at the differences between cities. This is where I think I need some help. Do you know of some better techniques for data visualization?
Also, I think these graphs work fairly well for graphing percentages, but what type of graph would you recommend when graphing mhhinc and comparing it with a percentage?
Thanks in advance for your help!!!