acs <- read.csv("HW3data.csv")
state <- acs$State
data <- select(acs, -State)
California, Florida, New York, and Texas are the most popular states (must be by a lot), and since the data isn’t scaled, TotalPop is dominating the euclidean distance. Thats why we have one group with those 4 states, and another with the rest.
km <- kmeans(data, centers = 2, nstart = 20)
acs$State[km$cluster == 1]
[1] "Alabama" "Alaska" "Arizona"
[4] "Arkansas" "Colorado" "Connecticut"
[7] "Delaware" "District of Columbia" "Georgia"
[10] "Hawaii" "Idaho" "Illinois"
[13] "Indiana" "Iowa" "Kansas"
[16] "Kentucky" "Louisiana" "Maine"
[19] "Maryland" "Massachusetts" "Michigan"
[22] "Minnesota" "Mississippi" "Missouri"
[25] "Montana" "Nebraska" "Nevada"
[28] "New Hampshire" "New Jersey" "New Mexico"
[31] "North Carolina" "North Dakota" "Ohio"
[34] "Oklahoma" "Oregon" "Pennsylvania"
[37] "Rhode Island" "South Carolina" "South Dakota"
[40] "Tennessee" "Utah" "Vermont"
[43] "Virginia" "Washington" "West Virginia"
[46] "Wisconsin" "Wyoming"
There are variables, like PublicWork and PrivateWork, that are almost perfectly correlated with each other. This adds uncessary noise to kmeans.
cor_mat <- cor(data)
kable(round(cor_mat, 2), format = "html") %>%
kable_styling(full_width = F, font_size = 10)
| TotalPop | Men | Women | VotingAgeCitizen | MeanCommute | Employed | Hispanic_ | White_ | Black_ | Native_ | Asian_ | Pacific_ | Poverty_ | ChildPoverty_ | Professional_ | Service_ | Office_ | Construction_ | Production_ | Drive_ | Carpool_ | Transit_ | Walk_ | OtherTransp_ | WorkAtHome_ | PrivateWork_ | PublicWork_ | SelfEmployed_ | FamilyWork_ | Unemployment_ | income_weighted | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TotalPop | 1.00 | -0.18 | 0.18 | -0.55 | 0.47 | -0.19 | 0.53 | -0.41 | 0.11 | -0.26 | 0.21 | -0.10 | 0.14 | 0.17 | -0.04 | 0.09 | 0.23 | -0.18 | 0.01 | -0.08 | -0.02 | 0.17 | -0.20 | -0.04 | -0.02 | 0.29 | -0.31 | 0.00 | -0.21 | 0.28 | 0.03 |
| Men | -0.18 | 1.00 | -1.00 | -0.17 | -0.64 | 0.18 | 0.06 | 0.21 | -0.69 | 0.61 | 0.11 | 0.24 | -0.39 | -0.40 | -0.31 | 0.02 | 0.05 | 0.67 | -0.01 | 0.13 | 0.59 | -0.41 | 0.10 | 0.26 | 0.30 | -0.37 | 0.25 | 0.43 | 0.59 | -0.48 | -0.16 |
| Women | 0.18 | -1.00 | 1.00 | 0.17 | 0.64 | -0.18 | -0.06 | -0.21 | 0.69 | -0.61 | -0.11 | -0.24 | 0.39 | 0.40 | 0.31 | -0.02 | -0.05 | -0.67 | 0.01 | -0.13 | -0.59 | 0.41 | -0.10 | -0.26 | -0.30 | 0.37 | -0.25 | -0.43 | -0.59 | 0.48 | 0.16 |
| VotingAgeCitizen | -0.55 | -0.17 | 0.17 | 1.00 | -0.24 | 0.14 | -0.75 | 0.61 | 0.02 | -0.03 | -0.34 | -0.13 | 0.04 | 0.02 | 0.01 | -0.19 | -0.36 | 0.07 | 0.21 | 0.16 | -0.32 | -0.16 | 0.19 | -0.18 | -0.07 | -0.04 | 0.01 | 0.10 | 0.12 | -0.15 | -0.05 |
| MeanCommute | 0.47 | -0.64 | 0.64 | -0.24 | 1.00 | -0.08 | 0.26 | -0.54 | 0.48 | -0.51 | 0.39 | 0.08 | 0.02 | 0.04 | 0.44 | 0.18 | 0.06 | -0.65 | -0.38 | -0.46 | -0.29 | 0.62 | 0.07 | 0.09 | -0.08 | 0.19 | -0.04 | -0.43 | -0.66 | 0.51 | 0.48 |
| Employed | -0.19 | 0.18 | -0.18 | 0.14 | -0.08 | 1.00 | -0.20 | 0.28 | -0.26 | -0.02 | 0.06 | -0.03 | -0.77 | -0.84 | 0.62 | -0.32 | -0.50 | -0.28 | -0.35 | -0.35 | -0.40 | 0.28 | 0.50 | 0.10 | 0.48 | 0.00 | -0.06 | 0.20 | 0.12 | -0.61 | 0.71 |
| Hispanic_ | 0.53 | 0.06 | -0.06 | -0.75 | 0.26 | -0.20 | 1.00 | -0.63 | -0.13 | 0.11 | 0.22 | 0.01 | 0.15 | 0.14 | 0.00 | 0.44 | 0.29 | -0.04 | -0.39 | -0.16 | 0.19 | 0.14 | -0.13 | 0.17 | 0.19 | -0.03 | 0.01 | 0.05 | -0.14 | 0.29 | 0.05 |
| White_ | -0.41 | 0.21 | -0.21 | 0.61 | -0.54 | 0.28 | -0.63 | 1.00 | -0.45 | -0.02 | -0.60 | -0.43 | -0.25 | -0.24 | -0.15 | -0.48 | -0.16 | 0.25 | 0.44 | 0.41 | -0.22 | -0.42 | -0.06 | -0.44 | 0.13 | 0.27 | -0.37 | 0.22 | 0.27 | -0.56 | -0.22 |
| Black_ | 0.11 | -0.69 | 0.69 | 0.02 | 0.48 | -0.26 | -0.13 | -0.45 | 1.00 | -0.31 | -0.10 | -0.17 | 0.48 | 0.48 | 0.23 | -0.05 | -0.15 | -0.40 | 0.01 | -0.20 | -0.35 | 0.39 | 0.00 | 0.09 | -0.45 | 0.00 | 0.18 | -0.52 | -0.37 | 0.60 | 0.14 |
| Native_ | -0.26 | 0.61 | -0.61 | -0.03 | -0.51 | -0.02 | 0.11 | -0.02 | -0.31 | 1.00 | -0.11 | 0.01 | 0.00 | -0.01 | -0.12 | 0.04 | -0.04 | 0.49 | -0.15 | -0.03 | 0.32 | -0.19 | 0.28 | 0.40 | 0.15 | -0.60 | 0.52 | 0.35 | 0.53 | -0.06 | -0.10 |
| Asian_ | 0.21 | 0.11 | -0.11 | -0.34 | 0.39 | 0.06 | 0.22 | -0.60 | -0.10 | -0.11 | 1.00 | 0.89 | -0.31 | -0.30 | 0.09 | 0.41 | 0.15 | -0.18 | -0.35 | -0.36 | 0.41 | 0.27 | 0.16 | 0.34 | 0.05 | -0.19 | 0.20 | 0.02 | -0.12 | 0.03 | 0.28 |
| Pacific_ | -0.10 | 0.24 | -0.24 | -0.13 | 0.08 | -0.03 | 0.01 | -0.43 | -0.17 | 0.01 | 0.89 | 1.00 | -0.22 | -0.22 | -0.09 | 0.37 | 0.20 | 0.04 | -0.21 | -0.20 | 0.57 | 0.05 | 0.13 | 0.33 | 0.03 | -0.30 | 0.29 | 0.09 | 0.04 | -0.10 | 0.04 |
| Poverty_ | 0.14 | -0.39 | 0.39 | 0.04 | 0.02 | -0.77 | 0.15 | -0.25 | 0.48 | 0.00 | -0.31 | -0.22 | 1.00 | 0.98 | -0.39 | 0.22 | 0.13 | 0.11 | 0.33 | 0.14 | 0.01 | -0.02 | -0.24 | -0.04 | -0.46 | -0.02 | 0.07 | -0.12 | -0.06 | 0.61 | -0.56 |
| ChildPoverty_ | 0.17 | -0.40 | 0.40 | 0.02 | 0.04 | -0.84 | 0.14 | -0.24 | 0.48 | -0.01 | -0.30 | -0.22 | 0.98 | 1.00 | -0.48 | 0.23 | 0.25 | 0.13 | 0.40 | 0.26 | 0.06 | -0.13 | -0.37 | -0.12 | -0.52 | 0.05 | 0.00 | -0.15 | -0.13 | 0.62 | -0.63 |
| Professional_ | -0.04 | -0.31 | 0.31 | 0.01 | 0.44 | 0.62 | 0.00 | -0.15 | 0.23 | -0.12 | 0.09 | -0.09 | -0.39 | -0.48 | 1.00 | -0.32 | -0.60 | -0.61 | -0.65 | -0.71 | -0.52 | 0.70 | 0.64 | 0.37 | 0.43 | -0.19 | 0.25 | -0.13 | -0.17 | -0.02 | 0.90 |
| Service_ | 0.09 | 0.02 | -0.02 | -0.19 | 0.18 | -0.32 | 0.44 | -0.48 | -0.05 | 0.04 | 0.41 | 0.37 | 0.22 | 0.23 | -0.32 | 1.00 | 0.37 | 0.04 | -0.33 | -0.13 | 0.27 | 0.10 | -0.02 | 0.17 | -0.12 | -0.05 | 0.05 | 0.02 | -0.14 | 0.37 | -0.13 |
| Office_ | 0.23 | 0.05 | -0.05 | -0.36 | 0.06 | -0.50 | 0.29 | -0.16 | -0.15 | -0.04 | 0.15 | 0.20 | 0.13 | 0.25 | -0.60 | 0.37 | 1.00 | 0.10 | 0.17 | 0.42 | 0.43 | -0.39 | -0.61 | -0.23 | -0.10 | 0.38 | -0.38 | -0.07 | -0.22 | 0.12 | -0.53 |
| Construction_ | -0.18 | 0.67 | -0.67 | 0.07 | -0.65 | -0.28 | -0.04 | 0.25 | -0.40 | 0.49 | -0.18 | 0.04 | 0.11 | 0.13 | -0.61 | 0.04 | 0.10 | 1.00 | 0.21 | 0.48 | 0.59 | -0.64 | -0.25 | -0.09 | 0.00 | -0.43 | 0.28 | 0.52 | 0.63 | -0.32 | -0.59 |
| Production_ | 0.01 | -0.01 | 0.01 | 0.21 | -0.38 | -0.35 | -0.39 | 0.44 | 0.01 | -0.15 | -0.35 | -0.21 | 0.33 | 0.40 | -0.65 | -0.33 | 0.17 | 0.21 | 1.00 | 0.67 | 0.03 | -0.53 | -0.54 | -0.51 | -0.53 | 0.44 | -0.43 | -0.14 | 0.02 | -0.05 | -0.67 |
| Drive_ | -0.08 | 0.13 | -0.13 | 0.16 | -0.46 | -0.35 | -0.16 | 0.41 | -0.20 | -0.03 | -0.36 | -0.20 | 0.14 | 0.26 | -0.71 | -0.13 | 0.42 | 0.48 | 0.67 | 1.00 | 0.26 | -0.92 | -0.84 | -0.71 | -0.29 | 0.36 | -0.41 | 0.08 | 0.08 | -0.23 | -0.73 |
| Carpool_ | -0.02 | 0.59 | -0.59 | -0.32 | -0.29 | -0.40 | 0.19 | -0.22 | -0.35 | 0.32 | 0.41 | 0.57 | 0.01 | 0.06 | -0.52 | 0.27 | 0.43 | 0.59 | 0.03 | 0.26 | 1.00 | -0.51 | -0.25 | 0.23 | 0.06 | -0.37 | 0.29 | 0.30 | 0.23 | -0.05 | -0.46 |
| Transit_ | 0.17 | -0.41 | 0.41 | -0.16 | 0.62 | 0.28 | 0.14 | -0.42 | 0.39 | -0.19 | 0.27 | 0.05 | -0.02 | -0.13 | 0.70 | 0.10 | -0.39 | -0.64 | -0.53 | -0.92 | -0.51 | 1.00 | 0.68 | 0.45 | 0.06 | -0.13 | 0.24 | -0.29 | -0.25 | 0.31 | 0.71 |
| Walk_ | -0.20 | 0.10 | -0.10 | 0.19 | 0.07 | 0.50 | -0.13 | -0.06 | 0.00 | 0.28 | 0.16 | 0.13 | -0.24 | -0.37 | 0.64 | -0.02 | -0.61 | -0.25 | -0.54 | -0.84 | -0.25 | 0.68 | 1.00 | 0.72 | 0.30 | -0.47 | 0.47 | 0.11 | 0.16 | 0.03 | 0.66 |
| OtherTransp_ | -0.04 | 0.26 | -0.26 | -0.18 | 0.09 | 0.10 | 0.17 | -0.44 | 0.09 | 0.40 | 0.34 | 0.33 | -0.04 | -0.12 | 0.37 | 0.17 | -0.23 | -0.09 | -0.51 | -0.71 | 0.23 | 0.45 | 0.72 | 1.00 | 0.27 | -0.57 | 0.60 | 0.05 | 0.08 | 0.32 | 0.41 |
| WorkAtHome_ | -0.02 | 0.30 | -0.30 | -0.07 | -0.08 | 0.48 | 0.19 | 0.13 | -0.45 | 0.15 | 0.05 | 0.03 | -0.46 | -0.52 | 0.43 | -0.12 | -0.10 | 0.00 | -0.53 | -0.29 | 0.06 | 0.06 | 0.30 | 0.27 | 1.00 | -0.17 | 0.00 | 0.54 | 0.32 | -0.40 | 0.34 |
| PrivateWork_ | 0.29 | -0.37 | 0.37 | -0.04 | 0.19 | 0.00 | -0.03 | 0.27 | 0.00 | -0.60 | -0.19 | -0.30 | -0.02 | 0.05 | -0.19 | -0.05 | 0.38 | -0.43 | 0.44 | 0.36 | -0.37 | -0.13 | -0.47 | -0.57 | -0.17 | 1.00 | -0.95 | -0.36 | -0.46 | 0.01 | -0.14 |
| PublicWork_ | -0.31 | 0.25 | -0.25 | 0.01 | -0.04 | -0.06 | 0.01 | -0.37 | 0.18 | 0.52 | 0.20 | 0.29 | 0.07 | 0.00 | 0.25 | 0.05 | -0.38 | 0.28 | -0.43 | -0.41 | 0.29 | 0.24 | 0.47 | 0.60 | 0.00 | -0.95 | 1.00 | 0.05 | 0.27 | 0.15 | 0.20 |
| SelfEmployed_ | 0.00 | 0.43 | -0.43 | 0.10 | -0.43 | 0.20 | 0.05 | 0.22 | -0.52 | 0.35 | 0.02 | 0.09 | -0.12 | -0.15 | -0.13 | 0.02 | -0.07 | 0.52 | -0.14 | 0.08 | 0.30 | -0.29 | 0.11 | 0.05 | 0.54 | -0.36 | 0.05 | 1.00 | 0.63 | -0.43 | -0.13 |
| FamilyWork_ | -0.21 | 0.59 | -0.59 | 0.12 | -0.66 | 0.12 | -0.14 | 0.27 | -0.37 | 0.53 | -0.12 | 0.04 | -0.06 | -0.13 | -0.17 | -0.14 | -0.22 | 0.63 | 0.02 | 0.08 | 0.23 | -0.25 | 0.16 | 0.08 | 0.32 | -0.46 | 0.27 | 0.63 | 1.00 | -0.44 | -0.18 |
| Unemployment_ | 0.28 | -0.48 | 0.48 | -0.15 | 0.51 | -0.61 | 0.29 | -0.56 | 0.60 | -0.06 | 0.03 | -0.10 | 0.61 | 0.62 | -0.02 | 0.37 | 0.12 | -0.32 | -0.05 | -0.23 | -0.05 | 0.31 | 0.03 | 0.32 | -0.40 | 0.01 | 0.15 | -0.43 | -0.44 | 1.00 | 0.00 |
| income_weighted | 0.03 | -0.16 | 0.16 | -0.05 | 0.48 | 0.71 | 0.05 | -0.22 | 0.14 | -0.10 | 0.28 | 0.04 | -0.56 | -0.63 | 0.90 | -0.13 | -0.53 | -0.59 | -0.67 | -0.73 | -0.46 | 0.71 | 0.66 | 0.41 | 0.34 | -0.14 | 0.20 | -0.13 | -0.18 | 0.00 | 1.00 |
PC1 explains 24.05% of the varaince in the original data set. PC1 and PC2 combined explain 46.04%
pca.sc <- prcomp(data, scale = TRUE)
summary(pca.sc)
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7
Standard deviation 2.7305 2.6108 2.2165 1.8331 1.39835 1.0981 1.02592
Proportion of Variance 0.2405 0.2199 0.1585 0.1084 0.06308 0.0389 0.03395
Cumulative Proportion 0.2405 0.4604 0.6189 0.7273 0.79034 0.8292 0.86319
PC8 PC9 PC10 PC11 PC12 PC13 PC14
Standard deviation 0.95189 0.83542 0.71417 0.64627 0.62422 0.59567 0.50603
Proportion of Variance 0.02923 0.02251 0.01645 0.01347 0.01257 0.01145 0.00826
Cumulative Proportion 0.89242 0.91493 0.93139 0.94486 0.95743 0.96887 0.97713
PC15 PC16 PC17 PC18 PC19 PC20 PC21
Standard deviation 0.41380 0.3431 0.32286 0.31120 0.25988 0.21717 0.18538
Proportion of Variance 0.00552 0.0038 0.00336 0.00312 0.00218 0.00152 0.00111
Cumulative Proportion 0.98266 0.9865 0.98982 0.99294 0.99512 0.99664 0.99775
PC22 PC23 PC24 PC25 PC26 PC27 PC28
Standard deviation 0.16042 0.1369 0.10672 0.08802 0.07514 0.02153 0.0003769
Proportion of Variance 0.00083 0.0006 0.00037 0.00025 0.00018 0.00001 0.0000000
Cumulative Proportion 0.99858 0.9992 0.99955 0.99980 0.99999 1.00000 1.0000000
PC29 PC30 PC31
Standard deviation 0.0002105 0.0001032 5.562e-15
Proportion of Variance 0.0000000 0.0000000 0.000e+00
Cumulative Proportion 1.0000000 1.0000000 1.000e+00
Looking at the Scree plot, the amount of new variation explained by PCs elbows around the fith or sixth PC. An eigenvalue greater than 1 means that PC accounts for more variation than at least one of the original features, so you can use this method on top of the scree plot to decide the number of PCs. The first 7 PCs have an eigenvalue greater than 1. Since the Scree plot shows a very slight increase from PC6 to PC7, in my opinion, the best bet is to use the first 6 PCS only.
fviz_eig(pca.sc, addlabels = TRUE)
round(pca.sc$sdev^2, 3)
[1] 7.456 6.816 4.913 3.360 1.955 1.206 1.053 0.906 0.698 0.510 0.418 0.390
[13] 0.355 0.256 0.171 0.118 0.104 0.097 0.068 0.047 0.034 0.026 0.019 0.011
[25] 0.008 0.006 0.000 0.000 0.000 0.000 0.000
District of Columbia is very different than the other states in PC1. You could possibly say this about Hawaii in PC3 and Alaska in PC2 as well, but the difference isn’t as large. Everything else appears normal.
dat_pca <- pca.sc$x[ , 1:7]
ggplot(dat_pca, aes(x = PC1, y = PC2, color = PC3)) +
geom_text(label = state) +
scale_color_gradient2(midpoint = mean(dat_pca[,3]),
low = "#00AFBB", mid = "#E7B800",
high = "#FC4E07")
20 states in cluster1 and 31 in cluster2.
set.seed(1)
km2 <- kmeans(dat_pca, centers = 2, nstart = 20)
kable(table(km2$cluster))
| Var1 | Freq |
|---|---|
| 1 | 20 |
| 2 | 31 |
They match since centers tells you the average PC value for each cluster. The centroid is the average.
km2$centers
PC1 PC2 PC3 PC4 PC5 PC6 PC7
1 1.015126 -2.513504 0.5435660 0.1841649 0.15079954 -0.04767570 -0.05434532
2 -0.654920 1.621615 -0.3506877 -0.1188161 -0.09729003 0.03075852 0.03506150
m <- as.data.frame(pca.sc$x)
mean(m$PC1[km2$cluster == 1])
[1] 1.015126
mean(m$PC2[km2$cluster == 1])
[1] -2.513504
mean(m$PC3[km2$cluster == 1])
[1] 0.543566
mean(m$PC4[km2$cluster == 1])
[1] 0.1841649
mean(m$PC1[km2$cluster == 2])
[1] -0.65492
mean(m$PC2[km2$cluster == 2])
[1] 1.621615
mean(m$PC3[km2$cluster == 2])
[1] -0.3506877
mean(m$PC4[km2$cluster == 2])
[1] -0.1188161
It elbows around k = 6, so after 6 clusters the decreases in total within cluster sum of squares is much smaller compared than the earlier drops. I would choose 6 clusters.
fviz_nbclust(dat_pca, kmeans, method = "wss") +
labs(subtitle = "Elbow Method")
8 clusters gives the highest silhouette width, but its important to note the improvement over nearby values of K is pretty small.
fviz_nbclust(dat_pca, kmeans, method = "silhouette") +
labs(subtitle = "Elbow Method")
Cluster 1 has 1 observation. Cluster 2 has 10 observations. Cluster 3 has 6 observations. Cluster 4 has 17 observations. Cluster 5 has 15 observations. Cluster 6 has 2 observations.
set.seed(1)
km2 <- kmeans(dat_pca, centers = 6, nstart = 20)
kable(table(km2$cluster))
| Var1 | Freq |
|---|---|
| 1 | 1 |
| 2 | 10 |
| 3 | 6 |
| 4 | 17 |
| 5 | 15 |
| 6 | 2 |
The state in its own cluster, District of Columbia, was the state I said was weirdly different than the other states in question (f). The two states in their own cluster, Alaska and Hawaii, were the two other states that I said may be different than the rest. This is most likely because they’re the three weird states that aren’t really inside the United States. I don’t know exactly how to describe them.
acs$State[km2$cluster == 1]
[1] "District of Columbia"
acs$State[km2$cluster == 6]
[1] "Alaska" "Hawaii"
My geography isn’t great, but it seems states located next/near each other typically fell in the same cluster.
acs$State[km2$cluster == 2]
[1] "Connecticut" "Delaware" "Illinois" "Maryland"
[5] "Massachusetts" "New Jersey" "New York" "Pennsylvania"
[9] "Rhode Island" "Virginia"
acs$State[km2$cluster == 3]
[1] "Arizona" "California" "Florida" "Nevada" "New Mexico"
[6] "Texas"
acs$State[km2$cluster == 4]
[1] "Colorado" "Idaho" "Iowa" "Kansas"
[5] "Maine" "Minnesota" "Montana" "Nebraska"
[9] "New Hampshire" "North Dakota" "Oregon" "South Dakota"
[13] "Utah" "Vermont" "Washington" "Wisconsin"
[17] "Wyoming"
acs$State[km2$cluster == 5]
[1] "Alabama" "Arkansas" "Georgia" "Indiana"
[5] "Kentucky" "Louisiana" "Michigan" "Mississippi"
[9] "Missouri" "North Carolina" "Ohio" "Oklahoma"
[13] "South Carolina" "Tennessee" "West Virginia"
scat <- as.data.frame(dat_pca)
scat <- mutate(scat, State = state, Cluster = km2$cluster)
ggplot(scat, aes(x = PC1, y = PC2, label = State, size = PC3)) +
geom_text(aes(color = Cluster))
scat <- scat %>% filter(!State %in% c("District of Columbia", "Alaska", "Hawaii"))
ggplot(scat, aes(x = PC1, y = PC2, label = State, size = PC3)) +
geom_text(aes(color = Cluster))
The top 3 variables in PC1 are Transit, Drive, and Income_weighted. The top 3 variables in PC2 are Men, Women, and Child Poverty. The top 3 variables in PC3 are White, Carpool, and Service. PC1 variables make sense, since states like New York and Massachusetts are on the far left and are heavy in transit and income, and lower in drive. Alabama, Wyoming, etc. being on the far right then also makes sense. PC2 variables make sense. States like Mississippi and Alabama which are probably high in child poverty are high up on the plot. Montana, Wyoming, North Dakota must be low on child poverty. PC3 variables make sense, larger states are typically more white and smaller ones are more diverse (California).
pca.sc$rotation[order(abs(pca.sc$rotation[,1]), decreasing = TRUE)[1:3], 1]
Transit_ Drive_ income_weighted
-0.3317263 0.3109031 -0.3106181
pca.sc$rotation[order(abs(pca.sc$rotation[,2]), decreasing = TRUE)[1:3], 2]
Men Women ChildPoverty_
-0.2821294 0.2821294 0.2681130
pca.sc$rotation[order(abs(pca.sc$rotation[,3]), decreasing = TRUE)[1:3], 3]
White_ Carpool_ Service_
0.347715 -0.302438 -0.284190