Junkins <-read_excel("C:/Users/jacob/OneDrive - University of Texas at San Antonio/Courses/Stats for Demographic Data 2/Homework 1/Junkins Data.xlsx") %>%as.data.frame()Junkins$STATE_ABBR <-NULLJunkins$statabb <-case_when(state.name == Junkins$statename ~ state.abb)#Junkins <-filter(Junkins, STATEAB!="DC")
Policy.pc <-prcomp(Junkins[,c( "cohabit","relcons","abortion","famplnpw","perBA")], center =TRUE,scale. =TRUE,retx=T,rownames=Junkins[,62])
Warning: In prcomp.default(Junkins[, c("cohabit", "relcons", "abortion",
"famplnpw", "perBA")], center = TRUE, scale. = TRUE, retx = T,
rownames = Junkins[, 62]) :
extra argument 'rownames' will be disregarded
screeplot(Policy.pc, type ="l", main ="Scree Plot")abline(h=1)
Interpretation
Principal Component Analysis (PCA) was used to analyze total fertility rate, divorce rate, unintentional pregnancy rate, total age at first marriage, and bachelor’s degree attainment at the state-level. The Scree plot suggests that two-components are sufficient to explain variation. PC1 and PC2 cumulatively explain 62.7% of the variation in the features of the data.
The density plot and ellipse plots suggest that the strongest divergence occurs between the Northeast and the South.
PC1
The loadings on PC1 are similar loading (all loadings are >0.4) with the exception of famplnpw (family planning expenditures) (0.05). PC1 appears to be dependent educational attainment (0.57) and cohabitation behaviors (0.47). Given these loading values, PC1 may be capturing a measure of a progressive lifestyle. It follows that non-coastal states like Missouri and Kentucky score highly in PC1.
ggplot(Policy, aes(PC1, PC2, col = Species, fill = region)) +stat_ellipse(geom ="polygon", col ="black", alpha =0.5) +geom_point(shape =21, col ="black")
ggplot(Policy, aes(PC1, PC2, col = region, fill = region))+stat_density_2d(geom ="contour")+geom_point()
library(usmap)library(ggplot2)plot_usmap(data = Policy, values ="PC1", labels =TRUE)+scale_fill_continuous(low ="blue", high ="orange")+ggtitle("PC1 at State-Level")
plot_usmap(data = Policy, values ="PC2", labels =TRUE)+scale_fill_continuous(low ="blue", high ="orange")+ggtitle("PC2 at State-Level")
PC2
The loadings on PC2 are more varied compared to PC1. Familyplanpw loads highly on to PC2 (-0.93), followed by relcons (-0.21), perBA (-0.29). PC2 appears to be capturing conservative family orientation (associated with a lack of family planning and abortion). It follows that states like California would score lowly on this component.