library(DAAG)
library(magrittr)

Question 9.1:

Using the same crime data set uscrime.txt as in Question 8.2, apply Principal Component Analysis and then create a regression model using the first few principal components. Specify your new model in terms of the original variables (not the principal components), and compare its quality to that of your solution to Question 8.2. You can use the R function prcomp for PCA. (Note that to first scale the data, you can include scale. = TRUE to scale as part of the PCA function. Don’t forget that, to make a prediction for the new city, you’ll need to unscale the coefficients (i.e., do the scaling calculation in reverse)!)

#set the working directory so we can correctly pull in the data
setwd("C:/Users/tarra/Documents/Intro to Analytics Modeling/Fall2020hw6/data 9.1")

uscrime_data <- read.table("uscrime.txt",stringsAsFactors = FALSE, header = TRUE)

uscrime_prc <- prcomp(uscrime_data[,-16], center=TRUE, scale=TRUE)

summary(uscrime_prc)
## Importance of components:
##                           PC1    PC2    PC3     PC4     PC5     PC6     PC7
## Standard deviation     2.4534 1.6739 1.4160 1.07806 0.97893 0.74377 0.56729
## Proportion of Variance 0.4013 0.1868 0.1337 0.07748 0.06389 0.03688 0.02145
## Cumulative Proportion  0.4013 0.5880 0.7217 0.79920 0.86308 0.89996 0.92142
##                            PC8     PC9    PC10    PC11    PC12    PC13   PC14
## Standard deviation     0.55444 0.48493 0.44708 0.41915 0.35804 0.26333 0.2418
## Proportion of Variance 0.02049 0.01568 0.01333 0.01171 0.00855 0.00462 0.0039
## Cumulative Proportion  0.94191 0.95759 0.97091 0.98263 0.99117 0.99579 0.9997
##                           PC15
## Standard deviation     0.06793
## Proportion of Variance 0.00031
## Cumulative Proportion  1.00000
screeplot(uscrime_prc, main = "Scree Plot", type = "line")

looking at the scree plot, we can see that we can choose the first five Pc’s in our model because they’re all above the STD. Next we want to focus on the model.

#we set the number of PC's we want to test, and then combine them with our original crime data. 
pc = 5

uscrime_matrix <- cbind(uscrime_prc$x[,1:pc],uscrime_data[,16])

#now that its combines, we can creat the LR model. 

uscrime_model <- lm(V6~., data = as.data.frame(uscrime_matrix))

summary(uscrime_model)
## 
## Call:
## lm(formula = V6 ~ ., data = as.data.frame(uscrime_matrix))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -420.79 -185.01   12.21  146.24  447.86 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   905.09      35.59  25.428  < 2e-16 ***
## PC1            65.22      14.67   4.447 6.51e-05 ***
## PC2           -70.08      21.49  -3.261  0.00224 ** 
## PC3            25.19      25.41   0.992  0.32725    
## PC4            69.45      33.37   2.081  0.04374 *  
## PC5          -229.04      36.75  -6.232 2.02e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 244 on 41 degrees of freedom
## Multiple R-squared:  0.6452, Adjusted R-squared:  0.6019 
## F-statistic: 14.91 on 5 and 41 DF,  p-value: 2.446e-08

Now we want to focus on our transformation so that we can run our new model on our variables from 8.2 and compare the quality.

#what we want to do here is to preform the transformation as explain in this week's lessons. 
#to start, we need our intercept 
intercept <- uscrime_model$coefficients[1]

#now we need our other coefficients to make the beta vector 
beta_vector <- uscrime_model$coefficients[2:(pc+1)]

#next we take the rotated matrix and multiply it be our coefficients to creat the alpha vector. 
alpha_vector <- uscrime_prc$rotation[,1:pc] %*% beta_vector

#we can get our original alpha values by doing alpha_vector/signma, and our original beta by subtracting the intercept from the sum of (alpha*mu)/sigma 
#for this we need to get mu and sigma. 
mu <- sapply(uscrime_data[,1:15], mean)
sigma <- sapply(uscrime_data[,1:15], sd)

original_alpha <- alpha_vector / sigma
original_beta <- intercept - sum(alpha_vector*mu / sigma)

#with this we can now make our estimates based on the model y = aX +b 
estimates <- as.matrix(uscrime_data[,1:15]) %*% original_alpha + original_beta

#We can use this to calculate R^2 and adjusted R^2 to compare to our 8.2 answer
SSE = sum((estimates - uscrime_data[,16])^2)
SStot = sum((uscrime_data[,16] - mean(uscrime_data[,16]))^2)
R2 <- 1 - SSE/SStot
R2
## [1] 0.6451941
R2_adjust <- R2 - (1-R2)*pc/(nrow(uscrime_data)-pc-1)
R2_adjust
## [1] 0.601925
#we can now move on to using last weeks data on our new model to compare. 
new_city <- data.frame(M= 14.0, So = 0, Ed = 10.0, Po1 = 12.0, Po2 = 15.5,
                    LF = 0.640, M.F = 94.0, Pop = 150, NW = 1.1, U1 = 0.120, 
                    U2 = 3.6, Wealth = 3200, Ineq = 20.1, Prob = 0.040, 
                    Time = 39.0)

pred_newcity <- data.frame(predict(uscrime_prc, new_city)) 

pred_newcity_model <- predict(uscrime_model, pred_newcity)

pred_newcity_model
##        1 
## 1388.926

Comparing this weeks predictions to last weeks predictions, we can see that PCA model returned pretty close results. With the prediction of 1389 compared to last weeks 1304 and an r squared of 0.645 to 0.671, it seems that this model is a little less satisfactory overall. Granted the data we used was a small amount, a bigger data set might prove to be the same if not better than last week’s. Overall we’ve shown that a PCA model can give just about the same accuracy when observing less predictors.