Question 11.1

library(stats)
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
crime <- read.table("uscrime.txt", header = T)
head(crime)
##      M So   Ed  Po1  Po2    LF   M.F Pop   NW    U1  U2 Wealth Ineq     Prob
## 1 15.1  1  9.1  5.8  5.6 0.510  95.0  33 30.1 0.108 4.1   3940 26.1 0.084602
## 2 14.3  0 11.3 10.3  9.5 0.583 101.2  13 10.2 0.096 3.6   5570 19.4 0.029599
## 3 14.2  1  8.9  4.5  4.4 0.533  96.9  18 21.9 0.094 3.3   3180 25.0 0.083401
## 4 13.6  0 12.1 14.9 14.1 0.577  99.4 157  8.0 0.102 3.9   6730 16.7 0.015801
## 5 14.1  0 12.1 10.9 10.1 0.591  98.5  18  3.0 0.091 2.0   5780 17.4 0.041399
## 6 12.1  0 11.0 11.8 11.5 0.547  96.4  25  4.4 0.084 2.9   6890 12.6 0.034201
##      Time Crime
## 1 26.2011   791
## 2 25.2999  1635
## 3 24.3006   578
## 4 29.9012  1969
## 5 21.2998  1234
## 6 20.9995   682
set.seed(1)

Using caret’s preprocess function which can transform our data (perform centering, scaling, etc.) we’ll start off with the Stepwise Regression analysis.

set.seed(1)

training.sample <- createDataPartition(crime$Crime, p=0.75, list=FALSE)
x_train <- crime[ training.sample, ]
x_test <- crime[-training.sample, ]

#predictor variables
x <- x_train[,-16]
#response variables
y <- x_train$Crime

#scale
preprocessdata <- preProcess(x, method = c("center","scale"))
x <- predict(preprocessdata,x)

#Building Model
stepwise_crime <- train(Crime ~.,
                        data = x_train,
                        method = "lmStepAIC",
                        trControl = trainControl(method = "cv", number =10),
                        trace = 0)

summary(stepwise_crime)
## 
## Call:
## lm(formula = .outcome ~ Ed + Po1 + NW + Wealth + Ineq + Prob + 
##     Time, data = dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -322.28 -108.06   -2.80   86.57  449.48 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4176.7137  1046.6581  -3.991 0.000431 ***
## Ed            141.3940    54.2985   2.604 0.014579 *  
## Po1            72.9545    22.2565   3.278 0.002794 ** 
## NW             10.8542     5.8286   1.862 0.073097 .  
## Wealth          0.2546     0.1053   2.417 0.022395 *  
## Ineq          108.7116    20.9647   5.185 1.67e-05 ***
## Prob        -5801.7963  2230.5267  -2.601 0.014679 *  
## Time          -12.2066     6.7594  -1.806 0.081700 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 201.2 on 28 degrees of freedom
## Multiple R-squared:  0.7696, Adjusted R-squared:  0.712 
## F-statistic: 13.36 on 7 and 28 DF,  p-value: 1.915e-07
stepwise_crime
## Linear Regression with Stepwise Selection 
## 
## 36 samples
## 15 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 32, 33, 32, 32, 32, 33, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   300.4982  0.6244653  262.5811

So it appears from the summary of our stepwise model, after scaling and splitting the data and cross validating with a 10 folds, we obtain an R-squared value of 0.6244. It was also observed that our stepwise function removed many of the insignificant predictors resulting in only Ed, Po1, NW, Wealth, Ineq, and Prob.

Lets see how the lasso model performs..

Using Caret’s train function, the “glmnet” method has an alpha argument that determines what type of model to fit. By setting alpha = 1, that sets a lasso model to fit.

set.seed(1)

lasso_model <- train(Crime ~.,
                     data = x_train,
                     method = "glmnet",
                     tuneGrid = expand.grid(alpha = 1, lambda = 1),
                     trControl = trainControl(method = "cv", number =10),
                     )

lasso_prediction <- lasso_model %>% predict(x_test) 

R2(lasso_prediction, x_test$Crime)
## [1] 0.6405498
data.frame(as.data.frame.matrix(coef(lasso_model$finalModel, lasso_model$bestTune$lambda)))
##                        X1
## (Intercept) -5702.9324365
## M              39.7677945
## So             83.3703444
## Ed            146.1566028
## Po1            92.9306898
## Po2           -17.4098861
## LF             69.6172765
## M.F            13.5272904
## Pop             0.4729637
## NW              5.6698477
## U1          -1737.9216348
## U2             63.1382814
## Wealth          0.1996086
## Ineq           91.8008485
## Prob        -5418.9663779
## Time           -8.8869386

Based on our our lasso model, after 10 fold cross validation on our training set, the model found that when alpha = 1 and lambda =1, the corresponding R-squared was 0.6405498. This showed a slight increase in comparison to our stepwise model.

Lastly, lets perform an analysis using an elastic net model.

set.seed(1)

elastic_crime <- train(Crime~.,
                       data = x_train,
                       method = "glmnet",
                       tuneLength = 25,
                       trControl = trainControl( method = "cv", number =10),
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
elastic_crime$bestTune
##     alpha   lambda
## 614     1 6.735005
coef(elastic_crime$finalModel, elastic_crime$bestTune$lambda)
## 16 x 1 sparse Matrix of class "dgCMatrix"
##                         1
## (Intercept) -5077.3025530
## M              33.0176924
## So            109.8721805
## Ed            112.9135864
## Po1            91.2513827
## Po2             .        
## LF            261.0881093
## M.F            13.2242283
## Pop             .        
## NW              2.5889007
## U1              .        
## U2             28.7855848
## Wealth          0.1388428
## Ineq           78.9026879
## Prob        -4641.7659780
## Time           -3.6955244
elastic_prediction <- elastic_crime %>% predict(x_test)

R2(elastic_prediction,x_test$Crime)
## [1] 0.7020074

For our elastic model, the optimal model was found using alpha = 1 and lambda = 6.73. This resulted with a Rsquared significantly higher than our stepwise and lasso models with a value of 0.7060159 on the test set.

Question 12.1

Coming from a chemical engineering background, design of experiments is widely used when it comes to the manufacturing industry. An example could be a piece of manufacturing equipment that is controlled by an operator. The operator has the ability to manipulate what goes into the machine while the output would be whatever the machine creates. Since it would be desired to maximize whatever the output of the machine produces based on whatever the input is, screening experiments can be employed to give you an idea of what inputs impact the output the most. Once you have an idea of what input affects the output, you could then begin to optimize your process.

Question 12.2

set.seed(1)
library(FrF2)
## Loading required package: DoE.base
## Loading required package: grid
## Loading required package: conf.design
## Registered S3 method overwritten by 'partitions':
##   method            from
##   print.equivalence lava
## Registered S3 method overwritten by 'DoE.base':
##   method           from       
##   factorize.factor conf.design
## 
## Attaching package: 'DoE.base'
## The following objects are masked from 'package:stats':
## 
##     aov, lm
## The following object is masked from 'package:graphics':
## 
##     plot.design
## The following object is masked from 'package:base':
## 
##     lengths
house <- FrF2(nruns = 16, nfactors =10, default.levels = c("Yes","No"))
data.frame(house)
##      A   B   C   D   E   F   G   H   J   K
## 1  Yes Yes Yes  No  No  No  No Yes  No Yes
## 2   No  No Yes Yes  No Yes Yes Yes  No  No
## 3  Yes  No  No Yes Yes Yes  No  No Yes  No
## 4  Yes Yes Yes Yes  No  No  No  No Yes  No
## 5   No Yes Yes Yes Yes Yes  No Yes Yes Yes
## 6   No Yes  No  No Yes  No Yes  No Yes Yes
## 7   No  No Yes  No  No Yes Yes  No Yes Yes
## 8  Yes  No Yes Yes Yes  No Yes  No  No Yes
## 9  Yes Yes  No  No  No Yes Yes Yes Yes  No
## 10 Yes Yes  No Yes  No Yes Yes  No  No Yes
## 11 Yes  No Yes  No Yes  No Yes Yes Yes  No
## 12  No Yes Yes  No Yes Yes  No  No  No  No
## 13  No Yes  No Yes Yes  No Yes Yes  No  No
## 14 Yes  No  No  No Yes Yes  No Yes  No Yes
## 15  No  No  No  No  No  No  No  No  No  No
## 16  No  No  No Yes  No  No  No Yes Yes Yes

Based on the implementation of the fractional factorial design, for 10 features to be shown for 16 houses, we can create a matrix that gives us an idea of which houses have which features (A-K).

Question 13.1

Binomial: An example for binomial distribution could be when rolling a pair of die n-times, how many times you rolled doubles.

Geometric: An example for geometric distribution could be the number of times your dog listens to you successfully before he disobeys you.

Poisson: An example for poisson distribution could be the observation of n-shooting stars over the next month.

Exponential: An example for exponential distribution could be the time between phone calls in a call center.

Weibull: An example of Weibull distribution could be determining how many times you need to jump on a trampoline before it breaks.