## No id variables; using all as measure variables
## Cement BlastFurnaceSlag FlyAsh Water
## Min. :102.0 Min. : 0.0 Min. : 0.00 Min. :121.8
## 1st Qu.:192.4 1st Qu.: 0.0 1st Qu.: 0.00 1st Qu.:164.9
## Median :272.9 Median : 22.0 Median : 0.00 Median :185.0
## Mean :281.2 Mean : 73.9 Mean : 54.19 Mean :181.6
## 3rd Qu.:350.0 3rd Qu.:142.9 3rd Qu.:118.30 3rd Qu.:192.0
## Max. :540.0 Max. :359.4 Max. :200.10 Max. :247.0
## Superplasticizer CoarseAggregate FineAggregate Age
## Min. : 0.000 Min. : 801.0 Min. :594.0 Min. : 1.00
## 1st Qu.: 0.000 1st Qu.: 932.0 1st Qu.:731.0 1st Qu.: 7.00
## Median : 6.400 Median : 968.0 Median :779.5 Median : 28.00
## Mean : 6.205 Mean : 972.9 Mean :773.6 Mean : 45.66
## 3rd Qu.:10.200 3rd Qu.:1029.4 3rd Qu.:824.0 3rd Qu.: 56.00
## Max. :32.200 Max. :1145.0 Max. :992.6 Max. :365.00
## ConcreteStrength
## Min. : 2.33
## 1st Qu.:23.71
## Median :34.45
## Mean :35.82
## 3rd Qu.:46.13
## Max. :82.60
- Cement is strongest factor for strength.
- FLyAsh is the weakest correlation to Strength but still at -0.106 , perhaps not insignificant.
- Strongest correlations between predictors is -0.66. - Super Plasticiser and Water.
When fitting our model we can consider the predictive value of the interrelated variables.
We used the caret package to perform cross validated training and testing of various models. We set up a training control variable,set to perform 3 time repeated 5 fold splits.
## RMSE: 9.79
## RMSE: 0.66
## lm variable importance
##
## Overall
## Age 100.000
## Cement 84.877
## `Water:Age` 80.779
## BlastFurnaceSlag 60.830
## `Water:Superplasticizer` 34.919
## Superplasticizer 29.981
## `Water:FineAggregate` 24.496
## FineAggregate 22.504
## Water 15.397
## `Cement:FlyAsh` 11.817
## FlyAsh 7.422
## CoarseAggregate 0.000
## GLM Model
## RMSE: 9.88
## RMSE: 0.65
## glm variable importance
##
## Overall
## Cement 100.0000
## Age 94.5583
## `Water:Age` 75.0802
## BlastFurnaceSlag 65.3983
## FlyAsh 32.0394
## `Water:Superplasticizer` 25.0380
## Water 23.2030
## Superplasticizer 19.3032
## CoarseAggregate 0.1565
## FineAggregate 0.0000
## RMSE: 5.67
## RMSE: 0.89
## Loading required package: e1071
## RMSE: 4.89
## RMSE: 0.92
## ranger variable importance
##
## Overall
## Age 100.0000
## Cement 81.2197
## Water 28.4932
## BlastFurnaceSlag 12.7374
## Superplasticizer 12.3864
## FineAggregate 4.0920
## CoarseAggregate 0.6761
## FlyAsh 0.0000
## RMSE: 4.1
## RMSE: 0.94
#### Variable importances
## gbm variable importance
##
## Overall
## Water:Age 100.000
## Cement 74.993
## Water 43.239
## Age 24.298
## BlastFurnaceSlag 24.178
## Cement:CoarseAggregate 22.545
## CoarseAggregate:FineAggregate 13.579
## Superplasticizer 11.620
## Water:Superplasticizer 8.878
## FineAggregate 6.223
## CoarseAggregate 4.373
## FlyAsh 2.590
## Cement:Water 0.000
## RMSE: 4.19
## RMSE: 0.94
## xgbTree variable importance
##
## Overall
## Water:Age 100.000
## Cement 85.066
## Age 72.045
## Water 45.502
## Cement:CoarseAggregate 39.804
## Superplasticizer 17.583
## CoarseAggregate:FineAggregate 14.463
## BlastFurnaceSlag 13.721
## Water:Superplasticizer 7.250
## FineAggregate 4.301
## CoarseAggregate 1.452
## Cement:Water 1.166
## FlyAsh 0.000
| Long | Model | rsquared | RMSE |
|---|---|---|---|
| Linear Model | lm | 0.6571239 | 9.793557 |
| Generalized Linear Model (GLM) | glm | 0.6515166 | 9.880447 |
| Generalized Additive Model (GAM) | gam | 0.8859791 | 5.673155 |
| Gradient Boosted Machine | gbm | 0.9399769 | 4.128572 |
| Random Forest | rf | 0.9180005 | 5.068819 |
| Extreme Gradient Boosting.XG Boost | xgb | 0.9370400 | 4.246084 |
- Age
- Water Content
- Water and ages inter-relationship
- Concrete content
These factors have by far and away the biggest importance on predicting Concrete Strength.
Given that Age has such a strong predictive power, that our data suggested age measurements were taking at sometimes wide time intervals, and the unreasonable effectiveness of data; We see collecting more freguent age measurements as the single biggest improvement that could be made in improving the predictive accuracy of our models.