The original dataset found here measures the compressive strength of concrete with 1030 observations with 8 numeric features describing components of the mixture, being the amount of cement, slag, ash, water, superplasticizer, coarse aggregate, and fine aggregate.
Using a neural network, I constructed a model that predicts the strength of the concrete using the variables given, then validated the data using test data which was partitioned from the original dataset.
concrete <- read.csv('concrete.csv')
str(concrete)
## 'data.frame': 1030 obs. of 9 variables:
## $ cement : num 141 169 250 266 155 ...
## $ slag : num 212 42.2 0 114 183.4 ...
## $ ash : num 0 124.3 95.7 0 0 ...
## $ water : num 204 158 187 228 193 ...
## $ superplastic: num 0 10.8 5.5 0 9.1 0 0 6.4 0 9 ...
## $ coarseagg : num 972 1081 957 932 1047 ...
## $ fineagg : num 748 796 861 670 697 ...
## $ age : int 28 14 28 28 28 90 7 56 28 28 ...
## $ strength : num 29.9 23.5 29.2 45.9 18.3 ...
Neural networks work best when the inputs are scaled from a narrow range around 0, so we transform and normalize our data to fit from values from 0-1.
normalize <- function(x) {
return((x-min(x)) / (max(x)-min(x)))
}
concrete_norm <- as.data.frame(lapply(concrete,normalize))
summary(concrete_norm$strength)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.2664 0.4001 0.4172 0.5457 1.0000
summary(concrete$strength)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.33 23.71 34.45 35.82 46.13 82.60
Partition the data into train and test using 75% of observations for training.
concrete_train <- concrete_norm[1:773, ]
concrete_test <- concrete_norm[774:1030, ]
library(neuralnet)
## Warning: package 'neuralnet' was built under R version 3.5.3
model <- neuralnet(strength ~ ., data = concrete_train)
plot(model)
Neural network model
We then test the model vs our test data against the 8 variables used to predict concrete strength.
Using the compute() function in the neuralnetwork package we can generate predicted results for the strength of the concrete.
We then measure the difference of the predicted strength and actual strength from our partitioned test data.
results <- compute(model, concrete_test[1:8])
predicted_strength <- results$net.result
cor(predicted_strength, concrete_test$strength)
## [,1]
## [1,] 0.8052466