Modeling Strength of Concrete

Using Artificial Neural Networks

Steven Ferguson

2022-10-28

Background:

“In the field of engineering, it is crucial to have accurate estimates of the performance of building materials. These estimates are required in order to develop safety guidelines governing the materials used in the construction of buildings, bridges, and roadways. Estimating the strength of concrete is a challenge of particular interest. Although it is used in nearly every construction project, concrete performance varies greatly due to a wide variety of ingredients that interact in complex ways. As a result, it is difficult to accurately predict the strength of the final product.”

“A model that could reliably predict concrete strength given a listing of the composition of the input materials could result in safer construction practices”

Step 1-- Data Collection

Data on the compressive strength of concrete was collected by I-Cheng Yeh and donated to the UCI Machine Learning Data Repository.

http://archive.ics.uci.edu/ml

Features include:

Amount of cement (kg/m$^3$)
Amount of Slag (kg/m$^3$)
Water (kg/m$^3$)
Superplasticizer (kg/m$^3$)
Coarse Aggregate (kg/m$^3$)
Fine Aggregate (kg/m$^3$)
Aging time (days)
Strength (?)

Step 2 – Exploring and Preparing the Data

# read in data
getwd()

## [1] "C:/Users/Sferg/Desktop/JABSOM Grad School/Machine Learning/ANN"

concrete <- read.csv("concrete.csv")

# Examine Structure
str(concrete)

## 'data.frame':    1030 obs. of  9 variables:
##  $ cement      : num  141 169 250 266 155 ...
##  $ slag        : num  212 42.2 0 114 183.4 ...
##  $ ash         : num  0 124.3 95.7 0 0 ...
##  $ water       : num  204 158 187 228 193 ...
##  $ superplastic: num  0 10.8 5.5 0 9.1 0 0 6.4 0 9 ...
##  $ coarseagg   : num  972 1081 957 932 1047 ...
##  $ fineagg     : num  748 796 861 670 697 ...
##  $ age         : int  28 14 28 28 28 90 7 56 28 28 ...
##  $ strength    : num  29.9 23.5 29.2 45.9 18.3 ...

Normalize the data by re-scaling to a narrow range around zero.

#custom normalization function
normalize<-function(x) {
  return((x-min(x))/(max(x)-min(x)))
}

# apply normalization to entire data frame
concrete_norm <- as.data.frame(lapply(concrete,normalize))

# confirm that the range is now between zero and one
print("Normalized Concrete Strength:")

## [1] "Normalized Concrete Strength:"

summary(concrete_norm$strength)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.2664  0.4001  0.4172  0.5457  1.0000

print("Normalized amount of Slag:")

## [1] "Normalized amount of Slag:"

summary(concrete_norm$slag)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00000 0.00000 0.06121 0.20561 0.39775 1.00000

print("")

## [1] ""

# compared to the original minimum and maximum
print("Concrete Strength data Before Normalization")

## [1] "Concrete Strength data Before Normalization"

summary(concrete$strength)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.33   23.71   34.45   35.82   46.13   82.60

print("Slag amount data Before Normalization")

## [1] "Slag amount data Before Normalization"

summary(concrete$slag)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0     0.0    22.0    73.9   142.9   359.4

Split the data into training (75%) and test (25%) sets.

Data was already randomized. Therefore, data just need to be split into two portions

# create training and test data
concrete_train <- concrete_norm[1:773,]
concrete_test <- concrete_norm[774:1030,]

# verify correct structure
str(concrete_train)

## 'data.frame':    773 obs. of  9 variables:
##  $ cement      : num  0.0897 0.1527 0.3379 0.3744 0.1205 ...
##  $ slag        : num  0.59 0.117 0 0.317 0.51 ...
##  $ ash         : num  0 0.621 0.478 0 0 ...
##  $ water       : num  0.653 0.292 0.524 0.848 0.571 ...
##  $ superplastic: num  0 0.335 0.171 0 0.283 ...
##  $ coarseagg   : num  0.497 0.813 0.453 0.381 0.716 ...
##  $ fineagg     : num  0.388 0.507 0.67 0.191 0.258 ...
##  $ age         : num  0.0742 0.0357 0.0742 0.0742 0.0742 ...
##  $ strength    : num  0.343 0.264 0.335 0.542 0.199 ...

str(concrete_test)

## 'data.frame':    257 obs. of  9 variables:
##  $ cement      : num  0.68 0.253 0.395 1 0.626 ...
##  $ slag        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ash         : num  0 0.502 0 0 0 ...
##  $ water       : num  0.521 0.3 0.489 0.409 0.741 ...
##  $ superplastic: num  0 0.323 0 0 0 ...
##  $ coarseagg   : num  0.651 0.59 0.834 0.942 0.589 ...
##  $ fineagg     : num  0.3788 0.7772 0.5369 0.0477 0.4225 ...
##  $ age         : num  0.0165 0.1511 0.0742 0.4918 0.1511 ...
##  $ strength    : num  0.346 0.524 0.276 0.863 0.423 ...

Step 3 – training a model on the data

Multilayer feedforward neural network used to model the relationship between ingredients used and strength of finished concrete. Using the R package “neuralnet” by Fritsch and Frauke.

Provides a standard/easy-to-use implementation of networks.

Function to plot network topology

Powerful tool, good choice for learning more about neural networks

Building the model:

m <- neuralnet(target ~ predictors, data = mydata, hidden = 1)

where target = the outcome in the “mydata” data frame to be modeled

predictors = R formula specifying which features in mydata data frame to use for prediction

data = data frame in which target and predictors variables are found

hidden = # of neurons in the hidden layer

Making predictions:

p <- compute (m, test)

m is a model trained by neuralnet() function

test is a data frame containing test data with same features as the training data used to build the classifier

Train a model on the data

# Train the neuralnet model
# install.packages("neuralnet")
library(neuralnet)

# Simple ANN with only a single hidden neuron
RNGversion("3.5.2") # Random Number Generator

## Warning in RNGkind("Mersenne-Twister", "Inversion", "Rounding"): non-uniform
## 'Rounding' sampler used

set.seed(12345) # to guarantee repeatable results
concrete_model <- neuralnet(formula = strength ~ cement + slag + 
                              ash + water + superplastic + 
                              coarseagg + fineagg + age,
                              data = concrete_train)

# Visualize the Network Topology
plot(concrete_model)

Step 4 – Evaluating Model Performance

The network topology diagram does tell us much about fitness of this model to future data. Therefore, to make predictions from about the test dataset, we’ll use “compute()”. This works slightly different than the “predict()” function.

$neurons: stores the neurons for each layer in network
$net.result: stores the predicted values

Because this is a numeric prediction problem (not a classification problem) a confusion matrix cannot be used to evaluate model accuracy. Therefore, we examine the correlation between our predicted concrete strength and the true concrete strength (the test data for concrete strength). Correlation provides insight into the strength of the linear relationship between the two variables.

# obtain model results
model_results <- compute(concrete_model, concrete_test[1:8])

# obtain predicted strength values
predicted_strength <- model_results$net.result

# Examine the Correlation between predicted and actual values
print ("Correlation between Predicted Strength and Actual Strength")

## [1] "Correlation between Predicted Strength and Actual Strength"

cor(predicted_strength, concrete_test$strength)

##           [,1]
## [1,] 0.8064656

80.65% correlation between predicted strength and actual strength: not great, but not bad for a first try. It is a fairly strong relationship, considering it only has a single hidden node (see plotted network topology).

Perhaps we can improve the performance of the model (obtain a correlation closer to 1.0) if we add more layers and more complexity to the model.

Step 5 – Improving Model Performance

Networks with more complex topologies are capable of learning more difficult concepts. Our last model only had 1 hidden node. Therefore, let’s see how it performs with 5 hidden nodes.

# Create a more complex NN toplology: 5 hidden nodes!
RNGversion("3.5.2")

## Warning in RNGkind("Mersenne-Twister", "Inversion", "Rounding"): non-uniform
## 'Rounding' sampler used

set.seed(12345) # to guarantee repeatable results
concrete_model2 <- neuralnet(formula = strength ~ cement + slag + 
                              ash + water + superplastic + 
                              coarseagg + fineagg + age,
                              data = concrete_train,
                              hidden = 5)
# Plot the network
plot(concrete_model2)

Standard square error reducted from 5.08 to 1.63. However, the number of training steps rose from 4,882 to 86,849. As more complexity is introduced tot he model (more hidden layers), more iterations must be completed to assign optimal weights.

Again, we will evaluate the model performance by analyzing the correlation between the predicted strength(2), and the actual strength (concrete_test$strength)

model_results2 <- compute(concrete_model2, concrete_test[1:8])
predicted_strength2 <- model_results2$net.result
cor(predicted_strength2, concrete_test$strength)

##           [,1]
## [1,] 0.9244533

92.4% correlation (using 5 neurons) is a significant improvement over 80.65% (using 1 neuron). Given this improvement in accuracy and reduction of SEE, this model clearly is superior.

But we can do even BETTER

We can add a custom activation function (softplus) and evaluate the results.

Here is a neural network with Two hidden layers and a Softplus activation function

softplus <- function(x) {log(1+exp(x))}
RNGversion("3.5.2")

## Warning in RNGkind("Mersenne-Twister", "Inversion", "Rounding"): non-uniform
## 'Rounding' sampler used

set.seed(12345)
concrete_model3 <-neuralnet(strength ~ cement + slag + ash + water
                            + superplastic + coarseagg + fineagg + age,
                            data = concrete_train, 
                            hidden = c(5,5), 
                            act.fct = softplus)
#plot the network
plot(concrete_model3)

Evaluate the results just like we did in the other 2 models

SSE actually increased slight by 0.039 to 1.666. Additionally, it took much, much longer to iterate, using more computational power. Was it worth the extra time? Maybe if it produces a correlation much closer to 1.0.

Find correlation:

model_results3 <- compute(concrete_model3, concrete_test[1:8])
predicted_strength3 <- model_results3$net.result
cor(predicted_strength3, concrete_test$strength)

##           [,1]
## [1,] 0.9348395

Well, it did improve by 1.0%. But I would still choose the 2nd NN model. Although it was slightly less accurate than the 3rd NN model, results were provided instantaneously. It’s a trade off of accuracy for using less computational power.

—Moving on—

Predicted and actual values are still on different scales. This is because we normalized the data for our NN.

strengths <- data.frame(
  actual = concrete$strength[774:1030],
  pred = predicted_strength3
)
head(strengths, n=3)

##     actual      pred
## 774  30.14 0.2860639
## 775  44.40 0.4777305
## 776  24.50 0.2840964

This doesn’t change the correlations (but would affect absolute error)

cor(strengths$pred, strengths$actual)

## [1] 0.9348395

Create an un-normalize function to reverse the normalization

unnormalize <- function(x) {
  return((x*(max(concrete$strength))-
            min(concrete$strength))+min(concrete$strength))
}
strengths$pred_new <- unnormalize(strengths$pred)
strengths$error <- strengths$pred_new - strengths$actual
head(strengths, n = 3)

##     actual      pred pred_new     error
## 774  30.14 0.2860639 23.62888 -6.511121
## 775  44.40 0.4777305 39.46054 -4.939464
## 776  24.50 0.2840964 23.46636 -1.033635

cor(strengths$pred_new, strengths$actual)

## [1] 0.9348395

We now have a model that can predict concrete strength with $92-93.5%$ accuracy, given the data:

Amount of cement (kg/m$^3$), Slag (kg/m$^3$), Water (kg/m$^3$), Superplasticizer (kg/m$^3$), Coarse Aggregate (kg/m$^3$), Fine Aggregate (kg/m$^3$), Aging time (days).