The objective of this document is to compare various search methods to find a concrete mixture with the highest predicted compressive strength.
The compressive strength was predicted with neural networks using model averaging (see avNNet). The dataset used to train the predictive model comes from the research paper Modeling of strength of high performance concrete using artificial neural networks by I-Cheng Yeh published in Cement and Concrete Research, Vol. 28, No. 12, pp. 1797-1808 (1998). The compressive strength of concrete was predicted using its age and composition.
The dataset can be downloaded through the UCI Machine learning Repository. The final model was tuned using the caret package.
The data contains 1030 examples and the following features:
Seven search methods are tested:
Grid and random search will be used as benchmarks for comparing the other search methods.
The pacman package is used to install and load all necessary packages.
# Install and load packages
if (!require(pacman)) {install.packages("pacman", verbose = F, quiet = T)} else require(pacman, quietly = T)
suppressWarnings(pacman::p_load(plyr, caret, tidyverse, tidyselect, readr, readxl, parallel, doParallel, gridExtra, pso, GA, GenSA, DEoptim, GGally, ggfortify, broom, knitr, kableExtra, install = T))The dataset is downloaded from the data the UCI Machine Learning Data Repository:
# Load library
# download.file(url = "http://archive.ics.uci.edu/ml/machine-learning-databases/concrete/compressive/Concrete_Data.xls", destfile = "Concrete_Data.xls", method = "curl", quiet = TRUE)
# Import Data
concrete_data <- read_xls(path = "Concrete_Data.xls", sheet = 1)Check the structure of the dataset:
# Check structure of the dataset
glimpse(concrete_data)Observations: 1,030
Variables: 9
$ `Cement (component 1)(kg in a m^3 mixture)` <dbl> 540.0,...
$ `Blast Furnace Slag (component 2)(kg in a m^3 mixture)` <dbl> 0.0, 0...
$ `Fly Ash (component 3)(kg in a m^3 mixture)` <dbl> 0, 0, ...
$ `Water (component 4)(kg in a m^3 mixture)` <dbl> 162, 1...
$ `Superplasticizer (component 5)(kg in a m^3 mixture)` <dbl> 2.5, 2...
$ `Coarse Aggregate (component 6)(kg in a m^3 mixture)` <dbl> 1040.0...
$ `Fine Aggregate (component 7)(kg in a m^3 mixture)` <dbl> 676.0,...
$ `Age (day)` <dbl> 28, 28...
$ `Concrete compressive strength(MPa, megapascals)` <dbl> 79.986...
# Rename variables
colnames(concrete_data) <- c("Cement", "Slag", "Ash", "Water", "Superplasticizer", "Coarse_Aggregate", "Fine_Aggregate", "Age", "Strength")
ingredients <- c("Cement", "Slag", "Ash", "Water", "Superplasticizer", "Coarse_Aggregate", "Fine_Aggregate")The values of the components of the concrete were recalculated so their values range from 0 to 1.
# Recalculate composition as proportions
concrete_data[, ingredients] <- t(apply(X = concrete_data[, ingredients], MARGIN = 1, FUN = function(x) {x/sum(x)}))
# Print summary statistics
glimpse(concrete_data)Observations: 1,030
Variables: 9
$ Cement <dbl> 0.223094, 0.221720, 0.149170, 0.149170, 0.085...
$ Slag <dbl> 0.000000, 0.000000, 0.063930, 0.063930, 0.056...
$ Ash <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ Water <dbl> 0.066928, 0.066516, 0.102288, 0.102288, 0.082...
$ Superplasticizer <dbl> 0.0010328, 0.0010265, 0.0000000, 0.0000000, 0...
$ Coarse_Aggregate <dbl> 0.42966, 0.43318, 0.41812, 0.41812, 0.42047, ...
$ Fine_Aggregate <dbl> 0.27928, 0.27756, 0.26649, 0.26649, 0.35476, ...
$ Age <dbl> 28, 28, 270, 365, 360, 90, 365, 28, 28, 28, 9...
$ Strength <dbl> 79.9861, 61.8874, 40.2695, 41.0528, 44.2961, ...
Print statistics for each variable and check for missing (NA) values:
concrete_data %>%
signif(5) %>%
gather(key = "Feature", value = "Quantity") %>%
dplyr::group_by(Feature) %>%
summarise(
`NA` = sum(is.array(Quantity), na.rm = T),
Min = min(Quantity, na.rm = T),
`1st Quartile` = quantile(Quantity, probs = 0.25, na.rm = T),
Median = median(Quantity, na.rm = T),
Mean = mean(Quantity, na.rm = T),
`3rd Quartile` = quantile(Quantity, probs = 0.75),
Max = max(Quantity, na.rm = T)
) %>%
kable(digits = 4, align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center")| Feature | NA | Min | 1st Quartile | Median | Mean | 3rd Quartile | Max |
|---|---|---|---|---|---|---|---|
| Age | 0 | 1.0000 | 7.0000 | 28.0000 | 45.6621 | 56.0000 | 365.0000 |
| Ash | 0 | 0.0000 | 0.0000 | 0.0000 | 0.0232 | 0.0503 | 0.0888 |
| Cement | 0 | 0.0448 | 0.0821 | 0.1153 | 0.1196 | 0.1492 | 0.2254 |
| Coarse_Aggregate | 0 | 0.3459 | 0.3923 | 0.4205 | 0.4152 | 0.4376 | 0.4798 |
| Fine_Aggregate | 0 | 0.2480 | 0.3112 | 0.3305 | 0.3301 | 0.3541 | 0.4142 |
| Slag | 0 | 0.0000 | 0.0000 | 0.0095 | 0.0316 | 0.0620 | 0.1503 |
| Strength | 0 | 2.3318 | 23.7075 | 34.4430 | 35.8178 | 46.1365 | 82.5990 |
| Superplasticizer | 0 | 0.0000 | 0.0000 | 0.0027 | 0.0026 | 0.0043 | 0.0131 |
| Water | 0 | 0.0514 | 0.0695 | 0.0786 | 0.0777 | 0.0839 | 0.1122 |
There are no NA values in the dataset however input variables have different ranges of values.
# Plots histograms of features
concrete_data %>%
gather(key = Variable, value = Value) %>%
ggplot() +
geom_histogram(aes(x = Value), bins = 20, fill = "#08519c") +
facet_wrap(~Variable, scales='free') +
theme_bw() +
theme(aspect.ratio = 0.5, axis.title = element_blank(), panel.grid = element_blank())The aim is to find a concrete mixture with the highest compressive strength possible. Several search methods will be tested in order to obtain a concrete mixture with the highest predicted compressive strength.
The compressive strength will be obtained from a predictive model previously obtained from training data. This will be done with neural networks using model averaging that were obtained using the caret package.
Since the compressive strength generally increases with ageing and we are not interested in optimizing for this variable, predictions will be made using always using the same number of days of aging. All predictions will be made with the concrete age fixed at 28 days.
# Import predictive model
avNNet_model_final<- readRDS(file = "Models/avNNet_model.rds")
# Define minimum and maximum values for each input
margin <- 0.05
Cement_min_max <- c(min(concrete_data$Cement)*(1-margin),
max(concrete_data$Cement)*(1+margin)) %>% round(4)
Slag_min_max <- c(min(concrete_data$Slag)*(1-margin),
max(concrete_data$Slag)*(1+margin)) %>% round(4)
Ash_min_max <- c(min(concrete_data$Ash)*(1-margin),
max(concrete_data$Ash)*(1+margin)) %>% round(4)
Superplasticizer_min_max <- c(min(concrete_data$Superplasticizer)*(1-margin),
max(concrete_data$Superplasticizer)*(1+margin)) %>% round(4)
Coarse_Aggregate_min_max <- c(min(concrete_data$Coarse_Aggregate)*(1-margin),
max(concrete_data$Coarse_Aggregate)*(1+margin)) %>% round(4)
Fine_Aggregate_min_max <- c(min(concrete_data$Fine_Aggregate)*(1-margin),
max(concrete_data$Fine_Aggregate)*(1+margin)) %>% round(4)
lower_limits <- c(Cement_min_max[1], Slag_min_max[1], Ash_min_max[1], Superplasticizer_min_max[1], Coarse_Aggregate_min_max[1], Fine_Aggregate_min_max[1])
upper_limits <- c(Cement_min_max[2], Slag_min_max[2], Ash_min_max[2], Superplasticizer_min_max[2], Coarse_Aggregate_min_max[2], Fine_Aggregate_min_max[2])
# Set fixed value for aging
days_aging <- 28
# Set minimum and maxium acceptable amount of water in each mixture
maximum_water <- (max(concrete_data$Water)*1.05) %>% round(4)
minimum_water <- (min(concrete_data$Water)*0.95) %>% round(4)
n_best_solutions <- 5 # Number of best solutions to keep (for selected methods)
# Optional: Create a starting point for the genetic algorithm
starting_point <- sapply(X = concrete_data, FUN = mean) %>% t() %>% data.frame() %>% select("Cement", "Slag", "Ash", "Superplasticizer", "Coarse_Aggregate", "Fine_Aggregate") %>% as.matrix()Grid search is a blind search method that reduces the space of solutions by implementing a regular hyper dimensional search space. The main downside of this method is the high computational demands of problems with several dimensions, this can be balanced by increasing the grid search step.
To perform a grid search, a search grid with a wide range of values for each ingredient was created. To exclude unfeasible solutions, combinations with too much or too little water were excluded from the grid.
# Create search grid
Search_Grid <- expand.grid(
Cement = seq(Cement_min_max[1], Cement_min_max[2], length.out = 18),
Slag = seq(Slag_min_max[1], Slag_min_max[2], length.out = 18),
Ash = seq(Ash_min_max[1], Ash_min_max[2], length.out = 18),
Superplasticizer = seq(Superplasticizer_min_max[1], Superplasticizer_min_max[2], length.out = 18),
Coarse_Aggregate = seq(Coarse_Aggregate_min_max[1], Coarse_Aggregate_min_max[2], length.out = 18),
Fine_Aggregate = seq(Fine_Aggregate_min_max[1], Fine_Aggregate_min_max[2], length.out = 18)
)
# Set threshold for minimum and maximum solids: remove solutions with too little or too much water
minimum_solids <- 1-maximum_water
maximum_solids <- 1-minimum_water
# Remove solutions with solids content below minimum threshold (too much water) or above maximum threshold (not enough water)
Search_Grid <- Search_Grid[rowSums(Search_Grid) >= minimum_solids &
rowSums(Search_Grid) <= maximum_solids, ]
# Add column for water content
Search_Grid$Water <- 1-rowSums(Search_Grid)
Search_Grid$Age <- 28
# Tabulate summary statistics of search grid
Search_Grid %>%
signif(5) %>%
gather(key = "Feature", value = "Quantity") %>%
dplyr::group_by(Feature) %>%
summarise(
Min = min(Quantity, na.rm = T),
`1st Quartile` = quantile(Quantity, probs = 0.25, na.rm = T),
Median = median(Quantity, na.rm = T),
Mean = mean(Quantity, na.rm = T),
`3rd Quartile` = quantile(Quantity, probs = 0.75),
Max = max(Quantity, na.rm = T)
) %>%
kable(digits = 4, align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center")| Feature | Min | 1st Quartile | Median | Mean | 3rd Quartile | Max |
|---|---|---|---|---|---|---|
| Age | 28.0000 | 28.0000 | 28.0000 | 28.0000 | 28.0000 | 28.0000 |
| Ash | 0.0000 | 0.0165 | 0.0384 | 0.0408 | 0.0659 | 0.0933 |
| Cement | 0.0426 | 0.0654 | 0.0997 | 0.1115 | 0.1454 | 0.2367 |
| Coarse_Aggregate | 0.3286 | 0.3492 | 0.3904 | 0.3939 | 0.4317 | 0.5038 |
| Fine_Aggregate | 0.2356 | 0.2590 | 0.2942 | 0.3053 | 0.3411 | 0.4349 |
| Slag | 0.0000 | 0.0279 | 0.0557 | 0.0612 | 0.0929 | 0.1579 |
| Superplasticizer | 0.0000 | 0.0032 | 0.0065 | 0.0068 | 0.0106 | 0.0138 |
| Water | 0.0488 | 0.0633 | 0.0791 | 0.0805 | 0.0969 | 0.1178 |
The search grid has a 5.35 million different solutions.
Use the model to generate predictions for each concrete mixture in the grid:
# Make predictions on search grid
GS_T0 <- Sys.time() # record start time
Search_Grid$Strength <- predict(avNNet_model_final, Search_Grid)
GS_T1 <- Sys.time() # record end time
(GS_Time <- GS_T1 - GS_T0)Time difference of 1.6414 mins
# Save the n best solutions of each model into a single dataframe
avNNet_GS <- Search_Grid %>%
arrange(desc(Strength)) %>%
.[1:n_best_solutions, ] %>% # select n best solutions
mutate(Method = "Grid Search") %>%
select(Method, Strength, everything())
avNNet_GS %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solutions", align = "c", digits = 4, col.names = gsub("_", " ", colnames(avNNet_GS))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center") | Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Water | Age |
|---|---|---|---|---|---|---|---|---|---|
| Grid Search | 93.807 | 0.2253 | 0.0464 | 0.0055 | 0.0024 | 0.3595 | 0.3059 | 0.0549 | 28 |
| Grid Search | 93.615 | 0.2253 | 0.0464 | 0.0110 | 0.0024 | 0.3595 | 0.3059 | 0.0494 | 28 |
| Grid Search | 92.516 | 0.2367 | 0.0557 | 0.0000 | 0.0024 | 0.3492 | 0.3059 | 0.0500 | 28 |
| Grid Search | 92.454 | 0.2367 | 0.0464 | 0.0055 | 0.0024 | 0.3595 | 0.2942 | 0.0552 | 28 |
| Grid Search | 92.360 | 0.2253 | 0.0557 | 0.0000 | 0.0024 | 0.3595 | 0.3059 | 0.0511 | 28 |
The grid search obtained a concrete mixture with a maximal predicted compressive strength of 93.807 MPa.
For performing a random search, a grid of solutions with randomly generated values for each variable is created. Values for each variable are randomly generated using the runif() function. Only one cycle of random search was executed.
Create random solutions of a wide range of ingredient combinations to perform a random search. To exclude unfeasible solutions, combinations with too much or too little water content were excluded.
set.seed(1)
Random_Solutions <- expand.grid(
Cement = runif(18, Cement_min_max[1], Cement_min_max[2]),
Slag = runif(18, Slag_min_max[1], Slag_min_max[2]),
Ash = runif(18, Ash_min_max[1], Ash_min_max[2]),
Superplasticizer = runif(18, Superplasticizer_min_max[1], Superplasticizer_min_max[2]),
Coarse_Aggregate = runif(18, Coarse_Aggregate_min_max[1], Coarse_Aggregate_min_max[2]),
Fine_Aggregate = runif(18, Fine_Aggregate_min_max[1], Fine_Aggregate_min_max[2])
)
# Set threshold for minimum and maximum solids: remove solutions with too little or too much water
minimum_solids <- 1-maximum_water
maximum_solids <- 1-minimum_water
# Remove solutions with solids content below minimum threshold (too much water) or above maximum threshold (not enough water)
Random_Solutions <- Random_Solutions[rowSums(Random_Solutions) >= minimum_solids &
rowSums(Random_Solutions) <= maximum_solids, ]
# Add column for water content
Random_Solutions$Water <- 1-rowSums(Random_Solutions)
Random_Solutions$Age <- 28
# Tabulate summary statistics
Random_Solutions %>%
signif(5) %>%
gather(key = "Feature", value = "Quantity") %>%
dplyr::group_by(Feature) %>%
summarise(
Min = min(Quantity, na.rm = T),
`1st Quartile` = quantile(Quantity, probs = 0.25, na.rm = T),
Median = median(Quantity, na.rm = T),
Mean = mean(Quantity, na.rm = T),
`3rd Quartile` = quantile(Quantity, probs = 0.75),
Max = max(Quantity, na.rm = T)
) %>%
kable(digits = 4, align = "c") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center")| Feature | Min | 1st Quartile | Median | Mean | 3rd Quartile | Max |
|---|---|---|---|---|---|---|
| Age | 28.0000 | 28.0000 | 28.0000 | 28.0000 | 28.0000 | 28.0000 |
| Ash | 0.0022 | 0.0384 | 0.0494 | 0.0469 | 0.0683 | 0.0804 |
| Cement | 0.0546 | 0.0817 | 0.0941 | 0.1121 | 0.1392 | 0.2351 |
| Coarse_Aggregate | 0.3499 | 0.3641 | 0.3871 | 0.3930 | 0.4048 | 0.4969 |
| Fine_Aggregate | 0.2474 | 0.2781 | 0.2895 | 0.3065 | 0.3309 | 0.4334 |
| Slag | 0.0021 | 0.0294 | 0.0537 | 0.0564 | 0.0761 | 0.1476 |
| Superplasticizer | 0.0010 | 0.0041 | 0.0056 | 0.0063 | 0.0091 | 0.0126 |
| Water | 0.0488 | 0.0620 | 0.0769 | 0.0789 | 0.0948 | 0.1178 |
There are 4.59 million different random solutions.
Use the model to generate predictions for each concrete mixture:
# Make predictions
RS_T0 <- Sys.time() # record start time
Random_Solutions$Strength <- predict(avNNet_model_final, Random_Solutions)
RS_T1 <- Sys.time() # record end time
(RS_Time <- RS_T1 - RS_T0)Time difference of 1.4248 mins
# Save the n best solutions of each model into a single dataframe
avNNet_RS <- Random_Solutions %>%
arrange(desc(Strength)) %>%
.[1:n_best_solutions, ] %>% # select n best solutions
mutate(Method = "Random Search") %>%
select(Method, Strength, everything())
# Tabulate best solutions
avNNet_RS %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solutions", align = "c", digits = 4, col.names = gsub("_", " ", colnames(avNNet_RS))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center") | Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Water | Age |
|---|---|---|---|---|---|---|---|---|---|
| Random Search | 90.205 | 0.2260 | 0.0537 | 0.0101 | 0.0014 | 0.3537 | 0.306 | 0.0492 | 28 |
| Random Search | 89.937 | 0.2260 | 0.0422 | 0.0101 | 0.0014 | 0.3641 | 0.306 | 0.0503 | 28 |
| Random Search | 89.775 | 0.2260 | 0.0537 | 0.0101 | 0.0012 | 0.3537 | 0.306 | 0.0494 | 28 |
| Random Search | 89.535 | 0.2260 | 0.0537 | 0.0101 | 0.0010 | 0.3537 | 0.306 | 0.0496 | 28 |
| Random Search | 89.013 | 0.2189 | 0.0422 | 0.0101 | 0.0014 | 0.3641 | 0.306 | 0.0574 | 28 |
The random search obtained a concrete mixture with a maximal predicted compressive strength of 90.205 MPa.
Simulated annealing (SA) is a local search method, i.e. a method that focuses its attention within the local neighbourhood of an initial starting point. This method is inspired on the heat treatment of metals where a material is heated above its recrystallization temperature to increase its ductility and then cooled under controlled conditions.
The SA algorithm starts with a starting point (randomly generated or provided) and a specified initial temperature that controls the probability of accepting inferior solutions. At each successive iteration the temperature decreases making the exploration of new and more different solutions less probable.
The GenSA package will be used to optimize the concrete mixture using simulated annealing. The main search parameters for this implementation of SA are:
temperature)maxit)These parameters can be increased to improve the performance of complex optimization problems.
# Set parameter settings for search algorithm
max_iter <- 500 # maximum number of iterations
pop_size <- 10 # population sizeWe will use the dataset to generate a set of 10 starting points. These will be selected by creating a subset of the 10 most dissimilar solutions from a randomly selected solution.
# Randomly select one sample from the data as a starting point
set.seed(1)
start_index <- sample(x = 1:nrow(concrete_data), size = 1)
start_point <- concrete_data[start_index, ]
# Select the n most dissimilar observations from the starting point
n_observations <- pop_size
index_starting_observations <- caret::maxDissim(a = start_point, b = concrete_data, n = n_observations)
starting_observations <- concrete_data[index_starting_observations, ]
# Remove water from subset
starting_observations <- starting_observations %>% dplyr::select(-Water)Because we require all the solutions to add to one, the search procedure will be run without water and the proportion of water will be determined by subtracting the sum of the proportion of all ingredients minus 1.
In order to assess each solution, a custom function to predict the compressive strength of each solution is created.
# Create custom function for assessing solutions
eval_function <- function(x, model, min_water, max_water, age = 28) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3]; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]
# Create dataframe with proportion of each solid component
solution_df <- data.frame(Cement = x1,
Slag = x2,
Ash = x3,
Superplasticizer = x4,
Coarse_Aggregate = x5,
Fine_Aggregate = x6)
# Calculate proportion of water
solution_df$Water <- 1-rowSums(solution_df) # Water = 1-sum(solids)
# Create death-penalty score for solutions with water content outside acceptable range
if(solution_df$Water >= min_water & solution_df$Water <= max_water & rowSums(solution_df) == 1) {
# Add pre-defined age to temporary solution
solution_df$Age <- age
return(-predict(model, solution_df)) # maximize strength
} else {
return(0)
}
}set.seed(1)
SA_output <- starting_observations
SA_output$Water <- NA
SA_output$Strength <- NA
i_max <- NA
trace_max_SA <- NA
SA_T0 <- Sys.time() # record start time
# Repeat the process of finding the optimal solution for each starting observation
for(i in 1:nrow(SA_output)) {
results <- GenSA::GenSA(
par = SA_output[i, 1:6] %>% as.matrix(),
fn = eval_function,
lower = lower_limits,
upper = upper_limits,
control = list(
maxit = max_iter/pop_size,
verbose = F),
model = avNNet_model_final,
min_water = minimum_water,
max_water = maximum_water
)
# Save the predictions
SA_output$Strength[i] <- abs(results$value)
# Save the input variables
SA_output[i, 1:6] <- results$par
if (SA_output$Strength[i] == max(SA_output$Strength, na.rm = T)) {
i_max <- i
trace_SA_max <- results$trace.mat
}
}
SA_T1 <- Sys.time() # record end time
(SA_Time <- SA_T1 - SA_T0)Time difference of 48.935 secs
# Save parameters of n best solutions
avNNet_SA <- SA_output %>%
arrange(desc(Strength)) %>%
.[1:n_best_solutions, ] %>%
mutate(Method = "Simulated Annealling",
Water = 1 - (Cement + Slag + Ash + Superplasticizer + Coarse_Aggregate + Fine_Aggregate), Age = days_aging) %>%
select(Method, Strength, everything())
# Tabulate best solutions
avNNet_SA %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solutions", align = "c", digits = 4, col.names = gsub("_", " ", colnames(avNNet_SA))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center")| Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Age | Water |
|---|---|---|---|---|---|---|---|---|---|
| Simulated Annealling | 90.599 | 0.2035 | 0.0677 | 0.0000 | 0.0027 | 0.3705 | 0.2975 | 28 | 0.0581 |
| Simulated Annealling | 89.706 | 0.2003 | 0.0883 | 0.0185 | 0.0050 | 0.3289 | 0.3074 | 28 | 0.0516 |
| Simulated Annealling | 89.706 | 0.2003 | 0.0883 | 0.0185 | 0.0050 | 0.3289 | 0.3074 | 28 | 0.0516 |
| Simulated Annealling | 87.805 | 0.1905 | 0.0676 | 0.0134 | 0.0047 | 0.3757 | 0.2885 | 28 | 0.0596 |
| Simulated Annealling | 87.805 | 0.1905 | 0.0676 | 0.0134 | 0.0047 | 0.3757 | 0.2885 | 28 | 0.0596 |
Simulated annealing obtained a concrete mixture with a maximal predicted compressive strength of 90.599 MPa.
Genetic algorithms (GA) are stochastic, population-based search methods inspired by Darwin’s theory of evolution. Genetic algorithms mimic the evolution of living organisms by creating and using a pool of candidate solutions to search for the global optimum. Although population-based methods tend to require more computations than single-state methods (e.g. hill climbing and simulated annealing), they work better as global optimisation methods as they tend to explore a wider range of regions in the search space.
The algorithm starts by creating a random population of feasible solutions. These are created within minimum and maximum values set by the user for each variable (gene). At each iteration (also known as generation) the individuals are assessed using a fitness function. A pre-defined number or percentage of the best individuals are then selected for the next generation (defined by elitism). This selected cohort can be recombined and modified. The probability of these two events is determined by the crossover and mutation rate, respectively.
Each generation of individuals goes through the steps of assessment, selection, crossover and mutation for a predefined number of iterations or until another stopping criteria is met.
The main parameters of a genetic algorithm are:
A random population seed may also be used for reproducibility.
The GA package developed by Luca Scrucca will be used to optimize the concrete mixture using the genetic algorithm. The package provides the ga() function whose main arguments are:
fitness: function to assess the fitness of the solutionslower and upper: lower and upper limits for each parameterpopSize: population sizemaxiter: maximum number of iterationspmutation: probability of mutationpcrossover: probability of crossoverelitism: fraction of best solutions to survive each generation# Set parameter settings for search algorithm
max_iter <- 500 # maximum number of iterations
pop_size <- 100 # population size# Create custom function for assessing solutions
eval_function <- function(x, model) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3]; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]
# Create dataframe with proportion of each solid component
solution_df <- data.frame(Cement = x1,
Slag = x2,
Ash = x3,
Superplasticizer = x4,
Coarse_Aggregate = x5,
Fine_Aggregate = x6)
# Calculate proportion of water
solution_df$Water <- 1-rowSums(solution_df) # Water = 1-sum(solids)
# Create death-penalty score for solutions with water content outside acceptable range
if(solution_df$Water >= minimum_water & solution_df$Water <= maximum_water & rowSums(solution_df) == 1) {
# Add pre-defined age to temporary solution
solution_df$Age <- days_aging
return(predict(model, solution_df))
} else {
return(0)
}
}A Local search algorithm is incorporated with the GA by setting optim = TRUE.
set.seed(1)
n_cores <- detectCores()-1
GA_T0 <- Sys.time() # record start time
# Run Genetic Algorithm
GA_output <- GA::ga(
type = "real-valued",
fitness = function(x) eval_function(x, model = avNNet_model_final),
lower = lower_limits,
upper = upper_limits,
popSize = pop_size, # population size
maxiter = max_iter, # maximum nuber of iteriation
pmutation = 0.3, # probability of mutation
elitism = 0.3, # percentage elitism
suggestions = starting_point, # Optional: starting point for genetic algorithm
parallel = n_cores,
monitor = FALSE,
optim = TRUE, # incorporate local search
optimArgs = list(method = "L-BFGS-B",
poptim = 0.20,
pressel = 0.5,
control = list(fnscale = -1, maxit = 100)), # parameters for local search
seed = 1
)
GA_T1 <- Sys.time() # record end time
(GA_Time <- GA_T1 - GA_T0)Time difference of 6.0873 mins
# Print summary of GA
summary(GA_output)-- Genetic Algorithm -------------------
GA settings:
Type = real-valued
Population size = 100
Number of generations = 500
Elitism = 0.3
Crossover probability = 0.8
Mutation probability = 0.3
Search domain =
x1 x2 x3 x4 x5 x6
lower 0.0426 0.0000 0.0000 0.0000 0.3286 0.2356
upper 0.2367 0.1579 0.0933 0.0138 0.5038 0.4349
Suggestions =
x1 x2 x3 x4 x5 x6
1 0.11955 0.031643 0.023173 0.0026203 0.41517 0.33012
GA results:
Iterations = 500
Fitness function value = 88.7
Solution =
x1 x2 x3 x4 x5 x6
[1,] 0.18084 0.067048 0.01013 0.004546 0.37448 0.30375
GA_summary <- GA_output@summary %>%
.[,c("max", "mean", "median")] %>%
data.frame() %>%
select(Best = max, Mean = mean, Median = median) %>%
mutate(Iteration = 1:max_iter)
# Plot results
GA_summary %>%
gather(key = "Parameter", value = "Value", -Iteration) %>%
ggplot(mapping = aes(x = Iteration, y = Value, col = Parameter)) +
geom_line() +
theme_bw() +
theme(aspect.ratio = 0.7) +
scale_color_brewer(type = "qual", palette = "Set1") +
labs(x = "Iteration", y = "Compressive Strength (Predicted)", title = "Best predicted compressive strength at each iteration", subtitle = "Results using Genetic Algorithm")# Save best solution
avNNet_GA <- data.frame(Method = "Genetic Algorithm",
Strength = GA_output@fitnessValue,
Cement = GA_output@solution[1],
Slag = GA_output@solution[2],
Ash = GA_output@solution[3],
Superplasticizer = GA_output@solution[4],
Coarse_Aggregate = GA_output@solution[5],
Fine_Aggregate = GA_output@solution[6]) %>%
mutate(Water = 1 - (Cement + Slag + Ash + Superplasticizer + Coarse_Aggregate + Fine_Aggregate),
Age = days_aging)
# Tabulate best solutions
avNNet_GA %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solutions", align = "c", digits = 4, col.names = gsub("_", " ", colnames(avNNet_GA))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center") | Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Water | Age |
|---|---|---|---|---|---|---|---|---|---|
| Genetic Algorithm | 88.7 | 0.1808 | 0.067 | 0.0101 | 0.0045 | 0.3745 | 0.3037 | 0.0592 | 28 |
The genetic algorithm obtained a concrete mixture with a maximal predicted compressive strength of 88.7 MPa.
Islands Genetic Algorithms (GAISL) are a variant of Genetic Algorithms where the population of solutions is partitioned in a set of subpopulations. Each island performs it runs its own GA with occasional individuals being transfered from one island to another.
The GAISL is run using the gaisl() function from the GA package. In addition to the arguments of the ga() function, there are additional argument to control the number of islands and migrate rate:
numIslands: number of islandsmigrationRate: proportion of individuals that migrate between islands in each exchangemigrationInterval: number of iterations at which exchanges are performed.# Set parameter settings for search algorithm
max_iter <- 500 # maximum number of iterations
pop_size <- 100 # population sizeA Local search algorithm is incorporated with the GAISL algorithm by setting optim = TRUE. Four islands will be used (numIslands = 4) with a migration rate (migrationRate) of 0.1 done at each 10 interations (migrationInterval = 10).
set.seed(1)
n_cores <- detectCores()-1
GAISL_T0 <- Sys.time() # record start time
# run GAISL algorithm
GAISL_output <- GA::gaisl(
type = "real-valued",
fitness = function(x) eval_function(x, model = avNNet_model_final),
lower = lower_limits,
upper = upper_limits,
popSize = pop_size, # population size
maxiter = max_iter, # maximum nuber of iteriation
pmutation = 0.3, # probability of mutation
elitism = 0.3, # percentage elitism
numIslands = 4,
migrationRate = 0.1,
migrationInterval = 10,
monitor = F,
# suggestions = starting_point, # Optional: starting point for genetic algorithm
parallel = n_cores,
optim = T, # perform local search
optimArgs = list(method = "L-BFGS-B",
poptim = 0.20,
pressel = 0.5,
control = list(fnscale = -1, maxit = 100)), # paramaters for local search
seed = 1)
GAISL_T1 <- Sys.time() # record end time
(GAISL_Time <- GAISL_T1 - GAISL_T0)Time difference of 5.2167 mins
summary(GAISL_output)-- Islands Genetic Algorithm -----------
GA settings:
Type = real-valued
Number of islands = 4
Islands pop. size = 25
Migration rate = 0.1
Migration interval = 10
Elitism = 0.3
Crossover probability = 0.8
Mutation probability = 0.3
Search domain =
x1 x2 x3 x4 x5 x6
lower 0.0426 0.0000 0.0000 0.0000 0.3286 0.2356
upper 0.2367 0.1579 0.0933 0.0138 0.5038 0.4349
GA results:
Iterations = 500
Epochs = 50
Fitness function values = 89.602 89.602 89.602 89.602
Solutions =
x1 x2 x3 x4 x5 x6
[1,] 0.207 0.080108 0.0094798 0.0033201 0.38825 0.26303
[2,] 0.207 0.080108 0.0094798 0.0033201 0.38825 0.26303
[3,] 0.207 0.080108 0.0094798 0.0033201 0.38825 0.26303
[4,] 0.207 0.080108 0.0094798 0.0033201 0.38825 0.26303
# Save trace data from each island
GAISL_summary <- data.frame(Iteration = 1:max_iter,
Island_1 = GAISL_output@summary[[1]][,"max"],
Island_2 = GAISL_output@summary[[2]][,"max"],
Island_3 = GAISL_output@summary[[3]][,"max"],
Island_4 = GAISL_output@summary[[4]][,"max"])
# Plot trace data for each island
GAISL_summary %>%
gather(key = "Island", value = "Average", -Iteration) %>%
ggplot(mapping = aes(x = Iteration, y = Average, col = Island)) +
geom_line(size = 0.75, alpha = 0.8) +
theme_bw() +
theme(aspect.ratio = 0.5) +
scale_color_brewer(type = "qual", palette = "Set1") +
labs(x = "Iteration", y = "Compressive Strength (Predicted)", title = "Best predicted compressive strength at each iteration", subtitle = "Results using Islands Genetic Algorithm (GAISL)")# Save best solution
avNNet_GAISL <- data.frame(Method = "Islands Genetic Algorithm",
Strength = GAISL_output@fitnessValue,
Cement = GAISL_output@solution[1],
Slag = GAISL_output@solution[2],
Ash = GAISL_output@solution[3],
Superplasticizer = GAISL_output@solution[4],
Coarse_Aggregate = GAISL_output@solution[5],
Fine_Aggregate = GAISL_output@solution[6]) %>%
mutate(Water = 1 - (Cement + Slag + Ash + Superplasticizer + Coarse_Aggregate + Fine_Aggregate),
Age = days_aging)
# Tabulate best solution(s)
avNNet_GAISL %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solution", align = "c", digits = 4, col.names = gsub("_", " ", colnames(avNNet_GAISL))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center") | Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Water | Age |
|---|---|---|---|---|---|---|---|---|---|
| Islands Genetic Algorithm | 89.602 | 0.207 | 0.0801 | 0.0095 | 0.0033 | 0.3882 | 0.263 | 0.0488 | 28 |
The islands genetic algorithm obtained a concrete mixture with a maximal predicted compressive strength of 89.602 MPa.
Differential Evolution (DE) is a population-based search method for multidimensional real-valued functions. Similar to genetic algorithms, DE uses a population of solutions and creates new candidates solutions from parent solutions.
The main difference between genetic algorithms and differential evolution is regarding how new candidate solutions are created. In the latter method, new solutions are created by differential mutation of the population members. Three candidate solutions (e.g. a, b and c) are randomly selected and a mutant parameter vector is created using a simple arithmetic formula, y = a + F(b-c), where F is a positive scaling factor that typically ranges from 0 to 1. This process is repeated for each dimension. The final mutant vector is then used for recombination.
Search for the global minimum of the 2D Ackley function with Differential Evolution. Source: Wikipedia
The DEoptim package will be used for searching optimal concrete mixtures with differential evolution (DE). The package provides the DEoptim() function whose main arguments are:
fn: function to assess the fitness of the solutionslower and upper: lower and upper limits for each parametercontrol: list of control parameters for search method (population size, propability of crossover, maximum number of iterations, etc.)# Set parameter settings for search algorithm
max_iter <- 500 # maximum number of iterations
pop_size <- 100 # population sizePrior to run the search method, a custom objective function is created.
# Create custom function for assessing solutions
eval_function <- function(x, model, min_water, max_water, age = 28) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3]; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]
# Create dataframe with proportion of each solid component
solution_df <- data.frame(Cement = x1,
Slag = x2,
Ash = x3,
Superplasticizer = x4,
Coarse_Aggregate = x5,
Fine_Aggregate = x6)
# Calculate proportion of water
solution_df$Water <- 1-rowSums(solution_df) # Water = 1-sum(solids)
# Create death-penalty score for solutions with water content outside acceptable range
if(solution_df$Water >= min_water & solution_df$Water <= max_water & rowSums(solution_df) == 1) {
# Add pre-defined age to temporary solution
solution_df$Age <- age
return(-predict(model, solution_df))
} else {
return(0)
}
}set.seed(1)
n_cores <- detectCores()-1
DE_T0 <- Sys.time() # record start time
# Run differential evolution algorithm
DE_output <- DEoptim::DEoptim(
fn = eval_function,
lower = lower_limits,
upper = upper_limits,
control = DEoptim.control(
NP = pop_size, # population size
itermax = max_iter, # maximum number of iterations
CR = 0.5, # probability of crossover
F = 0.8, # differential weighting factor
storepopfreq = 1 , # store every population
parallelType = 1, # run parallel processing
trace = F
),
model = avNNet_model_final,
min_water = minimum_water,
max_water = maximum_water
)
DE_T1 <- Sys.time() # record end time
(DE_Time <- DE_T1-DE_T0)Time difference of 2.6333 mins
# Print search results
summary(DE_output)
***** summary of DEoptim object *****
best member : 0.23665 0.04695 0 0.00223 0.35646 0.30419
best value : -95.67
after : 500 generations
fn evaluated : 1002 times
*************************************
DE_solution <- DE_output$optim$bestmem
# Save parameters of best solution
avNNet_DE <- data.frame(Method = "Differential Evolution",
Strength = abs(DE_output$optim$bestval),
Cement = DE_solution[1],
Slag = DE_solution[2],
Ash = DE_solution[3],
Superplasticizer = DE_solution[4],
Coarse_Aggregate = DE_solution[5],
Fine_Aggregate = DE_solution[6]) %>%
mutate(Water = 1 - (Cement + Slag + Ash + Superplasticizer + Coarse_Aggregate + Fine_Aggregate),
Age = days_aging)
# Plot results
ggplot(mapping = aes(x = 1:length(DE_output$member$bestvalit), y = abs(DE_output$member$bestvalit))) +
geom_line(col = "#08519c", size = 1) +
theme_bw() +
theme(aspect.ratio = 0.7) +
labs(x = "Iteration", y = "Compressive Strength (Predicted)", title = "Best predicted compressive strength at each iteration", subtitle = "Results using Differential Evolution")# Tabulate best solution(s)
avNNet_DE %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solution", align = "c", digits = 4, col.names = gsub("_", " ", colnames(avNNet_DE))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center") | Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Water | Age |
|---|---|---|---|---|---|---|---|---|---|
| Differential Evolution | 95.67 | 0.2366 | 0.047 | 0 | 0.0022 | 0.3565 | 0.3042 | 0.0535 | 28 |
The search was completed and found the best value to be 95.67 after 500 iterations and 1002 function evaluations.
Particle swarm optimization (PSO) is a population-based search method that belongs to the swarm intelligence family of algorithms. Proposed by Kenned and Eberhart (1995), this method is inspired by the swarm behavior of several animals such as bird flocks, fish schools and bee swarms.
The PSO method searches for the optimal solution by using a population of candidate solutions (also known as particles) that iteratively move across the search space. The movement of a specific particle across the search space is determined by its local best known position but also by the best known position in the search space found by other particles. This results in the whole swarm moving in a self-organized behaviour.
Each particle is defined by its:
The position of a particle on the next iteration depends on its current position and velocity, while the velocity depends on all of the parameters that define the particle.
A swarm of particles searching for the global minimum. Source: Wikipedia
The PSO package will be used to optimize the concrete mixture using particle swarm optimization through the psoptim() function. Prior to run the search method, a custom objective function is created.
# Set parameter settings for search algorithm
max_iter <- 500 # maximum number of iterations
pop_size <- 100 # population size# Create custom function for assessing solutions
eval_function <- function(x, model, min_water, max_water, age = 28) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3]; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]
# Create dataframe with proportion of each solid component
solution_df <- data.frame(Cement = x1,
Slag = x2,
Ash = x3,
Superplasticizer = x4,
Coarse_Aggregate = x5,
Fine_Aggregate = x6)
# Calculate proportion of water
solution_df$Water <- 1-rowSums(solution_df) # Water = 1-sum(solids)
# Create death-penalty score for solutions with water content outside acceptable range
if(solution_df$Water >= min_water & solution_df$Water <= max_water & rowSums(solution_df) == 1) {
# Add pre-defined age to temporary solution
solution_df$Age <- age
return(-predict(model, solution_df))
} else {
return(0)
}
}The psoptim() function from the pso package is used to run the search algorithm. The SPSO2011 method will be used for this search.
set.seed(1)
n_cores <- detectCores()-1
PSO_T0 <- Sys.time() # record start time
# Run differential evolution algorithm
PSO_output <- pso::psoptim(
par = rep(NA, 6),
fn = eval_function,
lower = lower_limits,
upper = upper_limits,
control = list(
trace = 1, # produce tracing information on the progress of the optimization
maxit = max_iter, # maximum number of iterations
REPORT = 1, # frequency for reports
trace.stats = T,
s = pop_size, # Swarm Size,
maxit.stagnate = round(0.75*max_iter), # maximum number of iterations without improvement
vectorize = T,
type = "SPSO2011" # method used
),
model = avNNet_model_final,
min_water = minimum_water,
max_water = maximum_water
)
PSO_T1 <- Sys.time() # record end time
(PSO_Time <- PSO_T1-PSO_T0)Time difference of 1.8913 mins
PSO_solution <- PSO_output$par
avNNet_PSO <- data.frame(Method = "Particle Swarm Optimization",
Strength = abs(PSO_output$value),
Cement = PSO_solution[1],
Slag = PSO_solution[2],
Ash = PSO_solution[3],
Superplasticizer = PSO_solution[4],
Coarse_Aggregate = PSO_solution[5],
Fine_Aggregate = PSO_solution[6]) %>%
mutate(Water = 1 - (Cement + Slag + Ash + Superplasticizer + Coarse_Aggregate + Fine_Aggregate),
Age = days_aging)
# Plot results
PSO_summary <- data.frame(
Iteration = PSO_output$stats$it,
Mean = PSO_output$stats$f %>% sapply(FUN = mean) %>% abs(),
Median = PSO_output$stats$f %>% sapply(FUN = median) %>% abs(),
Best = PSO_output$stats$error %>% sapply(FUN = min) %>% abs()
)
PSO_summary %>%
gather(key = "Parameter", value = "Value", -Iteration) %>%
ggplot(mapping = aes(x = Iteration, y = abs(Value), col = Parameter)) +
geom_line(size = 0.7) +
theme_bw() +
theme(aspect.ratio = 0.9) +
labs(x = "Iteration", y = "Compressive Strength (Predicted)", title = "Best predicted compressive strength at each iteration", subtitle = "Results using Particle Swarm Optimization") +
scale_color_brewer(type = "qual", palette = "Set1")# Tabulate best solution(s)
avNNet_PSO %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solution", align = "c", digits = 4, col.names = gsub("_", " ", colnames(avNNet_PSO))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center")| Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Water | Age |
|---|---|---|---|---|---|---|---|---|---|
| Particle Swarm Optimization | 88.695 | 0.1814 | 0.0671 | 0.01 | 0.0045 | 0.3743 | 0.3036 | 0.0591 | 28 |
The search was completed a found the best predicted compressive strength to be 88.695 after 500 iterations and 50000 function evaluations.
The table below summarises the values of predicted compressive strength for each method used and their respective concrete mixtures.
predictor_ID <- c("Cement", "Slag", "Ash", "Superplasticizer", "Coarse_Aggregate", "Fine_Aggregate", "Water")
summary_table <- bind_rows(avNNet_GS, avNNet_RS, avNNet_SA, avNNet_GA, avNNet_GAISL, avNNet_DE, avNNet_PSO)
summary_table %>%
arrange(desc(Strength)) %>%
knitr::kable(caption = "Summary of the best solutions.", align = "c", digits = 3, col.names = gsub("_", " ", colnames(summary_table))) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F, position = "center") | Method | Strength | Cement | Slag | Ash | Superplasticizer | Coarse Aggregate | Fine Aggregate | Water | Age |
|---|---|---|---|---|---|---|---|---|---|
| Differential Evolution | 95.670 | 0.237 | 0.047 | 0.000 | 0.002 | 0.356 | 0.304 | 0.054 | 28 |
| Grid Search | 93.807 | 0.225 | 0.046 | 0.005 | 0.002 | 0.360 | 0.306 | 0.055 | 28 |
| Grid Search | 93.615 | 0.225 | 0.046 | 0.011 | 0.002 | 0.360 | 0.306 | 0.049 | 28 |
| Grid Search | 92.516 | 0.237 | 0.056 | 0.000 | 0.002 | 0.349 | 0.306 | 0.050 | 28 |
| Grid Search | 92.454 | 0.237 | 0.046 | 0.005 | 0.002 | 0.360 | 0.294 | 0.055 | 28 |
| Grid Search | 92.360 | 0.225 | 0.056 | 0.000 | 0.002 | 0.360 | 0.306 | 0.051 | 28 |
| Simulated Annealling | 90.599 | 0.204 | 0.068 | 0.000 | 0.003 | 0.371 | 0.298 | 0.058 | 28 |
| Random Search | 90.205 | 0.226 | 0.054 | 0.010 | 0.001 | 0.354 | 0.306 | 0.049 | 28 |
| Random Search | 89.937 | 0.226 | 0.042 | 0.010 | 0.001 | 0.364 | 0.306 | 0.050 | 28 |
| Random Search | 89.775 | 0.226 | 0.054 | 0.010 | 0.001 | 0.354 | 0.306 | 0.049 | 28 |
| Simulated Annealling | 89.706 | 0.200 | 0.088 | 0.018 | 0.005 | 0.329 | 0.307 | 0.052 | 28 |
| Simulated Annealling | 89.706 | 0.200 | 0.088 | 0.018 | 0.005 | 0.329 | 0.307 | 0.052 | 28 |
| Islands Genetic Algorithm | 89.602 | 0.207 | 0.080 | 0.009 | 0.003 | 0.388 | 0.263 | 0.049 | 28 |
| Random Search | 89.535 | 0.226 | 0.054 | 0.010 | 0.001 | 0.354 | 0.306 | 0.050 | 28 |
| Random Search | 89.013 | 0.219 | 0.042 | 0.010 | 0.001 | 0.364 | 0.306 | 0.057 | 28 |
| Genetic Algorithm | 88.700 | 0.181 | 0.067 | 0.010 | 0.005 | 0.374 | 0.304 | 0.059 | 28 |
| Particle Swarm Optimization | 88.695 | 0.181 | 0.067 | 0.010 | 0.005 | 0.374 | 0.304 | 0.059 | 28 |
| Simulated Annealling | 87.805 | 0.191 | 0.068 | 0.013 | 0.005 | 0.376 | 0.288 | 0.060 | 28 |
| Simulated Annealling | 87.805 | 0.191 | 0.068 | 0.013 | 0.005 | 0.376 | 0.288 | 0.060 | 28 |
The PCA plot below shows how the best solutions of each model differ and cluster.
PCA <- prcomp(summary_table[, predictor_ID], center = TRUE, scale = TRUE) # Age needs to be removed from the PCA matrix (zero-variance)
autoplot(PCA,
data = summary_table,
size = "Strength",
colour = "Method", alpha = 0.7,
loadings = TRUE, loadings.label = TRUE, loadings.colour = "grey10", loadings.label.colour = "grey10", loadings.label.size = 3,
Loadings.label.label = c("Cement", "Slag", "Ash", "Superplasticizer", "Coarse Aggregate", "Fine Aggregate", "Water")) +
scale_color_brewer(type = "qual", palette = "Set1") +
labs(title = "PCA plot of best concrete mixtures for each search method", subtitle = "Strength values at 28 days.", size = "Predicted Strength\n(MPa)", caption = NULL) +
theme_bw() +
theme(aspect.ratio = 0.9, legend.text = element_text(size = 8.5), legend.title = element_text(face = "bold", size = 9), axis.title = element_text(size = 9))The PCA plot above shows that there are some differences between solutions provided by each search method. Solutions with higher strength leverage the use of cement for obtaining higher predicted compressive strengths.
The solution obtained with differential evolution had the highest predicted strength and based on the PCA plot above it composition is similar to the composition of solutions obtained grid and random search.
sessionInfo()R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] kableExtra_1.0.1 knitr_1.22 broom_0.5.1
[4] ggfortify_0.4.5 GGally_1.4.0 DEoptim_2.2-4
[7] GenSA_1.1.7 GA_3.2 pso_1.0.3
[10] gridExtra_2.3 doParallel_1.0.14 iterators_1.0.10
[13] foreach_1.4.4 readxl_1.3.0 tidyselect_0.2.5
[16] forcats_0.4.0 stringr_1.4.0 dplyr_0.8.0.1
[19] purrr_0.3.0 readr_1.3.1 tidyr_0.8.3
[22] tibble_2.0.1 tidyverse_1.2.1 caret_6.0-81
[25] ggplot2_3.1.0 lattice_0.20-38 plyr_1.8.4
[28] pacman_0.5.1
loaded via a namespace (and not attached):
[1] httr_1.4.0 viridisLite_0.3.0 jsonlite_1.6
[4] splines_3.5.1 prodlim_2018.04.18 modelr_0.1.4
[7] assertthat_0.2.0 highr_0.7 stats4_3.5.1
[10] cellranger_1.1.0 yaml_2.2.0 ipred_0.9-8
[13] pillar_1.3.1 backports_1.1.3 glue_1.3.0
[16] digest_0.6.18 RColorBrewer_1.1-2 rvest_0.3.2
[19] colorspace_1.4-0 recipes_0.1.4 htmltools_0.3.6
[22] Matrix_1.2-15 timeDate_3043.102 pkgconfig_2.0.2
[25] haven_2.1.0 webshot_0.5.1 scales_1.0.0
[28] gower_0.1.2 lava_1.6.5 proxy_0.4-22
[31] generics_0.0.2 withr_2.1.2 nnet_7.3-12
[34] lazyeval_0.2.1 cli_1.0.1 survival_2.43-3
[37] magrittr_1.5 crayon_1.3.4 evaluate_0.13
[40] fansi_0.4.0 nlme_3.1-137 MASS_7.3-51.1
[43] xml2_1.2.0 class_7.3-15 tools_3.5.1
[46] data.table_1.12.0 hms_0.4.2 munsell_0.5.0
[49] compiler_3.5.1 rlang_0.3.1 grid_3.5.1
[52] rstudioapi_0.9.0 labeling_0.3 rmarkdown_1.12
[55] gtable_0.2.0 ModelMetrics_1.2.2 codetools_0.2-16
[58] reshape_0.8.8 reshape2_1.4.3 R6_2.4.0
[61] lubridate_1.7.4 utf8_1.1.4 stringi_1.3.1
[64] Rcpp_1.0.0 rpart_4.1-13 xfun_0.5