Optimization is important in many fields, including in data science. In manufacturing, where every decision is critical to the process and the profit of organization, optimization is often employed, from the number of each products produced, how the unit is scheduled for production, get the best or optimal process parameter, and also the routing determination such as the traveling salesman problem. In data science, we are familiar with model tuning, where we tune our model in order to improve the model performance. Optimization algorithm can help us to get a better model performance.
Some problems have conflicting objectives, or trade-off, such as balancing between maximizing return while minimizing risk, or maximizing profit while minimizing production cost. These kind of problem belong to Multi-Objective Optimization Problem (MOOP). Solving MOOP is more difficult and require different approach than a single objective optimization. One example of algorithm to solve this kind of problem is Multi-Objective Evolutionary Algorithm, which is an extension of Genetic Algorithm for MOOP. If you are unfamiliar with Genetic Algorithm, you can check my previous post1 before continue to read. One of the most popular Multi-Objective Evolutionary Algorithm is NSGA-II2. Many of new algorithms are derived from this method. We will learn how to deploy them to solve MOOP in business problem.
This post is dedicated to learn how to solve Multi-Objective Optimization Problem and build the optimization algorithm with R.
Optimization is a process to be better. The goal of optimization is to find a solution that could maximize or minimize single objective function in accordance to the constraints. For example, we want to maximize the profit while at the same time minimizing the cost of production. If the problem has more than a single objectives, the problem could be modified to be a single objective problem, such as aggregating thei value and giving weight to each objective functions. However, this method has some disadvantages:
Meanwhile, in multiobjective optimization problem (MOOP), the trade-off can be identified since MOOP algorithm give multiple solutions across the pareto front. Multiple solutions also give the decision maker some alternative to choose instead of a single solution.
Since MOOP has more than one objective functions, we can’t directly say if solution A is better than solution B. We have something called pareto-dominance.
Given:
Multi-Objective Optimization Problem is concerned to find vector \(\vec{x} \epsilon \Omega\) that optimize the vector function \(\vec{f}\vec{_(x)}\)
In a minimization problem, a vector \(\vec{x}\) said to dominate \(\vec{y}\) (denoted \(\vec{x} \prec \vec{y}\)) if:
In a simple word, if we have solution called solution x, the solution is said to dominate (being better than) another solution, called solution y, if all of objective functions from solution x are less or equal than all of objective functions from solution y and there is at least one objective function from solution x that is better (has lower value in minimization) than objective function from solution y.
A solution is called dominant if it has at least one value of objective function that is better than the other while have better or equal value on another objective function. Look at the graph below.
Suppose we have some solution (let’s call it solution A, B, C, and D) with the value of the first objective function as the x axis (f1) and second objective function (f2) as the y axis. Both objective function is a minimization problem, so the closer the position to the bottom left, the better the solution is. From the figure above, we can infer several things:
Multi-Objective Evolutionary Algorithm (MOEA) is the application of evolutionary algorithm, such as the genetic algorithm to handle MOOP. There are already massive amount of research in this area, resulting in many variety of algorithms. NSGA-II is one of the popular choice and has become an inspiration for the later research.
Before we proceed, let’s review how the Genetic Algorithm works.
For multi-objective problem, NSGA-II works with the same workflow with the above process. The difference is, at the end of each iteration, new population for the next iteration is chosen using non-dominated sorting method. The solutions are divided into several front based on their domination status against each other. If we have initial population size of 10, the top 10 individuals from parent and children population from the better front will be chosen for next iteration. Also, to control the diversity of solutions, the crowding distance, which corresponds to the Manhattan distance of two neighboring solutions for two objectives, of each solutions is calculated3. The ones with the largest crowing distance are finally selected.
The problem is replicated from the study of Anagnostopolous and Mamanis4, which involves 3 objective functions. The study employed a MOEA for portfolio selection and optimization in investment management.
Portfolio optimization problem is concerned with managing the portfolio of assets that minimizes the risk objectives subjected to the constraint for guaranteeing a given level of returns. One of the fundamental principles of financial investment is diversification where investors diversify their investments into different types of assets. Portfolio diversification minimizes investors exposure to risks, and maximizes returns on portfolios. In some case, it is very difficult for the decision maker to know on beforehand the optimal number of securities that should be included in his/her portfolio without examining all the tradeoffs between risk, return and the cardinality of the portfolio. In any case, an investor might lose important solutions with large tradeoff between the objectives when he/she is forced to fix the number of assets in the portfolio on beforehand.
The objective function consists of:
1. Maximizing Returns
The returns
\[max \ \mu(x) = \sum_{i=1}^{N} x_i \ \mu_i\]
\(mu(x)\): total mean of returns
\(x_i\): weight of assets \(i\)
\(mu_i\): mean of returns of assets \(i\)
2. Minimizing Risk
The risk of the portfolio is represented by the variance of return.
\[min \ \rho(x) = \sum_{i=1}^{N}\sum_{j=1}^{N} x_i \ x_j \ \sigma_{ij}\]
\(\rho(x)\): total risk of portfolios
\(\sigma_{ij}\): covariance between asset \(i\) and \(j\)
3. Minimizing Number of Assets
Number of assets is minimized. The number of assets is the sum of all asset that has weight > 0.
\[min \ card(x) = \sum_{i=1}^{N} 1_{x>0}\]
\(card(x)\): number of assets
Subject To
\[\sum_{i=1}^{N} x_i = 1\]
\[0 \leq x_i \leq 1\]
Data is acquired from New York Stock Exchange on Kaggle (https://www.kaggle.com/dgawlik/nyse). We will only use data from January to March of 2015 for illustration.
nyse <- data.table::fread("data_input/prices.csv")
nyse <- nyse %>%
mutate(date = ymd(date)) %>%
filter(year(date) == 2015,
month(date) %in% c(1:3))
head(nyse)
To get clearer name of company, let’s import the Ticker Symbol and Security.
securities <- data.table::fread("data_input/securities.csv")
securities <- securities %>%
select(`Ticker symbol`, Security) %>%
rename(symbol = `Ticker symbol`,
name = Security)
Joint them with the price data.
nyse <- nyse %>%
left_join(securities, by = c("symbol")) %>%
select(date, symbol, name, everything())
head(nyse)
Let’s limit the number of available stocks to only randomly selected 30 stock. Too many choice will lead to longer runtime. Since it is just an introduction, I will reduce them.
set.seed(13)
selected_stock <- sample(nyse$symbol, 30)
nyse <- nyse %>%
filter(symbol %in% selected_stock)
head(nyse)
Let’s calculate the daily returns.
nyse <- nyse %>%
rename(price = close) %>%
select(date, symbol, name, price) %>%
group_by(symbol, name) %>%
mutate(price_prev = lag(price),
returns = (price - price_prev)/price_prev) %>%
slice(-1) %>%
ungroup()
head(nyse)
Let’s calculate the mean return of each stock.
mean_stock <- nyse %>%
group_by(symbol) %>%
summarise(mean = mean(returns)) %>%
arrange(desc(mean))
head(mean_stock)
The value of \(R_f\) is acquired from the latest interest rate on a three-month U.S. Treasury bill. Since the data is from 2016, we will use data from 2015 (Use data from March 27, 2015), which is 0.04%. The rate is acquired from https://ycharts.com/indicators/3_month_t_bill.
Calculate the covariance matrix between portofolio. First, we need to separate the return of each portofolio into several column by spreading them.
nyse_wide <- nyse %>%
pivot_wider(id_cols = c(date, symbol), names_from = symbol, values_from = returns) %>%
select(-date)
# Create Excess Return
for (i in 1:n_distinct(nyse$symbol)) {
nyse_wide[,i]<- nyse_wide[,i] - as.numeric(mean_stock[i,2])
}
head(nyse_wide)
Create the covariance matrix.
Let’s define the fitness function.
fitness <- function(x){
# Assign weight for each stocks
weight_stock <- x
# Calculate the total returns
f1 <- numeric()
for (i in 1:n_distinct(nyse$symbol)) {
f1[i] <- weight_stock[i]*mean_stock$mean[i]
}
mean_return <- sum(f1) - 1e9 * (round(sum(weight_stock),10)-1)^2
# Calculate the total risk
f2 <- numeric()
for (i in 1:n_distinct(nyse$symbol)) {
f3 <- numeric()
for (j in 1:n_distinct(nyse$symbol)) {
f3[j] <- weight_stock[i]*weight_stock[j]*nyse_cov[i,j]
}
f2[i] <- sum(f3)
}
risk <- sum(f2) + 1e9 * (round(sum(weight_stock),10)-1)^2
# Calculate the number of assets
card <- length(weight_stock[weight_stock > 0])
return(c(-mean_return, risk, card))
}
The NSGA-II algorithm can be ran with nsga2()
function from nsga2R
package.
Here is some parameter for the function:
fn
: the fitness function to be minimizedvarNo
: Number of decision variablesobjDim
: Number of objective functionslowerBounds
: Lower bounds of each decision variableupperBounds
: Upper bounds of each decision variablepopSize
: Size of populationgenerations
: Number of generationsWe will run the algorithm for 1000 iterations, with population size of 200. The probability of mutation is 0.2 while the probability for crossover is 0.8.
set.seed(123)
n_asset <- n_distinct(nyse$symbol)
finance_optim <- nsga2R(fn = fitness, varNo = n_asset, objDim = 3, generations = 1000,
mprob = 0.2, popSize = 200, cprob = 0.8,
lowerBounds = rep(0, n_asset), upperBounds = rep(1, n_asset))
To check the pareto front solutions, you can look at the objectives
and subset only those with pareto front rank
of 1. Let’s see how many different optimal solutions are generated.
finance_optim$objectives[finance_optim$paretoFrontRank == 1, ] %>%
matrix(ncol = 3) %>%
as.data.frame() %>%
distinct() %>%
mutate(V1 = round(-V1, 5),
V2 = round(V2, 5)) %>%
rename(`Total Return` = V1,
Risk = V2,
`Number of Assets` = V3)
Turns out there is only 1 solution.
Let’s check the total weight
## [1] 1
The sum of the weight is 1, so we can be confident that our solution doesn’t violate the constraint.
If you want to get the weight for each asset, you can use parameters
object from the finance_optim
. Let’s look the parameter for the first solution.
data.frame(symbol = unique(nyse$symbol),
name = unique(nyse$name),
weight = finance_optim$parameters[1, ]) %>%
mutate(return = round(weight * mean_stock$mean, 5),
weight = round(weight,5))
The vehicle crashworthiness problem5 is a multi-objective problem where the crash safety level of a vehicle is optimized. A higher safety level means how well a vehicle can protect the occupants from the effects of a frontal accident.
The study consider a full-width frontal crash test and an offset-frontal crash test. The full-width impact usually results in a high deceleration. It is known that the high deceleration can cause serious biomechanical injuries to the occupants while the offset test is demanding for the structural integrity of a vehicle. To improve the overall crashworthiness of a frontal structure of a vehicle, these two crashing scenarios should be considered simultaneously. There are five decision variables that represent the thickness of reinforced members around the car front, as shown on the figure below.
The following is 3 objectives that need to be achieved for optimal crashworthiness:
\[minimize\ f_1(x) = 1640.2823 + 2.3573285\ x_1 + 2.3220035\ x_2 +4.5688768\ x_3 + 7.7213633\ x_4 + 4.4559504\ x_5\]
\[minimize\ f_2(x) = 6.5856 + 1.15\ x_1 − 1.0427\ x_2 + 0.9738\ x_3 + 0.8364\ x_4 − 0.3695\ x_1x_4 +0.0861\ x_1x_5 + \\0.3628\ x_2x_4 − 0.1106\ x_1^2 − 0.3437\ x_3^2 + 0.1764\ x_4^2\]
\[minimize\ f_3(x) = −0.0551 + 0.0181\ x_1 + 0.1024\ x_2 + 0.0421\ x_3 −0.0073\ x_1x_2 + 0.024\ x_2x_3 − \\ 0.0118\ x_2x_4 − 0.0204\ x_3x_4 −0.008\ x_3x_5 − 0.0241\ x_2^2 + 0.0109\ x_4^2\]
Constrained with:
\[1 \leq x_i \leq 3\]
\(f_1\): The mass of the vehicle
\(f_2\): An integration of collision acceleration in the full frontal crash
\(f_3\): The toe-board intrusion in the 40% offset-frontal crash
Let’s define the fitness function
fitness <- function(x){
# Calculate f1(x)
f1 <- 1640.2823 + 2.3573285*x[1] + 2.3220035*x[2] +4.5688768*x[3] +
7.7213633*x[4] + 4.4559504*x[5]
# Calculate f2(x)
f2 <- 6.5856 + 1.15*x[1] - 1.0427*x[2] + 0.9738*x[3] + 0.8364*x[4] -
0.3695*x[1]*x[4] + 0.0861*x[1]*x[5] + 0.3628*x[2]*x[4] -
0.1106*x[1]^2 - 0.3437*x[3]^2 + 0.1764*x[4]^2
# Calculate f3(x)
f3 <- -0.0551 + 0.0181*x[1] + 0.1024*x[2] + 0.0421*x[3] -
0.0073*x[1]*x[2] + 0.024*x[2]*x[3] - 0.0118*x[2]*x[4] -
0.0204*x[3]*x[4] - 0.008*x[3]*x[5] - 0.0241*x[2]^2 + 0.0109*x[4]^2
return(c(f1,f2,f3))
}
We will run NSGA-II with 5000 generations and population size of 100.
set.seed(123)
car_optim <- nsga2R(fn = fitness, varNo = 5, objDim = 3, generations = 5000, popSize = 100,
lowerBounds = rep(1, 5), upperBounds = rep(3, 5))
Let’s look at the optimal solution: those who belong to the first front.
car_result <- car_optim$objectives %>%
as.data.frame() %>%
bind_cols(as.data.frame(car_optim$parameters)) %>%
filter(car_optim$paretoFrontRank == 1) %>%
mutate_all(.funs = function(x){round(x,3)}) %>%
distinct() %>%
rename("f1" = V1, "f2" = V2, "f3" = V3,
"x_1" = V11, "x_2" = V21, "x_3" = V31, "x_4" = V4, "x_5" = V5)
car_result
As we can see, we can achieve more than 1 similar objective values, indicating that all of those solutions are non-dominated to each other: no solution is better than the other. Let’s check the first 3 solutions.
Is the first solution better than the second? Remember that all of them are minimization problem.
f1
value of the first solution is higher than second solutionf2
value of the first solution is lower than second solutionf3
value of the first solution is higher than second solutionThe second solution can dominate or better than the first solution if the f2
value is lower than the first solution. Since the f2
of second solution is higher, the first and second solution are not dominate each other, thus they are equally good.
How about the first solution with the third one?
f1
value of the first solution is lower than third solutionf2
value of the first solution is lower than third solutionf3
value of the first solution is higher than third solutionThe first solution can dominate the third solution if the f3
value is lower than the third solution. Since it is not, then the first solution and the third solution are non-dominated to each other.
The consequence is that the decision maker can decide which solution to pick and look at the trade-off between objectives. If the decision maker prioritize lower mass (f1
), they should choose solution that has the lowest value of f1
.
If you want to choose solution with lowest mass, you can choose the first row of the above table. If you want to choose solution with biggest mass, you can choose the second row instead. Based on the table above, if we choose lower mass, it would result in higher intrusion f3
. This trade-off should be considered by decision maker.
We can visualize the objective values.
Multi-Objective Optimization Problem (MOOP) should be solved differently than a single objective problem. MOEA, like NSGA-II, can be used to solve MOOP. There are still many algorithm that can perform better and improve some aspects of NSGA-II6. Lack of package development for MOEA in R is a concern. We may need to build our own function if we are not satisfied with NSGA-II.
A Fast and Elitist Multiobjective Genetic Algorithm:NSGA-II↩
A portfolio optimization model with three objectives and discrete variables↩
Multiobjective optimization for crash safety design of vehicles using stepwise regression model↩
Multiobjective evolutionary algorithms: A survey of the state of the art↩