Frankie Cho
15 August 2022
UQ CBCS
Asking a computer to make decisions!
Consider a decision problem of choosing 25% of locations out of \( N \) locations for conservation.
Each location is either “chosen” or “not chosen” (0 or 1).
source('ilp-intro-functions.R')
N <- c(20,80,100,1000)
nchoosek <- function(n) choose(n, n/4)
Combinations <- sapply(N, nchoosek)
knitr::kable(data.frame(N, Combinations))
N | Combinations |
---|---|
20 | 1.550400e+04 |
80 | 3.535316e+18 |
100 | 2.425193e+23 |
1000 | 4.822840e+242 |
How to search through these combinations efficiently?
Compared to simulated annealing based methods ILP offers:
a) Data for the objective function
b) Data for the constraints
c) List of variables we “allow” the computer to choose
a) Raster of the costs of reserve sites
b) Raster of species distribution, spatial connectivity matrix
c) Binary decision variable: whether a reserve site is “selected” in the location
set.seed(123)
nx <- 50
ny <- 50
N <- nx*ny
r1 <- matrix(rpois(nx*ny, correlatedSpatialGrid(.5,nx,ny)), ncol = ny, nrow = nx)
r2 <- matrix(rpois(nx*ny, correlatedSpatialGrid(2,nx,ny)), ncol = ny, nrow = nx)
c <- matrix(exp(rnorm(correlatedSpatialGrid(5,nx,ny), 0)), ncol = ny, nrow = nx)
plotR1 <- plotSpatialGrid(r1, label = "Species 1", colorScheme = 'viridis')
plotR2 <- plotSpatialGrid(r2, label = "Species 2", colorScheme = 'viridis')
plotC <- plotSpatialGrid(c, label = "Cost", colorScheme = 'magma')
ggpubr::ggarrange(plotR1, plotR2, plotC, nrow = 1)
\[ \min_\mathbf{x} \sum^N_{i=1} c_i x_i \: (1) \] \[ s.t. \sum_{i=1}^N r_{ik} x_i \geq T_k, \forall k \in K \: (2) \] \[ x_i \in \{0,1\} \: (3) \]
How to use these formulations in my own problem??
We need to find data variables and decision variables
\( N= 2500 \): 2500 possible locations to choose from
\( K=2 \): two species we are concerned about
\( \mathbf{c} = [c_1, c_2, ..., c_N] \) : vector of length N, costs of setting up a reserve at each location
\( \mathbf{r}_k = [r_{1k}, r_{2k}, ..., r_{Nk}] \) : vector of length N, number of the species \( k \) in each location – we have \( K \) number of these vectors
\( T_k \): a scalar, conservation objective for each species
\( \mathbf{x} = [x_1, x_2, ..., x_N] \) : vector of length N
Gurobi find the vector \( \mathbf{x} \) that optimises the objective function
In an LP/ ILP/ MILP, a decision variable cannot be multiplied with another decision variable
e.g. \( x_ix_j \) is not feasible in the LP – if two decision variables are multiplied together, then the problem will be a quadratic program (QP) – computationally intensive
Beyer et al. (2016) overcomes this problem with 'linearization' – using another integer variable to represent binary variables multiplied together
Data variables can multiply with other data variables. Multiplication occurs before the problem is solved.
Decision variables need to lie in a set. This describes the range of values these decision variables can take.
\( x_i \in \mathbb{R} \): \( x_i \) can be any real number
\( x_i \in [0,1] \) : \( x_i \) is between 0 and 1; can be non-integer; same as: \( 0 \leq x_i \leq 1 \)
\( x_i \in \{0,1\} \) : \( x_i \) is either 0 or 1 (i.e. integer)
\( x_i \in \mathbb{Z} \): \( x_i \) must be an integer
\( x_i \in \mathbb{Z}^+ \): \( x_i \) must be a positive integer
Gurobi accepts problems in the following form:
\[ \min_\mathbf{x} \mathbf{c'x} \\ s.t. \mathbf{A'x} \leq \mathbf{b} \]
Vector-by-vector multiplication \( \mathbf{c'x} \)
\[ \begin{bmatrix} c_1 & c_2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = c_1x_1 + c_2x_2 \]
Matrix-by-vector multiplication \( \mathbf{A'x}\leq b \)
\[ \begin{bmatrix} r_{1,1} & r_{2,1} \\ r_{1,2} & r_{2,2} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} r_{1,1}x_1 + r_{2,1}x_1 \\ r_{1,2}x_2 + r_{2,2}x_2 \end{bmatrix} < \begin{bmatrix} T_1 \\ T_2 \end{bmatrix} \]
result <- gurobi(model, params)
model$obj
: \( \mathbf{c} \)
model$A
: \( \mathbf{A} \)
model$rhs
: \( \mathbf{b} \)
model$modelsense
: \( \min \)
model$sense
: \( < \)
result$x
: \( \mathbf{x} \)
result$objval
: \( \mathbf{c'x} \)
\[ \min_\mathbf{x} \sum^N_{i=1} c_i x_i \: (1) \] \[ s.t. \sum_{i=1}^N r_{ik} x_i \geq T_k, \forall k \in K \: (2) \] \[ x_i \in \{0,1\} \: (3) \]
Equivalent matrix formulation: \[ \min_\mathbf{x} \mathbf{c'x}\: (1) \] \[ s.t. \mathbf{r}_k' \mathbf{x} \geq T_k, \forall k \in K \: (2) \] \[ \mathbf{x} \in \{0,1\} \: (3) \]
model <- list()
model$obj <- as.vector(c)
model$modelsense <- 'min'
model$A <- matrix(c(as.vector(r1), as.vector(r2)), nrow = 2, byrow = T)
model$rhs <- c(0.25*sum(r1),0.25*sum(r2))
model$sense <- '>='
model$vtype <- 'B'
# result <- gurobi(model, list())
result <- rsymphony_solve_gurobi(model, list())
plotX <- plotSpatialGrid(matrix(result$x, ncol = ny, nrow = nx), 'Decision', 'viridis')
ggarrange(plotC, plotR1, plotR2, plotX, nrow = 1)
LP solves much quicker than ILP/ MILP because it takes advantages of more efficient algorithms.
start_time_integer <- Sys.time()
integer_result <- rsymphony_solve_gurobi(model, list())
end_time_integer <- Sys.time()
model_cont <- model
model_cont$vtype <- 'C'
model_cont$lb <- rep(0,nx*ny)
model_cont$ub <- rep(1,nx*ny)
start_time_cont <- Sys.time()
non_integer_result <- rsymphony_solve_gurobi(model_cont, list())
end_time_cont <- Sys.time()
integer_time <- end_time_integer - start_time_integer
cont_time <- end_time_cont - start_time_cont
integer_time
Time difference of 5.022165 secs
cont_time
Time difference of 0.02275205 secs
Some LP will lead to integer results even when integer constraints are not explicitly imposed. Investigate why integer constraints are needed for your problem.
plotXInt <- plotSpatialGrid(matrix(integer_result$x, ncol = ny, nrow = nx), 'Decision (integer)', 'viridis')
plotXCont <- plotSpatialGrid(matrix(non_integer_result$x, ncol = ny, nrow = nx), 'Decision (continuous)', 'viridis')
ggarrange(plotX, plotXCont, nrow=1);
\[ \min_{\mathbf{x}} \sum_{i=1}^N c_ix_i - b \sum_{i=1}^N \sum_{j=1}^N z_{ij} \\ s.t. - x_i + z_{ij} \leq 0 \: \forall (i,j) \in E \\ - x_j + z_{ij} \leq 0 \: \forall (i,j) \in E \\ - x_i - x_j + z_{ij} \geq -1 \: \forall (i,j) \in E \]
This ensured that for neighbours \( i \) and \( j \), if both \( x_i \) and \( x_j \) are equal to \( 1 \), then \( z_{ij} \) must be 1. Otherwise, \( z_{ij} = 0 \). In other words \( z_{ij} \) replaced \( x_ix_j \).
Notice the increase in number of constraints! (increase by number of elements in \( E \) times 3)
“Efficiency frontier”
aka: “Pareto frontier”, “Trade-off curves”
Possible to alter the spatial connectivity in result through a linearised objective
W <- spatialWeightsMatrix(nx, ny, rowStandardise = FALSE)
W[lower.tri(W)] <- 0 # Set it as a upper triangular matrix
E <- which(W!=0, arr.ind = TRUE) # Get set of neighbours E, with two columns, one for index of i and one for index of j
num_neighbours <- nrow(E)
A_r1 <- c(as.vector(r1), rep(0, num_neighbours))
A_r2 <- c(as.vector(r2), rep(0, num_neighbours))
A_E <- matrix(0, nrow = num_neighbours*3, ncol = length(r1)+num_neighbours)
bVector <- c(0.25*sum(r1),0.25*sum(r2))
senseVector <- c('>=','>=')
# Loop through all the elements in the set of neighbours
for (s in 1:nrow(E)) {
# Given three constraints:
# z_ij - x_i < 0, coefficients for z_ij and x_i are 1 and -1
# z_ij - x_j < 0, coefficients for z_ij and x_j are 1 and -1
# z_ij - x_i - x_j < -1, coefficients for z_ij, x_i and x_j are 1, -1 and -1 respectively
row <- ((1+(s-1)*3)):(((s-1)*3)+3)
A_E[row,E[s,1]] <- c(-1, 0, -1) # Coefficients for x_i
A_E[row,E[s,2]] <- c( 0,-1, -1) # Coefficients for x_j
A_E[row,s+(nx*ny)] <- c(1, 1, 1) # Coefficients for z_ij are all 1
senseVector <- c(senseVector, '<=', '<=', '>=') # Inequality constraints
bVector <- c(bVector, 0,0,-1) # RHS of the equation
}
model_connectivity <- list()
model_connectivity$modelsense <- 'min'
model_connectivity$A <- rbind(A_r1, A_r2, A_E)
model_connectivity$rhs <- bVector
model_connectivity$sense <- senseVector
model_connectivity$vtype <- 'B'
b <- seq(0,0.5,0.1)
result_connectivity <- list()
## Loop through all 5 selections of b
for (i in 1:length(b)) {
model_connectivity$obj <- c(as.vector(c), rep(-b[i], num_neighbours))
result_connectivity[[i]] <- rsymphony_solve_gurobi(model_connectivity, list())
}
plot_connectivity = list()
for (i in 1:length(b)) {
plot_connectivity[[i]] <- plotSpatialGrid(matrix(
result_connectivity[[i]]$x[1:(nx*ny)], ncol = ny, nrow = nx),
paste("b = ", as.character(b[i])))
}
ggarrange(plotlist = plot_connectivity, nrow = 1)
efficiencyFrontier <- data.frame(b = b, cost = rep(0,length(b)), connectivity = rep(0,length(b)))
for (i in 1:length(b)) {
x <- result_connectivity[[i]]$x[1:(nx*ny)]
z_ij <- result_connectivity[[i]]$x[(nx*ny+1):length(result_connectivity[[i]]$x)]
efficiencyFrontier[i,2] <- as.vector(c)%*%x
efficiencyFrontier[i,3] <- sum(z_ij)
}
ggplot(efficiencyFrontier,aes(y = -cost, x = connectivity, label = paste('b = ',b))) +
geom_smooth(method = 'glm', se=F, formula = y ~ poly(x,3)) +
geom_point() +
geom_text(vjust = 0, nudge_y = 1) +
scale_y_continuous('Cost (negative)')+
scale_x_continuous('Connectivity')+
theme_bw()
\[ \min_\mathbf{x} \sum_{i=1}^N \sum_{k\in K} r_{ik}x_i \\ s.t. \sum_{i=1}^N c_i x_i \leq B \\ x_i \in \{0,1\} \]
Ensures that if \( x_a \) is selected, \( n \) number of its neighbours are also selected \[ \min_{\mathbf{x}} \sum_{i=1}^N c_ix_i \\ s.t. \sum^{m}_{i=1}x_i \geq nx_a \]
W <- spatialWeightsMatrix(nx, ny, rowStandardise = FALSE)
constraintMatrixNeighbour <- function(W, n) {
A <- matrix(0, nrow = nrow(W), ncol = nrow(W))
for (i in 1:nrow(W)) {
# Initialise vector of zeros corresponding to the length of the decision vector (for that row of constraint)
aj <- W[,i]
# Set the number of neighbours for i
aj[i] <- -n
# Set the matrix back to A
A[,i] <- aj
}
A
}
model_neighbour <- model
model_neighbour$modelsense <- 'min'
model_neighbour$sense <- ">="
model_neighbour$vtype <- 'B'
neighbours <- c(0,1,2,3)
result_neighbour <- list()
## Loop through all 5 selections of b
for (i in 1:length(neighbours)) {
model_neighbour$A <- rbind(model$A, constraintMatrixNeighbour(W, neighbours[i]))
model_neighbour$rhs <- c(model$rhs, rep(0, nx*ny))
result_neighbour[[i]] <- rsymphony_solve_gurobi(model_neighbour, list())
}
plot_neighbour = list()
for (i in 1:length(neighbours)) {
plot_neighbour[[i]] <- plotSpatialGrid(matrix(
result_neighbour[[i]]$x[1:(nx*ny)], ncol = ny, nrow = nx),
paste("n = ", as.character(neighbours[i])))
}
ggarrange(plotlist = plot_neighbour, nrow = 1)