STRAP-III Analysis

STRAP-III Problem Statement

The basic premise of STRAP-III formulation is as follows: if a decision-maker is faced with a mix of humans and digital surveillance resources for human trafficking, what should be the allocation policy for an area such that any type-I or type-II errors are minimized? That is, which nodes and/or edges should be allocated a resource for minimizing detection error and ensuring at least K ‘units’ of trafficking activity is detected (traffickers or victims).

In order to illustrate solutions to this problem, consider the following graph:

## unable to reach CRAN

## Network attributes:
##   vertices = 5
##   directed = TRUE
##   hyper = FALSE
##   loops = FALSE
##   multiple = FALSE
##   bipartite = FALSE
##  total edges = 13 
##    missing edges = 0 
##    non-missing edges = 13 
##  density = 0.65 
## 
## Vertex attributes:
## 
##  city_population:
##    numeric valued attribute
##    attribute summary:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     9.0    10.0    12.0    11.8    14.0    14.0 
##   vertex.names:
##    character valued attribute
##    5 valid vertex names
## 
## Edge attributes:
## 
##  travel_population:
##    numeric valued attribute
##    attribute summary:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   3.000   5.000   4.615   6.000   7.000

## [1] "Population on vertices is "

## [1]  9 14 12 14 10

## [1] "Population on edges is given by the following adjacency matrix "

##      [,1] [,2] [,3] [,4] [,5]
## [1,]    0    7    1    5    4
## [2,]    0    0    5    3    0
## [3,]    7    0    0    0    0
## [4,]    0    0    7    0    3
## [5,]    3    6    6    3    0

Given this graph, as representative of an area, we now start with defining the optimization problem. First, initialize the parameters of the graph:

## [1] "Cost of edge surveillance"

##            [,1]
##  [1,] 0.2985909
##  [2,] 0.3824498
##  [3,] 0.3392299
##  [4,] 0.3043190
##  [5,] 0.2813981
##  [6,] 0.1603479
##  [7,] 0.2160844
##  [8,] 0.2538751
##  [9,] 0.2682912
## [10,] 0.1252195
## [11,] 0.3268904
## [12,] 0.2290141
## [13,] 0.1896407

## [1] "Cost of vertex surveillance"

##           [,1]
## [1,] 0.1648461
## [2,] 0.3378435
## [3,] 0.1480809
## [4,] 0.3732748
## [5,] 0.1168643

## [1] "Estimate (noisy) of human trafficking on edges"

##             [,1]
##  [1,] 0.19652491
##  [2,] 0.40396945
##  [3,] 0.09543817
##  [4,] 0.03125569
##  [5,] 0.40143971
##  [6,] 0.22691506
##  [7,] 0.05852149
##  [8,] 0.04273295
##  [9,] 0.12109762
## [10,] 0.48582426
## [11,] 0.41593361
## [12,] 0.32599641
## [13,] 0.32737021

## [1] "Estimate (noisy) of human trafficking on nodes"

##            [,1]
## [1,] 0.05899827
## [2,] 0.34617981
## [3,] 0.24976303
## [4,] 0.14106362
## [5,] 0.44876007

## [1] "Sensitivity of surveillance devices (Homogeneous)"

## [1] 0.4

## [1] "Specificity of surveillance devices (Homogeneous)"

## [1] 0.4

Now, consider the device level imperfections – it may be corruption in case of humans or technical limitations (battery power, processing capacity) in case of devices. Initialize the following vector for the number of devices and the corresponding imperfection values. Higher values can be construed for devices while lower values can be for humans:

## [1] "Imperfection values are: "

##           [,1]      [,2]      [,3]    [,4]      [,5]       [,6]      [,7]
## [1,] 0.2681594 0.2403329 0.2513642 0.19619 0.1912825 0.02482012 0.1146641
##           [,8]
## [1,] 0.1875591

We now initialize the optimization setup. Using OMPR and GPLK solvers, following is a setup that can be used for solving this allocation problem:

#Optimization using MIP
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)
library(ROI)

## ROI: R Optimization Infrastructure

## Registered solver plugins: nlminb, glpk.

## Default solver: auto.

library(ROI.plugin.glpk)
library(ompr)
library(ompr.roi)

m <- MIPModel() %>%
  ##Variables that consider allocation on nodes
  add_variable(occn[i,l], i=1:length(devices), l=1:(num_nodes), type="binary") %>%
  
  ##Variables that consider allocation on edges
  add_variable(occe[i,j], i=1:length(devices), j=1:(num_edges), type="binary") %>%
  
  ##Budget constraint
  add_constraint(sum_expr(CV[l,1]*occn[i,l], i=1:length(devices), l=1:num_nodes)+sum_expr(CE[j,1]*occe[i,j], i=1:length(devices), j=1:num_edges)<= B) %>% 
  
  ##Trafficking activity constraint - at least K should be detected
  add_constraint(sum_expr(H[i,1]*occn[i,l]*S2[l,1]*ytpr, i=1:length(devices), l=1:num_nodes)+sum_expr(H[i,1]*occe[i,j]*W2[j,1]*ytpr,i=1:length(devices), j=1:num_edges)>= K) %>%
  
  ##Allocation constraints
  ##Each resource can be allocated to at most 1 vertex
  add_constraint(sum_expr(occn[i,l], i=1:length(devices))<=1, l=1:(num_nodes)) %>%
  
  ##Each resource can be allocated to at most 1 edge
  add_constraint(sum_expr(occe[i,j], i=1:length(devices))<=1, j=1:(num_edges)) %>%
  
  ##Each resource can be allocated to at most 1 vertex OR 1 edge but NOT BOTH
  add_constraint(occe[i,j]+occn[i,l]<=1, i=1:length(devices), j=1:(num_edges), l=1:(num_nodes)) %>%
  
  ##Each vertex can be allocated to at most 1 resource
  add_constraint(sum_expr(occn[i,l], l=1:(num_nodes))<=1, i=1:length(devices)) %>%
  
  ##Each edge can be allocated to at most 1 resource
  add_constraint(sum_expr(occe[i,j],  j=1:(num_edges))<=1, i=1:length(devices)) %>%
  
  ##Overall allocations should not exceed number of resources available
  add_constraint(sum_expr(occn[i,l],i=1:length(devices), l=1:(num_nodes))+sum_expr(occe[i,j], i=1:length(devices), j=1:(num_edges)) <= length(devices)) %>%
  
  #Objective function setup
  set_objective(sum_expr(H[i,1]*occn[i,l]*S1[l,1]*(1-ytnr), i=1:length(devices), l=1:num_nodes)+sum_expr(H[i,1]*occn[i,l]*S2[l,1]*(1-ytpr), i=1:length(devices), l=1:num_nodes)+sum_expr(H[i,1]*occe[i,j]*W1[j,1]*(1-ytnr), i=1:length(devices), j=1:num_edges)+sum_expr(H[i,1]*occe[i,j]*W2[j,1]*(1-ytpr), i=1:length(devices), j=1:num_edges),"min")  
  
#Solving model using GLPK Solver
result <- solve_model(m, with_ROI(solver = "glpk", verbose = TRUE))

## <SOLVER MSG>  ----
## GLPK Simplex Optimizer, v4.65
## 557 rows, 144 columns, 1760 non-zeros
##       0: obj =   0.000000000e+00 inf =   1.000e+00 (1)
##       7: obj =   3.934501603e+00 inf =   0.000e+00 (0)
## *    31: obj =   3.493956617e+00 inf =   1.539e-15 (0)
## OPTIMAL LP SOLUTION FOUND
## GLPK Integer Optimizer, v4.65
## 557 rows, 144 columns, 1760 non-zeros
## 144 integer variables, all of which are binary
## Integer optimization begins...
## Long-step dual simplex will be used
## +    31: mip =     not found yet >=              -inf        (1; 0)
## Solution found by heuristic: 3.85456108048
## Solution found by heuristic: 3.73979156046
## Solution found by heuristic: 3.64479872829
## Solution found by heuristic: 3.59471115308
## Solution found by heuristic: 3.57472270357
## Solution found by heuristic: 3.5688337794
## Solution found by heuristic: 3.5688337794
## Solution found by heuristic: 3.5657489603
## Solution found by heuristic: 3.5616929605
## Solution found by heuristic: 3.55671606643
## +  1204: mip =   3.556716066e+00 >=     tree is empty   0.0% (0; 481)
## INTEGER OPTIMAL SOLUTION FOUND
## <!SOLVER MSG> ----

#Indices of vertex/edges where resource is allocated
which(result$solution>0)

##  occe[4,2]  occe[8,5] occe[1,10] occe[2,11] occe[5,12]  occn[6,2]  occn[3,5] 
##         12         40         73         82         93        118        139

Reading the output should be as follows: For occe[x,y] or occn[x,y], x => the resource while y => the edge/node it is allocated to. Now here are a few questions:

Does having more variety in options to choose from (resource count) help in achieving the goal of at least K units of trafficking activity detection at budget B?
At what point (resource count) does it become unfeasible to decide on an allocation policy that minimizes detection error while achieving at least K units detected?

Both these questions, from a practice perspective, help in planning recruitment and training of humans and device purchase in case of planning deployment.

STRAP-III Analysis

A Ray

3/23/2021

STRAP-III Problem Statement