Weighted Constrained K-Means

In this demo, the electricity consumption of 12 buildings is collected. The project aims to setup a number of microgrids powered by solar panel generators and provide electricity to these buildings at minimum cost. While wiring cost is considered, the optimal assignment has to satisfy the capacities of varying solar panel generators. In other words, the research question is to group the buildings into a number of clusters such that the aggregate inter-cluster distance is minimized within each cluster while keeping the aggregate electricity consumption (demand) of each cluster at a moderate level defined by upper/lower bound. I develop a weighted constrained k-means approach to solve this problem.

Inspired by Bradley et al. (2000), I formulate the constrained k-means clustering as a linear programming problem: \[min_{C,T} \quad \Sigma _{i=1}^m \Sigma _{h=1}^k wT_{i,h} ||x^i-C^h||^2 + \Sigma _{h=1}^k \Sigma _{g=1}^n \rho _gS_{h,g}+ \Sigma _{t=1}^{\tau}\Sigma _{h=1}^k \theta ^P P_{h,t}+\Sigma _{t=1}^{\tau}\Sigma _{h=1}^k \theta ^Q Q_{h,t}\] \[s.t. \quad \Sigma _{h=1}^k T_{i,h}=1, \quad i=1,...,m\] \[\quad \quad \quad \quad \quad \quad \quad \quad \Sigma _{i=1}^m d _{i,t}T_{i,h}\leq \Sigma _{g=1}^n U _gS_{h,g}, \quad h=1,...,k \quad t=1,...,\tau\] \[\quad \quad \quad \quad \quad \quad \quad \quad \Sigma _{i=1}^m d _{i,t}T_{i,h} \geq \Sigma _{g=1}^n L _gS_{h,g}, \quad h=1,...,k \quad t=1,...,\tau\] \[\quad \quad \quad \quad \Sigma _{g=1}^n S_{h,g}=1, \quad h=1,...,k.\] \[\quad \quad \quad \quad \quad \quad T_{i,h} \geq 0, \quad i=1,...,m, h=1,...,k.\] I adopt the following notations: \(i \in \{1,2,3,...,m\}\): index of buildings

\(h \in \{1,2,3,...,k\}\): index of clusters

\(t \in \{1,2,3,...,\tau\}\): index of time stamp

\(g \in \{1,2,3,...,n\}\): index of generator types

\(C^h\): the centroid of cluster \(h\)

\(x^i\): the position of building \(i\)

\(w\): wiring cost per kilometer

\(\rho _g\): generator set-up cost of type \(g\)

\(L_g\) and \(U_g\): lower and upper limit of type \(g\) generator

\(d_{i,t}\): electricity consumption (demand) of building \(i\) at time \(t\) in Kw

\(S_{h,g}\): decision variable, equals one if cluster \(h\) adopts type \(g\) generator

\(T_{i,h}\): decision variable, equals one if data point \(x^i\) is closest to centroid \(C^h\) and zero otherwise

It’s worth mentioning that centroid \(C^{h,t+1}\) at iteration \(t+1\) is calculated using centroid at iteration \(t\). The algorithm contains two steps: 1) initiate \(C^h\) and run the LP problem to determine the cluster assignment \(T_{ih}^t\), and 2) update \(C^{h,t+1}=\frac{\Sigma _{i=1}^m T_{i,h}^t x^i}{\Sigma _{i=1}^m T_{i,h}^t}\) until convergence.

An Example

I deploy the model on a real-world dataset which consists the demand and location of 12 buildings.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.3
library(lpSolve)
## Warning: package 'lpSolve' was built under R version 4.2.2
library(geodist) 
## Warning: package 'geodist' was built under R version 4.2.3
coord <- matrix(c(40.555181921935095, -74.27603516627407,    # Create longitude/latitude matrix
                  40.55620901014459, -74.28006920857614,
                  40.555915557978196,-74.27901778265698,
                  40.560415016521716, -74.27509102871402,
                  40.560480223842745, -74.2754128937913,
                  40.55449718770875, -74.26743063987445,
                  40.55607858711832, -74.28137812655713,
                  40.560741052491544, -74.2889312270376,
                  40.559208669626244, -74.27987608952975,
                  40.55720347736548, -74.28328785934906,
                  40.55609489001049, -74.27689347314684,
                  40.55444827785318, -74.28251538316356),
                  ncol = 2,byrow = TRUE)


d1 = c(283, 9,  30, 67, 56, 530,    24, 37, 98,   2,    6,  48) #demand for month 1
d2 = c(310, 9,  34, 60, 65, 491,    26, 32, 118,    5,  7,  43)
d3 = c(336, 8,  31, 50, 57, 465,    22, 34, 100,    5,  7,  41)
d4 = c(323, 8,  26, 49, 31, 472,    23, 33, 130,    4,  8,  45)
d5 = c(329, 7,  30, 28, 26, 385,    18, 40, 112,    6,  7,  40)
d6 = c(351, 10, 41, 16, 27, 436,    19, 67, 325,    12, 11, 47)
d7 = c(363, 12, 88, 17, 31, 422,    18, 132,    31, 15, 13, 54)
d8 = c(351, 12, 61, 21, 23, 371,    7,  31, 174,    14, 13, 52)
d9 = c(329, 12, 36, 26, 27, 404,    11, 121,    311,    14, 12, 57)
d10 = c(309,    10, 32, 20, 21, 297,    24, 62, 168,    4,  9,  47)
d11 = c(301,    9,  33, 32, 33, 352,    17, 7,  36, 12, 8,  44)
d12 = c(322,    8,  34, 46, 47, 347,    21, 11, 48, 2,  6,  40)

d = cbind(d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12)

time = dim(d)[2]
building = dim(d)[1]

cost_g = c(10000,10000,10000)
lower = c(50,100,300)
upper = c(100,300,1000)

cost_g = matrix(cost_g,nrow=1)
generator = length(upper)
upper = matrix(upper,nrow=1)
lower = matrix(lower,nrow=1)

tarrif_surplus = 10
tarrif_gap = 10
wiring = 1

demand <- c(363,12,88,67,65,530,26,132,325,15,13,57)  # this is peak demand calculated as max for each building during the 12-month period.

data = cbind(coord,demand)
data = cbind(data,d)
colnames(data)[1]="x"
colnames(data)[2]="y"
colnames(data)[3]="demand"
data = as.data.frame(data)

A scatter plot is created to visualize the dataset.

graph <- ggplot(data, aes(x=x, y=y,label=demand)) +geom_point(size=5)
graph +  geom_text(hjust=-0.36, vjust=0)

Run the weighted constrained K-means algorithm

# constraints on demand capacity using vector of decision variables x1 (mk by 1). Be aware that x2 will be included later to accommodate varying capacities.
#km <- function(num_cluster){
num_cluster =3

W = matrix(demand,nrow=1)
Au = kronecker(diag(num_cluster), W) 
Al = kronecker(diag(num_cluster), W) 
#bu = c(rep(bu[1],time),rep(bu[2],time),rep(bu[3],time))
#bl = c(rep(bl[1],time),rep(bl[2],time),rep(bl[3],time))

# set up constraints to make sure each data point is assigned to one cluster only using vector of decision variables x1 (mk by 1)
zero = rep(0,length(demand))
zero[1] = 1
B = kronecker(diag(num_cluster),matrix(zero,nrow=1))
one = matrix(rep(1,num_cluster),nrow=1)
B = one%*%B
for (x in 2:length(demand)){
  print(x)
  zero = rep(0,length(demand))
  zero[x] = 1
  temp = kronecker(diag(num_cluster),matrix(zero,nrow=1))
  temp = one%*%temp
  B = rbind(B,temp)
}
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11
## [1] 12
#set up constraints for generator-cluster decision variables using vector of decision variables x2 (nk by 1)
#a cluster can only choose a generator of one type
one = matrix(rep(1,generator),nrow=1)
S = kronecker(diag(num_cluster),one)
s = rep(1,num_cluster)

#set up constraints for binary variables I which equals one if there is SURPLUS and zero otherwise using x1 x2 and x3 (n\tao by 1)
library(magic)
## Warning: package 'magic' was built under R version 4.2.2
## Loading required package: abind
M = as.integer(10^7)
zero = rep(0,time)
zero[1] = 1
demand_t = kronecker(diag(num_cluster),t(d%*%zero)) 
for (x in 2:time){
  zero = rep(0,time)
  zero[x] = 1
  tmp = kronecker(diag(num_cluster),t(d%*%zero)) 
  demand_t = adiag(demand_t,tmp)
}



################################################
select1 = adiag(diag(num_cluster*building),diag(num_cluster*generator)*0,diag(num_cluster*time)*0,diag(num_cluster*time)*0,diag(num_cluster*time)*0)
select1 = cbind(diag(num_cluster*building),matrix(0,nrow=num_cluster*building,ncol=num_cluster*(3*time+generator),))%*%select1

select2 = adiag(diag(num_cluster*building)*0,diag(num_cluster*generator),diag(num_cluster*time)*0,diag(num_cluster*time)*0,diag(num_cluster*time)*0)
select2 = cbind(matrix(0,nrow=num_cluster*generator,ncol=num_cluster*building),diag(num_cluster*generator),matrix(0,nrow=num_cluster*generator,ncol=3*num_cluster*time))%*%select2

select3 = adiag(diag(num_cluster*building)*0,diag(num_cluster*generator)*0,diag(num_cluster*time),diag(num_cluster*time)*0,diag(num_cluster*time)*0)
select3 = cbind(matrix(0,nrow=num_cluster*time,ncol=num_cluster*(generator+building)),diag(num_cluster*time),matrix(0,ncol=2*time*num_cluster,nrow=time*num_cluster))%*%select3

select4 = adiag(diag(num_cluster*building)*0,diag(num_cluster*generator)*0,diag(num_cluster*time)*0,diag(num_cluster*time),diag(num_cluster*time)*0)
select4 = cbind(matrix(0,nrow=num_cluster*time,ncol=((building+generator+time)*num_cluster)),diag(time*num_cluster),matrix(0,nrow=time*num_cluster,ncol=time*num_cluster))%*%select4

select5 = adiag(diag(num_cluster*building)*0,diag(num_cluster*generator)*0,diag(num_cluster*time)*0,diag(num_cluster*time)*0,diag(num_cluster*time))
select5 = cbind(matrix(0,nrow=time*num_cluster,ncol=(num_cluster*(building+generator+time+time))),diag(time*num_cluster))%*%select5



one = matrix(1,nrow=1,ncol=1)
left1 = Au%*%select1-kronecker(diag(num_cluster),kronecker(one,upper))%*%select2
right1 = rep(0,num_cluster)
direction1 = rep("<=",length(right1))

one = matrix(1,nrow=1,ncol=1)
left2 = Au%*%select1-kronecker(diag(num_cluster),kronecker(one,lower))%*%select2
right2 =rep(0,num_cluster)
direction2 = rep(">=",length(right2))

left3 = B%*%select1
right3 = rep(1,building)
direction3 = rep("=",length(right3))

left4 = S%*%select2
right4 = rep(1,num_cluster)
direction4 = rep("=",length(right4))


left51 = M*select3  # binary
left52 = demand_t%*%kronecker(matrix(1,nrow=time,ncol=1),diag(building*num_cluster))%*%select1
left5 = left51+left52
right5 = rep(c(upper),time)
direction5 = rep(">=",length(right5))

left6 = left51+left52 # binary
right6 = rep(c(upper),time)+M
direction6 = rep("<=",length(right6))

left7 = left52 + select4                          #surplus
right7 = rep(c(upper),time)
direction7 = rep(">=",length(right7))

left8 = select4-M*select3                                 #surplus
right8 = rep(0,(time*num_cluster))
direction8 = rep("<=",length(right8))

left9 = select4
right9 = rep(0,time*num_cluster)                  #surplus
direction9 = rep(">=",length(right9))

left10 = select4 + left52 + M*select3             #surplus
right10 = rep(c(upper),time) + M
direction10 = rep("<=",length(right10))

left11 = select5                                  #gap
right11 = rep(0,time*num_cluster)
direction11 = rep(">=",length(right11))

left12 = -left52 + select5 -M*select3             #gap
right12 = rep(c(upper),time)*(-1)
direction12 = rep("<=",length(right12))

left13 = select5 - left52                         #gap
right13 = rep(c(upper),time)*(-1)
direction13 = rep(">=",length(right13))

left14 = select5 + M*select3                       #gap
right14 = rep(M,time*num_cluster)
direction14 = rep("<=",length(right14))


left = rbind(left1,left2,left3,left4,left5,left6,left7,left8,left9,left10,left11,left12,left13,left14)
right =c(right1,right2,right3,right4,right5,right6,right7,right8,right9,right10,right11,right12,right13,right14)
direction = c(direction1,direction2,direction3,direction4,direction5,direction6,direction7,direction8,direction9,direction10,direction11,direction12,direction13,direction14)

set.seed(1)
cx <- runif(n = num_cluster, min = min(coord[,1]), max = max(coord[,1]))
cy <- runif(n = num_cluster, min = min(coord[,2]), max = max(coord[,2]))
cluster_init = cbind(cx,cy)


adj = as.matrix(geodist(rbind(cluster_init[1,],coord)))
coef = adj[1,2:(length(demand)+1)]
for (x in 2:num_cluster){
  adj = as.matrix(geodist(rbind(cluster_init[x,],coord)))
  coef = c(coef,adj[1,2:(length(demand)+1)])
}


coef = c(coef*wiring,rep(c(cost_g),num_cluster),rep(0,num_cluster*time),rep((-1)*tarrif_surplus,num_cluster*time),rep((-1)*tarrif_gap,time*num_cluster))

binary = seq(1,(num_cluster*(building+generator+time)),1)
  
optimum <-  lp(direction="min",
               objective.in = coef,
               const.mat = left,
               const.dir = direction,
               const.rhs = right,
               binary.vec = binary)

result = optimum$solution

###################
for (i in 1:10){
  #print(i)
  position = result[1:(building*num_cluster)]
  xnew = coord[,1]*position
  ynew = coord[,2]*position
  cnewx = c()
  cnewy = c()
  for (x in 1:num_cluster){
    tmpx = sum(xnew[((x-1)*length(demand)+1):((x)*length(demand))])/sum(xnew[((x-1)*length(demand)+1):((x)*length(demand))] != 0)
    tmpy = sum(ynew[((x-1)*length(demand)+1):((x)*length(demand))])/sum(ynew[((x-1)*length(demand)+1):((x)*length(demand))] != 0)
    cnewx = c(cnewx,tmpx)
    cnewy = c(cnewy,tmpy)
  }
  cluster = cbind(cnewx,cnewy)
  
  adj = as.matrix(geodist(rbind(cluster[1,],coord)))
  coef = unname(adj[1,2:(length(demand)+1)])
  
  for (j in 2:num_cluster){
    adj = as.matrix(geodist(rbind(cluster[j,],coord)))
    coef = c(coef,unname(adj[1,2:(length(demand)+1)]))
  }
  coef = c(coef*wiring,rep(c(cost_g),num_cluster),rep(0,num_cluster*time),rep((-1)*tarrif_surplus,num_cluster*time),rep((-1)*tarrif_gap,time*num_cluster))
  binary = seq(1,(num_cluster*(building+generator+time)),1)  
  optimum <-  lp(direction="min",
                 objective.in = coef,
                 const.mat = left,
                 const.dir = direction,
                 const.rhs = right,
                 binary.vec = binary)
  result = optimum$solution
  
  position = result[1:(building*num_cluster)]
  xnew = coord[,1]*position
  ynew = coord[,2]*position
  cnewx = c()
  cnewy = c()
  for (x in 1:num_cluster){
    tmpx = sum(xnew[((x-1)*length(demand)+1):((x)*length(demand))])/sum(xnew[((x-1)*length(demand)+1):((x)*length(demand))] != 0)
    tmpy = sum(ynew[((x-1)*length(demand)+1):((x)*length(demand))])/sum(ynew[((x-1)*length(demand)+1):((x)*length(demand))] != 0)
    cnewx = c(cnewx,tmpx)
    cnewy = c(cnewy,tmpy)
  }
  clusternew = cbind(cnewx,cnewy)
  
  if (norm(cluster-clusternew)<0.00001){
    break
  }
  
  print(i)
  print(norm(cluster-clusternew))
  
}
## [1] 1
## [1] 0.01563692
## [1] 2
## [1] 0.02102637
## [1] 3
## [1] 0.01125312
## [1] 4
## [1] 0.001057148
## [1] 5
## [1] 0.006036758
## [1] 6
## [1] 0.008724689
## [1] 7
## [1] 0.001046419
## [1] 8
## [1] 0.002419855
## [1] 9
## [1] 0.002921309
## [1] 10
## [1] 0.004992281
#return (result%*%coef)}

The result indicates 2 clusters in the optimal assignment.

group =rep(0,length(demand))
for (i in 1:num_cluster){
  group = group + result[((i-1)*length(demand)+1):((i)*length(demand))]*i
}

data$k <- kmeans(data[,1:2],num_cluster)$cluster
data$group <- group

generator_assign = result[(1+num_cluster*building):(num_cluster*(building+generator))]*rep(c(1,2,3),num_cluster)
generator_assign = kronecker(diag(num_cluster),matrix(1,nrow=1,ncol=generator))%*%generator_assign
print(generator_assign)
##      [,1]
## [1,]    3
## [2,]    3
## [3,]    2

We also visualize the clustering result using the electricity consumption of each building.

library(gridExtra)
## Warning: package 'gridExtra' was built under R version 4.2.3
plot1 <- ggplot(data, aes(x=x, y=y,color=group,label=demand)) +geom_point(size=5)
plot1 +  geom_text(hjust=-0.33, vjust=0)

#plot2 <- ggplot(data, aes(x=x, y=y,color=group,label=d2)) +geom_point(size=5)
#plot2 +  geom_text(hjust=-0.33, vjust=0)

#plot3 <- ggplot(data, aes(x=x, y=y,color=group,label=d3)) +geom_point(size=5)
#plot3 +  geom_text(hjust=-0.33, vjust=0)

#plot4 <- ggplot(data, aes(x=x, y=y,color=group,label=d4)) +geom_point(size=5)
#plot4 +  geom_text(hjust=-0.33, vjust=0)

#plot5 <- ggplot(data, aes(x=x, y=y,color=group,label=d5)) +geom_point(size=5)
#plot5 +  geom_text(hjust=-0.33, vjust=0)

#plot6 <- ggplot(data, aes(x=x, y=y,color=group,label=d6)) +geom_point(size=5)
#plot6 +  geom_text(hjust=-0.33, vjust=0)

#grid.arrange(plot1, plot2, plot3,plot4,plot5,plot6, ncol=3, nrow = 2)

We can compare the result with original K-means clustering.

graph <- ggplot(data, aes(x=x, y=y,color=k,label=demand)) +geom_point(size=5) +  geom_text(hjust=-0.33, vjust=0)
graph 

Extensions

There are several directions to extend the model:

  1. Introduce heterogeneous solar panel generators with different capacity and setup cost.

  2. Consider the electricity gap/surplus between microgrids and state grids and model the electricity trading.

References

Bradley, P. S., Bennett, K. P., & Demiriz, A. (2000). Constrained k-means clustering. Microsoft Research, Redmond, 20(0), 0.