In the era of Covid-19 pandemic, slowing the spread of the virus is one of the critical things. In the absence of Covid-19 vaccine, social distancing is the best preventive measure to slow the spread of the virus so that the number of sick people doesn’t cause the healthcare system to collapse.
By social distancing or reducing social exposure by 75% can reduce the spread of the virus by up to 99%.
Source: visualcapitalist.com
Until a vaccine is available, this could mean some form of social distancing will need to exist for many months, if not years.
- World Economic Forum
This is why social distancing is important.
Of course, enforcing social distancing in public places is not easy. Especially when a country is so big and diverse like Indonesia. By implementing Social distancing detector through CCTV or live camera, this can help authority or government to keep track and monitoring social distancing in public places on a large scale.
The primary goal of this project is to enable the detection of multiple persons in a video or a live camera to detect are they social distancing or not.
Source: Landing AI
This goal could be achieved through computer vision using deep learning method named YOLO (You Only Look Once). This enables live detection of social distancing through CCTV or live camera. This also enables the data to be recorded in a database and can be displayed to a dashboard in real-time.
There are a lot of people and organization trying to solve this problem. One of the example is from Landing AI and you can see the result from image that used above in the section goal. What makes this project different is the implementation, where this project will focus on public area in Indonesia. Since Indonesia has diverse environment that may rarely meet in other developed country like traditional market, night market and crowded bus station.
If the main goal is achieved, this project can be extended to a few things.
Mask detection
Not only social distancing, but this project can be extended to live mask detection as well.
Live heatmap
Reporting in realtime is one of the important things from a user perspective. Live heatmap can help the user to monitor social distancing at multiple areas in realtime. This can be achieved through data visualization plotted to a map that can create a heatmap of the person are social distancing or not.
The dataset that used in this rpubs comes from crowdhuman.org. And it is an open source human dataset for training computer vision.
The dataset comes with a lot of images and labeled data. With the image format .png and the annotation data .odgt. Maybe you’re wondering why the training data are images? Isn’t the model will be handling videos? Well, since the video consist of frames which are still images, so yea the model can be train by image.
The data that we’ve imported, should annotate base on these images
odgt is a file format that each line of it is a JSON, this JSON contains the whole annotations for the relative image. We prefer using this format since it is reader-friendly.
The annotation format should look like this.
JSON{
"ID" : image_filename,
"gtboxes" : [gtbox],
}
gtbox{
"tag" : "person" or "mask",
"vbox": [x, y, w, h],
"fbox": [x, y, w, h],
"hbox": [x, y, w, h],
"extra" : extra,
"head_attr" : head_attr,
}
extra{
"ignore": 0 or 1,
"box_id": int,
"occ": int,
}
head_attr{
"ignore": 0 or 1,
"unsure": int,
"occ": int,
}
Since there’s no library to convert odgt straight to dataframe, we need to create custom function to convert it to list named odgtToList. And then later we can convert it to dataframe.
Because odgt basically a json data in each line of it. First we need to split the line. Then convert it the json string to list so we can then convert it to list dataframe, we can do this with fromJSON function from library rjson.
Then apply it to the anotation.
After converting to a list, now it’s much easier and flexible to convert it to dataframe.
The first is to create empty dataframe with the structure that we wanted. In this case, we want the dataframe to has the filename (img), so that later we can apply it to the image. Then the label of the annotation, and the most important part, coordinate of annotation or X, Y, W(width), H(height). This is part is important because it needed to create and train the model where to put the bounding box.
df <- data.frame(
img=character(),
label=character(0),
x=numeric(0),
y=numeric(0),
w=numeric(0),
h=numeric(0),
stringsAsFactors = F
)Okay now we can convert the list format to dataframe.
for (i in 1:length(annot)) {
for(index in 1:length(annot[[i]]$gtboxes)) {
item.img <- paste(annot[[i]]$ID, ".jpg", sep="")
item.hbox <- annot[[i]]$gtboxes[index][[1]]$fbox %>% lapply(as.numeric)
item.label <- annot[[i]]$gtboxes[index][[1]]$tag
df[index,] <- c(
img=item.img,
label=item.label,
x=item.hbox[1],
y=item.hbox[2],
w=item.hbox[3],
h=item.hbox[4]
)
}
}After preprocessing, it will look something like this.
## img label x y w h
## 1 273271,1b86f000bc5b77bf.jpg person 672 99 308 1117
## 2 273271,1b86f000bc5b77bf.jpg person 132 209 435 652
## 3 273271,1b86f000bc5b77bf.jpg person 125 130 291 678
## 4 273271,1b86f000bc5b77bf.jpg person 365 225 209 564
## 5 273271,1a0d6000b9e1f5b7.jpg person 527 103 128 318
## 6 273271,1a0d6000b9e1f5b7.jpg person 338 126 131 311
The annotation data has been preprocess. Now let’s see how the image is gonna be displayed with the bounding box .
list.img <- c()
for (curr.img in unique(df$img)) {
human_image <- image_read(paste("data/", curr.img, sep=""))
img <- image_draw(human_image)
currentDf <- df %>% filter(img == curr.img)
for(index in 1:nrow(currentDf)) {
item <- currentDf[index,]
rect(item$x, item$y, item$x+item$w, item$y+item$h, border="green", lwd=2)
text(item$x, item$y, item$label, cex = 2, col="green")
}
list.img <- c(list.img, img)
}## # A tibble: 1 x 7
## format width height colorspace matte filesize density
## <chr> <int> <int> <chr> <lgl> <int> <chr>
## 1 JPEG 1000 667 sRGB TRUE 0 72x72
## # A tibble: 1 x 7
## format width height colorspace matte filesize density
## <chr> <int> <int> <chr> <lgl> <int> <chr>
## 1 JPEG 1024 629 sRGB TRUE 0 72x72
The model that fits for this case is YOLO (You Only Look Once). It’s basically state-of-the-art, real-time object detection system. It’s the most robust and advanced model for real-time object detection so far. From the bounding box created by YOLO, then we can create a distance measurement by calculating the pixel from each bounding box. Distance measurement can be extended from simple calculation to using a simple neural network to determine whether a person is social distancing or not.