Deleting the Ground Point data set

Introduction

The process of deleting the point data set coresponf to the ground and trees has several steps. First we load the libraries and the sample data set. Then we are going to use k-mean algorithm to devidie the data set into 30 blocks. Then we apply the local algorithm and density clustering to indetify and delet the ground data set.

Loading Packages

To implement our algorithm, we need to use a variety of R packages.

library(sp)
library(spatstat)
library(raster)
library(rgeos)
library(rgl)
library(rgdal)
library(ggplot2)
library(arules)
library(rLiDAR)

Now we are ready to load our sample data set and take a look at the first 6 rows of the data. The data is going to be loaded as a matrix. We convert it to a data frame.

main=readLAS("main.las")
m=as.data.frame(main)
head(m)
##          X        Y      Z Intensity ReturnNumber
## 1 404960.2 280725.3 147.19         9            2
## 2 404959.3 280725.8 147.24        13            1
## 3 404958.5 280726.3 147.40        24            1
## 4 404957.6 280727.0 147.07        22            1
## 5 404956.7 280727.6 147.01        47            1
## 6 404957.0 280727.6 147.05        36            1

As you can see, we have 5 variables. The coordinates, intesity and return numbers. The dimension of the data set and the range of each coordinates are:

## The Dimension:  1364499 5
## Range of X:  404447.2 405420.7
## Range of Y:  280724.5 281443.3
## Range of Z:  121.94 191.75

The first thing we need to do is to shift all three coordinates such that the minimum vector in the data set becomes \((0,0,0)\). That makes the calculations much more faster and the 3D visualization much more easier.

m$X=m$X-range(m$X)[1]
m$Y=m$Y-range(m$Y)[1]
m$Z=m$Z-range(m$Z)[1]

Now we are going to use k-mean algorithm to cluster the data set with regard to \(X\) and \(Y\) coordinates. That makes the implementation of deleting ground algorithm faster and more effective.

qmain=kmeans(m[,1:2],30,iter.max = 30)
m$color=qmain$cluster
l=list()
for(i in 1:30){
      l[[i]]=m[m$color==i,]
}

Since rendering and plotting the whole data set needs huge memory, we are going to visualize each cluster. Here are the first 2 clusters. You can use your mouse to zoom in and rotate the plots.

mfrow3d(1,2)
for(i in 1:2){
      ql=kmeans(l[[i]][,3],10)
      color=ql$cluster
      l[[i]]$color=color
      plot3d(l[[i]]$X,l[[i]]$Y,l[[i]]$Z,aspect = F,col = l[[i]]$color,xlab = "X",ylab = "Y",zlab = "Z")
      if(i!=2){
            next3d()
      }
}

You must enable Javascript to view this page properly.

Local maximum and regression planes algorithm

The process of this algorithm is based on the local maximums of \(Z\) coordinate. First we devide the range of \(Z\) coordinate into intervals. Then we look at the number of points in each interval and we calaculate the local maximums of those frequencies. The first local maxumims give us the height of the ground. The following function takes a point cloud data set and gives us a data frame containing the intervals,frequencies and the upper bounds for each interval. The following is te code for the function and the result of applying the fucntion on the first cluster.

heightG=function(w,intervals=30){
      start=floor(range(w$Z)[1])
      end=ceiling(range(w$Z)[2])
      shift=round((end-start)/intervals,2)
      res=data.frame()
      for(i in 1:intervals){
            r=start+shift
            m=sum(w$Z>=start&w$Z<r)
            res[i,1]=paste("[",start,",",r,")")
            res[i,2]=m
            res[i,3]=r
            start=r
      }
      colnames(res)=c("Intervals","Frequency","height")
      res
}
w=heightG(l[[1]],intervals = 30)
head(w,10)
##            Intervals Frequency height
## 1     [ 13 , 14.17 )       864  14.17
## 2  [ 14.17 , 15.34 )      1149  15.34
## 3  [ 15.34 , 16.51 )      1317  16.51
## 4  [ 16.51 , 17.68 )      3443  17.68
## 5  [ 17.68 , 18.85 )      3483  18.85
## 6  [ 18.85 , 20.02 )      2391  20.02
## 7  [ 20.02 , 21.19 )      4397  21.19
## 8  [ 21.19 , 22.36 )      2213  22.36
## 9  [ 22.36 , 23.53 )      2037  23.53
## 10  [ 23.53 , 24.7 )      1223  24.70

As you can see in the previous example, the first local maximum occures at the fifth row, when the height is less than \(18.85\). Let’s visualize the cluster again:

ww=l[[1]]
l1=rep(1,sum(ww$Z<18.85))
l2=rep(10,sum(ww$Z>=18.85))
coll=c(l1,l2)
ww=ww[order(ww$Z),]
ww$coll=coll
open3d()
## wgl 
##   1
 plot3d(ww$X,ww$Y,ww$Z,aspect = F,col = ww$coll,xlab = "X",ylab = "Y",zlab = "Z")

You must enable Javascript to view this page properly.

You can rotate and zoom in to see more details. All the points with height less than \(18.85\) (the first local maximum) are shown by black. As you can see, that gives us the ground data set. The second local maximum is \(21.19\) as you see in the previous table. Let’s plot that

ww=l[[1]]
l1=rep(1,sum(ww$Z<21.19))
l2=rep(10,sum(ww$Z>=21.19))
coll=c(l1,l2)
ww=ww[order(ww$Z),]
ww$coll=coll
open3d()
## wgl 
##   3
 plot3d(ww$X,ww$Y,ww$Z,aspect = F,col = ww$coll,xlab = "X",ylab = "Y",zlab = "Z")

You must enable Javascript to view this page properly.

As you can see, we can target more ground data points if we choose the second interval. Now the question is how to choose the best interval. To answer this question, we need to look at regression plane fited in each interval.

loc.max=diff(diff(w$Freq)>=0)<0
ind=which(loc.max==TRUE)+1
c=c()
for(i in 1:length(ind)){
      c[i]=w[ind[i],3]
}
open3d()
## wgl 
##   5
mfrow3d(1,3)
s1=l[[1]][l[[1]]$Z<c[1],]
x=s1$X
y=s1$Y
z=s1$Z
fit1=lm(z~x+y)
coefs1=coef(fit1)
a <- coefs1["x"]
b <- coefs1["y"]
e <- -1
d <- coefs1["(Intercept)"]
plot3d(x,y,z,aspect = F,col="red")
planes3d(a,b,e,d,alpha=0.3)

s2=l[[1]][l[[1]]$Z<c[2],]
x=s2$X
y=s2$Y
z=s2$Z
fit2=lm(z~x+y)
coefs2=coef(fit2)
a <- coefs2["x"]
b <- coefs2["y"]
e <- -1
d <- coefs2["(Intercept)"]
plot3d(x,y,z,aspect = F,col="red")
planes3d(a,b,e,d,alpha=0.3)

s3=l[[1]][l[[1]]$Z<c[3],]
x=s3$X
y=s3$Y
z=s3$Z
fit3=lm(z~x+y)
coefs3=coef(fit3)
a <- coefs3["x"]
b <- coefs3["y"]
e <- -1
d <- coefs3["(Intercept)"]
plot3d(x,y,z,aspect = F,col="red")
planes3d(a,b,e,d,alpha=0.3)

You must enable Javascript to view this page properly.

Let’s look at the maximum distance between the points and the regression planes.

max(abs(fit1$residuals))
## [1] 3.984811
max(abs(fit2$residuals))
## [1] 5.446323
max(abs(fit3$residuals))
## [1] 11.52803

As you can see, the gap between the first two numbers is much less than the second gap. We can define a threshold in the final fucntion that determines which interval should we choose base on the gaps between distances. Here is the code for the final function that deletes the ground by using local maximums and regression planes.

delGh=function(m,threshold=2){
      w=heightG(m,intervals = 30)
      loc.max=diff(diff(w$Freq)>=0)<0
      ind=which(loc.max==TRUE)+1
      c=c()
      for(i in 1:length(ind)){
              c[i]=w[ind[i],3]
      }
      l=1
      s1=m[m$Z<c[1],]
      x=s1$X
      y=s1$Y
      z=s1$Z
      fit1=lm(z~x+y)
      i=2
      while(l!=1&i<=length(c)){
            s2=m[m$Z<c[i],]
            x=s2$X
            y=s2$Y
            z=s2$Z
            fit2=lm(z~x+y)
            i=i+1
            if(max(abs(fit2$residuals))-max(abs(fit1$residuals))>threshold){
                  l=1
            }
            fit1=fit2
      }
      c[i]
}

delGh(l[[1]])
## [1] 21.19

Let’s apply this function on the first cluster.

ww=l[[1]]
m1=ww[ww$Z>=delGh(l[[1]],threshold = 2),]
open3d()
## wgl 
##   7
mfrow3d(1,2)
plot3d(ww$X,ww$Y,ww$Z,aspect = F,col = "red",xlab = "X",ylab = "Y",zlab = "Z")
next3d()
plot3d(m1$X,m1$Y,m1$Z,aspect = F,col = "blue",xlab = "X",ylab = "Y",zlab = "Z")

You must enable Javascript to view this page properly.

The left side plot is the original data set and the left side plot is the data set after deleting process. Let’s see how many points we have deleted:

dim(ww)[1]-dim(m1)[1]
## [1] 17044

Let’s go through the process with the second cluster now.

ww=l[[2]]
m1=ww[ww$Z>=delGh(l[[1]],threshold = 2),]
open3d()
## wgl 
##   9
mfrow3d(1,2)
plot3d(ww$X,ww$Y,ww$Z,aspect = F,col = "red",xlab = "X",ylab = "Y",zlab = "Z")
next3d()
plot3d(m1$X,m1$Y,m1$Z,aspect = F,col = "blue",xlab = "X",ylab = "Y",zlab = "Z")

You must enable Javascript to view this page properly.

Let’s repeat the process with the second cluster now.

ww=l[[3]]
m1=ww[ww$Z>=delGh(l[[3]],threshold = 2),]
open3d()
## wgl 
##  11
mfrow3d(1,2)
plot3d(ww$X,ww$Y,ww$Z,aspect = F,col = "red",xlab = "X",ylab = "Y",zlab = "Z")
next3d()
plot3d(m1$X,m1$Y,m1$Z,aspect = F,col = "blue",xlab = "X",ylab = "Y",zlab = "Z")

You must enable Javascript to view this page properly.

Now that we are sure that the algorithm works properly, we can apply the algorithm to all clusters.

a=list()
for(i in 1:30){
      ww=l[[3]]
      a[[i]]=ww[ww$Z>=delGh(l[[3]],threshold = 2),]
}

res=a[[1]]
for(i in 2:30){
      res=rbind(res,a[[i]])
}

dim(m)[1]-dim(res)[1]
## [1] 318639

As you can see, we can delete \(318639\) ground points by using this algorithm.