First off, I would like to acknowledge the blog/tutorials of Simon Barthelme and Manuel Gimond in which this code was modified and adapted from.

Links:

https://www.r-bloggers.com/new-version-of-imager-package-for-image-processing/

https://mgimond.github.io/Spatial/point-pattern-analysis-in-r.html

#———————————————–

I’ve been fooling around with some of the image analysis packages provided by R and came across “imageR” recently. As I assumed, they can probably be applied to birds-eye-view images of seabird colonies and nesting sites.

Here’s an example of how it can be applied. I haven’t manually counted for accuracy yet this was just for fun, but I’m sure its within a 90% confidence ballpark.

So basically what I’ve done here is processed the image to remove the background from the birds, expand the birds and outline them, then set the outlines against the image for verification, and simply take the length of the list object to get the count of birds (n = 333).

This took about 30 minutes and no machine learning to accomplish, as seems to be the current trend in super-colony counting.

library(imager)
library(spatstat)
library(ggplot2)
library(ggpubr)
library(ecespa)

So first, we want to set the working directory to the folder with the images we wish to process, then import them. I got this image from a random google search for nesting seabirds

#-----------------------------------------
#setwd("My/Working/directory")
list.files()
##  [1] "100915040_262380021578005_7115576343391109120_n.jpg"
##  [2] "100925892_244526526839177_6184116465101176832_n.jpg"
##  [3] "Bird_counting_image_process.R"                      
##  [4] "card00872_fr.jpg"                                   
##  [5] "heatmaps and density.png"                           
##  [6] "Image analysis"                                     
##  [7] "image_point_analysis.R"                             
##  [8] "image_processing.html"                              
##  [9] "image_processing.Rmd"                               
## [10] "images.jpg"                                         
## [11] "images.png"                                         
## [12] "quadrat counts.png"                                 
## [13] "remove NN.png"                                      
## [14] "RPubs - Bird Nesting Analysis in R.html"            
## [15] "rsconnect"                                          
## [16] "shapes to points.png"                               
## [17] "testplot.png"                                       
## [18] "wildlifetrusts_40661540865.jpg"                     
## [19] "zm5ijHsuTaLdWdWW29zCfG-1200-80.jpg"
#-----------------------------------------

# Import the image file
im <- load.image("wildlifetrusts_40661540865.jpg")
plot(im)

Then we want to check our images to see if greyscale or negative would be more appropriate for identifying our species (greyscale if light colored, negative if dark)

# greyscale
im.g <- grayscale(im)
# since the birds are white, lets turn it negative
im.g <- -im.g
plot(im.g)

Then we want to check which percentage captures most of our animals and least of the background

# determine the best threshold to identify animals
layout(t(1:3))
threshold(im.g,"25%") %>% plot
threshold(im.g,"15%") %>% plot 
threshold(im.g,"10%") %>% plot

We also want to make sure there isn’t light effecting what is getting captured by our image processing techniques, so lets remove some of it

# create df and summarize
df <- as.data.frame(im.g)
head(df,5)
##   x y      value
## 1 1 1 -0.2503529
## 2 2 1 -0.2431373
## 3 3 1 -0.3292941
## 4 4 1 -0.3385882
## 5 5 1 -0.3229020
m <- lm(value ~ x + y,data=df) #linear trend
summary(m)
## 
## Call:
## lm(formula = value ~ x + y, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.52393 -0.06093  0.02654  0.09531  0.35363 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4.084e-01  1.257e-03 -324.84   <2e-16 ***
## x           -3.056e-04  4.730e-06  -64.60   <2e-16 ***
## y            7.841e-05  6.313e-06   12.42   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1423 on 90217 degrees of freedom
## Multiple R-squared:  0.04577,    Adjusted R-squared:  0.04575 
## F-statistic:  2164 on 2 and 90217 DF,  p-value: < 2.2e-16
# extract and remove the luminance based on linear trend
layout(t(1:2))
im.f <- im.g-fitted(m)
plot(im.g,main="Before")
plot(im.f,main="After trend removal")

Now we begin our morphological operations, basically we want to reduce the image quality to only capture the animal points, and then increase the size of the pixels to make them large enough to outline. There are a few ways to do this:

# morphological operations
im.t <- threshold(im.f,"3%")
px <- as.pixset(1-im.t) #Convert to pixset
plot(px)

# enlarge the pixset by 3 pixels
grow(px,2) %>% plot(main="Growing by 2 pixels") # this one is better

# shrink the pixet by 3 pixels
shrink(px,2) %>% plot(main="Shrinking by 2 pixels")

# check side by side
layout(t(1:2))
plot(px,main="Original")
shrink(px,2) %>% grow(2) %>% plot(main="Shrink, then grow")

# clean fills holes
layout(t(1:2))
plot(px,main="Original")
grow(px,2)%>% plot(main="Grow, then shrink")

Now we’ll check out processed image, lay the outlines over the animals, then segregate the outlines into seperate lists to generate a count. Although the accuracy of this particular image is likely not 100%, it’s within the 90% confidence ballpark for sure!

Essentially this method could reduce processing time for animal counts significantly, however, it is likely limited by how much differences there are in the background. For example, if there are scattered objects which are they same color as the animals being counted, they will likely be included in the count. I have no way of mitigating this yet, but hopefully someone can come along and make improvements to the code I provide here!

par(mfrow=c(1,1))
# get the outline of points
plot(im)
fill(px,3) %>% clean(1) %>% highlight

## Split into connected components (individual coins)
pxs <- split_connected(px)
## Compute their respective area
area <- sapply(pxs,sum)
## count the birds
length(pxs)
## [1] 333

Now we want to convert the outlines to central points for further analysis

Using these newly generated points we can start doing some density analyses

#---------------------------------------------------
#####################################################
############ Count and density statistics############
#####################################################

# make a new dataframe with only x,y columns
df <- xy_remnn[,c(1,2)]

# convert to a ppp object for patstat analyses
df_ss    <- haz.ppp(df)

# plot to ensure match with new ppp object
par(mfrow=c(1,2))
plot(im);
plot(df_ss, main=NULL, cols="red", pch=20, add =TRUE)


# make quadrats for counts
# nx=num. rows, ny= num. columns
Q <- quadratcount(df_ss, nx= 6, ny=3)
plot(im)
plot(df_ss, pch=20, cols="red", main=NULL, add=TRUE)  # Plot points
plot(Q, add=TRUE, col = "white", size=13, lwd=3, cex = 2)  # Add quadrat grid

par(mfrow=c(2,2))
# Compute the density for each quadrat
Q.d <- intensity(Q)
plot(im)
# Plot the densities as a heatmap
plot(intensity(Q, image=TRUE),las=1, col=rev(heat.colors(25, alpha=0.3)), add=TRUE)  # Plot density raster
plot(df_ss, pch=20, cex=0.6, col=rgb(0,0,0,.5), add=TRUE, alpha = 0.3)  # Add points



library(viridis)



# density with default function
K1 <- density(df_ss) # Using the default bandwidth
plot(im)
plot(K1, main=NULL, las=1, col=inferno(25, alpha=0.5), add=TRUE)
contour(K1, add=TRUE, col = "white")

# set the bandwidth sigma
K2 <- density(df_ss, sigma=50) # Using a 50km bandwidth
plot(im)
plot(K2, main=NULL, las=1, col=inferno(25, alpha=0.5), add=TRUE)
contour(K1, add=TRUE, col = "white")

# change and test other density functions
K3 <- density(df_ss, kernel = "disc", sigma=50) # Using a 50km bandwidth
plot(im)
plot(K1, main=NULL, las=1, col=inferno(25, alpha=0.5), add=TRUE)
contour(K1, add=TRUE, col = "white")

now we can check a bit of the distance statistics as well

## Warning in min(D[scaledlegbox]): no non-missing arguments to min; returning Inf

## Warning in min(D[scaledlegbox]): no non-missing arguments to min; returning Inf

## Warning in min(D[scaledlegbox]): no non-missing arguments to min; returning Inf

## Warning in min(D[scaledlegbox]): no non-missing arguments to min; returning Inf

By Using Markov Chain Models and an Average Nearest Neighbor (ANN) analysis, we can also generate a pseudo-P value to test the null hypothesis of random distribution

## [1] 10.95054

## [1] 0.003333333

I’m excited to see what others will do with this code and what other possible applications there are in ecology! I hope this helps someone somewhere!