GESC 258- Geographical Research Methods
Access online version of this document at (https://rpubs.com/majidhojati/gesc258lab3)
This lab has two parts. Your Lab demonstrator will teach you the first part on (Feb 4th) and the second part on (Feb 10th). This week we review a discrete distribution in the first part.
Part I: n-year floods as binomial distribution
Binomial distribution is a distribution that applies to the
experiments that have two outcomes (success
or
failure
). Tossing a coin more than once is an example of
binomial distribution. There are many real-life examples of binomial
distribution. Such as number of side effects from medications, or number
of spam emails/calls that you might receive per day. A good example of
binomial distribution in geography is 100-year flood. A
100-year flood
is a flood event that has a 1 in 100 chance
(0.01 probability) of being equaled or exceeded in any given year.
Similarly we have 50-year flood
which is a flood event that
has a 1 in 50 chance (0.2 probability) of occurring. Read more about
this concept here :http://bcn.boulder.co.us/basin/watershed/flood.html. The
Binomial distribution can be used to calculate the probability of
experiencing a certain number of 50-year
floods in a
specified number of years. In this lab we are going to calculate the
probability of having x
number 50-year
flood in 50 years period.
So to get started we will first define our binomial model. We know
the probability of a 50-year flood happening each year is
0.2
so this is probability of success
or \(p=0.2\). What is probability of failure? We
can calculate it using this equation:
\[q = 1- p\] As you recall, a
binomial distribution has two parameters, n
and
p
. As we mentioned in the first paragraph we are going to
calculate probability of x
number of
50-year floods
in a 50 years period, so \(n=50\). We can now easily define our
binomial model as \(X~B(50,0.2)\). Lets
start calculations in the R. First we need to define our variables. We
can define them as follows:
please pay attention on the comments in each line of the codes
<- 50 #number of binomial tests
n <- 0.2 # probability of success p
If you want to find out what is the probability of having
3
50-year flood in 50 years you need to find \(P(x=3)\). In R you can use the following
function:
<- dbinom(3,n,p) # n and p are already defined in the pervious part. We are storing output of dbinom as px_3
px_3 px_3
## [1] 0.004370946
Now we want to calculate all the probability values for all the
possible values of X
. Recall that the domain of binomial
distribution is from 0
to n
. So we need to run
dbinom(x,n,p)
50 times and change x parameter every time.
Another faster way to do it is to define a vector of numbers from 0 to
50 and pass it as x
to dbinom(x,n,p)
. lets do
it in R:
<- 1:n # We define a vector of numbers starting from 1 and ending with n=50
x x
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
so if you pass variable x
to the dbinom
function, it will calculate binomial probability for all the possible
values of X
.
<- dbinom(x,n,p)
Px Px
## [1] 1.784060e-04 1.092737e-03 4.370946e-03 1.283965e-02 2.953120e-02
## [6] 5.537101e-02 8.701158e-02 1.169218e-01 1.364088e-01 1.398190e-01
## [11] 1.271082e-01 1.032754e-01 7.547049e-02 4.986443e-02 2.991866e-02
## [16] 1.636177e-02 8.180883e-03 3.749571e-03 1.578767e-03 6.117722e-04
## [21] 2.184901e-04 7.200240e-05 2.191378e-05 6.163249e-06 1.602445e-06
## [26] 3.852031e-07 8.560068e-08 1.757871e-08 3.333894e-09 5.834314e-10
## [31] 9.410184e-11 1.396824e-11 1.904760e-12 2.380950e-13 2.721086e-14
## [36] 2.834465e-15 2.681250e-16 2.293175e-17 1.763980e-18 1.212737e-19
## [41] 7.394735e-21 3.961465e-22 1.842542e-23 7.328292e-25 2.442764e-26
## [46] 6.637946e-28 1.412329e-29 2.206764e-31 2.251800e-33 1.125900e-35
In this output you see an
e-
at the end of each number (e.g. 1.784060e-04). It is called scientific notation, You can read more about it here: https://en.wikipedia.org/wiki/Scientific_notation
In the above output, the first value (Px[1]
) is
probability of having x=1
,one 50-year flood in
n=50
years. we can say \(P(x=1)=0.00017\). To find max value of
probabilities we can use max(Px)
which is
max(Px)
## [1] 0.139819
lets start plotting. Here we first set a few parameters for our plot
and make a new blank plot using plot.new()
function. using
plot.window()
function we set our plot’s x
and
y
ranges. As you see we set x
axis’ limit
using xlim
parameter and set y
axis’ limit
using ylim
parameter. both xlim
and
ylim
parameters should be in form of a range. it means they
should have min
and max
of our values. In our
x axis we want to plot x
value (vector of numbers between 1
and n=50) and as a result xlim=c(1,50)
and for the y axis
we want to plot probability value range from 0 to max(Px)
where Px
is out calculated values so
ylim=c(0, max(Px)+0.2)
. We add 0.2
to
max(Px)
to add an extra gap on top of our chart. In the
next line of code we use barplot
to plot values of
Px
on the chart. First parameter in barplot function is
height
of bars. It should be a vector of numbers. We
already have the heights of the bars in Px
. So we just need
to pass it to our function. Pay attention to col
parameter
which is a color and you can change it to whatever color you want.
space
parameter is the space between bars in barplot.
You can type
?c
in R console to find out what doesc(1,50)
mean. Try to change parameters in the codes to learn how they impact your plot
Read comments of following section of code:
par(mar=c(5,5,5,5), las=1) # we set a margin of 5 pixels around our charts to have enough room for labels
plot.new() # create a new plot
plot.window(xlim=c(1,50), ylim=c(0, max(Px)+0.2)) + # set our plot's ylim and xlim
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange") # draw bar plot. col can be name of your colors like orange, blue, red
## numeric(0)
Now we want to add axis
to our plot. we use
axis
function to do it. side
parameter in this
function determines the side of the axis. 1 is bottom, 2 is left, 3 is
top and 4 is right side. at
parameter tells to the R to add
a tick
to the presented coordinates.
at=seq(0.5,n, by=1)
means that add a tick every 1 unit,
starting from 0.5 and end at n=50. labels
parameter tells R
to add a label on defined ticks.
par(mar=c(5,5,5,5), las=1)
plot.new()
plot.window(xlim=range(x), ylim=c(0, max(Px)+0.2))
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange")
### New Lines of code
axis(side=1, at=seq(0.5,n, by=1), labels=seq(1,n, by=1)) # set ticks at defined positions for x axis
now we want to draw a box around our plot. we use
box()
function.
par(mar=c(5,5,5,5), las=1)
plot.new()
plot.window(xlim=range(x), ylim=c(0, max(Px)+0.2))
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange")
axis(side=1, at=seq(0.5,n, by=1), labels=seq(1,n, by=1))
### New Lines of code
box()
Now Lets imagine we want to find probability of having less than or
equal 9 event P(x<=9)
. If we want to show
P(x<=9)
on the chart using a line and arrow we use two
functions called lines
and arrows
.
lines
functions gets a vector of x
values and
a vector of y
values to draw a line based on the
(x,y)
values. lwd
is thickness of the line and
lty
is type of line. arrows
function is
similar but it will get x0
and y0
as start
point of the arrow and x1
and y1
as end point
of arrow. use ?lines
or ?arrows
command to
find out more about this function.
Also we can use text
function to add a text on
(x,y)
location on the chart.
par(mar=c(5,5,5,5), las=1)
plot.new()
plot.window(xlim=range(x), ylim=c(0, max(Px)+0.2))
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange")
axis(side=1, at=seq(0.5,n, by=1), labels=seq(1,n, by=1))
box()
### New Lines of code
lines(x=c(9,9),y=c(0,0.3), col="red",lwd=2,lty="dashed")
arrows(x0=9,y0=0.3,x1=0,y1=0.3, col = "red", lwd=2, lty="dashed")
text(x=4.2,y=0.32,"P(x<=9)")
To calculate
P(x<=9)
we have Several methods. One method
is to calculate P(x=1)+ P(x=2) + ... + P(x=9)
. Lets try
this method.
dbinom(1,n,p) + dbinom(2,n,p) + dbinom(3,n,p) + dbinom(4,n,p) + dbinom(5,n,p) + dbinom(6,n,p) + dbinom(7,n,p)+ dbinom(8,n,p) +dbinom(9,n,p)
## [1] 0.4437261
you can also use this function to get the same result.
sum(dbinom(1:9,n,p) )
Another more advanced method is to calculate
cumulative probability values
using pbinom
function. Type ?pbinom
in R console to learn what are its
parameters. Lets look at its output when we run it with
x from 1 to 50
and n=50
and
p=0.2
<- pbinom(x,n,p)
cumulativePx cumulativePx
## [1] 0.0001926784 0.0012854150 0.0056563610 0.0184960151 0.0480272194
## [6] 0.1033982275 0.1904098116 0.3073316278 0.4437404133 0.5835594185
## [11] 0.7106676050 0.8139430065 0.8894134923 0.9392779204 0.9691965772
## [16] 0.9855583427 0.9937392254 0.9974887967 0.9990675635 0.9996793357
## [21] 0.9998978257 0.9999698281 0.9999917419 0.9999979051 0.9999995076
## [26] 0.9999998928 0.9999999784 0.9999999960 0.9999999993 0.9999999999
## [31] 1.0000000000 1.0000000000 1.0000000000 1.0000000000 1.0000000000
## [36] 1.0000000000 1.0000000000 1.0000000000 1.0000000000 1.0000000000
## [41] 1.0000000000 1.0000000000 1.0000000000 1.0000000000 1.0000000000
## [46] 1.0000000000 1.0000000000 1.0000000000 1.0000000000 1.0000000000
As you see this function first calculates P(x=1)
then
calculates P(x=1) + P(x=2)
and continues until reaches to
P(x=1) + P(x=2) + ,,, P(x=n)
. Lets plot it on the chart.
Notice that here we want to add another y axis on the right side of the
plot. So we use another plot.window
function and from now
all of the future parameters will use this axis for their y
parameter.
par(mar=c(5,5,5,5), las=1)
plot.new()
plot.window(xlim=range(x), ylim=c(0, max(Px)+0.2))
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange")
axis(side=1, at=seq(0.5,n, by=1), labels=seq(1,n, by=1))
box()
lines(x=c(9,9),y=c(0,0.3), col="red",lwd=2,lty="dashed")
arrows(x0=9,y0=0.3,x1=0,y1=0.3, col = "red", lwd=2, lty="dashed")
text(x=4.2,y=0.32,"P(x<=9)")
### New Lines of code
<- pbinom(x,n,p)
cumulativePx plot.window(xlim=c(0,n), ylim=c(0,1.5)) # we define a new plot window and set xlim and ylim, xlim is same as our previous plot but ylim is from 0 to 1.5
lines(x, cumulativePx, col="red", lwd=2) #draw a line, with color= red, width=2 and x coordinates of 0 to 50 and y coordinates of cumPX
axis(side=4,at=seq(0,1.5, by=0.2), labels=seq(0,1.5, by=0.2)) # same as before we add an axis on the right side (side=4) and add ticks from 0 to 1.5 with 0.2 intervals, and label them
Lets add title and labels to our plot
par(mar=c(5,5,5,5), las=1)
plot.new()
plot.window(xlim=range(x), ylim=c(0, max(Px)+0.2))
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange")
axis(side=1, at=seq(0.5,n, by=1), labels=seq(1,n, by=1))
box()
lines(x=c(9,9),y=c(0,0.3), col="red",lwd=2,lty="dashed")
arrows(x0=9,y0=0.3,x1=0,y1=0.3, col = "red", lwd=2, lty="dashed")
text(x=4.2,y=0.32,"P(x<=9)")
<- pbinom(x,n,p)
cumulativePx plot.window(xlim=c(0,n), ylim=c(0,1.5))
lines(x, cumulativePx, col="red", lwd=2)
axis(side=4,at=seq(0,1.5, by=0.2), labels=seq(0,1.5, by=0.2))
### New Lines of code
title("Binomial distribution for n=50 and p=0.2")
mtext("Cumulative Probability", side = 4, las=3, line=3 ) # you can change values of las and line to see their effect on your axis's title
mtext("Probability", side = 2, las=3, line=3)
Now we want to draw a polygon
on the chart to highlight
areas under chart that show P(x<=9)
and
P(x>9)
par(mar=c(5,5,5,5), las=1)
plot.new()
plot.window(xlim=range(x), ylim=c(0, max(Px)+0.2))
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange")
axis(side=1, at=seq(0.5,n, by=1), labels=seq(1,n, by=1))
box()
lines(x=c(9,9),y=c(0,0.3), col="red",lwd=2,lty="dashed")
arrows(x0=9,y0=0.3,x1=0,y1=0.3, col = "red", lwd=2, lty="dashed")
text(x=4.2,y=0.32,"P(x<=9)")
<- pbinom(x,n,p)
cumulativePx plot.window(xlim=c(0,n), ylim=c(0,1.5))
lines(x, cumulativePx, col="red", lwd=2)
axis(side=4,at=seq(0,1.5, by=0.2), labels=seq(0,1.5, by=0.2))
title("Binomial distribution for n=50 and p=0.2")
mtext("Cumulative Probability", side = 4, las=3, line=3 ) # you can change values of las and line to see their effect on your axis's title
mtext("Probability", side = 2, las=3, line=3)
### New Lines of code
polygon(c(1:9, 9,1), c(cumulativePx[1:9], cumulativePx[1],cumulativePx[1]), col=adjustcolor( "green", alpha.f = 0.2))
text(7, 0.20, "P(x<=9)")
polygon(c(9:n, n,9), c(cumulativePx[9:n], Px[n],cumulativePx[1]), col=adjustcolor( "brown", alpha.f = 0.2))
text(18, 0.20, "P(x>9)")
Lets calculate mean and standard deviation of our binomial model.
<- n*p # mean of a binomial distribution is equal to n * p
mu <- sqrt(n*p*(1-p)) # stdev of a binomial distribution is equal to square root of n*p*(1-p)
sigma <- mu + sigma # lets calculate 1 stdev distance from mean in the positive side
sd1.pos <- mu - sigma # lets calculate 1 stdev distance from mean in the negative side sd1.neg
And finally lets plot mu
and stdev values on our plot.
We use a function named abline
which draws a horizontal or
vertical line on the chart. v
parameter is value of
x
that it draws a vertical line on it.
par(mar=c(5,5,5,5), las=1)
plot.new()
plot.window(xlim=range(x), ylim=c(0, max(Px)+0.2))
barplot(Px,space=0, ylim=c(0, max(Px)+0.2),col="orange")
axis(side=1, at=seq(0.5,n, by=1), labels=seq(1,n, by=1))
box()
lines(x=c(9,9),y=c(0,0.3), col="red",lwd=2,lty="dashed")
arrows(x0=9,y0=0.3,x1=0,y1=0.3, col = "red", lwd=2, lty="dashed")
text(x=4.2,y=0.32,"P(x<=9)")
<- pbinom(x,n,p)
cumulativePx plot.window(xlim=c(0,n), ylim=c(0,1.5))
lines(x, cumulativePx, col="red", lwd=2)
axis(side=4,at=seq(0,1.5, by=0.2), labels=seq(0,1.5, by=0.2))
title("Binomial distribution for n=50 and p=0.2")
mtext("Cumulative Probability", side = 4, las=3, line=3 )
mtext("Probability", side = 2, las=3, line=3)
polygon(c(1:9, 9,1), c(cumulativePx[1:9], Px[1],Px[1]), col=adjustcolor( "green", alpha.f = 0.2))
text(7, 0.20, "P(x<=9)")
polygon(c(9:n, n,9), c(cumulativePx[9:n], Px[n],Px[1]), col=adjustcolor( "brown", alpha.f = 0.2))
text(18, 0.20, "P(x>9)")
### New Lines of code
abline(v=mu,lty = "dashed")
text(mu+2, 1.1, "mean")
abline(v=sd1.pos, lty= "dashed", col="gray")
abline(v=sd1.neg, lty= "dashed", col="gray")
text(sd1.pos+2, 1.2, "+1 sd")
text(sd1.neg-2, 1.2, "-1 sd")
Part I Assignment
1- Add your name and date on the above chart using text
function and save it as an image on your one-drive. Save this image and
submit it with your part II report (5 Marks).
Part II - Poisson Distribution
Our lab this week will examine a specific process of rock weathering
in sandstone found in coastal rocky areas in coastal British Columbia.
This weathering process produces characteristic ‘honeycomb’ patterns in
sandstone which can easily be found in coastal sandstone around the
world. These features are called by several names, such as honeycomb
weathering, cavernous weathering, and tafoni
(https://en.wikipedia.org/wiki/Tafoni). In coastal
salt-rich environments such as where we will be exploring, we will
investigate how this unique pattern of rock weathering forms and how we
can characterize their patterns using probability distributions. But
most importantly let’s take a look at what we’re talking about:
Figure 2.1: Honeycomb weathering on the coast of Gabriola Island, British Columbia. Image credit: Colin Robertson
These rock features are pretty cool to see and very distinctive. Here are a few more examples taken from the same site: all photos were taken at approximately the same vertical distance so the scale is approximately equal across these images.
Figure 2.2: More examples of honeycomb weathering on the coast of Gabriola Island, British Columbia. Image credit: Colin Robertson
How does honeycomb patterns in sandstone form?
There are a variety of mechanisms that produce these regular weathering patterns in sandstone, but in salt-rich coastal environments such as that here, the primary process is a result of what is called salt crystallization. In short, rockfaces exposed to sunlight have salt-rich water drawn up to the surface where the salt crystallizes as the water evaporates. The salt crystallization exerts pressure on the surround rock to create tiny factures which become the pits we recognize in the pattern above. As a result of this process, we therefore expect to see these interesting features where sandstone has direct sunlight exposure, such as on south-facing sections of coastline.
We can investigate the density of the weathering process. We will take one image as our study dataset, then count the number of pits in different sections of the image. Here we are taking one small section of coastline and enumerating the density of weathering pits for the entire small section. This is a sample - and our population is the sandstone surfaces undergoing this process in general. Think about what potential issues there may be in making inferences from the data to the population in this case.
Study Image
The image below we will use as our dataset.We have placed a regular 6
x 6 grid over the image which will serve as our sampling units. Then we
will count the number of pits in each cell and generate a dataset of
counts. Since this will be a dataset of counts, we will consider this a
discrete random variable
.
Figure 2.3: Study image dataset featuring honeycomb weathering on the coast of Gabriola Island, British Columbia. Image credit: Colin Robertson
In order to count the number of pits per cell, we have to make some decisions; namely
- what counts as a pit and
- how do we associate pits with cells since in many cases the hole crosses cell boundaries.
This is the sort of research decision we have to make carefully, and then later consider how it impacts our analysis. We will use the following criteria. A pit will be defined based on having visible walls (i.e., we can discern the edges/sides of the pit) and further, we will link it with a cell based on the centroid of the pit. Since this step requires some interpretation which may vary - I have done this for you digitizing our dataset of pit centroids, as below:
Figure 2.4: Study image with pits digitized. Image credit: Colin Robertson
Your first job is to create a dataset by counting the number of pits in each cell. I have done the first row for you to give us a working dataset as an example, but when you do the assignment you will have to get the counts for all cells - you should end up with a datset with 36 observations, each of which is a count.
Figure 2.5: First row of cells with number of pits counted. Image credit: Colin Robertson
So to get started our dataset would be created in R as follows:
<- c(4,7,9,4,5,0) pits
recall why we are treating this as a discrete random variable. Based on our criteria and the nature of the dataset, we have counts - which cannot take on fractional values. This was a decision which impacts how we treat the data. An alternative way to analyze the data would be to measure the fraction of each cell covered by a pit, which we could then treat as a continuous random variable.
Analyzing Count Data
The first thing we can do if we want to estimate the parameters of the Poisson distribution with our dataset, is to calculate the sample statistic estimate of the distributions parameters. The Poisson distribution has a probability mass function as follows: \[P(X = k) = \frac{e^{-\lambda}\lambda^k}{k!}\]
and the only important thing for you to know is that there is one unknown quantity required to define it (i.e., one parameter) - which is the mean number of events per unit of observation - which in our case is the cells which we counted pits in. The sample estimate of \(λ\) is therefore simply:
mean(pits)
## [1] 4.833333
so we know that is the average number of pits, but we are not sure what the overall distribution would look like. Again we can simulate some new data using our estimate of \(λ\) that we calculated from the data to see
= rpois(n=1000, lambda = mean(pits))
simpits hist(simpits, xlab = "Simulated Pit Count Values", main="")
which again might vary a bit for you, but should be a distribution centered on around 5 and tailing off around 12 or 14. Lets look at this and notice a few details;
- the lowest value is zero
- the higher values quickly trail off
- the distribution shape is almost normal
the Poisson is useful for what we call rare events
because of these properties of being bounded by zero on the lower end.
If you have count data that has very high counts, the difference between
the Poisson and the Normal is negligible and you can just use a Normal
distribution (this is called a Normal approximation). We will have a
look at normal distribution in Part II
of this lab. Lets
look at the Poisson functions.
The function dpois
gives the probability of
x
number of events observed given the expected number
specified by λ
. Whereas the function ppois
gives a cumulative probability, we have to specify the upper or lower
tail (i.e., equal or less than vs. equal or greater than). Say we wanted
to calculate the probability of observing 7
pits in a cell
given our mean number per cell of 4.8333333
- we could of
course work through the formula on a calculator or using R
math functions - and it would look like this:
#compare this to the probability function noted above
exp(-mean(pits))*mean(pits)^7/factorial(7)
## [1] 0.09732103
but it is much easier to use the built in function dpois to get the probability:
dpois(x = 7, lambda = mean(pits))
## [1] 0.09732103
we get the same answer but using dpois
is a lot cleaner
and easier. We can also plot the probability for each of the observed
counts in our dataset. We just have to supply the whole vector of counts
to the same dpois function
plot(pits, dpois(x = pits, lambda = mean(pits)), pch=20, cex=2, xlab = "Pit Count", ylab = "Probability")
So now we have the ability to calculate probabilities from the Poisson distribution in R. How do we answer meaningful questions with these tools? While the process of salt crystallization is the dominant one creating the weathering pits, we may want to know how pits form. For example we could ask how many can form before adjacent pit edges dissolve into one larger pit. What do you think the maximum number of pits per unit area is? Where we see large pits form, are these the gradual growth of single pits or the amalgamation of several into one larger one? The other thing to think about is where we have large pits, the count in the cell is likely going to be lower.
Part II Assignment
Calculate the probability of observing a cell with 15 pits based on the sample intensity (λ computed from all cells in the image). Write a sentence interpreting what this means. Include commands used to generate the answer. (out of 2)
What is the probability of observing a cell with between 3 and 5 pits (λ computed from all cells in the image)? Write a sentence interpreting what this means. Include commands used to generate the answer. (out of 3)
There are two important assumptions to using the Poisson distribution we should consider.
The probability that an event will occur within a given unit must be the same for all units (i.e. the underlying process governing the phenomenon must be invariant)
The number of events occurring per unit must be independent of the number of events occurring in other units (no interactions/dependencies).
Write short paragraph (200-300 words) explaining why or why not these assumptions are met in this analysis of the weathering pits dataset. (out of 5)
- Explore the assumption of independence by calculating the probability of each observed count and noting where on the image any counts with a probability less than 0.10 occur. Comment on whether these are distributed randomly over the image or clumped in specific parts of the image and what this means for the independence assumption. Include an image showing which cells have an unusual (i.e., p < 0.10) number of pits. (out of 5)
Hint: to answer 4 you can take a screenshot from the lab and use MSPaint (Windows), Photos (Mac) or another graphics program to identify which cells have unusual counts.
Hand in
Please submit your answers on MLS under Assignment 1. Your final
report should be in pdf
format. Also Please make sure to
include clean codes and their results. The document
formatting of your assignment has 5 marks.
Extra information about formatting your reports
An intro to document formatting (https://ohiostate.pressbooks.pub/feptechcomm/chapter/8-formatting/)
Formatting codes in your documents: (https://owlcation.com/stem/code-snippets)
Two Microsoft word addons for code formating
Credit
This lab material for part II is adopted from GESC 258- Labs originally developed by Dr. Colin Robertson.