Meet the exponential family

Proportional & Absence Presence: Bernoulli or Binomial Distributions
Continuous: Gaussian or Gamma
Count Data: Poisson or Negative Binomial

Distribution Selection: You should decide a priori based on what you know about your data. When there are two competing options, look at the mean-variance relationship.

sparrows <- read.table("Sparrows.txt", header = T)

names(sparrows)

##  [1] "Speciescode" "Sex"         "SexNew"      "wingcrd"     "flatwing"   
##  [6] "tarsus"      "head"        "culmen"      "nalospi"     "wt"         
## [11] "observer"    "Age"

Normal

Fig. 1B below is re-scaled so that the surface under the histogram adds to 1. This provides a better representative density curve (Fig. 1D). The histograms in Fig. 1 provide evidence that our data are pretty normal and we can use the Gaussian Distribution to create a model of the data. Fig. 1C shows how the data would look if our data were completely normal using simulated data with a sample mean of 18.9 and a sample variance of 3.7 from 1280 sparrows.

The function rnorm() takes random samples from a normal distribution with a specified mean and standard deviation. If you specify, freq=F it scales the histogram so that the area inside is equal to 1.

op <- par(mfrow = c(2, 2))

hist(sparrows$wt, 
     nclass = 15, 
     xlab = "Weight", 
     main = "Observed data",
     col="#e76bf3")

hist(sparrows$wt, 
     nclass = 15, 
     xlab = "Weight", 
     main = "Observed data", 
     freq = FALSE, 
     col="#7cae00")

Y <- rnorm(1281, 
           mean = mean(sparrows$wt), 
           sd = sd(sparrows$wt))

hist(Y, 
     nclass = 15, 
     main = "Simulated data", 
     xlab = "Weight", 
     col="#00bfc4")

X <- seq(from = 0, to = 30, length = 200) 

Y <- dnorm(X, 
           mean = mean(sparrows$wt), 
           sd = sd(sparrows$wt))

plot(X, Y, type = "l", xlab = "Weight", 
     ylab = "Probablities", ylim = c(0, 0.25),
     xlim = c(0, 30), main = "Normal density curve", 
     col="#f8766d")

Figure 1: (A) Histogram of weight of 1281 sparrows. (B) As panel A, but now scaled so that the total area in the histogram is equal to 1. (C) Histogram of simulated data from a normal distribution with mean and variance taken from 1281 sparrows. (D) Normal probability curve with values for the mean and the variance taken from the sample of 1281 sparrows. The surface under the normal density curve addds up to 1.

par(op)

Poisson(Discrete)

Overdispersion: when the variance is larger than the mean. When this happens with Poisson, there is a correction in Chapter 9 you can use. Alternatively, we can use the negative binomial when this happens, if we want.

The smaller the \(\mu\) the more skewed, and the larger the \(\mu\) the more symmetrical.

\(\mu\) can be a non-integer (with halves and decimals), but all ys have to be integers and positive. Also, note that even though Fig. 2 contains distributions that look “normal”, they are not equal to a normal distribution. A Gaussian distribution uses two parameters (\(\mu\) and \(\sigma^2\)); Poisson uses only one parameter \(\mu\) which is both the variance and the mean.

x1 <- 0:10
Y1 <- dpois(x1, lambda=3)
x2 <- 0:10
Y2 <- dpois(x2, lambda=5)
x3 <- 1:40
Y3 <- dpois(x3, lambda = 10)
x4 <- 50:150
Y4 <- dpois(x4, lambda = 100)


XLab <- "Y values"; YLab <- "Probabilities"

op <- par(mfrow = c(2, 2))

plot(x1, 
     Y1, 
     type = "h", 
     xlab = XLab, 
     ylab = YLab,
     main = "Poisson with mean 3",
     col="#e76bf3"
     )

plot(x2, 
     Y2, 
     type = "h", #use this for poisson because of discrete data
     xlab = XLab, 
     ylab = YLab,
     main = "Poisson with mean 5", 
     col="#7cae00")

plot(x3, 
     Y3, 
     type = "h", 
     xlab = XLab, 
     ylab = YLab,
     main = "Poisson with mean 10", 
     col="#00bfc4")

plot(x4, 
     Y4, 
     type = "h", 
     xlab = XLab, 
     ylab = YLab,
     main = "Poisson with mean 100", 
     col="#f8766d")

Figure 2: Poisson probabilities for µ = 3 (A), µ = 5 (B), µ = 10 (C), and µ = 100 (D). Equation (8.3, pg. 196) is used to calculate the probabilities for certain values. Because the outcome variable y is account, vertical lines are used instead of a line connecting all the points.

par(op)

Sea Lice

Penston et al. (2008) analysed the number of sea lice at sites around fish farms in the north-west of Scotland as a function of explanatory variables like time, depth, and station. The response variable was the number of sea lice at various sites i, denoted by Ni. However, samples were taken from a volume of water, denoted by Vi, that differed per site. One option is to use the density Ni/Vi as the response variable and work with a Gaussian distribution, but if the volumes differ considerably per site, then this is a poor approach as it ignores the differences in volumes.

If volume, area, or time differ between each sample (making each observation a rate or density) you still use Poisson Distribution.

Negative Binomial(Discrete)

This distribution is a mixture of Poisson and Gamma. The mean and variance of Y are given by:

\(E(Y)=\)\(\mu\) ; \(var(Y)=\)\(\mu\) + \(\dfrac{\mu^2}{k}\)

We have overdispersion if the variance is larger than th emean. The variance equation can calculate the amount of overdispersion and k is called th edispersion parameter. If k is large (relative to mean squared) the entire term \(\mu\)/k is 0 and the variance is just \(\mu\). In this case the distribution converges to a Poisson Distribution, and you might as well just use Poisson. The smaller the k, the larger the dispersion.

mu1B=1 ; k1B=0.1
 mu1C=1 ; k1C=1
 mu1D=1 ; k1D=100000

 mu2B=10 ; k2B=0.1
 mu2C=10 ; k2C=1
 mu2D=10 ; k2D=100000

 mu3B=100 ; k3B=0.1
 mu3C=100 ; k3C=1
 mu3D=100 ; k3D=100000


x1B<-0:10; Y12<-dnbinom(x1B,mu=mu1B,size=k1B)
x1C<-0:10; Y13<-dnbinom(x1C,mu=mu1C,size=k1C)
x1D<-0:10; Y14<-dnbinom(x1D,mu=mu1D,size=k1D)

x2B<-0:20; Y22<-dnbinom(x2B,mu=mu2B,size=k2B)
x2C<-0:20; Y23<-dnbinom(x2C,mu=mu2C,size=k2C)
x2D<-0:20; Y24<-dnbinom(x2D,mu=mu2D,size=k2D)


x3B<-0:200; Y32<-dnbinom(x3B,mu=mu3B,size=k3B)
x3C<-0:200; Y33<-dnbinom(x3C,mu=mu3C,size=k3C)
x3D<-0:200; Y34<-dnbinom(x3D,mu=mu3D,size=k3D)


par(mfrow=c(3,3))
Xlab="Y values"
Ylab="Probabilities"
plot(x1B,Y12,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu1B,",",k1B,")"))
plot(x1C,Y13,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu1C,",",k1C,")"))
plot(x1D,Y14,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu1D,",",k1D,")"))


plot(x2B,Y22,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu2B,",",k2B,")"))
plot(x2C,Y23,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu2C,",",k2C,")"))
plot(x2D,Y24,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu2D,",",k2D,")"))


plot(x3B,Y32,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu3B,",",k3B,")"))
plot(x3C,Y33,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu3C,",",k3C,")"))
plot(x3D,Y34,type="h",xlab=Xlab,ylab=Ylab,main=paste("NB(",mu3D,",",k3D,")"))

Figure 3:Nine density curves from a negative binomial distribution NB(mu, k), where mu is the mean and k–1 is the dispersion parameter. The column of panels on the right have a large k, and these negative binomial curves approximate the Poisson distribution. R code to create this graph is given on the book website. If k = 1, the negative binomial distribution is also called the geometric distribution

Gamma

This is used for continuous response variables with positive values. You cannot use this distribution if your Y values are negative or zero. The mean and variance of Y are below, and the larger the v with respect to \(\mu\), the closer to a bell-shaped distribution you get.

\(E(Y)=\)\(\mu\) ; \(var(Y)=\)\(\mu\) + \(\dfrac{\mu^2}{v}\)

See figure 8.4 on pg. 201 in Zuur for examples of these distributions.

Bernoulli & Binomial

Binomial

The mean and variance of Y are:

\(E(Y)=\)\(N* \pi\) ; \(var(Y)=\)\(N* \pi * (1-\pi)\)

Used for coin tosses when this is generally introduced, but it can be used in ecological questions too. Imagine you are at a deer farm and you sample N deer to determine the presence or absence of a particular disease. In such research, you want to know the probability \(\pi\) that a particular deer is infected with the disease. You can also question if your sample of deer is independent but it is not discussed how to deal with this until Ch. 12

You can see fully worked problems for this in Ch. 20 (Koalas present at sites), Ch. 21 (Badger activity near farms), and Ch. 21 of Zuur et al., 2007 (Presence and absence of flat fish for 62 sites in an estuary).

Xlab="Y values"
Ylab="Probabilities"

n11<-20; x11<-1:n11; p11<-0.2
n12<-20; x12<-1:n12; p12<-0.5
n13<-20; x13<-1:n13; p13<-0.7

n21<-10; x21<-1:n21; p21<-0.2
n22<-10; x22<-1:n22; p22<-0.5
n23<-10; x23<-1:n23; p23<-0.7


n31<-100; x31<-1:n31; p31<-0.2
n32<-100; x32<-1:n32; p32<-0.5
n33<-100; x33<-1:n33; p33<-0.7


prop11<-dbinom(x11, size=n11, prob=p11)
prop12<-dbinom(x12, size=n12, prob=p12)
prop13<-dbinom(x13, size=n13, prob=p13)


prop21<-dbinom(x21, size=n21, prob=p21)
prop22<-dbinom(x22, size=n22, prob=p22)
prop23<-dbinom(x23, size=n23, prob=p23)


prop31<-dbinom(x31, size=n31, prob=p31)
prop32<-dbinom(x32, size=n32, prob=p32)
prop33<-dbinom(x33, size=n33, prob=p33)


par(mfrow=c(3,3))


plot(x21,prop21,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p21,",",n21,")"))
plot(x22,prop22,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p22,",",n22,")"))
plot(x23,prop23,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p23,",",n23,")"))

plot(x11,prop11,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p11,",",n11,")"))
plot(x12,prop12,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p12,",",n12,")"))
plot(x13,prop13,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p13,",",n13,")"))


plot(x31,prop31,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p31,",",n31,")"))
plot(x32,prop32,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p32,",",n32,")"))
plot(x33,prop33,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(",p33,",",n33,")"))

Figure 4:Binomial density curves B(π, N) for various values of π (namely 0.2, 0.5, and 0.7) and N (namely 10, 20, and 100). R code to create this graph is on the book website.

Bernoulli

This is obtained if N = 1; we only toss once or we only sample one animal on the farm. Note that we only get probabilities at 0 (fail) and 1(success). Four Bernoulli distributions are given below with \(\pi\) = 0.2, 0.5, 0.7, and 1. In general there is no distinction made between a binomial distribution and Bernoulli. The notation B(\(\pi\), N) is used for both and N=1 implies Bernoulli.

par(mfrow=c(2,2))
prop11<-dbinom(0:1, size=1, prob=0.2)
plot(0:1,prop11,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(0.2,1)"),ylim=c(0,1))


prop11<-dbinom(0:1, size=1, prob=0.5)
plot(0:1,prop11,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(0.5,1)"),ylim=c(0,1))


prop11<-dbinom(0:1, size=1, prob=0.7)
plot(0:1,prop11,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(0.7,1)"),ylim=c(0,1))

prop11<-dbinom(0:1, size=1, prob=1)
plot(0:1,prop11,type="h",xlab=Xlab,ylab=Ylab,main=paste("B(1,1)"),ylim=c(0,1))

Figure 5: Four Bernoulli distributions B(π,1) for different values of π. R code to create this graph is on the book website.

Natural Exp. Fam.

There are so many more distributions that what are discussed here.

Multinomial Distribution

When response variable is categorical with more than 2 levels.

Inverse Gaussian

Useful for lifetime distributions (failure time of machines, or lifetime of products)

References

[1] Zuur, A. F., Ieno, E. N., Walker, N., Saveliev, A. A. & Smith, G. M. Mixed Effects Models and Extensions in Ecology with R (Springer, 2010).

Zuur Ch. 8