We need to know if response category, event duration, the number of photos in each event and groupsize - are they the same thing or are they different? For while-tailed deer, caribou and moose.

First, load the data and take look at the distribution of each variable.

# load the data
behav <- read.csv("data/Ungulate_behaviour_event_30min.csv",stringsAsFactors = F,header = T)
str(behav)
## 'data.frame':    2137 obs. of  11 variables:
##  $ Species               : chr  "Odocoileus virginianus" "Odocoileus virginianus" "Odocoileus virginianus" "Odocoileus virginianus" ...
##  $ Deployment.Location.ID: chr  "Algar01" "Algar01" "Algar01" "Algar06" ...
##  $ Date_Time.Captured    : chr  "2015-11-22 02:54:47" "2015-11-28 04:02:54" "2016-09-17 17:23:34" "2019-06-08 07:23:08" ...
##  $ Event.ID              : chr  "E0" "E1" "E10" "E1003" ...
##  $ Event.Observations    : int  2 9 12 3 7 4 11 14 6 2 ...
##  $ Event.Duration        : int  1 49 30 4 10 4 18 23 9 2 ...
##  $ Event.Groupsize       : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ mean                  : num  0 1 1 0 0 0 0 1 0 0 ...
##  $ mode                  : int  0 1 1 0 0 0 0 1 0 0 ...
##  $ Secure                : logi  FALSE TRUE TRUE FALSE FALSE FALSE ...
##  $ Season                : chr  "Winter" "Winter" "Summer" "Summer" ...

Then, take a look at the distribution of each variable.

# Change 'secure' to 1 & 0
behav$Secure[behav$Secure == T] <- 1
behav$Secure[behav$Secure == F] <- 0

# Distrubtion of variables
par(mfrow=c(2,3))
hist(behav$Event.Observations)
hist(behav$Event.Duration)
hist(behav$Event.Groupsize)

hist(behav$mean)
hist(behav$mode)
hist(behav$Secure)

The Event Observations, Evbent Durations, and group size are skewed to the right. The mode and secure are just bionimial distributed and they look identical. Mean is not bionimal but pretty close to a bionimial distribtuion, which means ungulates barely display a second behaviour in one event. So let’s log transfer skewed variables:

# Distrubtion of log scale, since the duration
par(mfrow=c(1,3))
hist(log(behav$Event.Observations), main="Histogram for log Observation")
hist(log(behav$Event.Duration), main="Histogram for log Duration")
hist(log(behav$Event.Groupsize), main="Histogram for log Group Size")

Looks much better for Event Observations and Evbent Durations but not for the group size. Most of events only have 1 individual.

Correalation between varibale:

Event Observations VS Event Durations

# log transfer Event observations and Event duration
behav$log.Observations <- log(behav$Event.Observations)
# Since duration has 0 mintues, apply log(x + 1) instead of log()
behav$Event.Duration <- (behav$Event.Duration + 1)
behav$log.Duration <- log(behav$Event.Duration)

# Run linear regression
summary(lm(log.Duration ~ log.Observations, data = behav))
## 
## Call:
## lm(formula = log.Duration ~ log.Observations, data = behav)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.5177 -0.5447 -0.2155  0.1354  5.6669 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       0.08377    0.04031   2.078   0.0378 *  
## log.Observations  1.50573    0.02033  74.066   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8826 on 2135 degrees of freedom
## Multiple R-squared:  0.7198, Adjusted R-squared:  0.7197 
## F-statistic:  5486 on 1 and 2135 DF,  p-value: < 2.2e-16
visreg::visreg(lm(log.Duration ~ log.Observations, data = behav))

Event Observations and Event Durations are correlated, R2 = 0.7 .

Next, the relationship between Event and Group size.

# log transfer group size
behav$log.group <- log(behav$Event.Groupsize)

summary(lm(log.Observations ~ log.group, data = behav))
## 
## Call:
## lm(formula = log.Observations ~ log.group, data = behav)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6502 -0.5319 -0.0211  0.4489  3.7401 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.63052    0.02164   75.36   <2e-16 ***
## log.group    0.73557    0.05816   12.65   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9062 on 2135 degrees of freedom
## Multiple R-squared:  0.06969,    Adjusted R-squared:  0.06925 
## F-statistic: 159.9 on 1 and 2135 DF,  p-value: < 2.2e-16
visreg::visreg(lm(log.Observations ~ log.group, data = behav))

summary(lm(Event.Duration ~ log.group, data = behav))
## 
## Call:
## lm(formula = Event.Duration ~ log.group, data = behav)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -209.34  -60.44  -56.44  -36.44 2750.56 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   63.443      5.433  11.677  < 2e-16 ***
## log.group    105.966     14.606   7.255 5.59e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 227.6 on 2135 degrees of freedom
## Multiple R-squared:  0.02406,    Adjusted R-squared:  0.0236 
## F-statistic: 52.64 on 1 and 2135 DF,  p-value: 5.591e-13
visreg::visreg(lm(Event.Duration ~ log.group, data = behav))

Both models are statistically significant P < 0.001 but both models have very low R2 0.06 and 0.02 . Which means the Event and Event Group Size are probally not the same.

Next, the Events duration/observation = behaviour type?

I tried box plot as Chirs suggested using Secure and Event Duration since mean isn’t really categorical.

# Convert Secure to factor
behav$Secure <-  as.factor(behav$Secure) 
#bot plot
boxplot( behav$Event.Duration ~ behav$Secure )

Looks like there is a slight difference in Event Duration when comarping Secure and Non-secure behaviour.

So let’s fit the linear model:

summary(lm(Event.Duration ~ Secure, data = behav))
## 
## Call:
## lm(formula = Event.Duration ~ Secure, data = behav)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -157.26  -52.42  -48.42  -32.42 2655.74 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   54.422      5.635   9.659   <2e-16 ***
## Secure1      103.834     11.325   9.169   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 225.9 on 2135 degrees of freedom
## Multiple R-squared:  0.03788,    Adjusted R-squared:  0.03743 
## F-statistic: 84.06 on 1 and 2135 DF,  p-value: < 2.2e-16
visreg::visreg(lm(Event.Duration ~ Secure, data = behav))

The difference is statistically significant.

Since, Events Duration and Events Obvservation is correlated. We can also expect different Event Obvservation when compare secure and non-secure behaviour.

#bot plot
boxplot( behav$Event.Observations ~ behav$Secure )

summary(lm(Event.Observations ~ Secure, data = behav))
## 
## Call:
## lm(formula = Event.Observations ~ Secure, data = behav)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.142  -3.691  -1.691   1.309 193.858 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.6909     0.3054   18.63   <2e-16 ***
## Secure1      15.4509     0.6139   25.17   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.25 on 2135 degrees of freedom
## Multiple R-squared:  0.2288, Adjusted R-squared:  0.2285 
## F-statistic: 633.5 on 1 and 2135 DF,  p-value: < 2.2e-16
visreg::visreg(lm(Event.Observations ~ Secure, data = behav))

The difference is more significant than Event Duration vs Secure, Estimate = 15.45, R2 = 0.23 .

Are there differences among speices?

We can check histogram visually:

I found three noticeable pattern by just look at histgram. Furtuer exaimnation is needed.

Finally, the Event VS. Secure behaviour was checked:

Table 1. The estimates, P, R saure of Event VS behaviour models
Event Behaviour Species Estimate P R.squre
Event Obsercations Secure White-tailed deer 12.7 0 0.2
Event Obsercations Secure Moose 27.27 0 0.34
Event Obsercations Secure Caribou 12.34 0 0.25
Event Duration Secure White-tailed deer 122.72 0 0.04
Event Duration Secure Moose 111 0 0.05
Event Duration Secure Caribou 63.56 0 0.02
par(mfrow=c(1,3))
visreg::visreg(z1, main="Deer")
visreg::visreg(z2, main="Moose")
visreg::visreg(z3, main="Caribou")

par(mfrow=c(1,3))
visreg::visreg(z4, main="Deer")
visreg::visreg(z5, main="Moose")
visreg::visreg(z6, main="Caribou")

The take homes are: