Squid Pop Analyses

library(VGAM)
## Loading required package: stats4
## Loading required package: splines

Data

Load the data. You may need to change this to match your computer. If the data and this file are in the same folder then you can keep it like this.

dat <- read.csv("squid_pops_data_R_4.11.2019MB.csv", header = TRUE, stringsAsFactors = FALSE)
dat <- dat[1:23, ] # remove rows that just have NAs in them

Some data processing

Lets take care of the date thing now and convert it to Juliuan day.

dat$date_set <- as.Date(dat$date_set, format = "%m/%d/%y")
dat$date_julian <- format(dat$date_set, "%j")

Checking etc.

Lets take a look.

dat
## Look at the structure - this tells you how R is thinking of the different elements ##
str(dat)
## 'data.frame':    23 obs. of  24 variables:
##  $ site            : chr  "Blanquezal" "Point Garcia" "South Wadleigh" "South Fish Egg" ...
##  $ lat             : num  55.6 55.6 55.5 55.5 55.6 ...
##  $ long            : num  -133 -133 -133 -133 -133 ...
##  $ sea_otter_level : chr  "high" "high" "high" "high" ...
##  $ date_set        : Date, format: "2020-06-12" "2020-06-12" ...
##  $ time_set        : int  1010 1101 830 935 1131 1215 822 930 730 1025 ...
##  $ date_pulled     : chr  "6/13/2018" "6/13/2018" "6/13/2018" "6/18/2018" ...
##  $ time_pulled     : int  1009 1045 1151 745 858 1400 830 933 546 602 ...
##  $ total.time      : num  24 23.7 27.4 22.2 21.4 ...
##  $ Avg_temp_set_F  : int  50 50 50 58 58 58 63 63 54 54 ...
##  $ Max_wind_set_mph: int  12 12 12 9 9 9 13 13 12 12 ...
##  $ Precip_set_in   : num  0 0 0 0.04 0.04 0.04 0 0 0 0 ...
##  $ Avg_temp_pull   : int  55 55 55 63 63 63 61 61 53 53 ...
##  $ Max_wind_pull   : int  9 9 9 13 13 13 14 14 13 13 ...
##  $ Precip_pull     : num  0.08 0.08 0.08 0 0 0 0 0 0 0 ...
##  $ top_pop         : int  26 24 25 24 25 25 23 25 25 25 ...
##  $ bottom_pop      : int  26 24 25 24 25 24 23 25 25 25 ...
##  $ total_pops      : int  26 24 25 24 25 25 23 25 25 25 ...
##  $ full_bottoms    : int  18 23 17 17 8 23 13 20 23 25 ...
##  $ partial_bottoms : int  6 1 1 3 1 1 2 2 1 0 ...
##  $ full_tops       : int  1 1 8 0 0 6 3 2 1 1 ...
##  $ partial_tops    : int  0 1 0 0 0 7 0 3 4 7 ...
##  $ n_fallen_over   : int  NA 1 NA 2 NA NA NA NA NA 1 ...
##  $ date_julian     : chr  "164" "164" "164" "169" ...

Prep for analysis

dat$sea_otter_level <- ordered(dat$sea_otter_level, levels = c("high", "low"))

Extract response data

What is our response? What did we measure? We counted the number of top and bottom baits in three different states, present, absent, and partial. Given how are data are structured I think it will be easiest to make or response data matrix in a slightly differnet way. We will build a new data frame from scratch while calculating the proportions at the same time. NOTE I am not totally sure how you have coded things here so I am assuming that “full_tops” means the number of top baits missing.

We will also need to sepatate the responses of top and bottom baits. So we will be looking at the top and bottom baits in isolation/independently. Tops

## Calculate proportion of tops absent ##
top.absent <- dat$full_tops
top.partial <- dat$partial_tops
top.present <- (dat$top_pop - dat$full_tops - dat$partial_tops)

## Make a response data frame then a matrix ##
tops <- data.frame(top.absent, top.partial, top.present)
tops <- as.matrix(tops)
rowSums(tops)
##  [1] 26 24 25 24 25 25 23 25 25 25 25 25 25 25 25 25 24 24 24 24 25 25 25

Bottoms now you try

## Calculate proportion of tops absent ##

## Make a response data frame call is bots ##

Model Building

OK great we have our response data ‘tops’ and ‘bots’. Thats pretty neat! We also have or explanatory data in dat. Things like the date, temp, precip etc. Let to some stats.

Lets fit a model for the top baits. Remember that the output is relative to whatever is designated as the reference behavior. The reference behovior is the last is the response matrix, which is baits being present. This makes interpretaion easier, because everything is relative to nothing happening.

## The Model ##
mod1 <- vglm(tops ~ sea_otter_level, multinomial, data = dat)
summary(mod1)
## 
## Call:
## vglm(formula = tops ~ sea_otter_level, family = multinomial, 
##     data = dat)
## 
## 
## Pearson residuals:
##                       Min     1Q  Median    3Q   Max
## log(mu[,1]/mu[,3]) -2.155 -1.536 -0.9201 0.889 9.447
## log(mu[,2]/mu[,3]) -2.297 -1.775 -0.6216 1.417 6.873
## 
## Coefficients: 
##                     Estimate Std. Error z value Pr(>|z|)    
## (Intercept):1       -1.66891    0.12471 -13.382   <2e-16 ***
## (Intercept):2       -1.58231    0.12030 -13.153   <2e-16 ***
## sea_otter_level.L:1  0.09764    0.17637   0.554    0.580    
## sea_otter_level.L:2  0.10382    0.17013   0.610    0.542    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Number of linear predictors:  2 
## 
## Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
## 
## Residual deviance: 254.4214 on 42 degrees of freedom
## 
## Log-likelihood: -177.1813 on 42 degrees of freedom
## 
## Number of iterations: 5 
## 
## No Hauck-Donner effect found in any of the estimates
## 
## Reference group is level  3  of the response
## Coefficents ##
coef(mod1)
##       (Intercept):1       (Intercept):2 sea_otter_level.L:1 
##         -1.66891261         -1.58230767          0.09764317 
## sea_otter_level.L:2 
##          0.10381882
exp(coef(mod1))
##       (Intercept):1       (Intercept):2 sea_otter_level.L:1 
##           0.1884519           0.2055003           1.1025693 
## sea_otter_level.L:2 
##           1.1093994

What do you think this means? Remember that in a multinomial setting everything is thought of as a probability of occurance relative to the reference level. Using the seal example as a reference can you figure it out? You may want to add some code here to help you decide. Hint: just looking at the raw data does it look like there is a sea otter effect?