Entanglement Windows

We started with the windows. I’ll summarise the data a bit better than the piecemeal approach I took yesterday.

Reproductively Active Females

I’ll show the reproductively active females first.

Boxplot

## Warning in loop_apply(n, do.ply): Removed 723 rows containing missing
## values (geom_point).

Data Summary

Here are the summaries of the raw data:

## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Source: local data frame [4 x 4]
## 
##        group     n        m        sd
## 1 unimpacted 28271 75.97280 11.374056
## 2      minor    47 71.00498  8.161709
## 3   moderate    19 70.63849  9.952520
## 4     severe     7 54.86305 17.492908

GLM to compare differences

First we’ll run a glm using unimpacted as the reference level, and then I’ll compare the differences

dfglm <- transform(dfLong, group = factor(group))
dfglm <- within(dfglm, group <- relevel(group, ref = 'unimpacted'))
ft1 <- glm(value ~ group, data = dfglm)
summary(ft1)
## 
## Call:
## glm(formula = value ~ group, data = dfglm)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -75.769   -6.880    4.291    7.750   17.105  
## 
## Coefficients:
##                Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)    75.97280    0.06762 1123.450  < 2e-16 ***
## groupminor     -4.96782    1.65992   -2.993  0.00277 ** 
## groupmoderate  -5.33431    2.60942   -2.044  0.04094 *  
## groupsevere   -21.10975    4.29813   -4.911 9.09e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 129.2854)
## 
##     Null deviance: 3668761  on 28343  degrees of freedom
## Residual deviance: 3663949  on 28340  degrees of freedom
## AIC: 218252
## 
## Number of Fisher Scoring iterations: 2

We’ll run a simple multiple comparisons of the factors:

library(multcomp)
## Loading required package: mvtnorm
## Loading required package: survival
## Loading required package: TH.data
summary(glht(ft1, mcp(group = "Tukey")))
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: glm(formula = value ~ group, data = dfglm)
## 
## Linear Hypotheses:
##                            Estimate Std. Error z value Pr(>|z|)    
## minor - unimpacted == 0     -4.9678     1.6599  -2.993  0.01237 *  
## moderate - unimpacted == 0  -5.3343     2.6094  -2.044  0.15178    
## severe - unimpacted == 0   -21.1097     4.2981  -4.911  < 0.001 ***
## moderate - minor == 0       -0.3665     3.0912  -0.119  0.99931    
## severe - minor == 0        -16.1419     4.6065  -3.504  0.00229 ** 
## severe - moderate == 0     -15.7754     5.0273  -3.138  0.00766 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

Non-Reproductively Active Animals

Ok, let’s repeat that all for the non-reproductively active animals

Boxplot

Data Summary

Here are the summaries of the raw data:

## Source: local data frame [7 x 4]
## 
##       variable     n        m       sd
## 1   Severe...1    24 52.30306 24.00919
## 2   Severe...0    10 58.98385 22.11262
## 3 Moderate...1    14 72.78055 17.34964
## 4 Moderate...0    83 76.65288 11.41027
## 5    Minor...1    16 77.15186 21.04126
## 6    Minor...0   314 77.63647 10.52896
## 7   unimpacted 84899 77.67087 12.07031

GLM to compare differences

First we’ll run a glm using unimpacted as the reference level, and then I’ll compare the differences

dfglm <- transform(dfLong, variable = factor(variable))
dfglm <- within(dfLong, variable <- relevel(variable, ref = 'unimpacted'))

ft1 <- glm(value ~ variable, data = dfglm)
summary(ft1)
## 
## Call:
## glm(formula = value ~ variable, data = dfglm)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -77.601   -4.203    4.679    8.074   32.609  
## 
## Coefficients:
##                       Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)           77.67087    0.04144 1874.405  < 2e-16 ***
## variableSevere...1   -25.36781    2.46491  -10.292  < 2e-16 ***
## variableSevere...0   -18.68702    3.81831   -4.894  9.9e-07 ***
## variableModerate...1  -4.89032    3.22714   -1.515    0.130    
## variableModerate...0  -1.01798    1.32593   -0.768    0.443    
## variableMinor...1     -0.51901    3.01875   -0.172    0.863    
## variableMinor...0     -0.03440    0.68263   -0.050    0.960    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 145.778)
## 
##     Null deviance: 12461939  on 85359  degrees of freedom
## Residual deviance: 12442587  on 85353  degrees of freedom
## AIC: 667521
## 
## Number of Fisher Scoring iterations: 2

We’ll run a simple multiple comparisons of the factors:

library(multcomp)
summary(glht(ft1, mcp(variable = "Tukey")))
## Warning in RET$pfunction("adjusted", ...): Completion with error > abseps
## Warning in RET$pfunction("adjusted", ...): Completion with error > abseps
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: glm(formula = value ~ variable, data = dfglm)
## 
## Linear Hypotheses:
##                                  Estimate Std. Error z value Pr(>|z|)    
## Severe...1 - unimpacted == 0     -25.3678     2.4649 -10.292   <0.001 ***
## Severe...0 - unimpacted == 0     -18.6870     3.8183  -4.894   <0.001 ***
## Moderate...1 - unimpacted == 0    -4.8903     3.2271  -1.515   0.6808    
## Moderate...0 - unimpacted == 0    -1.0180     1.3259  -0.768   0.9830    
## Minor...1 - unimpacted == 0       -0.5190     3.0187  -0.172   1.0000    
## Minor...0 - unimpacted == 0       -0.0344     0.6826  -0.050   1.0000    
## Severe...0 - Severe...1 == 0       6.6808     4.5444   1.470   0.7108    
## Moderate...1 - Severe...1 == 0    20.4775     4.0604   5.043   <0.001 ***
## Moderate...0 - Severe...1 == 0    24.3498     2.7983   8.702   <0.001 ***
## Minor...1 - Severe...1 == 0       24.8488     3.8968   6.377   <0.001 ***
## Minor...0 - Severe...1 == 0       25.3334     2.5570   9.907   <0.001 ***
## Moderate...1 - Severe...0 == 0    13.7967     4.9991   2.760   0.0648 .  
## Moderate...0 - Severe...0 == 0    17.6690     4.0415   4.372   <0.001 ***
## Minor...1 - Severe...0 == 0       18.1680     4.8671   3.733   0.0027 ** 
## Minor...0 - Severe...0 == 0       18.6526     3.8784   4.809   <0.001 ***
## Moderate...0 - Moderate...1 == 0   3.8723     3.4884   1.110   0.9015    
## Minor...1 - Moderate...1 == 0      4.3713     4.4186   0.989   0.9412    
## Minor...0 - Moderate...1 == 0      4.8559     3.2980   1.472   0.7094    
## Minor...1 - Moderate...0 == 0      0.4990     3.2966   0.151   1.0000    
## Minor...0 - Moderate...0 == 0      0.9836     1.4902   0.660   0.9923    
## Minor...0 - Minor...1 == 0         0.4846     3.0944   0.157   1.0000    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

Health of Repro vs. Non-repro Animals

This is what you asked for over email - e.g. to see if the severe repro females are worse than the severe non-repro individuals, etc. Note that these are just for the entanglements, i.e. I’ve removed all of the unimpacted cases.

Ok, so let’s look at a glm, where we have a term for whether it’s a repro female or not:

ft2 <- glm(value ~ variable + fem, data = dfglm)
summary(ft2)
## 
## Call:
## glm(formula = value ~ variable + fem, data = dfglm)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -74.285   -6.924    3.247    9.138   32.179  
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           77.5208     0.6803 113.945  < 2e-16 ***
## variableSevere...1   -24.7881     2.3846 -10.395  < 2e-16 ***
## variableSevere...0   -15.9244     3.6237  -4.394 1.34e-05 ***
## variableModerate...1  -2.5032     2.9144  -0.859 0.390783    
## variableModerate...0  -1.1862     1.4131  -0.839 0.401610    
## variableMinor...1     -0.6820     2.9822  -0.229 0.819194    
## fem                   -5.6073     1.5604  -3.594 0.000357 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 152.4322)
## 
##     Null deviance: 101661  on 533  degrees of freedom
## Residual deviance:  80332  on 527  degrees of freedom
## AIC: 4208.6
## 
## Number of Fisher Scoring iterations: 2

Survival

Ok, this is what I showed you before. First plot is of the animals estimated to die before the end of modeling. Second plot is all animals - both estimated to die and presumed alive.

And then the summaries from the glms. n.b. that I’m not 100% these glms are the best way to present this. Because what they really show is a decrease in survival with reference to the Minor - no gear reference case. This is probably fine, but what is “significant” is only in comparison to that - I haven’t had the time to compare all other animals. That will really be the better reference case.

I think it’s sufficient for the abstract to simply say that survival decreases as entanglement severity increases with severe - gear cases being significantly worse than minor.

dfsub$case <- as.factor(paste(dfsub$Severity, dfsub$gear, sep = ''))
dfsubglm <- within(dfsub, case <- relevel(case, ref = 'minor0'))
df$case <- as.factor(paste(df$Severity, df$gear, sep = ''))
dfglm <- within(df, case <- relevel(case, ref = 'minor0'))


# summary statistics - estimated to die
ft1 <- glm(dfsubglm$m2die ~ dfsubglm$case)
summary(ft1)
## 
## Call:
## glm(formula = dfsubglm$m2die ~ dfsubglm$case)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -59.00  -19.78  -11.12   14.88  143.00  
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              51.118      7.344   6.961 5.72e-09 ***
## dfsubglm$caseminor1     -53.118     43.447  -1.223   0.2270    
## dfsubglm$casemoderate0   12.882     20.510   0.628   0.5327    
## dfsubglm$casemoderate1  -29.118     25.791  -1.129   0.2641    
## dfsubglm$casesevere0    -22.868     22.635  -1.010   0.3170    
## dfsubglm$casesevere1    -36.572     14.854  -2.462   0.0172 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 1833.673)
## 
##     Null deviance: 112752  on 57  degrees of freedom
## Residual deviance:  95351  on 52  degrees of freedom
## AIC: 608.08
## 
## Number of Fisher Scoring iterations: 2
# summary statistics - estimated to die & presumed alive
ft2 <- glm(dfglm$m2die ~ dfglm$case)
summary(ft2)
## 
## Call:
## glm(formula = dfglm$m2die ~ dfglm$case)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -82.56  -39.06  -22.19   34.42  210.81  
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           84.562      3.341  25.307  < 2e-16 ***
## dfglm$caseminor1      -7.962     17.363  -0.459    0.647    
## dfglm$casemoderate0   -3.372      7.099  -0.475    0.635    
## dfglm$casemoderate1  -22.562     14.782  -1.526    0.128    
## dfglm$casesevere0    -16.016     16.585  -0.966    0.335    
## dfglm$casesevere1    -57.339     13.132  -4.366 1.63e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 2902.996)
## 
##     Null deviance: 1167770  on 386  degrees of freedom
## Residual deviance: 1106042  on 381  degrees of freedom
## AIC: 4192
## 
## Number of Fisher Scoring iterations: 2