Multinomial Flight Modeling

Flight Trials Winter 2020 were conducted from 2/17/2020 - 3/10/2020. Soapberry bugs were flight tested twice (T1 vs. T2) for multiple hours in the flight mill and observed from 8 AM to (5-8 PM) each day. I used multicategorical, nominal modeling (multinom) to analyze how sex and host plant as well as changes in mass and egg-laying changed a soapberry bug’s flight response between trials.

Delta Flight Response (T1 vs. T2)

Introduction

Clean the Data

Flight Case ~ Mass Per

Flight Case ~ Mass Diff or Mass Perc + Sex

Flight Case ~ Mass Diff or Mass Perc + Sex + Host Plant

Flight Case ~ Mass Diff or Mass Perc + Sex + Wing2Body

Females Only

Introduction

Graphs I generated below illustrate how changes in mass (more specifically % mass) or egg-laying led to changes in flight response. To generate these graphs, I ran multicategorical regressions to model the probability of different flight delta cases due to sex, host plant, and changes in mass or egg-laying. Changes in speed and distance as a result of changes in mass or egg-laying were not analyzed here but rather in a separate script.

To perform these analyses, a new variable will be created called “flight_case” which calculates the flight response differences between T1 and T2. Here is a formula and key describing each flight case event and encoding:

Formula : flight case = response in \(T_2\) - response in \(T_1\)

Delta Flight Response Key
Event Encoding
flew in both trials 2
flew in T2 only 1
flew in neither trials 0
flew in T1 only -1

Since the outcomes (or response variables) were no longer binomial, I used multicategorical logit models (refer to Chapter 6 of An Introduction to Categorical Data Analysis, Second Edition by Alan Agresti). Below I’ve written out notes on performing these analyses.

What we are interested in is a multicategorical, nominal response variable. When the response variable is nominal, there is no natural order among the response variable categories (unordered categories). When the response variable is multicategorical, its multicategory models assume that the counts in the categories of \(Y\) have a multinomial distribution.

Let J = number of categories for \(Y\).

\(\pi_1,...\pi_J\) = the response probabilities where \(\sum_j \pi_j = 1\).

With \(n\) independent observations, the probability distribution for the number of outcomes of the J types is the multinomial. It specifies the probability for each possible way the \(n\) observations can fall in the J categories. Multicategory logit models simultaneously use all pairs of categories by specifying the odds of outcome in one category instead of another.

Logit models for nominal response variables pair each category with a baseline category. The choice of the baseline category is arbitrary. When the last category (J) is the baseline, the baseline-category logits are

\(\log (\frac{\pi_j}{\pi_J}), j = 1, ..., J - 1\).


Given that the response falls in category j or category J, this is the log odds that the response is j. For J = 3, for instance, the model uses \(\log(\pi_1/\pi_3)\) and \(\log(\pi_2/\pi_3)\). The baseline-category logit model with a predictor x is

\(\log (\frac{\pi_j}{\pi_J}) = \alpha_j + \beta_j x, j = 1, ..., J - 1\)


The model has J - 1 equations, with separate parameters for each. The effects vary according to the category paired with the baseline. When J = 2, this model simplifies to a single equation for \(\log(\pi_1/\pi_2) = logit(\pi_1)\), resulting in ordinary logistic regression for binary responses.

So how do these pair of categories determine equations for all other pairs of categories? Here is an arbitrary pair of categories a and b that follows the general equation above,

\(\log (\frac{\pi_a}{\pi_b}) = \log ( \frac{\pi_a/pi_J}{\pi_b/pi_J}) = \log (\frac{\pi_a}{\pi_J}) - \log (\frac{\pi_b}{\pi_J})\)

\(= (\alpha_a + \beta_a x) - (\alpha_b + \beta_b x)\)

\(= (\alpha_a - \alpha_b) + (\beta_a - \beta_b) x\)


So, the equation for categories a and b has the form \(\alpha + \beta x\) with intercept parameter \(\alpha = (\alpha_a - \alpha_b)\) and with slope parameter \(\beta = (\beta_a - \beta_b)\).

Let’s apply it to our data now:

Cleaning the Data

source_path = "~/Desktop/git_repositories/SBB-dispersal/avbernat_working_on/Rsrc/"

script_names = c("center_flight_data.R", # Re-centers data 
                 "clean_flight_data.R", # Loads and cleans data
                 "unique_flight_data.R",
                 "get_warnings.R", 
                 "compare_models.R",
                 "regression_output.R", # Cleans regression outputs; prints in color or B&W
                 "AICprobabilities.R",
                 "multinom_functions.R")

for (script in script_names) { 
  path = paste0(source_path, script)
  source(path) 
}
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
output_col = FALSE # Change to TRUE if working in Base R or RStudio; FALSE if generating an HTML

data <- read_flight_data("data/all_flight_data-Winter2020.csv")
data_all <- data[[1]]
data_tested <- data[[2]]
d <- create_delta_data(data_tested)
## Warning: `funs()` is deprecated as of dplyr 0.8.0.
## Please use a list of either functions or lambdas: 
## 
##   # Simple named list: 
##   list(mean = mean, median = median)
## 
##   # Auto named with `tibble::lst()`: 
##   tibble::lst(mean, median)
## 
##   # Using lambdas
##   list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
colnames(d)[c(1:2,5,66:68,71:73)]
## [1] "ID"          "sex"         "host_plant"  "egg_diff"    "mass_diff"  
## [6] "flew_diff"   "mass_per"    "flight_case" "egg_case"
# for wing-to-body ratio color creating
d$w2b_col <- 0

Age Effect

Age also plays a role in this dataset, but age was unknown because the soapberry bugs were field collected.

Modeling

Below I used the multinom function from the nnet package to estimate a multinomial logistic regression model. There are other functions in other R packages capable of multinomial regression. I chose the multinom function because it does not require the data to be reshaped (as the mlogit package does) and to mirror the example code found in Hilbe’s Logistic Regression Models.

First, I chose the baseline by specifying the level using relevel function. Then, I ran our model using multinom. The multinom package does not include p-value calculations for the regression coefficients, so I calculated P using Wald tests (i.e. z-tests).

Finally, I considered mass difference as my predictor variable where a (+) sign means the bug gained mass between trials and a (-) sign means the bug lost mass between trials.

Delta Mass Key
Event Sign
gained mass from T1 to T2 pos
no mass change between trails 0
lost mass from T1 to T2 neg

Multinom model: flight case ~ mass_per

Baseline

df <- d %>%
  filter(!is.na(mass_diff), !is.na(flight_case)) 
  
df <- df[with(df, order(mass_per)),]
n_trials = nrow(df)

df$flight_case <- relevel(as.factor(df$flight_case), ref = "0")

Null model

null <- multinom(flight_case ~ 1, data = df) 
## # weights:  8 (3 variable)
## initial  value 385.389832 
## iter  10 value 319.269929
## final  value 319.269680 
## converged
model <- multinom(flight_case ~ mass_per, data = df) 
## # weights:  12 (6 variable)
## initial  value 385.389832 
## iter  10 value 307.053645
## final  value 306.984778 
## converged
model_table = calculate_P(model)
## 
##  AIC:  625.9696 
##    (Intercept)  Estimate DF Std. Err.     z   wald P > |z|
## -1      -0.909     0.044  6     0.010 4.492 20.179   0.000
## 1       -2.132     0.001  6     0.020 0.043  0.002   0.965
## 2        0.344     0.021  6     0.008 2.494  6.221   0.013

Model comparing flying only in T2 to the baseline (not flying at all) is not significant (P = 0.97). That is probably because so few bugs (only male) flew in T2 only.

Significant ML equations

run_multinom_model = function(d) {
  m <- multinom(flight_case ~ mass_per, trace=FALSE, data = d) 
  model_table = calculate_P(m, print_table=FALSE)
  return(model_table)
}
ML_eqs = get_significant_models(7)

The multinomial (ML) prediction equations are:

gsub("Mass Change",  "Mass Percent Change", prediction_equations(model_table))
## [1] "log(pi_-1 / pi_1) = 1.22 + 0.04 Mass Percent Change     Flew in T1, rather than T2"   
## [2] "log(pi_2 / pi_-1) = 1.25 + -0.02 Mass Percent Change     Flew in both, rather than T1"
## [3] "log(pi_2 / pi_1) = 2.48 + 0.02 Mass Percent Change     Flew in both, rather than T2"
N Model P
278 Flew in T1 only rather than T2 only = 1.22 + 0.04 Mass % sig
278 Flew in both rather than T1 only = 1.25 - 0.02 Mass % sig
278 Flew in both rather than T2 only = 2.48 + 0.02 Mass % not sig

Estimated Odds

For bugs of mass_perc x + 20%*(0.04) the estimated odds that bug flew in T1 rather than T2 equals

exp(0.04*20)
## [1] 2.225541

times the estimated odds at mass_perc x (%). So for every 20% increase in a bug’s original mass, the bug is 2.23 times more likely to fly only in T1. That makes sense because the odds of flying in T2 only are very low to begin with.

For bugs of mass_perc x + 40%*(-0.02), the estimated odds that the bug flew twice instead of only in T1 equals

exp(40*-0.02) # 2.5 times less likely to fly twice if gain mass
## [1] 0.449329
exp(-40*-0.02) # 2.5 times more likely to fly twice if loose mass
## [1] 2.225541

times the estimated odds at mass_perc x (%). Similar message - for every 40% increase in a bug’s original mass, the bug was 2.23 times more likely to fly only in T1 rather than fly twice. Although, when a bug looses 40% of original mass then becomes more likely to fly twice then only once.

exp(coef(model)*20) # this compares to no fly, the baseline # gain large % mass
##     (Intercept) mass_per
## -1 1.264064e-08 2.415238
## 1  3.047172e-19 1.017407
## 2  9.660792e+02 1.508336
exp(coef(model)*5) # gain small % mass
##     (Intercept) mass_per
## -1 1.060333e-02 1.246637
## 1  2.349493e-05 1.004324
## 2  5.575107e+00 1.108216
exp(coef(model)*-5) # loose small % mass
##     (Intercept)  mass_per
## -1 9.431000e+01 0.8021582
## 1  4.256237e+04 0.9956950
## 2  1.793688e-01 0.9023509
exp(coef(model)*-20) # loose large % mass
##     (Intercept)  mass_per
## -1 7.910994e+07 0.4140378
## 1  3.281732e+18 0.9828910
## 2  1.035112e-03 0.6629821
# 1/0 not significant

1 vs. 0 is not significant as already stated. However, -1 vs. 0 shows that as gain more mass then become increasingly more likely to only fly once. If loose mass then most likely to not fly at all, and more likely to fly twice then fly only once.

Predicted probabilities

You can also use predicted probabilities to help you understand the model. The multicategory logit model has an alternative expression in terms of the response probabilities. This is

\(\pi_j = \frac{e^{\alpha_j + \beta_j x}}{\sum_Je^{\alpha_J + \beta_J x}}, j=1,...,J\)


The denominator is the same for each probability, and the numerators for various j sum to the denominator where \(\sum_j \pi_j = 1\). The parameters (\(\alpha\) and\(\beta\)) equal zero for whichever category is the baseline in the logit expressions. So these would be the equations for the mass diff model (code not show):

(Intercept) Estimate DF Std. Err. z wald P > |z| -1 -0.909 0.044 6 0.010 4.492 20.179 0.000 1 -2.132 0.001 6 0.020 0.043 0.002 0.965 2 0.344 0.021 6 0.008 2.494 6.221 0.013

\(\hat{\pi_{-1}} = \frac{e^{-1.77 + 69.21 x}}{1 + e^{-1.77 + 69.21 x} + e^{-2.13 - 6.70 x} + e^{0.42 + 25.33 x}}, j=1,...,J\)

\(\hat{\pi_1} = \frac{e^{-2.13 - 6.70 x}}{1 + e^{-1.77 + 69.21 x} + e^{-2.13 - 6.70 x} + e^{0.42 + 25.33 x}}, j=1,...,J\)

\(\hat{\pi_2} = \frac{e^{0.42 + 25.33 x}}{1 + e^{-1.77 + 69.21 x} + e^{-2.13 - 6.70 x} + e^{0.42 + 25.33 x}}, j=1,...,J\)

\(\hat{\pi_0} = \frac{e^{0 + 0 x} = 1}{1 + e^{-1.77 + 69.21 x} + e^{-2.13 - 6.70 x} + e^{0.42 + 25.33 x}}, j=1,...,J\)

x = mass_diff, which can give you the estimated probabilities.


You can calculate these predicted probabilities for each of our outcome levels using the fitted function. You can start by generating the predicted probabilities for the observations in our dataset and viewing the first few rows. Then, you can plot them.

head(pp <- fitted(model))
##           0         -1          1         2
## 1 0.5386483 0.04284419 0.06190317 0.3566043
## 2 0.5356362 0.04375052 0.06158899 0.3590243
## 3 0.5290568 0.04577594 0.06090110 0.3642662
## 4 0.5263205 0.04663701 0.06061436 0.3664282
## 5 0.4982068 0.05614792 0.05764729 0.3879980
## 6 0.4949167 0.05734314 0.05729764 0.3904425

For more on multinomial plotting for when have more than 1 predictor variables see the following: https://stats.idre.ucla.edu/r/dae/multinomial-logistic-regression/

Now let’s compare some models

flight_case, mass_per, and sex

df <- df[with(df, order(mass_per)),]
data <- data.frame(R = df$flight_case, 
         A = df$mass_per,
         B = df$sex_c)
source("src/compare_models.R")
model_comparisonsAIC("src/generic multinomial models- multinom 1RF + 2 FF.R")
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
##        [,1]      [,2]      
## AICs   587.5607  593.4209  
## models 3         4         
## probs  0.9492355 0.05068255
## 
## m3   multinom(formula = R ~ A + B, data = data, trace = FALSE)
## m4   multinom(formula = R ~ A * B, data = data, trace = FALSE)
anova(m3, m4, test="Chisq") # Adding A*B does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##   Model Resid. df Resid. Dev   Test    Df  LR stat.   Pr(Chi)
## 1 A + B       825   569.5607                                 
## 2 A * B       822   569.4209 1 vs 2     3 0.1398496 0.9866598
delta_mass_model <- multinom(flight_case ~ mass_per + sex_c, data = df)
## # weights:  16 (9 variable)
## initial  value 385.389832 
## iter  10 value 286.869825
## iter  20 value 284.809036
## iter  30 value 284.797822
## final  value 284.780360 
## converged
model_table = calculate_P2(delta_mass_model, "mass_per", "sex_c")
## 
##  AIC:  587.5607 
##    (Intercept) mass_per  sex_c DF   SEi   SE1   SE2      zi     z1      z2
## -1      -1.015    0.043 -0.692  9 0.239 0.010 0.203  -4.248  4.390  -3.408
## 1       -6.820   -0.009 -5.626  9 0.183 0.026 0.183 -37.245 -0.348 -30.721
## 2        0.124    0.019 -0.902  9 0.167 0.008 0.159   0.742  2.334  -5.684
##       waldi  wald1   wald2 Pi > |z| P1 > |z| P2 > |z|
## -1   18.049 19.272  11.617    0.000    0.000    0.001
## 1  1387.197  0.121 943.764    0.000    0.728    0.000
## 2     0.551  5.447  32.310    0.458    0.020    0.000
prediction_equations2(model_table, " Mass Percent Change", " Sex ")
## [1] "Where F = 1"
## [1] "log(pi_-1 / pi_1) = 5.81 + 0.05 Mass Percent Change + 4.93 Sex      Flew in T1, rather than T2"    
## [2] "log(pi_2 / pi_-1) = 1.14 + -0.02 Mass Percent Change + -0.21 Sex      Flew in both, rather than T1"
## [3] "log(pi_2 / pi_1) = 6.94 + 0.03 Mass Percent Change + 4.72 Sex      Flew in both, rather than T2"

Significant models

run_multinom_model = function(d) {
  m <- multinom(flight_case ~ mass_per + sex_c, trace=FALSE, data = d) 
  model_table = calculate_P2(m, "mass_per", "sex_c", print_table=FALSE)
  return(model_table)
}
par(mfrow=c(1,2))
MASS_ML = get_significant_models(15) # mass per
SEX_ML = get_significant_models(16) # sex

Table 1. Estimated Parameters, Standard Errors, and Wald Test Statistics for All Main Effects Model
Odds Effect Parameter Estimate SE exp(\(\beta\))* Wald p
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 1.14 0.24 3.13 22.62 <0.001
\(P(Yi=flew\, in\, T1\, only)\) \(\delta\) Mass % \(\beta_1\) -0.02 0.01 0.67 7.64 0.006
Sex \(\beta_2\, (F=1 | M=-1)\) -0.21 0.20 0.81 1.14 0.286
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 7.00 0.19 1096.63 1305.68 <0.001
\(P(Yi=flew\, in\, T2\, only)\) \(\delta\) Mass % \(\beta_1\) 0.03 0.03 1.82 1.23 0.266
Sex \(\beta_2\, (F=1 | M=-1)\) 4.78 0.19 119.10 630.97 <0.001
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 0.12 0.17 1.13 0.55 0.458
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) 0.02 0.01 1.49 5.45 0.02
Sex \(\beta_2\, (F=1 | M=-1)\) -0.90 0.16 0.41 32.31 <0.001
\(\underline{P(Yi=flew\, in\, T1\, only)}\) Intercept \(\alpha\) -1.02 0.24 0.36 18.05 <0.001
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) 0.04 0.01 2.23 19.27 <0.001
Sex \(\beta_2\, (F=1 | M=-1)\) -0.69 0.20 0.50 11.62 0.001
\(\underline{P(Yi=flew\, in\, T2\, only)}\) Intercept \(\alpha\) -6.82 0.18 0.00 1387.2 <0.001
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) -0.01 0.03 0.82 0.12 0.728
Sex \(\beta_2\, (F=1 | M=-1)\) -5.63 0.18 0.00 943.76 <0.001
\(\underline{P(Yi=flew\, in\, T1\, only)}\) Intercept \(\alpha\) 5.86 0.23 350.72 633.18 <0.001
\(P(Yi=flew\, in\, T2\, only)\) \(\delta\) Mass % \(\beta_1\) 0.05 0.03 2.72 3.84 0.05
Sex \(\beta_2\, (F=1 | M=-1)\) 4.99 0.21 146.94 561.96 <0.001
* Instead of a 1% mass increase, which is relatively too small, the mass percent change estimates were multiplied by 20 before calculating the log odds. These changes better represent experimental observation and offers a more realistic odds.
N Model P
278 Flew in T1 only rather than T2 only = 5.81 + 0.05 Mass % + 4.93 Sex mass** | sex**
278 Flew in both rather than T1 only = 1.14 - 0.02 Mass % - 0.21 Sex mass** | not sex
278 Flew in both rather than T2 only = 6.94 + 0.03 Mass % + 4.72 Sex not mass | sex**

Estimated Odds

For males:

For bugs of mass_perc x + 20%*(0.05), the estimated odds that bug flew in T1 rather than T2 equals

exp(0.05*20)
## [1] 2.718282

times the estimated odds at mass_perc x (%). So for every 20% increase in a bug’s original mass, the bug was 2.72 times more likely to fly only in T1. That makes sense because the odds of flying in T2 only are very low to begin with.

For bugs of mass_perc x + 20*(-0.02%), the estimated odds that the bug flew twice instead of only in T1 equals

exp(20*-0.02) # 20 times less likely to fly twice if gain mass
## [1] 0.67032
exp(-20*-0.02) # 20 times more likely to fly twice if loose mass
## [1] 1.491825

times the estimated odds at mass_perc x (%). Similar message - for every 20% increase in a bug’s original mass, the bug was 1.5 times more likely to fly only in T1. So, when a bug starts loosing mass it becomes more likely to fly twice then only once.

For females:

For bugs of mass_perc x + 20%*(0.05), the estimated odds that bug flew in T1 rather than T2 equals

exp(0.05*20 + 4.93)
## [1] 376.1545

times the estimated odds at mass_perc x (%). So for every 20% increase in a bug’s original mass, the bug was 376 times more likely to fly only in T1. Crazy! It doesn’t take much for a female to only fly once.

For bugs of mass_perc x + 20%*(0.05), the estimated odds that bug flew in both trials rather than T1 equals

exp(-0.02*20 - 0.21)
## [1] 0.5433509
exp(-0.02*-40 - 0.21)
## [1] 1.803988

times the estimated odds at mass_perc x (%). So half as likely to fly in both trials if gain mass. Would need to loose a lot of mass (40) to be almost twice as likely (1.8) to fly in both trials.

exp(coef(delta_mass_model)*20) # this compares to no fly, the baseline # gain large % mass
##     (Intercept)  mass_per        sex_c
## -1 1.513930e-09 2.3406655 9.798741e-07
## 1  5.730838e-60 0.8325593 1.367423e-49
## 2  1.185738e+01 1.4736071 1.475858e-08
exp(coef(delta_mass_model)*5) # gain small % mass
##     (Intercept)  mass_per        sex_c
## -1 6.237728e-03 1.2369007 3.146245e-02
## 1  1.547229e-15 0.9552209 6.081010e-13
## 2  1.855655e+00 1.1017814 1.102202e-02
exp(coef(delta_mass_model)*-5) # loose small % mass
##     (Intercept)  mass_per        sex_c
## -1 1.603148e+02 0.8084723 3.178392e+01
## 1  6.463168e+14 1.0468783 1.644464e+12
## 2  5.388934e-01 0.9076211 9.072748e+01
exp(coef(delta_mass_model)*-20) # loose large % mass
##     (Intercept)  mass_per        sex_c
## -1 6.605326e+08 0.4272289 1.020539e+06
## 1  1.744946e+59 1.2011156 7.313027e+48
## 2  8.433568e-02 0.6786069 6.775721e+07
# 1/0 mass not sig but sex**
  • If F and gain a lot of mass then most likely to fly only in T1 rather than none.
  • If F and loose mass then most likely to not fly in either trial.

Predicted Probabilities

head(pp <- fitted(delta_mass_model))
##           0         -1            1         2
## 1 0.7917303 0.03003973 4.362037e-06 0.1782256
## 2 0.7894639 0.03073036 4.325625e-06 0.1798015
## 3 0.7844677 0.03228066 4.247094e-06 0.1832474
## 4 0.7823713 0.03294258 4.214827e-06 0.1846819
## 5 0.7601763 0.04036342 3.895616e-06 0.1994564
## 6 0.7574981 0.04130977 3.859619e-06 0.2011882
df$index = 1:nrow(df)
females = df %>%
  filter(sex=="F")
males = df %>%
  filter(sex=="M")
Frows = females$index
Mrows = males$index

No females flew in T2 only.

Adding Host Plant

df <- df[with(df, order(mass_per)),]
data <- data.frame(R = df$flight_case, 
         A = df$mass_per,
         B = df$sex_c,
         C = df$host_c)
source("src/compare_models.R")
model_comparisonsAIC("src/generic multinomial models- multinom 1RF + 3 FF.R")
##        [,1]      [,2]      [,3]       [,4]      
## AICs   587.5607  591.9016  592.3168   592.4231  
## models 4         7         13         12        
## probs  0.7141852 0.0815063 0.06622882 0.06280119
## 
## m4   multinom(formula = R ~ A + B, data = data, trace = FALSE)
## m7   multinom(formula = R ~ A + B + C, data = data, trace = FALSE)
## m13  multinom(formula = R ~ B * C + A, data = data, trace = FALSE)
## m12  multinom(formula = R ~ A * C + B, data = data, trace = FALSE)
anova(m4, m7, test="Chisq") # Adding C (host plant) does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1     A + B       825   569.5607                                
## 2 A + B + C       822   567.9016 1 vs 2     3 1.659076 0.6460701

Host plant is not significant.

Wing2Body

df <- df[with(df, order(mass_per)),]
df$wing2body_c = df$wing2body - mean(df$wing2body)
df$wing2body_scaled = df$wing2body_c/sd(df$wing2body)*100 # normalized and then multiplied by 100
data <- data.frame(R = df$flight_case, 
         A = df$mass_per,
         B = df$sex_c,
         C = df$wing2body_c)
model_script = paste0(source_path,"generic multinomial models- multinom 1RF + 3 FF.R")
model_comparisonsAIC(model_script)
##        [,1]      [,2]      [,3]      
## AICs   582.2678  585.1197  587.133   
## models 7         12        13        
## probs  0.6671688 0.1603139 0.05858546
## 
## m7   multinom(formula = R ~ A + B + C, data = data, trace = FALSE)
## m12  multinom(formula = R ~ A * C + B, data = data, trace = FALSE)
## m13  multinom(formula = R ~ B * C + A, data = data, trace = FALSE)
anova(m7, m12, test="Chisq") # adding A*C does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1 A + B + C       822   558.2678                                
## 2 A * C + B       819   555.1197 1 vs 2     3 3.148182 0.3693379
anova(m7, m13, test="Chisq") # Adding B*C does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1 A + B + C       822   558.2678                                
## 2 B * C + A       819   557.1330 1 vs 2     3 1.134887 0.7686596
model <- multinom(flight_case ~ mass_per + sex_c + wing2body_c, data = df)
## # weights:  20 (12 variable)
## initial  value 385.389832 
## iter  10 value 286.740091
## iter  20 value 280.436850
## iter  30 value 279.437125
## iter  40 value 279.174660
## iter  50 value 279.134087
## final  value 279.133921 
## converged
model_table = calculate_P3(model)
## 
##  AIC:  582.2678 
##    (Intercept) mass %    sex wing2body DF   SEi   SE1   SE2    SE3      zi
## -1      -0.935  0.041 -0.571    23.739 12 0.243 0.010 0.212 12.059  -3.854
## 1       -8.177 -0.005 -6.954    -6.595 12 0.187 0.025 0.187 18.786 -43.767
## 2        0.201  0.018 -0.760    28.094 12 0.172 0.008 0.166  9.718   1.173
##        z1      z2     z3    waldi  wald1    wald2 wald3 Pi>|z| P1>|z| P2>|z|
## -1  4.254  -2.698  1.969   14.850 18.096    7.278 3.875  0.000  0.000  0.007
## 1  -0.215 -37.102 -0.351 1915.510  0.046 1376.550 0.123  0.000  0.830  0.000
## 2   2.141  -4.590  2.891    1.375  4.585   21.071 8.357  0.241  0.032  0.000
##    P3>|z|
## -1  0.049
## 1   0.726
## 2   0.004

Signal of flying in only T2 is weak, which is probably why the equation in the model was not significant. It does not differ much from not flying at all.

Significant equations

run_multinom_model = function(d) {
  m <- multinom(flight_case ~ mass_per + sex_c + wing2body_c, trace=FALSE, data = d)
  model_table = calculate_P3(m, print_table=FALSE)
  return(model_table)
}

# ML's below are all the same
par(mfrow=c(2,2))
MASS_PER_ML = get_significant_models(19) # mass%
SEX_ML = get_significant_models(20) # sex
WING2BODY_ML = get_significant_models(21) # wing2body

# 2 vs. -1
#Odds = c("$\\frac{P(Yi=flew\\,twice)}{P(Yi=flew\\, in\\, T1\\, only)}$", " ", " ", " ")
Odds = c("$\\underline{P(Yi=flew\\,twice)}$", "$P(Yi=flew\\, in\\, T1\\, only)$", " ", " ")
effect = c("Intercept", "$\\delta$ Mass %", "Sex", "Wing-to-Body")
pars = c("$\\alpha$", "$\\beta_1$", "$\\beta_2\\, (F=1 | M=-1)$", "$\\beta_3\\,(\\mu=0)$ ")
key1 = getting_odds(Odds, effect, pars, MASS_PER_ML[[1]], 3) 

# 2 vs. 1
Odds = c("$\\underline{P(Yi=flew\\,twice)}$", "$P(Yi=flew\\, in\\, T2\\, only)$", " ", " ")
key2 = getting_odds(Odds, effect, pars, MASS_PER_ML[[3]], 3) 

# 2 vs. 0
Odds = c("$\\underline{P(Yi=flew\\,twice)}$", "$P(Yi=did\\, not\\, fly)$", " ", " ")
key3 = getting_odds(Odds, effect, pars, MASS_PER_ML[[2]], 3) 

# -1 vs. 0
Odds = c("$\\underline{P(Yi=flew\\, in\\, T1\\, only)}$", "$P(Yi=did\\, not\\, fly)$", " ", " ")
key4 = getting_odds(Odds, effect, pars, MASS_PER_ML[[2]], 1) 

# 1 vs. 0
Odds = c("$\\underline{P(Yi=flew\\, in\\, T2\\, only)}$", "$P(Yi=did\\, not\\, fly)$", " ", " ")
key5 = getting_odds(Odds, effect, pars, MASS_PER_ML[[2]], 2) 

# -1 vs. 1
Odds = c("$\\underline{P(Yi=flew\\, in\\, T1\\, only)}$", "$P(Yi=flew\\, in\\, T2\\, only)$", " ", " ")
key6 = getting_odds(Odds, effect, pars, MASS_PER_ML[[3]], 2)

key = rbind(key1, key2, key3, key4, key5, key6)
rownames(key)<-NULL
colnames(key) = c("Odds","Effect", "Parameter", "Estimate", "SE", "exp($\\beta$)*", "Wald", "p")
#  "exp($\\sum_{n=1}^{3}{\\beta_n}$)*"

See here for more table parameters: https://haozhu233.github.io/kableExtra/awesome_table_in_html.html

kable(key, caption="Table 1. Estimated Parameters, Standard Errors, and Wald Test Statistics for All Main Effects Model") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"))  %>%
  column_spec(1, width="6cm", background="white") %>%
  kable_classic(html_font = "Cambria") %>%
  row_spec(c(4,8,12,16,20), extra_css = "border-bottom: 1px solid") %>%
  #column_spec(1, color = "black", background = "white") %>%
  footnote(general = " * Instead of a 1% mass increase, which is relatively too small, or a 1 unit increase in the ratio, which is 2 magnitudes higher than approximately one standard deviation from the mean, mass and wing-to-body ratio estimates were multiplied by 20 and divided by 100, respectively, before calculating the log odds. These changes better represent experimental observation and offers a more realistic impact on the odds. ",
           general_title = " ", number_title = "* "
           )
Table 1. Estimated Parameters, Standard Errors, and Wald Test Statistics for All Main Effects Model
Odds Effect Parameter Estimate SE exp(\(\beta\))* Wald p
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 1.14 0.24 3.13 22.48 <0.001
\(P(Yi=flew\, in\, T1\, only)\) \(\delta\) Mass % \(\beta_1\) -0.02 0.01 0.67 7.71 0.005
Sex \(\beta_2\, (F=1 | M=-1)\) -0.19 0.20 0.83 0.88 0.348
Wing-to-Body \(\beta_3\,(\mu=0)\) 4.22 10.81 1.04 0.15 0.696
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 7.18 0.20 1312.91 1340.92 <0.001
\(P(Yi=flew\, in\, T2\, only)\) \(\delta\) Mass % \(\beta_1\) 0.02 0.02 1.49 0.91 0.34
Sex \(\beta_2\, (F=1 | M=-1)\) 5.00 0.20 148.41 635.31 <0.001
Wing-to-Body \(\beta_3\,(\mu=0)\) 32.85 18.76 1.39 3.07 0.08
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 0.20 0.17 1.22 1.38 0.241
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) 0.02 0.01 1.49 4.59 0.032
Sex \(\beta_2\, (F=1 | M=-1)\) -0.76 0.17 0.47 21.07 <0.001
Wing-to-Body \(\beta_3\,(\mu=0)\) 28.09 9.72 1.32 8.36 0.004
\(\underline{P(Yi=flew\, in\, T1\, only)}\) Intercept \(\alpha\) -0.94 0.24 0.39 14.85 <0.001
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) 0.04 0.01 2.23 18.1 <0.001
Sex \(\beta_2\, (F=1 | M=-1)\) -0.57 0.21 0.57 7.28 0.007
Wing-to-Body \(\beta_3\,(\mu=0)\) 23.74 12.06 1.27 3.88 0.049
\(\underline{P(Yi=flew\, in\, T2\, only)}\) Intercept \(\alpha\) -8.18 0.19 0.00 1915.51 <0.001
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) -0.01 0.03 0.82 0.05 0.83
Sex \(\beta_2\, (F=1 | M=-1)\) -6.95 0.19 0.00 1376.55 <0.001
Wing-to-Body \(\beta_3\,(\mu=0)\) -6.60 18.79 0.93 0.12 0.726
\(\underline{P(Yi=flew\, in\, T1\, only)}\) Intercept \(\alpha\) 6.05 0.23 424.11 663.4 <0.001
\(P(Yi=flew\, in\, T2\, only)\) \(\delta\) Mass % \(\beta_1\) 0.05 0.03 2.72 3.38 0.066
Sex \(\beta_2\, (F=1 | M=-1)\) 5.19 0.22 179.47 558.52 <0.001
Wing-to-Body \(\beta_3\,(\mu=0)\) 28.78 20.22 1.34 2.02 0.155
* Instead of a 1% mass increase, which is relatively too small, or a 1 unit increase in the ratio, which is 2 magnitudes higher than approximately one standard deviation from the mean, mass and wing-to-body ratio estimates were multiplied by 20 and divided by 100, respectively, before calculating the log odds. These changes better represent experimental observation and offers a more realistic impact on the odds.

Odds

# t1 vs. did not fly
#exp(-0.01 - 0.76 + 28.09/100) # 20% increase in mass, female, and 1/20 increase in the wing2body ratio length
1/40
## [1] 0.025
exp(23.74/40-.57)
## [1] 1.023778
exp(23.74/40+.57)
## [1] 3.201118
exp(0.02*0 + 0.76 + 28.09/100)
## [1] 2.831764
min(d$wing2body) - max(d$wing2body)
## [1] -0.125717
exp(0.02*20 - 0.76 + 28.09/100) # 20% increase in mass, female, and 1/20 increase in the wing2body ratio length
## [1] 0.9239475
exp(0.02*0 + 0.76 + 28.09/100)
## [1] 2.831764
# confidence interval for wing2body ratio
exp((28.09 + 1.96*9.72)/100) 
## [1] 1.602255
exp((28.09 - 1.96*9.72)/100)
## [1] 1.094599

Predicted Probabilities

head(pp <- fitted(model))
##           0         -1            1         2
## 1 0.7470322 0.03581826 2.459149e-07 0.2171493
## 2 0.8116776 0.02845344 2.925412e-07 0.1598686
## 3 0.6983200 0.04316854 2.166785e-07 0.2585113
## 4 0.8013507 0.03101678 2.841098e-07 0.1676322
## 5 0.7663040 0.04010556 2.581516e-07 0.1935902
## 6 0.7890842 0.03734506 2.743930e-07 0.1735705
# getting those small subset plots and scales in the top right hand corner.
plot_histograms = function() {
  x = 0.48
  v <- c(x-0.13,x, 0.85, 0.90)
  par( fig=v, new=TRUE, mar=c(0,0,0,0) )
  hist(df$wing2body[Frows], col="white", main="", cex.axis=0.9) 
  text(0.64,45, labels="w2b", cex=0.9)

  x = 0.88
  v <- c(x-0.13, x, 0.85, 0.90)
  par( fig=v, new=TRUE, mar=c(0,0,0,0) )
  hist(df$wing2body[Mrows], col="white", main="", cex.axis=0.9, xlim=c(0.64,0.78)) 
  text(0.66,40, labels="w2b", cex=0.9)
}
plot_color_scale = function() {
  x = 0.98
  v <- c(x-0.025, x, 0.75, 0.86)
  par( fig=v, new=TRUE, mar=c(0,0,0,0) )
  color.bar(colorRampPalette(c("black", "grey"))(1000), 63)
}

Amazing to see what a clear impact wing2body ratio has for whether bugs are more likely to fly twice or not fly at all. Those with wing2body ratios above the mean wing2body ratio appear to be more likely to fly twice and less likely to not fly. This is true across females and males.

Females only

Baseline

df <- d %>%
  filter(!is.na(egg_diff), !is.na(mass_diff), !is.na(flew_diff), sex_c == 1)

df <- df[with(df, order(mass_diff)),]

n_tfemales = nrow(df)

df$flight_case <- relevel(as.factor(df$flight_case), ref = "0")

Barplot

Egg Case

Delta Egg Response Key
Event Encoding
laid eggs in both trials 2
laid eggs in T2 only 1
laid eggs in neither trials 0
laid eggs in T1 only -1

Baseline

df <- d %>%
  filter(!is.na(egg_case), !is.na(mass_diff), !is.na(flew_diff), sex_c == 1)

df <- df[with(df, order(mass_diff)),]

n_tfemales = nrow(df)

df$flight_case <- relevel(as.factor(df$flight_case), ref = "0")

Null model

df$flight_case = droplevels(df$flight_case) # no female bug only flew in T2
null <- multinom(flight_case ~ 1, data = df) 
## # weights:  6 (2 variable)
## initial  value 102.170943 
## final  value 93.055466 
## converged

Multinom model: flew_diff ~ egg_case

model <- multinom(flight_case ~ egg_case, data = df) 
## # weights:  9 (4 variable)
## initial  value 102.170943 
## final  value 84.114906 
## converged
model_table = calculate_P(model)
## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)
## 
##  AIC:  176.2298 
##    (Intercept)  Estimate DF Std. Err.      z   wald P > |z|
## -1      -0.194    -0.656  4     0.345 -1.902  3.616   0.057
## 2        0.697    -1.189  4     0.314 -3.782 14.306   0.000
anova(null, model, test="Chisq") # adding egg_case improves fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: flight_case
##      Model Resid. df Resid. Dev   Test    Df LR stat.      Pr(Chi)
## 1        1       184   186.1109                                   
## 2 egg_case       182   168.2298 1 vs 2     2 17.88112 0.0001309677

Significant models

## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

Comparing models: adding host

Adding host with egg_diff and mass_diff

Host Plant Key
Host Encoding
Golden Rain Tree (GRT) 1
Balloon Vine (BV) -1
data <- data.frame(R = df$flight_case, 
         A = df$egg_case,
         B = df$mass_diff,
         C = df$host_c)
source("src/compare_models.R")
model_comparisonsAIC("src/generic multinomial models- multinom 1RF + 3 FF.R")
##        [,1]      [,2]      [,3]      [,4]       [,5]      
## AICs   165.1826  166.6029  167.0497  167.9211   167.9513  
## models 7         13        4         11         16        
## probs  0.3572256 0.1756085 0.1404487 0.09084268 0.08948043
## 
## m7   multinom(formula = R ~ A + B + C, data = data, trace = FALSE)
## m13  multinom(formula = R ~ B * C + A, data = data, trace = FALSE)
## m4   multinom(formula = R ~ A + B, data = data, trace = FALSE)
## m11  multinom(formula = R ~ A * B + C, data = data, trace = FALSE)
## m16  multinom(formula = R ~ B * C + A * B, data = data, trace = FALSE)
anova(m4, m7, test="Chisq") # Adding C does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df LR stat.    Pr(Chi)
## 1     A + B       180   155.0497                                 
## 2 A + B + C       178   149.1826 1 vs 2     2  5.86705 0.05320915
anova(m7, m13, test="Chisq") # Adding  mass_diff*host does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1 A + B + C       178   149.1826                                
## 2 B * C + A       176   146.6029 1 vs 2     2 2.579779 0.2753012
#model <- multinom(flight_case ~ mass_diff + egg_case, data = df) 
df <- df[with(df, order(mass_per)),]
model <- multinom(flight_case ~ mass_per + egg_case, data = df) 
## # weights:  12 (6 variable)
## initial  value 102.170943 
## iter  10 value 76.802714
## final  value 76.802689 
## converged
model_table = calculate_P2(model, "mass_per", "egg_case")
## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)
## 
##  AIC:  165.6054 
##    (Intercept) mass_per egg_case DF   SEi   SE1   SE2     zi    z1     z2 waldi
## -1      -0.950    0.041   -0.533  6 0.617 0.012 0.380 -1.539 3.389 -1.402 2.370
## 2        0.406    0.022   -1.098  6 0.424 0.011 0.297  0.957 2.038 -3.700 0.917
##     wald1  wald2 Pi > |z| P1 > |z| P2 > |z|
## -1 11.488  1.966    0.124    0.001    0.161
## 2   4.154 13.693    0.338    0.042    0.000

Odds

exp(0.02*20 + 1.10) # egg case = -1
## [1] 4.481689
exp(0.02*20 - 1.10) # egg case = 1
## [1] 0.4965853
exp(0.02*20) # egg case = 0
## [1] 1.491825
exp(0.02*20 - 1.10*2) # egg case = 2
## [1] 0.1652989

Significant eqs

run_multinom_model = function(d) {
  m <- multinom(flight_case ~ mass_per + egg_case, trace=FALSE, data = d) 
  model_table = calculate_P2(m, "mass_per", "egg_case", print_table=FALSE)
  return(model_table)
}
par(mfrow=c(1,2)) 
ML_eqs = get_significant_modelsf(15) # mass_per 
## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)
ML_eqs = get_significant_modelsf(16) # egg_case
## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

## Warning in cbind(s$coefficients, c(s$edf, s$edf, s$edf), s$standard.errors[, :
## number of rows of result is not a multiple of vector length (arg 2)

# 2 vs. -1
Odds = c("$\\underline{P(Yi=flew\\,twice)}$", "$P(Yi=flew\\, in\\, T1\\, only)$", " ")
effect = c("Intercept", "$\\delta$ Mass %", "Egg Response")
pars = c("$\\alpha$", "$\\beta_1$", "$\\beta_2$")
key1 = getting_oddsf(Odds, effect, pars, ML_eqs[[1]], 2) 

# 2 vs. 0
Odds = c("$\\underline{P(Yi=flew\\,twice)}$", "$P(Yi=did\\, not\\, fly)$", " ")
key2 = getting_oddsf(Odds, effect, pars, ML_eqs[[2]], 2) 

# -1 vs. 0
Odds = c("$\\underline{P(Yi=flew\\, in\\, T1\\, only)}$", "$P(Yi=did\\, not\\, fly)$", " ")
key3 = getting_oddsf(Odds, effect, pars, ML_eqs[[2]], 1) 

key = rbind(key1, key2, key3)
rownames(key)<-NULL
colnames(key) = c("Odds","Effect", "Parameter", "Estimate", "SE", "exp($\\beta$)*", "Wald", "p")
kable(key, caption="Table 2. Estimated Parameters, Standard Errors, and Wald Test Statistics for All Main Effects Model") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"))  %>%
  column_spec(1, width="6cm", background="white") %>%
  kable_classic(html_font = "Cambria") %>%
  row_spec(c(3,6,9), extra_css = "border-bottom: 1px solid")  %>%
  footnote(general = " * Instead of a 1% mass increase, which is relatively too small, mass percent estimates were multiplied by 20 before calculating the log odds. These changes better represent experimental observations and offers a more realistic odds.",
           general_title = " ", number_title = "* "
           )
Table 2. Estimated Parameters, Standard Errors, and Wald Test Statistics for All Main Effects Model
Odds Effect Parameter Estimate SE exp(\(\beta\))* Wald p
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 1.36 0.57 3.90 5.64 0.018
\(P(Yi=flew\, in\, T1\, only)\) \(\delta\) Mass % \(\beta_1\) -0.02 0.01 0.67 2.81 0.094
Egg Response \(\beta_2\) -0.56 0.38 0.57 2.18 0.14
\(\underline{P(Yi=flew\,twice)}\) Intercept \(\alpha\) 0.41 0.42 1.51 0.92 0.338
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) 0.02 0.01 1.49 4.15 0.042
Egg Response \(\beta_2\) -1.10 0.30 0.33 13.69 <0.001
\(\underline{P(Yi=flew\, in\, T1\, only)}\) Intercept \(\alpha\) -0.95 0.62 0.39 2.37 0.124
\(P(Yi=did\, not\, fly)\) \(\delta\) Mass % \(\beta_1\) 0.04 0.01 2.23 11.49 0.001
Egg Response \(\beta_2\) -0.53 0.38 0.59 1.97 0.161
* Instead of a 1% mass increase, which is relatively too small, mass percent estimates were multiplied by 20 before calculating the log odds. These changes better represent experimental observations and offers a more realistic odds.
tb = model_table
# -1 / 2 | Flew in T1, not T2
I = (tb[1,1] - tb[2,1])
M = (tb[1,2] - tb[2,2])
E = (tb[1,3] - tb[2,3])
EQ1 = paste0("log(pi_-1 / pi_2) = ", round(I, 2), " + ", round(M,2), " Mass %", " + ", round(E, 2), " Egg Case", "     Flew in T1, rather than both" )
EQ1  # mass_dff** | sex not
## [1] "log(pi_-1 / pi_2) = -1.36 + 0.02 Mass % + 0.56 Egg Case     Flew in T1, rather than both"

Predicted Probabilities

head(pp <- fitted(model))
##           0         -1          2
## 1 0.3182776 0.04652361 0.63519881
## 2 0.7833277 0.04039434 0.17627792
## 3 0.9015654 0.02877502 0.06965959
## 4 0.7751740 0.04311454 0.18171149
## 5 0.7496332 0.05212861 0.19823820
## 6 0.2764594 0.05730902 0.66623157

df <- df[with(df, order(mass_diff)),]
matrix = rbind(cbind(df$mass_diff[eggT1_rows], 
            pp[eggT1_rows,1],pp[eggT1_rows,2], pp[eggT1_rows,3]),
      cbind(df$mass_diff[egg_0rows], pp[egg_0rows,1], pp[egg_0rows,2],
            pp[egg_0rows,3]), 
      cbind(df$mass_diff[egg_2rows], pp[egg_2rows,1], pp[egg_2rows,2],
            pp[egg_2rows,3]),
      cbind(df$mass_diff[eggT2_rows], pp[eggT2_rows,1], pp[eggT2_rows,2], 
            pp[eggT2_rows,3]))
colnames(matrix) = c("mass_diff", "No", "T1", "Twice")
#write.csv(matrix, "data/animate_eggs.csv") # for interactive graphs

df <- df[with(df, order(mass_per)),]
matrix = rbind(cbind(df$mass_per[eggT1_rows], 
            pp[eggT1_rows,1],pp[eggT1_rows,2], pp[eggT1_rows,3]),
      cbind(df$mass_per[egg_0rows], pp[egg_0rows,1], pp[egg_0rows,2],
            pp[egg_0rows,3]), 
      cbind(df$mass_per[egg_2rows], pp[egg_2rows,1], pp[egg_2rows,2],
            pp[egg_2rows,3]),
      cbind(df$mass_per[eggT2_rows], pp[eggT2_rows,1], pp[eggT2_rows,2], 
            pp[eggT2_rows,3]))
colnames(matrix) = c("mass_per", "No", "T1", "Twice")
#write.csv(matrix, "data/animate_eggs-perc.csv") # for interactive graphs
  • Eggs laid twice: most likely to not fly unless gain more than about 0.025 g mass and then fly only in T1 (largest mass change)
  • Eggs laid in T2: most likely to not fly (2nd widest mass change)
  • Eggs laid only in T1: most likely to fly twice (mostly only lost mass)
  • Eggs not laid: most likely to fly twice always (smallest mass change and mostly only gained about 0.030 g)

wing2body

Baseline

df <- d %>%
  filter(!is.na(egg_case), !is.na(mass_diff), !is.na(flew_diff), !is.na(wing2body), sex_c == 1)

df <- df[with(df, order(mass_diff)),]

n_tfemales = nrow(df)

df$flight_case <- relevel(as.factor(df$flight_case), ref = "0")
data <- data.frame(R = df$flight_case, 
         A = df$egg_case,
         B = df$mass_diff,
         C = df$wing2body)
source("src/compare_models.R")
model_comparisonsAIC("src/generic multinomial models- multinom 1RF + 3 FF.R") # strange error...
##        [,1]      [,2]      [,3]      [,4]      [,5]       [,6]      
## AICs   165.9791  167.0497  168.793   169.6042  169.9652   169.9733  
## models 7         4         12        11        13         8         
## probs  0.4170701 0.2441983 0.1021376 0.0680834 0.05683876 0.05661042
## 
## m7   multinom(formula = R ~ A + B + C, data = data, trace = FALSE)
## m4   multinom(formula = R ~ A + B, data = data, trace = FALSE)
## m12  multinom(formula = R ~ A * C + B, data = data, trace = FALSE)
## m11  multinom(formula = R ~ A * B + C, data = data, trace = FALSE)
## m13  multinom(formula = R ~ B * C + A, data = data, trace = FALSE)
## m8   multinom(formula = R ~ A * B, data = data, trace = FALSE)
anova(m4, m7, test="Chisq") # adding wing2body does not include fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df LR stat.    Pr(Chi)
## 1     A + B       180   155.0497                                 
## 2 A + B + C       178   149.9791 1 vs 2     2 5.070548 0.07924002
anova(m7, m12, test="Chisq") # Adding A*C does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1 A + B + C       178   149.9791                                
## 2 A * C + B       176   148.7930 1 vs 2     2 1.186134 0.5526299
anova(m7, m11, test="Chisq") 
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df  LR stat.   Pr(Chi)
## 1 A + B + C       178   149.9791                                 
## 2 A * B + C       176   149.6042 1 vs 2     2 0.3749583 0.8290464
anova(m7, m13, test="Chisq") 
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##       Model Resid. df Resid. Dev   Test    Df   LR stat.   Pr(Chi)
## 1 A + B + C       178   149.9791                                  
## 2 B * C + A       176   149.9652 1 vs 2     2 0.01392852 0.9930599
anova(m4, m8, test="Chisq") # Adding A*B does not improve fit
## Likelihood ratio tests of Multinomial Models
## 
## Response: R
##   Model Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1 A + B       180   155.0497                                
## 2 A * B       178   153.9733 1 vs 2     2 1.076425 0.5837908

wing2body is not in the top model for females

Encodings

Delta Flight Response Key
Event Encoding
flew in both trials 2
flew in T2 only 1
flew in neither trials 0
flew in T1 only -1
Delta Mass Key
Event Sign
gained mass from T1 to T2 pos
no mass change between trails 0
lost mass from T1 to T2 neg
Host Plant Key
Host Encoding
Golden Rain Tree (GRT) 1
Balloon Vine (BV) -1

Females Only

Delta Flight Response Key
Event Encoding
flew in both trials 2
flew in neither trial 0
flew in T1 only -1
Delta Egg Response Key
Event Encoding
laid eggs in both trials 2
laid eggs in T2 only 1
laid eggs in neither trials 0
laid eggs in T1 only -1

Summary

flight case ~ mass per (all bugs)

flight case ~ mass per and sex

wing2body

females only

flight case ~ mass diff and egg case

Questions

  • Egg case and egg diff not factors. Not sure how to work around this problem.