TO DO

Review analysis google doc
Finish mean comparisons for Relative Feed Quality
Finish adding in Jake’s code
find sources for Jake
Multidimensionsal scaling for nutrient content and then impose treatments as environmental effects
Do we need to alter the nutrient values by the percent dry matter of the samples?

Overview of missing and problematic data

Here we analyze data from the Forage Timing Experiment.

missing data
The data for both fall harvests of at the R70 site in 2018 are missing from the dataset
The data from some plots where the spring harvest was at the grain stage is missing from the dataset

problematic data
The IWG at the I2 site was 6 years old and did not grow back after the first spring harvest. Of the data that was collected for the spring harvest in 2017 (environment=I2.2017), the data had a negative relationship between accumulated GDD and yield, unlike all other environments. Considering the I2.2017 IWG did not grow back after the spring harvest and the negative relationship between GDD and yield in 2017, its very likely the old age of this site caused it to respond very differently than the two other sites. It was decided to remove the limited data from I2 from the dataset. In some instances, the I2 data is presented just to illustrate the difference of its performance.

outliers

Overview of key analysis questions

We need to discuss and decide on a consistent methodology for how to classify and model the factors in this experiment.

From my analysis it seems like this experiment is stuck in-between (1) not being consistent enough to treat site-years as fixed effects without interactions slicing everything up and (2) not having enough data points to treat environments as a random effect while at the same time feeling comfortable that all the variance is accounted for in our small sample size (ideally we would want n=20 for environments; https://www.muscardinus.be/2018/09/number-random-effect-levels/).

I was taught to treat factors with less than 4 levels like site, year, and field.year as fixed effects because they are not likely generalizable to all sites, all years, all field.years. As fixed effects, they must be included in analyses as a factorial and if interactions are detected, the data must be sliced until interactions are removed in order to discuss main effects. The only random effect is typically block nested within site which helps account for the spatial variation within a field by letting the y intercept vary depending on which block the plots were located. In this approach, we also need to account for the repeated measures of plots within the same site across years and sometimes within year, by making plot the subject and a repeated measure statement. Another option within this approach is to combine site-year into an environment and treat environment that as a fixed effect. I’ve seen this in other papers and it would reduce the amount of interaction testing on the dataset which reduces our liklihood of a type I error. A factorial combination of environment with field.year can be used as well if there is a strong reason to believe the response would differ between field.years.

However, this approach of treating factors as fixed effects may not be appropriate for the data and objectives of this study. I think the objective of this study is to provide a starter or preliminary dataset with respect to how forage yield, forage quality and net $ returns can be maximized by optimizing when the first cut is taken and whether to take a follow-up cut. Such decisions are likely going to vary a lot depending on the weather of the given year and the site conditions if IWG behaves like any other forage (which it probably does). Therefore, for our study to robustly answer this question, it would require a lot of different environments and a variety of management approaches to be generalizable to a given IWG grower. We did not have the resources to conduct such a robust study, but we did get data for two sites over 3 years = 4 solid environment datasets. I tried the first option of large factorials of fixed effects and unsuprisingly there were lots of interactions, creating a complicated story for the reader and conclusions that likely aren’t true.

One quick fix could be to decrease the alpha level, thereby reducing our liklihood for Type I errrors. One drawback is that this could confuse the reader because it’s not very common.

Another approach could be to try to generalize our 4 environments to all environments by making them a random effect. Thinking back on our objectives for this study, we’re not interested in trying to go through and explain each interaction and effect due to the long list of possibile explanations with no real good way to parse out which. We’re trying to generalize and this would let us do that, but with the caveat that we need about double the amount of data to feel confident. If we construct a model where each environment can be allowed to have (1) a different starting point because of random effects from that environment and (2) allow the effect of our treatment of interest (timing.1cut for example) vary in its intensity within each environment due to the random effects from that environment; THEN we could estimate whether there is a difference due to a singular fixed effect despite that effect receiving random effects from the environment which could both have impacted the intercept and slope of the line within each environment across levels of our fixed effect factor variable of interest. In this scenario, we’d be operating under the statistical assumption that our environments captured the expected variation we’d expect to see if we replicated the study numerous times, which we’d want to explicitely discuss as a limitation. IF we still found a difference, we could suggest there’s sufficient reason to believe that an actual difference does exist and the addition of more environments would be critical to further clarify this effect. Ultimately, I think this would provide a clearer presentation of the data to the reader.

case study: yield.1cut vs. timing.1cut

Below is some code that would show a sample mean comparison analysis between the spring forage yield and the timing of the cut for the spring forage yield. In this example, we are comparing the approach where we use analyze factorials of fixed effects and must slice the data and the approach where we employ random effects to allow for generalization. Note that I am switching from the nlme package to the lmer package due to the latter’s ability to handle multiple random effects

fixeff <- lmer(yield.1cut~timing.1cut*site*field.year +
                (1|site/block), 
              dat=dat1,
              REML=FALSE)
isSingular(fixeff)

## [1] TRUE

summary(fixeff)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut * site * field.year + (1 | site/block)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2359.7   2404.1  -1164.8   2329.7      128 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9763 -0.4022 -0.0133  0.4581  4.2277 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  block:site (Intercept) 115185   339.4   
##  site       (Intercept)      0     0.0   
##  Residual               642505   801.6   
## Number of obs: 143, groups:  block:site, 8; site, 2
## 
## Fixed effects:
##                                               Estimate Std. Error t value
## (Intercept)                                     3688.3      286.9  12.853
## timing.1cutanthesis                            -1039.7      327.2  -3.177
## timing.1cutdough                                -221.9      327.2  -0.678
## siteR100                                       -1341.2      405.8  -3.305
## field.yearsecond                               -2960.2      327.2  -9.046
## timing.1cutanthesis:siteR100                    2336.6      462.8   5.049
## timing.1cutdough:siteR100                       2794.8      462.8   6.039
## timing.1cutanthesis:field.yearsecond            1745.1      462.8   3.771
## timing.1cutdough:field.yearsecond               1324.6      462.8   2.862
## siteR100:field.yearsecond                       3358.8      462.8   7.258
## timing.1cutanthesis:siteR100:field.yearsecond   1304.2      654.5   1.993
## timing.1cutdough:siteR100:field.yearsecond     -2177.9      658.3  -3.308
## 
## Correlation of Fixed Effects:
##                  (Intr) tmng.1ctn tmng.1ctd stR100 fld.yr tmng.1ctn:R100
## tmng.1ctnth      -0.570                                                 
## tmng.1ctdgh      -0.570  0.500                                          
## siteR100         -0.707  0.403     0.403                                
## fild.yrscnd      -0.570  0.500     0.500     0.403                      
## tmng.1ctn:R100    0.403 -0.707    -0.354    -0.570 -0.354               
## tmng.1ctd:R100    0.403 -0.354    -0.707    -0.570 -0.354  0.500        
## tmng.1ctn:.       0.403 -0.707    -0.354    -0.285 -0.707  0.500        
## tmng.1ctd:.       0.403 -0.354    -0.707    -0.285 -0.707  0.250        
## stR100:fld.       0.403 -0.354    -0.354    -0.570 -0.707  0.500        
## tmng.1ctn:R100:. -0.285  0.500     0.250     0.403  0.500 -0.707        
## tmng.1ctd:R100:. -0.283  0.249     0.497     0.401  0.497 -0.351        
##                  tmng.1ctd:R100 tmng.1ctn:. tmng.1ctd:. sR100: tmng.1ctn:R100:.
## tmng.1ctnth                                                                    
## tmng.1ctdgh                                                                    
## siteR100                                                                       
## fild.yrscnd                                                                    
## tmng.1ctn:R100                                                                 
## tmng.1ctd:R100                                                                 
## tmng.1ctn:.       0.250                                                        
## tmng.1ctd:.       0.500          0.500                                         
## stR100:fld.       0.500          0.500       0.500                             
## tmng.1ctn:R100:. -0.354         -0.707      -0.354      -0.707                 
## tmng.1ctd:R100:. -0.703         -0.351      -0.703      -0.703  0.497          
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

fixeff.2 <- lmer(yield.1cut~timing.1cut*site*field.year +
                (1|site:block),  #using interaction rather than nesting syntax eliminates warning
              dat=dat1,
              REML=FALSE)
summary(fixeff.2)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut * site * field.year + (1 | site:block)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2357.7   2399.1  -1164.8   2329.7      129 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9763 -0.4022 -0.0133  0.4581  4.2277 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  site:block (Intercept) 115185   339.4   
##  Residual               642505   801.6   
## Number of obs: 143, groups:  site:block, 8
## 
## Fixed effects:
##                                               Estimate Std. Error t value
## (Intercept)                                     3688.3      286.9  12.853
## timing.1cutanthesis                            -1039.7      327.2  -3.177
## timing.1cutdough                                -221.9      327.2  -0.678
## siteR100                                       -1341.2      405.8  -3.305
## field.yearsecond                               -2960.2      327.2  -9.046
## timing.1cutanthesis:siteR100                    2336.6      462.8   5.049
## timing.1cutdough:siteR100                       2794.8      462.8   6.039
## timing.1cutanthesis:field.yearsecond            1745.1      462.8   3.771
## timing.1cutdough:field.yearsecond               1324.6      462.8   2.862
## siteR100:field.yearsecond                       3358.8      462.8   7.258
## timing.1cutanthesis:siteR100:field.yearsecond   1304.2      654.5   1.993
## timing.1cutdough:siteR100:field.yearsecond     -2177.9      658.3  -3.308
## 
## Correlation of Fixed Effects:
##                  (Intr) tmng.1ctn tmng.1ctd stR100 fld.yr tmng.1ctn:R100
## tmng.1ctnth      -0.570                                                 
## tmng.1ctdgh      -0.570  0.500                                          
## siteR100         -0.707  0.403     0.403                                
## fild.yrscnd      -0.570  0.500     0.500     0.403                      
## tmng.1ctn:R100    0.403 -0.707    -0.354    -0.570 -0.354               
## tmng.1ctd:R100    0.403 -0.354    -0.707    -0.570 -0.354  0.500        
## tmng.1ctn:.       0.403 -0.707    -0.354    -0.285 -0.707  0.500        
## tmng.1ctd:.       0.403 -0.354    -0.707    -0.285 -0.707  0.250        
## stR100:fld.       0.403 -0.354    -0.354    -0.570 -0.707  0.500        
## tmng.1ctn:R100:. -0.285  0.500     0.250     0.403  0.500 -0.707        
## tmng.1ctd:R100:. -0.283  0.249     0.497     0.401  0.497 -0.351        
##                  tmng.1ctd:R100 tmng.1ctn:. tmng.1ctd:. sR100: tmng.1ctn:R100:.
## tmng.1ctnth                                                                    
## tmng.1ctdgh                                                                    
## siteR100                                                                       
## fild.yrscnd                                                                    
## tmng.1ctn:R100                                                                 
## tmng.1ctd:R100                                                                 
## tmng.1ctn:.       0.250                                                        
## tmng.1ctd:.       0.500          0.500                                         
## stR100:fld.       0.500          0.500       0.500                             
## tmng.1ctn:R100:. -0.354         -0.707      -0.354      -0.707                 
## tmng.1ctd:R100:. -0.703         -0.351      -0.703      -0.703  0.497

anova(fixeff, fixeff.2) #both models are similar

## Data: dat1
## Models:
## fixeff.2: yield.1cut ~ timing.1cut * site * field.year + (1 | site:block)
## fixeff: yield.1cut ~ timing.1cut * site * field.year + (1 | site/block)
##          npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## fixeff.2   14 2357.7 2399.1 -1164.8   2329.7                    
## fixeff     15 2359.7 2404.1 -1164.8   2329.7     0  1          1

car::Anova(fixeff.2) #interactions everywhere

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##                                Chisq Df Pr(>Chisq)    
## timing.1cut                  86.2665  2  < 2.2e-16 ***
## site                         47.6429  1  5.114e-12 ***
## field.year                    9.0218  1   0.002668 ** 
## timing.1cut:site             84.0794  2  < 2.2e-16 ***
## timing.1cut:field.year       64.5220  2  9.755e-15 ***
## site:field.year             132.0423  1  < 2.2e-16 ***
## timing.1cut:site:field.year  28.5163  2  6.423e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

fixeff.3 <- lm(yield.1cut~timing.1cut*site*field.year, 
              dat=dat1)
anova(fixeff, fixeff.3)

## Data: dat1
## Models:
## fixeff.3: yield.1cut ~ timing.1cut * site * field.year
## fixeff: yield.1cut ~ timing.1cut * site * field.year + (1 | site/block)
##          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## fixeff.3   13 2367.7 2406.2 -1170.8   2341.7                        
## fixeff     15 2359.7 2404.1 -1164.8   2329.7 12.046  2   0.002423 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

car::Anova(fixeff) #interactions everywhere

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##                                Chisq Df Pr(>Chisq)    
## timing.1cut                  86.2665  2  < 2.2e-16 ***
## site                         47.6429  1  5.114e-12 ***
## field.year                    9.0218  1   0.002668 ** 
## timing.1cut:site             84.0794  2  < 2.2e-16 ***
## timing.1cut:field.year       64.5220  2  9.755e-15 ***
## site:field.year             132.0423  1  < 2.2e-16 ***
## timing.1cut:site:field.year  28.5163  2  6.423e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#I replicated this code where data=dat to include I2 and grain cuts and got the same conclusions

Now let’s try the generalizable approach with random effects

## Data: dat1
## Models:
## raneff.0: yield.1cut ~ timing.1cut + (1 | site/block)
## raneff.1: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
##          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## raneff.0    6 2513.4 2531.2 -1250.7   2501.4                         
## raneff.1    7 2473.7 2494.4 -1229.8   2459.7 41.751  1  1.037e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Linear mixed model fit by REML ['lmerMod']
## Formula: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
##    Data: dat1
## 
## REML criterion at convergence: 2419
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5931 -0.6302 -0.1960  0.2959  4.3822 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  block:site (Intercept)  109833   331.4  
##  env        (Intercept) 1228371  1108.3  
##  site       (Intercept) 1170471  1081.9  
##  Residual               1532354  1237.9  
## Number of obs: 143, groups:  block:site, 8; env, 4; site, 2
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           2377.3      968.5   2.455
## timing.1cutanthesis   1327.1      252.7   5.252
## timing.1cutdough      1310.7      254.1   5.158
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.130          
## tmng.1ctdgh -0.130  0.497

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##              Chisq Df Pr(>Chisq)    
## timing.1cut 36.202  2  1.377e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  timing.1cut emmean  SE   df lower.CL upper.CL .group
##  boot          2377 968 1.05    -8702    13457  1    
##  dough         3688 969 1.05    -7358    14734   2   
##  anthesis      3704 968 1.05    -7375    14784   2   
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

These simple random effect models conclude that yield.1cut differs among timing.1cut treatments with boot cuts lower than dough and anthesis. Plausible conclusion given the data. But are these models the best fitted?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues

## Data: dat1
## Models:
## raneff.0: yield.1cut ~ timing.1cut + (1 | site/block)
## raneff.1: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
## raneff.4: yield.1cut ~ timing.1cut + (1 | site:block) + (1 + timing.1cut | 
## raneff.4:     env)
## raneff.2: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env) + (timing.1cut | 
## raneff.2:     env)
## raneff.3: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env) + (1 + 
## raneff.3:     timing.1cut | env)
##          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## raneff.0    6 2513.4 2531.2 -1250.7   2501.4                         
## raneff.1    7 2473.7 2494.4 -1229.8   2459.7 41.751  1  1.037e-10 ***
## raneff.4   11 2398.7 2431.2 -1188.3   2376.7 83.030  4  < 2.2e-16 ***
## raneff.2   13 2402.7 2441.2 -1188.3   2376.7  0.000  2          1    
## raneff.3   13 2402.7 2441.2 -1188.3   2376.7  0.000  0               
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Data: dat1
## Models:
## raneff.1: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
## raneff.4: yield.1cut ~ timing.1cut + (1 | site:block) + (1 + timing.1cut | 
## raneff.4:     env)
##          npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)    
## raneff.1    7 2473.7 2494.4 -1229.8   2459.7                        
## raneff.4   11 2398.7 2431.2 -1188.3   2376.7 83.03  4  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut + (1 | site:block) + (1 + timing.1cut |  
##     env)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2398.7   2431.3  -1188.3   2376.7      132 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7313 -0.4312 -0.0479  0.4326  4.1061 
## 
## Random effects:
##  Groups     Name                Variance Std.Dev. Corr       
##  site:block (Intercept)          158796   398.5              
##  env        (Intercept)         1111924  1054.5              
##             timing.1cutanthesis 3660026  1913.1   -0.12      
##             timing.1cutdough     923590   961.0   -0.36  0.64
##  Residual                        694046   833.1              
## Number of obs: 143, groups:  site:block, 8; env, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           2377.3      558.8   4.254
## timing.1cutanthesis   1327.1      971.6   1.366
## timing.1cutdough      1293.9      510.1   2.537
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.135          
## tmng.1ctdgh -0.367  0.625

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2473.7   2494.4  -1229.8   2459.7      136 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5940 -0.6360 -0.2082  0.2910  4.4203 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  block:site (Intercept)  111203   333.5  
##  env        (Intercept) 1228844  1108.5  
##  site       (Intercept)  253234   503.2  
##  Residual               1509289  1228.5  
## Number of obs: 143, groups:  block:site, 8; env, 4; site, 2
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           2377.3      692.2   3.434
## timing.1cutanthesis   1327.1      250.8   5.292
## timing.1cutdough      1310.4      252.2   5.196
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.181          
## tmng.1ctdgh -0.180  0.497

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##              Chisq Df Pr(>Chisq)  
## timing.1cut 4.8862  2    0.08689 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##              Chisq Df Pr(>Chisq)  
## timing.1cut 6.5149  2    0.03849 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  timing.1cut emmean   SE   df lower.CL upper.CL .group
##  boot          2377  636 5.41      778     3977  1    
##  dough         3671  691 5.49     1942     5401  1    
##  anthesis      3704 1215 5.42      651     6757  1    
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

## $lsmeans
##  timing.1cut lsmean   SE   df lower.CL upper.CL
##  boot          2377  636 5.41      778     3977
##  anthesis      3704 1215 5.42      651     6757
##  dough         3671  691 5.49     1942     5401
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast         estimate   SE   df t.ratio p.value
##  boot - anthesis   -1327.1 1122 5.34 -1.183  0.5088 
##  boot - dough      -1293.9  589 5.33 -2.198  0.1593 
##  anthesis - dough     33.3  883 5.34  0.038  0.9992 
## 
## Degrees-of-freedom method: kenward-roger 
## P value adjustment: tukey method for comparing a family of 3 estimates

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: lmer(formula = yield.1cut ~ timing.1cut + (1 | site:block) + 
##     (1 + timing.1cut | env), data = dat1, REML = FALSE)
## 
## Linear Hypotheses:
##                       Estimate Std. Error z value Pr(>|z|)  
## anthesis - boot == 0   1327.14     971.56   1.366   0.3346  
## dough - boot == 0      1293.89     510.05   2.537   0.0271 *
## dough - anthesis == 0   -33.26     764.39  -0.044   0.9989  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

These more complex random effect models have approximately half the amount of residual variance as the simple random effect model and we can conclude that are better due to a significant difference in the model fit statistics. We select the best of the more compelx random effect models and then switch back to using REML because I’ve read it’s more robust which is why its the default for the lmer package. Our model concludes there are no differences between the the timing.1cut treatments.

Interestingly, if we use the maximum liklihood criterion, then there is a difference detected between the boot cut and the dough cuts. I suspect this difference is also plausible.

So which analysis approach is the most appropaite for this experiment?

I’d argue that the more complex random effect model that failed to reject the null hypothesis is the most conservative and will present the clearest story. Providing additional discussions about the sensitivity our data to having differences among treatments being able to be concluded through variations of the testing would be important to present as well.

Using this generalizable approach, we’d conclude that we can’t detect a difference in the spring forage yield among different timing cuts. Looking at the boxplot below, I’d say this agrees with my assessment of the responses of the environments. While it looks like there’s a nice strong trend in R100.2018 and R70.2018, we’ll see later that 2018 has below average rainfall before the boot harvest which likely depressed those yields, giving the impression that yield was due to increasing physiological maturity when its more likely that it was catching up after water stress.

We need to agree on a consistent approach for mean comparison before I continue on other response

Site conditions

All three sites were located in Rosemount MN, so the environmental conditions imposed by site was primarily driven by differences in soil and management. The differing environmental conditions among years and their interaction with the age of the kernza, the soil conditions and the applied treatments should be considered.

# Maximizing yield: only spring forage cut (yield.1cut)

What is the best time to harvest IWG forage in order to maximize your yield?

Here we look at the forage yield (dry biomass on a kg ha-1 basis) of the first cut of the year

distribution

First, let’s look at the distribution of the spring forage yield response.

The distribution of the first year looks normal, but the second year appears more Gamma/poisson.

variance

Variance seems homogenous except for the anthesis cuts in the second year, we’ll need to be aware of this limitation of our data.

Growing Degree Days vs. yield

How do GDD accumulation explain the observed forage yields?

Since we are exploring the data and we expect that the treatments from one field.year may confound the data from the second field year, we are going to see if the patterns in GDD accumulation and yield are similar within each field year.

yield.1cut vs. GDD.1cut where field.year=“first”

Now, we look only at data from the first year of the experiment, thereby removing any effects of the follow-on harvest treatments (follow.cut)

It appears as more GDD accumulated prior to the first forage cut, the forage yields were greater at R100 and didn’t really change or had a negative slope at R70.

Since we are only looking at the data for the spring cut in the first year, this is the only instance where the I2 data could be used.

When I2 is included, the positive relationship between GDD accumulation and yield is even less clear. While overall there is a positive correlation, this is driven primarily by the R100, and both I2 and R70 show slightly negative relationships.

A negative relationship between forage yield and growing degree days is not expected and may indicate that another environmental factor like precipitation may be influencing the response. Let’s look at the data by environment rather than site.

It’s strange that the GDD of the first cut is identical for both environments in 2017, indicating they were harvested on the same day. Investigating the notes, it appears that in 2017, I2 and R70 were harvested on the same day for all three spring harvest timings, so that checks out.

It’s noteworthy that harvests in R100.2018 occurred after less GDD accumulated, which indicates the plants were maturing faster in 2018 as compared to 2017, and since both R70 and R100 had the same age, difference in rate of maturity could be due to site factors that are hard to quantify (i.e. soil texture, soil health) or they could be simply due to precipitation

It’s clear that the accumulated precipitation between spring forage harvests differed between 2017 and 2018. Only about 250 mm of rain accumulated before the cut at the boot stage in 2018, whereas almost twice as much has accumulated before the first cut at the boot stage in 2017. There’s a couple ways this can be interpreted, were forage yields reduced for the boot cut in R100.2018 because of a lack of rain and that’s why we see a positive slope or was a lack of rain after the boot cut in 2017 that caused the lack of a positive slope. There may be a better way to visualize this.

At this point, it makes sense that the I2 data shows a negative slope because the IWG stand, as previously explained, was old and starting to die and did die after the grain harvest in 2017, so let’s just remove I2 and look at it again

So this might be something to be aware of in the discussion. Need to tease this out a bit more.

yield.1cut vs. GDD.1cut where field.year=“second”

Next we look at forage yields in the second year based on what treatment was applied in the first field year.

So here we are looking at the GDD that accumulated in year 2 on the x axis prior to the first cut and the forage yield of that first cut.
Regression lines suggest that in the second year, as GDD accumulated the forage yield was greater.
Regression lines also indicate very little difference among the follow.cut treatments. This is even more obvious if you allow se=T in the geom_smooth equation.
But remember that the kernza plants with follow.cut=none got about 2 more months of GDD in the first field year than kernza plants where follow.cut=october.

Now we will add in the additional GDD that accumulated in first field.year after the last cut in a given plot.

To do this, we need to need to create a new dataframe that is just field.year=second
then create a new dataframe that is just field.year=first
match these dataframes by row and mutate a new column that’s gdd.1cut of the dataframe for the second year plus the gdd.extra of that same plot the previous year

So here we are looking at the GDD that accumulated for the IWG plants after they were last cut in year 1 and before they were cut again in year 2.

Its remarkable that the plots with no follow-up cut received almost twice the average GDD than plots that were cut again in september and almost 4 times that amount of GDD than plots cut in October prior to the first cut in the spring the following year.

Did this translate to differences in yield?

##             numDF denDF   F-value p-value
## (Intercept)     1    61 3.1381474  0.0815
## follow.cut      2    61 0.1201979  0.8870

This is interesting! I would expect that plants that did not receive a follow-up cut (and therefore would accumulate more GDD compared to the plots with follow-up harvests), would have a greater forage yield for the yield.1cut the following year. However, despite follow.cut=none plots receiving upwards of 4x GDD compared with plots where follow.cut=october, the yield the following year was similar. When we do not take into account the GDD accumulation of the previous year and only look at the GDD accumulation in the field.year when the harvest occurred, then the GDD accumulation and yield allign.

This suggests that the GDD that accumulated after the last cut in year1 did not really have that great of an effect on the yield the following year. It also suggests that whether the plot was harvested once, twice or three times following harvest the first year did not have an impact on the first yield of the second year.
This may be worth further analyzing as one of our hypothesis is that multiple harvests on kernza plants will reduce their forage yield the following year.

yield.1cut vs. gdd.1cut

Since there doesn’t seem to be an effect of the harvest intensity (follow.cut) on the following field.year, we can combine across field.years!

let’s look at the relationship between yield and GDD among years…

Ok, when looking at year (which helps incorporate precip), it seems there’s a positive relationship between yield and GDD across all years except maybe 2017.
Since the data isn’t super even, let’s seperate by environment to get a better sense of things

There seems to be a consistent positive association between forage yield and GDD except for in 2017.
Did 2017 differ in precipitation or something? That could explain the non-positive association between GDD and forage yield

Let’s look at rain for 2017 vs the other years

weather for context

Now let’s add dates in 2017 and 2018 when the boot, anthesis and dough cut occurred.

It’s hard to tell without the another line depicting the 30 year average which year was weird, but it looks like 2018 happened to be drier than other years, especially in late May. Looking at both environments in 2018, the yield was low at the boot cut, making the slope more positive. The precipiation somewhat explains why the relationship between GDD and yield.1cut was so positive in 2018 (because plants were water-stressed at the boot stage harvest and then got water causing forage yields to increase), but there isn’t really that much of a pattern here since in 2019 there was a strong positive relationship between GDD and yield despite no water stress.

Trying to absorb the weirdness of R70.2017 by separating by site and keeping years together was a smart move by Jake

yield.1cut vs GDD, seperated by site

Much better, nice consistent trend

regression comparison of yield.cut vs. GDD.1cut

Now let’s compare a linear regression with a quadratic regression

Now let’s statistically compare the linear and quadratic model for goodness of fit.

##             numDF denDF  F-value p-value
## (Intercept)     1   134 22.21362  <.0001
## gdd.1cut        1   134 31.50397  <.0001

## Linear mixed-effects model fit by maximum likelihood
##  Data: dat1 
##        AIC      BIC    logLik
##   2506.082 2520.896 -1248.041
## 
## Random effects:
##  Formula: ~1 | site
##         (Intercept)
## StdDev:    942.6386
## 
##  Formula: ~1 | block %in% site
##         (Intercept) Residual
## StdDev:    282.3626 1441.402
## 
## Fixed effects: yield.1cut ~ gdd.1cut 
##                Value Std.Error  DF  t-value p-value
## (Intercept) 951.3149  801.9483 134 1.186255  0.2376
## gdd.1cut      1.8102    0.3225 134 5.612840  0.0000
##  Correlation: 
##          (Intr)
## gdd.1cut -0.511
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -1.4620552 -0.5899343 -0.2642517  0.2075620  4.4722016 
## 
## Number of Observations: 143
## Number of Groups: 
##            site block %in% site 
##               2               8

##               numDF denDF  F-value p-value
## (Intercept)       1   133 19.87100  <.0001
## gdd.1cut          1   133 37.66029  <.0001
## I(gdd.1cut^2)     1   133 26.77571  <.0001

##      Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mod1     1  5 2506.082 2520.896 -1248.041                        
## mod2     2  6 2483.244 2501.021 -1235.622 1 vs 2 24.83836  <.0001

We conclude model 2 (the quadratic) fits the data better and differs from model 1.

summary stats of regression; yield vs gdd

Now we determine at what level of GDD accumulation our maximimum forage yield.1cut occurs by calculate at what point our quadratic function we fitted onto the yield.1cut vs. GDD data has a slope of zero, and the corresponding GDD, this value is our “agronomically optimal growing degree day accumulation” or AOGDD

## estAOGDD 
## 1484.834

## estAOGDD 
## 4118.921

conclusion of GDD for yield.1cut

We conclude that our maximum yield for the first forage cut is 4,103 kg ha dry forage biomass and that this is achievable when 1,468 GDD have accumulated, which appears to be after anthesis and before dough stage.

Timing.1cut vs. yield

Did the forage yield of the first cut differ among timing.1cut treatments?

Interaction testing, model building

The forage yield response is normally distributed and variance is homogeneous, but when seperated by year the second year will require a poisson glm.

The main effect of interest is the timing.1cut. This fixed effect was based on the physiological stage of the IWG and therefore somewhat takes into account the random effects of year conditions (GDD and precip accumulation), site conditions (texture fertility differences) and even differences in the age of the kernza stand. We can model these random effects as a singular factor, environment. The additional random effect of block within environment can be nested within environment.

The repeated measuring of an IWG stand across two field.years, however, violates the assumption of independent sampling. Since each environment is treated as a random effect, this should solve that problem because they are able to have different y intercepts.

##             numDF denDF  F-value p-value
## (Intercept)     1   133 22.72727  <.0001
## timing.1cut     2   133 12.46666  <.0001

##  timing.1cut emmean  SE df lower.CL upper.CL .group
##  boot          2377 704  1    -6564    11319  1    
##  dough         3677 704  1    -5274    12628   2   
##  anthesis      3704 704  1    -5237    12646   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

Let’s compare my model with Jake’s approach where an interaction between timing.1cut and field.year is included and site, rather than environment, is used as the random effect.

First test to see if interaction with the year of the experiment

##                        numDF denDF   F-value p-value
## (Intercept)                1   130 22.436628  <.0001
## timing.1cut                2   130 14.591185  <.0001
## field.year                 1   130  3.073607  0.0819
## timing.1cut:field.year     2   130 11.138958  <.0001

There is an interaction between timing.1cut and field.year. This seems to be driven by high variability at the anthesis harvest of the second year. There are no outliers, but that anthesis harvest looks weirdly variable and it seems to be driving the interaction

Since there’s an interaction between field.years and yield, we must separate by field.years.

Now we compare models based on AIC criterion and plot residuals

## integer(0)

## integer(0)

## integer(0)

## integer(0)

I think model selection is a judgement call here.

I’d argue that either model jpb.1 or splitting by year and doing sy1 and sy2 are good.
The AIC criterion and log liklihood will always reduce the more that we split up the data. The question is whether we have to separate by field years.

conclusion of mean comparison

If we do not seperate by field.years, our graph would look like this

##  timing.1cut emmean  SE df lower.CL upper.CL .group
##  boot          2377 704  1    -6564    11319  1    
##  dough         3677 704  1    -5274    12628   2   
##  anthesis      3704 704  1    -5237    12646   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

##   timing.1cut  n     mean       se tukey
## 1        boot 48 2377.276 196.8551     b
## 2    anthesis 48 3704.417 346.1988     a
## 3       dough 47 3660.779 199.1394     a

If we seperate by field.years, we would get this.

##   field.year timing.1cut  n     mean       se tukey
## 1      first        boot 24 3017.664 268.9939     b
## 2      first    anthesis 24 3146.219 165.6068     b
## 3      first       dough 24 4193.135 220.8997     a
## 4     second        boot 24 1736.888 224.4086     c
## 5     second    anthesis 24 4262.615 659.7852     a
## 6     second       dough 23 3105.277 297.8929     b

Maximizing yield: all cuts summed together

What combination of the first cut timing and follow-up cuts yields the highest result?

We need to remember that R70.2018 did not have follow-up cuts, so we really have 3 environments with follow-up cuts: R70.2017, R100.2018 and R100.2019. We shouldn’t include the yield data from R70.2018 if there weren’t any followup cuts!

distribution

First, let’s look at the distribution of the spring forage yield response.

All sites appear normally distributed except the R70.2017 might benefet from a GLM.

variance

Variance of total yield appears to be homogenous except for the high yielding anthesis treatments in R100.2019

Now, we expect the total yield to be greater with follow-up cuts (It can’t be lower unless it’s the second year and there’s some carryover effect). While we see what we expect in the R100 site, in R70.2017 there appears to be a decrease in total yield in the plots with follow-up cuts. This cannot be explained by precipitation (which was very similar to 2018) nor treatments from a previous year (this was the first year), I think it’s just random luck and very low yields in the follow-up cuts for the first year.

Treatment for total yield

##             numDF denDF  F-value p-value
## (Intercept)     1    91 42.06339  <.0001
## treatment       8    91  2.55057  0.0149

##  treatment emmean  SE df lower.CL upper.CL .group
##  BN          3116 792  1    -6951    13183  1    
##  BO          3804 792  1    -6263    13871  12   
##  BS          3959 792  1    -6108    14026  12   
##  AN          4067 792  1    -5999    14134  12   
##  DN          4068 792  1    -5999    14135  12   
##  DS          4623 792  1    -5444    14690  12   
##  DO          4929 804  1    -5281    15139  12   
##  AS          5204 792  1    -4863    15271   2   
##  AO          5301 792  1    -4766    15368   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

Unsurprisingly, the plots with the greatest total yield combined across both years total yield across both years favors the cuts in september and october.

##  treatment emmean  SE df lower.CL upper.CL .group
##  BN          3116 792  1    -6951    13183  1    
##  BO          3804 792  1    -6263    13871  12   
##  BS          3959 792  1    -6108    14026  12   
##  AN          4067 792  1    -5999    14134  12   
##  DN          4068 792  1    -5999    14135  12   
##  DS          4623 792  1    -5444    14690  12   
##  DO          4929 804  1    -5281    15139  12   
##  AS          5204 792  1    -4863    15271   2   
##  AO          5301 792  1    -4766    15368   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

##   treatment  n     mean       se tukey
## 1        BN 12 3400.492 388.0992     b
## 2        BS 12 4243.389 283.1127    ab
## 3        BO 12 4088.314 418.1572    ab
## 4        AN 12 4352.060 546.4632    ab
## 5        AS 12 5488.623 846.4588     a
## 6        AO 12 5585.631 808.9571     a
## 7        DN 12 4352.326 282.2937    ab
## 8        DS 12 4907.874 347.1665    ab
## 9        DO 11 5186.363 455.3763    ab

conclusion

maximum total yields were lower when just harvested at the boot stage compared with when harvested at anthesis and harvested again in september and september and october.

Plots that recieved follow-up harvests did not differ from plots that were only havested once, assuming the timing of the first harvest was the same.

Relative Forage Quality (RFQ)

What are the best harvest times for maximizing quality?

So we have more data for the RFQ and RFV, which could allow us to look at grain cuts, but for now we won’t because we’ve been filtering out the I2 and grain cuts up until this point so it might be confusing for the reader.

–>

RFQ.1cut

distribution

First, let’s look at the distribution of the spring forage yield response.

yikes, this is a mess of distributions, not very normally distributed.

Looks like the RFQ went down in the second field.year. I wonder if the plots whose RFQ didn’t go down were the one’s without a followup harvest?

variance

As we expect, as the plant matures the RFQ decreases.

I noticed a blimp of high RFQ values in the second field year and thought they might the plots that didn’t recieve follow-up cuts, but it appears there isn’t a clear impact on RFQ in the second year if the IWG was harvested multiple times the year prior.

The variances seem pretty homogenous

First we’ll look at treatments, then we can mess around with GDD

interaction testing

There seems to be a consistent enough trend that we might be able to combine across sites and years

Nevermind, there are significant interactions between timing.1cut and the site,year,environment, field year etc.

Let’s try the approach used for yield.1cut

generalizable model

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.012832 (tol = 0.002, component 1)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: 
## RFQ.1cut ~ timing.1cut + (1 | site/block) + (1 | env) + (timing.1cut |  
##     env)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   1050.7   1089.2   -512.3   1024.7      130 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.44224 -0.56770  0.01424  0.42390  2.80180 
## 
## Random effects:
##  Groups     Name                Variance  Std.Dev.  Corr       
##  block.site (Intercept)         1.771e-03 0.0420848            
##  env        (Intercept)         5.498e-03 0.0741474            
##             timing.1cutanthesis 5.438e-03 0.0737422 -0.48      
##             timing.1cutdough    4.200e-03 0.0648071 -0.60  0.99
##  env.1      (Intercept)         4.803e-07 0.0006930            
##  site       (Intercept)         1.195e-07 0.0003457            
##  Residual                       1.031e-02 0.1015429            
## Number of obs: 143, groups:  block:site, 8; env, 4; site, 2
## 
## Fixed effects:
##                     Estimate Std. Error t value Pr(>|z|)    
## (Intercept)          4.68100    0.12464  37.557  < 2e-16 ***
## timing.1cutanthesis -0.25446    0.07982  -3.188  0.00143 ** 
## timing.1cutdough    -0.33533    0.06895  -4.863 1.15e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.122          
## tmng.1ctdgh -0.324  0.947   
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## Model failed to converge with max|grad| = 0.012832 (tol = 0.002, component 1)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: RFQ.1cut ~ timing.1cut + (1 | site:block) + (timing.1cut | env)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   1046.7   1079.2   -512.3   1024.7      132 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.44192 -0.56795  0.01421  0.42377  2.80112 
## 
## Random effects:
##  Groups     Name                Variance Std.Dev. Corr       
##  site:block (Intercept)         0.001770 0.04207             
##  env        (Intercept)         0.005500 0.07416             
##             timing.1cutanthesis 0.005441 0.07376  -0.48      
##             timing.1cutdough    0.004198 0.06479  -0.60  0.99
##  Residual                       0.010314 0.10156             
## Number of obs: 143, groups:  site:block, 8; env, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value Pr(>|z|)    
## (Intercept)          4.68110    0.12457  37.577  < 2e-16 ***
## timing.1cutanthesis -0.25456    0.07981  -3.190  0.00143 ** 
## timing.1cutdough    -0.33542    0.06893  -4.866 1.14e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.124          
## tmng.1ctdgh -0.326  0.947   
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: RFQ.1cut ~ timing.1cut + (timing.1cut | env)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   1077.4   1107.0   -528.7   1057.4      133 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.82786 -0.52095 -0.04847  0.46533  3.15240 
## 
## Random effects:
##  Groups   Name                Variance Std.Dev. Corr       
##  env      (Intercept)         0.005968 0.07725             
##           timing.1cutanthesis 0.005470 0.07396  -0.45      
##           timing.1cutdough    0.004175 0.06462  -0.59  0.99
##  Residual                     0.013040 0.11419             
## Number of obs: 143, groups:  env, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value Pr(>|z|)    
## (Intercept)          4.68409    0.10848  43.178  < 2e-16 ***
## timing.1cutanthesis -0.25146    0.07763  -3.239   0.0012 ** 
## timing.1cutdough    -0.33041    0.06688  -4.940 7.79e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.160          
## tmng.1ctdgh -0.385  0.929   
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## Data: dat1
## Models:
## q1cut.3.glm: RFQ.1cut ~ timing.1cut + (timing.1cut | env)
## q1cut.2.glm: RFQ.1cut ~ timing.1cut + (1 | site:block) + (timing.1cut | env)
## q1cut.1.glm: RFQ.1cut ~ timing.1cut + (1 | site/block) + (1 | env) + (timing.1cut | 
## q1cut.1.glm:     env)
##             npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## q1cut.3.glm   10 1077.4 1107.0 -528.71   1057.4                         
## q1cut.2.glm   11 1046.7 1079.2 -512.33   1024.7 32.769  1  1.038e-08 ***
## q1cut.1.glm   13 1050.7 1089.2 -512.33   1024.7  0.000  2          1    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: RFQ.1cut
##              Chisq Df Pr(>Chisq)    
## timing.1cut 43.234  2  4.092e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  timing.1cut emmean    SE  df asymp.LCL asymp.UCL .group
##  dough         4.35 0.121 Inf      4.11      4.58  1    
##  anthesis      4.43 0.139 Inf      4.15      4.70   2   
##  boot          4.68 0.125 Inf      4.44      4.93    3  
## 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

conclusion

But what about the interaction between timing.1cut and the follow.cut treatment?

## Linear mixed model fit by REML ['lmerMod']
## Formula: RFQ.1cut ~ follow.cut + (treatment | env)
##    Data: dat1
## 
## REML criterion at convergence: 1075.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.1198 -0.5149 -0.0081  0.4385  3.5166 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr                               
##  env      (Intercept) 1950.79  44.168                                      
##           treatmentBS   15.84   3.980   -0.29                              
##           treatmentBO   20.12   4.485    0.93 -0.34                        
##           treatmentAN  522.48  22.858   -0.85 -0.22 -0.70                  
##           treatmentAS  642.36  25.345   -0.89 -0.15 -0.76  1.00            
##           treatmentAO  673.09  25.944   -0.92 -0.08 -0.79  0.99  1.00      
##           treatmentDN  965.77  31.077   -0.98  0.09 -0.88  0.95  0.97  0.98
##           treatmentDS  932.99  30.545   -0.96  0.02 -0.85  0.97  0.98  0.99
##           treatmentDO  983.88  31.367   -0.93 -0.08 -0.81  0.99  1.00  1.00
##  Residual               92.69   9.627                                      
##             
##             
##             
##             
##             
##             
##             
##             
##   1.00      
##   0.99  0.99
##             
## Number of obs: 143, groups:  env, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)          67.0199     4.5093  14.863
## follow.cutseptember   0.9357     1.9902   0.470
## follow.cutoctober     1.7211     2.0101   0.856
## 
## Correlation of Fixed Effects:
##             (Intr) fllw.cts
## fllw.ctsptm -0.355         
## fllw.ctctbr -0.054  0.460  
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: RFQ.1cut
##             Chisq Df Pr(>Chisq)
## follow.cut 0.3341  2     0.8462

## Linear mixed model fit by REML ['lmerMod']
## Formula: RFQ.1cut ~ treatment + (treatment | env)
##    Data: dat1
## 
## REML criterion at convergence: 1037
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.1224 -0.4544 -0.0059  0.4350  3.4483 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr                               
##  env      (Intercept) 387.272  19.679                                      
##           treatmentBS  23.462   4.844   -0.73                              
##           treatmentBO   8.456   2.908    0.26 -0.36                        
##           treatmentAN 153.493  12.389   -0.22 -0.39  0.66                  
##           treatmentAS 142.334  11.930   -0.33 -0.32  0.56  0.99            
##           treatmentAO 129.030  11.359   -0.51 -0.12  0.49  0.95  0.98      
##           treatmentDN 131.046  11.448   -0.85  0.32  0.16  0.70  0.78  0.88
##           treatmentDS 120.856  10.993   -0.72  0.12  0.32  0.84  0.90  0.96
##           treatmentDO 120.379  10.972   -0.42 -0.25  0.48  0.97  0.99  0.99
##  Residual              94.884   9.741                                      
##             
##             
##             
##             
##             
##             
##             
##             
##   0.98      
##   0.83  0.93
##             
## Number of obs: 143, groups:  env, 4
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 107.5702    10.1365  10.612
## treatmentBS   0.9351     4.2102   0.222
## treatmentBO   5.9183     3.7382   1.583
## treatmentAN -20.3637     7.0876  -2.873
## treatmentAS -22.3050     6.8880  -3.238
## treatmentAO -22.3945     6.6421  -3.372
## treatmentDN -29.5628     6.6800  -4.426
## treatmentDS -28.1751     6.4865  -4.344
## treatmentDO -28.3867     6.5090  -4.361
## 
## Correlation of Fixed Effects:
##             (Intr) trtmBS trtmBO trtmAN trtmAS trtmAO trtmDN trtmDS
## treatmentBS -0.544                                                 
## treatmentBO -0.059  0.296                                          
## treatmentAN -0.272  0.002  0.448                                   
## treatmentAS -0.366  0.047  0.420  0.872                            
## treatmentAO -0.510  0.151  0.403  0.838  0.856                     
## treatmentDN -0.797  0.369  0.290  0.648  0.705  0.780              
## treatmentDS -0.680  0.275  0.349  0.750  0.790  0.837  0.845       
## treatmentDO -0.434  0.095  0.401  0.844  0.858  0.851  0.735  0.805
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: RFQ.1cut
##            Chisq Df Pr(>Chisq)    
## treatment 207.51  8  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  treatment emmean   SE   df lower.CL upper.CL .group
##  DN          78.0 8.75 3.45     52.1      104  1    
##  DO          78.9 9.49 3.41     50.7      107  1    
##  DS          79.4 8.23 3.51     55.3      104  1    
##  AO          85.2 9.46 3.37     56.9      113  1    
##  AS          85.3 8.23 3.51     61.1      109  1    
##  AN          87.2 8.75 3.45     61.3      113  1    
##  BN         107.6 8.75 3.45     81.6      133   2   
##  BS         108.5 8.23 3.51     84.4      133   2   
##  BO         113.5 9.46 3.37     85.2      142   2   
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

Only significant effects on RFQ.1cut were from the timing of the first cut, the followup cut did not impact the RFQ.1cut at all.

All cuts

October cut

Here we use data=dat, which includes I2 and grain cut, though since we are only doing the october cut, there is no data for I2. Unsure if there is no data for timing.cut=“grain”

## # A tibble: 12 x 1
##    `unique(dry.matter.3cut)`
##                        <dbl>
##  1                      NA  
##  2                      93.8
##  3                      94.0
##  4                      94.3
##  5                      94.0
##  6                      94.8
##  7                      97.6
##  8                      97.0
##  9                      96.8
## 10                      96.0
## 11                      96.5
## 12                      97.4

FIG Data Analysis

Jesse Puka-Beals

01 27 2021

TO DO

Overview of missing and problematic data

outliers

Overview of key analysis questions

case study: yield.1cut vs. timing.1cut

Site conditions

distribution

variance

Growing Degree Days vs. yield

yield.1cut vs. GDD.1cut where field.year=“first”

yield.1cut vs. GDD.1cut where field.year=“second”

yield.1cut vs. gdd.1cut

weather for context

yield.1cut vs GDD, seperated by site

regression comparison of yield.cut vs. GDD.1cut

summary stats of regression; yield vs gdd

conclusion of GDD for yield.1cut

Timing.1cut vs. yield

Interaction testing, model building

conclusion of mean comparison

Maximizing yield: all cuts summed together

distribution

variance

Treatment for total yield

conclusion

Relative Forage Quality (RFQ)

RFQ.1cut

distribution

variance

interaction testing

generalizable model

conclusion

All cuts

October cut