TO DO

Review analysis google doc
find sources for Jake
Multidimensionsal scaling for nutrient content and then impose treatments as environmental effects

Overview of missing and problematic data

Here we analyze data from the Forage Timing Experiment.

missing data
The data for both fall harvests of at the R70 site in 2018 are missing from the dataset
The data from some plots where the spring harvest was at the grain stage is missing from the dataset

problematic data
The IWG at the I2 site was 6 years old and did not grow back after the first spring harvest. Of the data that was collected for the spring harvest in 2017 (environment=I2.2017), the data had a negative relationship between accumulated GDD and yield, unlike all other environments. Considering the I2.2017 IWG did not grow back after the spring harvest and the negative relationship between GDD and yield in 2017, its very likely the old age of this site caused it to respond very differently than the two other sites. It was decided to remove the limited data from I2 from the dataset. In some instances, the I2 data is presented just to illustrate the difference of its performance.

outliers

Overview of key analysis questions

We need to discuss and decide on a consistent methodology for how to classify and model the factors in this experiment.

From my analysis it seems like this experiment is stuck in-between (1) not being consistent enough to treat site-years as fixed effects without interactions slicing everything up and (2) not having enough data points to treat environments as a random effect while at the same time feeling comfortable that all the variance is accounted for in our small sample size (ideally we would want n=20 for environments; https://www.muscardinus.be/2018/09/number-random-effect-levels/).

I was taught to treat factors with less than 4 levels like site, year, and field.year as fixed effects because they are not likely generalizable to all sites, all years, all field.years. As fixed effects, they must be included in analyses as a factorial and if interactions are detected, the data must be sliced until interactions are removed in order to discuss main effects. The only random effect is typically block nested within site which helps account for the spatial variation within a field by letting the y intercept vary depending on which block the plots were located. In this approach, we also need to account for the repeated measures of plots within the same site across years and sometimes within year, by making plot the subject and a repeated measure statement. Another option within this approach is to combine site-year into an environment and treat environment that as a fixed effect. I’ve seen this in other papers and it would reduce the amount of interaction testing on the dataset which reduces our liklihood of a type I error. A factorial combination of environment with field.year can be used as well if there is a strong reason to believe the response would differ between field.years.

However, this approach of treating factors as fixed effects may not be appropriate for the data and objectives of this study. I think the objective of this study is to provide a starter or preliminary dataset with respect to how forage yield, forage quality and net $ returns can be maximized by optimizing when the first cut is taken and whether to take a follow-up cut. Such decisions are likely going to vary a lot depending on the weather of the given year and the site conditions if IWG behaves like any other forage (which it probably does). Therefore, for our study to robustly answer this question, it would require a lot of different environments and a variety of management approaches to be generalizable to a given IWG grower. We did not have the resources to conduct such a robust study, but we did get data for two sites over 3 years = 4 solid environment datasets. I tried the first option of large factorials of fixed effects and unsuprisingly there were lots of interactions, creating a complicated story for the reader and conclusions that likely aren’t true.

One quick fix could be to decrease the alpha level, thereby reducing our liklihood for Type I errrors. One drawback is that this could confuse the reader because it’s not very common.

Another approach could be to try to generalize our 4 environments to all environments by making them a random effect. Thinking back on our objectives for this study, we’re not interested in trying to go through and explain each interaction and effect due to the long list of possibile explanations with no real good way to parse out which. We’re trying to generalize and this would let us do that, but with the caveat that we need about double the amount of data to feel confident. If we construct a model where each environment can be allowed to have (1) a different starting point because of random effects from that environment and (2) allow the effect of our treatment of interest (timing.1cut for example) vary in its intensity within each environment due to the random effects from that environment; THEN we could estimate whether there is a difference due to a singular fixed effect despite that effect receiving random effects from the environment which could both have impacted the intercept and slope of the line within each environment across levels of our fixed effect factor variable of interest. In this scenario, we’d be operating under the statistical assumption that our environments captured the expected variation we’d expect to see if we replicated the study numerous times, which we’d want to explicitely discuss as a limitation. IF we still found a difference, we could suggest there’s sufficient reason to believe that an actual difference does exist and the addition of more environments would be critical to further clarify this effect. Ultimately, I think this would provide a clearer presentation of the data to the reader.

case study: yield.1cut vs. timing.1cut

Below is some code that would show a sample mean comparison analysis between the spring forage yield and the timing of the cut for the spring forage yield. In this example, we are comparing the approach where we use analyze factorials of fixed effects and must slice the data and the approach where we employ random effects to allow for generalization. Note that I am switching from the nlme package to the lmer package due to the latter’s ability to handle multiple random effects

There is not a difference in the yield.1cut in the second year due to additional cuts the prior year

So the trend we’re expecting is the one we see in 2018, boot<anthesis<dough. In 2018, we see that lower rainfall and lower gdd accumulation may have contributed to this average

fixeff <- lmer(yield.1cut~timing.1cut*site*field.year +
                (1|site/block), 
              dat=dat1,
              REML=FALSE)
isSingular(fixeff)

## [1] TRUE

summary(fixeff)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut * site * field.year + (1 | site/block)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2359.7   2404.1  -1164.8   2329.7      128 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9763 -0.4022 -0.0133  0.4581  4.2277 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  block:site (Intercept) 115185   339.4   
##  site       (Intercept)      0     0.0   
##  Residual               642505   801.6   
## Number of obs: 143, groups:  block:site, 8; site, 2
## 
## Fixed effects:
##                                               Estimate Std. Error t value
## (Intercept)                                     3688.3      286.9  12.853
## timing.1cutanthesis                            -1039.7      327.2  -3.177
## timing.1cutdough                                -221.9      327.2  -0.678
## siteR100                                       -1341.2      405.8  -3.305
## field.yearsecond                               -2960.2      327.2  -9.046
## timing.1cutanthesis:siteR100                    2336.6      462.8   5.049
## timing.1cutdough:siteR100                       2794.8      462.8   6.039
## timing.1cutanthesis:field.yearsecond            1745.1      462.8   3.771
## timing.1cutdough:field.yearsecond               1324.6      462.8   2.862
## siteR100:field.yearsecond                       3358.8      462.8   7.258
## timing.1cutanthesis:siteR100:field.yearsecond   1304.2      654.5   1.993
## timing.1cutdough:siteR100:field.yearsecond     -2177.9      658.3  -3.308
## 
## Correlation of Fixed Effects:
##                  (Intr) tmng.1ctn tmng.1ctd stR100 fld.yr tmng.1ctn:R100
## tmng.1ctnth      -0.570                                                 
## tmng.1ctdgh      -0.570  0.500                                          
## siteR100         -0.707  0.403     0.403                                
## fild.yrscnd      -0.570  0.500     0.500     0.403                      
## tmng.1ctn:R100    0.403 -0.707    -0.354    -0.570 -0.354               
## tmng.1ctd:R100    0.403 -0.354    -0.707    -0.570 -0.354  0.500        
## tmng.1ctn:.       0.403 -0.707    -0.354    -0.285 -0.707  0.500        
## tmng.1ctd:.       0.403 -0.354    -0.707    -0.285 -0.707  0.250        
## stR100:fld.       0.403 -0.354    -0.354    -0.570 -0.707  0.500        
## tmng.1ctn:R100:. -0.285  0.500     0.250     0.403  0.500 -0.707        
## tmng.1ctd:R100:. -0.283  0.249     0.497     0.401  0.497 -0.351        
##                  tmng.1ctd:R100 tmng.1ctn:. tmng.1ctd:. sR100: tmng.1ctn:R100:.
## tmng.1ctnth                                                                    
## tmng.1ctdgh                                                                    
## siteR100                                                                       
## fild.yrscnd                                                                    
## tmng.1ctn:R100                                                                 
## tmng.1ctd:R100                                                                 
## tmng.1ctn:.       0.250                                                        
## tmng.1ctd:.       0.500          0.500                                         
## stR100:fld.       0.500          0.500       0.500                             
## tmng.1ctn:R100:. -0.354         -0.707      -0.354      -0.707                 
## tmng.1ctd:R100:. -0.703         -0.351      -0.703      -0.703  0.497          
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

fixeff.2 <- lmer(yield.1cut~timing.1cut*site*field.year +
                (1|site:block),  #using interaction rather than nesting syntax eliminates warning
              dat=dat1,
              REML=FALSE)
summary(fixeff.2)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut * site * field.year + (1 | site:block)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2357.7   2399.1  -1164.8   2329.7      129 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9763 -0.4022 -0.0133  0.4581  4.2277 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  site:block (Intercept) 115185   339.4   
##  Residual               642505   801.6   
## Number of obs: 143, groups:  site:block, 8
## 
## Fixed effects:
##                                               Estimate Std. Error t value
## (Intercept)                                     3688.3      286.9  12.853
## timing.1cutanthesis                            -1039.7      327.2  -3.177
## timing.1cutdough                                -221.9      327.2  -0.678
## siteR100                                       -1341.2      405.8  -3.305
## field.yearsecond                               -2960.2      327.2  -9.046
## timing.1cutanthesis:siteR100                    2336.6      462.8   5.049
## timing.1cutdough:siteR100                       2794.8      462.8   6.039
## timing.1cutanthesis:field.yearsecond            1745.1      462.8   3.771
## timing.1cutdough:field.yearsecond               1324.6      462.8   2.862
## siteR100:field.yearsecond                       3358.8      462.8   7.258
## timing.1cutanthesis:siteR100:field.yearsecond   1304.2      654.5   1.993
## timing.1cutdough:siteR100:field.yearsecond     -2177.9      658.3  -3.308
## 
## Correlation of Fixed Effects:
##                  (Intr) tmng.1ctn tmng.1ctd stR100 fld.yr tmng.1ctn:R100
## tmng.1ctnth      -0.570                                                 
## tmng.1ctdgh      -0.570  0.500                                          
## siteR100         -0.707  0.403     0.403                                
## fild.yrscnd      -0.570  0.500     0.500     0.403                      
## tmng.1ctn:R100    0.403 -0.707    -0.354    -0.570 -0.354               
## tmng.1ctd:R100    0.403 -0.354    -0.707    -0.570 -0.354  0.500        
## tmng.1ctn:.       0.403 -0.707    -0.354    -0.285 -0.707  0.500        
## tmng.1ctd:.       0.403 -0.354    -0.707    -0.285 -0.707  0.250        
## stR100:fld.       0.403 -0.354    -0.354    -0.570 -0.707  0.500        
## tmng.1ctn:R100:. -0.285  0.500     0.250     0.403  0.500 -0.707        
## tmng.1ctd:R100:. -0.283  0.249     0.497     0.401  0.497 -0.351        
##                  tmng.1ctd:R100 tmng.1ctn:. tmng.1ctd:. sR100: tmng.1ctn:R100:.
## tmng.1ctnth                                                                    
## tmng.1ctdgh                                                                    
## siteR100                                                                       
## fild.yrscnd                                                                    
## tmng.1ctn:R100                                                                 
## tmng.1ctd:R100                                                                 
## tmng.1ctn:.       0.250                                                        
## tmng.1ctd:.       0.500          0.500                                         
## stR100:fld.       0.500          0.500       0.500                             
## tmng.1ctn:R100:. -0.354         -0.707      -0.354      -0.707                 
## tmng.1ctd:R100:. -0.703         -0.351      -0.703      -0.703  0.497

anova(fixeff, fixeff.2) #both models are similar

## Data: dat1
## Models:
## fixeff.2: yield.1cut ~ timing.1cut * site * field.year + (1 | site:block)
## fixeff: yield.1cut ~ timing.1cut * site * field.year + (1 | site/block)
##          npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## fixeff.2   14 2357.7 2399.1 -1164.8   2329.7                    
## fixeff     15 2359.7 2404.1 -1164.8   2329.7     0  1          1

car::Anova(fixeff.2) #interactions everywhere

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##                                Chisq Df Pr(>Chisq)    
## timing.1cut                  86.2665  2  < 2.2e-16 ***
## site                         47.6429  1  5.114e-12 ***
## field.year                    9.0218  1   0.002668 ** 
## timing.1cut:site             84.0794  2  < 2.2e-16 ***
## timing.1cut:field.year       64.5220  2  9.755e-15 ***
## site:field.year             132.0423  1  < 2.2e-16 ***
## timing.1cut:site:field.year  28.5163  2  6.423e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

fixeff.3 <- lm(yield.1cut~timing.1cut*site*field.year, 
              dat=dat1)
anova(fixeff, fixeff.3)

## Data: dat1
## Models:
## fixeff.3: yield.1cut ~ timing.1cut * site * field.year
## fixeff: yield.1cut ~ timing.1cut * site * field.year + (1 | site/block)
##          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## fixeff.3   13 2367.7 2406.2 -1170.8   2341.7                        
## fixeff     15 2359.7 2404.1 -1164.8   2329.7 12.046  2   0.002423 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

car::Anova(fixeff) #interactions everywhere

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##                                Chisq Df Pr(>Chisq)    
## timing.1cut                  86.2665  2  < 2.2e-16 ***
## site                         47.6429  1  5.114e-12 ***
## field.year                    9.0218  1   0.002668 ** 
## timing.1cut:site             84.0794  2  < 2.2e-16 ***
## timing.1cut:field.year       64.5220  2  9.755e-15 ***
## site:field.year             132.0423  1  < 2.2e-16 ***
## timing.1cut:site:field.year  28.5163  2  6.423e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#I replicated this code where data=dat to include I2 and grain cuts and got the same conclusions

Now let’s try the generalizable approach with random effects

## Data: dat1
## Models:
## raneff.0: yield.1cut ~ timing.1cut + (1 | site/block)
## raneff.1: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
##          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## raneff.0    6 2513.4 2531.2 -1250.7   2501.4                         
## raneff.1    7 2473.7 2494.4 -1229.8   2459.7 41.751  1  1.037e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Linear mixed model fit by REML ['lmerMod']
## Formula: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
##    Data: dat1
## 
## REML criterion at convergence: 2419
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5931 -0.6302 -0.1960  0.2959  4.3822 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  block:site (Intercept)  109833   331.4  
##  env        (Intercept) 1228371  1108.3  
##  site       (Intercept) 1170471  1081.9  
##  Residual               1532354  1237.9  
## Number of obs: 143, groups:  block:site, 8; env, 4; site, 2
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           2377.3      968.5   2.455
## timing.1cutanthesis   1327.1      252.7   5.252
## timing.1cutdough      1310.7      254.1   5.158
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.130          
## tmng.1ctdgh -0.130  0.497

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##              Chisq Df Pr(>Chisq)    
## timing.1cut 36.202  2  1.377e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  timing.1cut emmean  SE   df lower.CL upper.CL .group
##  boot          2377 968 1.05    -8702    13457  1    
##  dough         3688 969 1.05    -7358    14734   2   
##  anthesis      3704 968 1.05    -7375    14784   2   
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

These simple random effect models conclude that yield.1cut differs among timing.1cut treatments with boot cuts lower than dough and anthesis. Plausible conclusion given the data. But are these models the best fitted?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues

## Data: dat1
## Models:
## raneff.0: yield.1cut ~ timing.1cut + (1 | site/block)
## raneff.1: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
## raneff.4: yield.1cut ~ timing.1cut + (1 | site:block) + (1 + timing.1cut | 
## raneff.4:     env)
## raneff.2: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env) + (timing.1cut | 
## raneff.2:     env)
## raneff.3: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env) + (1 + 
## raneff.3:     timing.1cut | env)
##          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## raneff.0    6 2513.4 2531.2 -1250.7   2501.4                         
## raneff.1    7 2473.7 2494.4 -1229.8   2459.7 41.751  1  1.037e-10 ***
## raneff.4   11 2398.7 2431.2 -1188.3   2376.7 83.030  4  < 2.2e-16 ***
## raneff.2   13 2402.7 2441.2 -1188.3   2376.7  0.000  2          1    
## raneff.3   13 2402.7 2441.2 -1188.3   2376.7  0.000  0               
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Data: dat1
## Models:
## raneff.1: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
## raneff.4: yield.1cut ~ timing.1cut + (1 | site:block) + (1 + timing.1cut | 
## raneff.4:     env)
##          npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)    
## raneff.1    7 2473.7 2494.4 -1229.8   2459.7                        
## raneff.4   11 2398.7 2431.2 -1188.3   2376.7 83.03  4  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut + (1 | site:block) + (1 + timing.1cut |  
##     env)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2398.7   2431.3  -1188.3   2376.7      132 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7313 -0.4312 -0.0479  0.4326  4.1061 
## 
## Random effects:
##  Groups     Name                Variance Std.Dev. Corr       
##  site:block (Intercept)          158796   398.5              
##  env        (Intercept)         1111924  1054.5              
##             timing.1cutanthesis 3660026  1913.1   -0.12      
##             timing.1cutdough     923590   961.0   -0.36  0.64
##  Residual                        694046   833.1              
## Number of obs: 143, groups:  site:block, 8; env, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           2377.3      558.8   4.254
## timing.1cutanthesis   1327.1      971.6   1.366
## timing.1cutdough      1293.9      510.1   2.537
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.135          
## tmng.1ctdgh -0.367  0.625

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ timing.1cut + (1 | site/block) + (1 | env)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2473.7   2494.4  -1229.8   2459.7      136 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5940 -0.6360 -0.2082  0.2910  4.4203 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  block:site (Intercept)  111203   333.5  
##  env        (Intercept) 1228844  1108.5  
##  site       (Intercept)  253234   503.2  
##  Residual               1509289  1228.5  
## Number of obs: 143, groups:  block:site, 8; env, 4; site, 2
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           2377.3      692.2   3.434
## timing.1cutanthesis   1327.1      250.8   5.292
## timing.1cutdough      1310.4      252.2   5.196
## 
## Correlation of Fixed Effects:
##             (Intr) tmng.1ctn
## tmng.1ctnth -0.181          
## tmng.1ctdgh -0.180  0.497

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##              Chisq Df Pr(>Chisq)  
## timing.1cut 4.8862  2    0.08689 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##              Chisq Df Pr(>Chisq)  
## timing.1cut 6.5149  2    0.03849 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  timing.1cut emmean   SE   df lower.CL upper.CL .group
##  boot          2377  636 5.41      778     3977  1    
##  dough         3671  691 5.49     1942     5401  1    
##  anthesis      3704 1215 5.42      651     6757  1    
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

## $lsmeans
##  timing.1cut lsmean   SE   df lower.CL upper.CL
##  boot          2377  636 5.41      778     3977
##  anthesis      3704 1215 5.42      651     6757
##  dough         3671  691 5.49     1942     5401
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast         estimate   SE   df t.ratio p.value
##  boot - anthesis   -1327.1 1122 5.34 -1.183  0.5088 
##  boot - dough      -1293.9  589 5.33 -2.198  0.1593 
##  anthesis - dough     33.3  883 5.34  0.038  0.9992 
## 
## Degrees-of-freedom method: kenward-roger 
## P value adjustment: tukey method for comparing a family of 3 estimates

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: lmer(formula = yield.1cut ~ timing.1cut + (1 | site:block) + 
##     (1 + timing.1cut | env), data = dat1, REML = FALSE)
## 
## Linear Hypotheses:
##                       Estimate Std. Error z value Pr(>|z|)  
## anthesis - boot == 0   1327.14     971.56   1.366   0.3346  
## dough - boot == 0      1293.89     510.05   2.537   0.0266 *
## dough - anthesis == 0   -33.26     764.39  -0.044   0.9989  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

These more complex random effect models have approximately half the amount of residual variance as the simple random effect model and we can conclude that are better due to a significant difference in the model fit statistics. We select the best of the more compelx random effect models and then switch back to using REML because I’ve read it’s more robust which is why its the default for the lmer package. Our model concludes there are no differences between the the timing.1cut treatments.

Interestingly, if we use the maximum liklihood criterion, then there is a difference detected between the boot cut and the dough cuts. I suspect this difference is also plausible.

So which analysis approach is the most appropaite for this experiment?

I’d argue that the more complex random effect model that failed to reject the null hypothesis is the most conservative and will present the clearest story. Providing additional discussions about the sensitivity our data to having differences among treatments being able to be concluded through variations of the testing would be important to present as well.

Using this generalizable approach, we’d conclude that we can’t detect a difference in the spring forage yield among different timing cuts. Looking at the boxplot below, I’d say this agrees with my assessment of the responses of the environments. While it looks like there’s a nice strong trend in R100.2018 and R70.2018, we’ll see later that 2018 has below average rainfall before the boot harvest which likely depressed those yields, giving the impression that yield was due to increasing physiological maturity when its more likely that it was catching up after water stress.

We need to agree on a consistent approach for mean comparison before I continue on other response

Site conditions

All three sites were located in Rosemount MN, so the environmental conditions imposed by site was primarily driven by differences in soil and management. The differing environmental conditions among years and their interaction with the age of the kernza, the soil conditions and the applied treatments should be considered.

# Effect of the follow-up cuts on field.year=2 When the follow-up cut treatments were applied to plots in the first field.year, did this result in differing yields the following year?

yield.1cut~follow.cut

Does the yield.1cut in the second field year differ between plots that received follow-up cuts the previous year?

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00310246 (tol = 0.002, component 1)

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: yield.1cut ~ follow.cut * year + (1 | site/block)
##    Data: subset(dat1, field.year == "second")
## 
##      AIC      BIC   logLik deviance df.resid 
##   1206.5   1226.9   -594.2   1188.5       62 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.5959 -0.8751 -0.1058  0.6023  2.6263 
## 
## Random effects:
##  Groups     Name        Variance  Std.Dev. 
##  block:site (Intercept) 4.867e-03 0.0697622
##  site       (Intercept) 1.576e-08 0.0001256
##  Residual               1.893e-01 0.4351172
## Number of obs: 71, groups:  block:site, 8; site, 2
## 
## Fixed effects:
##                              Estimate Std. Error t value Pr(>|z|)    
## (Intercept)                   7.22712    0.13536  53.391  < 2e-16 ***
## follow.cutseptember          -0.00427    0.17793  -0.024    0.981    
## follow.cutoctober            -0.12081    0.17798  -0.679    0.497    
## year2019                      1.21505    0.19108   6.359 2.03e-10 ***
## follow.cutseptember:year2019  0.07371    0.25165   0.293    0.770    
## follow.cutoctober:year2019    0.12934    0.25516   0.507    0.612    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##               (Intr) fllw.cts fllw.ctc yr2019 fllw.cts:2019
## fllw.ctsptm   -0.657                                       
## fllw.ctctbr   -0.657  0.499                                
## year2019      -0.708  0.465    0.466                       
## fllw.cts:2019  0.466 -0.707   -0.353   -0.659              
## fllw.ctc:2019  0.460 -0.348   -0.698   -0.651  0.494       
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## Model failed to converge with max|grad| = 0.00310246 (tol = 0.002, component 1)

## Linear mixed model fit by REML ['lmerMod']
## Formula: yield.1cut ~ follow.cut * year + (1 | block)
##    Data: subset(dat1, field.year == "second")
## 
## REML criterion at convergence: 1162.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8157 -0.4599 -0.0569  0.2981  3.6184 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept)   27900   167    
##  Residual             2698108  1643    
## Number of obs: 71, groups:  block, 4
## 
## Fixed effects:
##                              Estimate Std. Error t value
## (Intercept)                  1385.621    481.474   2.878
## follow.cutseptember            -6.059    670.585  -0.009
## follow.cutoctober            -158.494    670.585  -0.236
## year2019                     3260.412    670.585   4.862
## follow.cutseptember:year2019  355.316    948.351   0.375
## follow.cutoctober:year2019    218.293    959.144   0.228
## 
## Correlation of Fixed Effects:
##               (Intr) fllw.cts fllw.ctc yr2019 fllw.cts:2019
## fllw.ctsptm   -0.696                                       
## fllw.ctctbr   -0.696  0.500                                
## year2019      -0.696  0.500    0.500                       
## fllw.cts:2019  0.492 -0.707   -0.354   -0.707              
## fllw.ctc:2019  0.487 -0.350   -0.699   -0.699  0.494

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues

## Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX

## Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: yield.1cut ~ follow.cut * year + (1 | block)
##    Data: subset(dat1, field.year == "second")
## 
##      AIC      BIC   logLik deviance df.resid 
##   1203.3   1221.4   -593.6   1187.3       63 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.59221 -0.88242 -0.06851  0.59596  2.56704 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept) 0.007383 0.08593 
##  Residual             0.186911 0.43233 
## Number of obs: 71, groups:  block, 4
## 
## Fixed effects:
##                               Estimate Std. Error t value Pr(>|z|)    
## (Intercept)                   7.223357   0.131991  54.726  < 2e-16 ***
## follow.cutseptember          -0.002824   0.176499  -0.016    0.987    
## follow.cutoctober            -0.120711   0.176499  -0.684    0.494    
## year2019                      1.218296   0.176499   6.903 5.11e-12 ***
## follow.cutseptember:year2019  0.069440   0.249607   0.278    0.781    
## follow.cutoctober:year2019    0.119791   0.252481   0.474    0.635    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##               (Intr) fllw.cts fllw.ctc yr2019 fllw.cts:2019
## fllw.ctsptm   -0.669                                       
## fllw.ctctbr   -0.669  0.500                                
## year2019      -0.669  0.500    0.500                       
## fllw.cts:2019  0.473 -0.707   -0.354   -0.707              
## fllw.ctc:2019  0.467 -0.350   -0.699   -0.699  0.494       
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## unable to evaluate scaled gradient
## Model failed to converge: degenerate  Hessian with 1 negative eigenvalues

## 
## Call:
## glm(formula = yield.1cut ~ follow.cut * year, family = Gamma(link = "log"), 
##     data = subset(dat1, field.year == "second"), na.action = na.omit)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.99655  -0.43888  -0.06314   0.25967   0.91289  
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   7.233904   0.132219  54.712  < 2e-16 ***
## follow.cutseptember          -0.004382   0.186986  -0.023    0.981    
## follow.cutoctober            -0.121473   0.186986  -0.650    0.518    
## year2019                      1.209866   0.186986   6.470 1.48e-08 ***
## follow.cutseptember:year2019  0.076864   0.264437   0.291    0.772    
## follow.cutoctober:year2019    0.135067   0.267426   0.505    0.615    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Gamma family taken to be 0.2097815)
## 
##     Null deviance: 41.915  on 70  degrees of freedom
## Residual deviance: 14.337  on 65  degrees of freedom
## AIC: 1200.9
## 
## Number of Fisher Scoring iterations: 4

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.1cut ~ follow.cut * year + (1 | block)
##    Data: subset(dat1, field.year == "second")
## 
##      AIC      BIC   logLik deviance df.resid 
##   1263.2   1281.3   -623.6   1247.2       63 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9185 -0.5021 -0.0551  0.3421  3.8278 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept)       0     0    
##  Residual             2491020  1578    
## Number of obs: 71, groups:  block, 4
## 
## Fixed effects:
##                              Estimate Std. Error t value
## (Intercept)                  1385.621    455.615   3.041
## follow.cutseptember            -6.059    644.337  -0.009
## follow.cutoctober            -158.494    644.337  -0.246
## year2019                     3260.412    644.337   5.060
## follow.cutseptember:year2019  355.316    911.230   0.390
## follow.cutoctober:year2019    222.084    921.527   0.241
## 
## Correlation of Fixed Effects:
##               (Intr) fllw.cts fllw.ctc yr2019 fllw.cts:2019
## fllw.ctsptm   -0.707                                       
## fllw.ctctbr   -0.707  0.500                                
## year2019      -0.707  0.500    0.500                       
## fllw.cts:2019  0.500 -0.707   -0.354   -0.707              
## fllw.ctc:2019  0.494 -0.350   -0.699   -0.699  0.494       
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## 
## Call:
## lm(formula = yield.1cut ~ follow.cut * year, data = subset(dat1, 
##     field.year == "second"), na.action = na.omit)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3027.9  -792.4   -87.0   539.9  6041.3 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  1385.621    476.179   2.910  0.00495 ** 
## follow.cutseptember            -6.059    673.419  -0.009  0.99285    
## follow.cutoctober            -158.494    673.419  -0.235  0.81467    
## year2019                     3260.412    673.419   4.842  8.3e-06 ***
## follow.cutseptember:year2019  355.316    952.358   0.373  0.71030    
## follow.cutoctober:year2019    222.084    963.120   0.231  0.81836    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1650 on 65 degrees of freedom
## Multiple R-squared:  0.5462, Adjusted R-squared:  0.5113 
## F-statistic: 15.65 on 5 and 65 DF,  p-value: 4.36e-10

## [1] 1261.192

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues

## Warning in vcov.merMod(mod, complete = FALSE): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX

## Warning in vcov.merMod(mod, complete = FALSE): variance-covariance matrix computed from finite-difference Hessian is
## not positive definite or contains NA values: falling back to var-cov estimated from RX

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.1cut
##                    Chisq Df Pr(>Chisq)    
## follow.cut        0.5715  2     0.7515    
## year            155.5806  1     <2e-16 ***
## follow.cut:year   0.2276  2     0.8924    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Analysis of Deviance Table (Type II tests)
## 
## Response: yield.1cut
##                 LR Chisq Df Pr(>Chisq)    
## follow.cut         0.448  2     0.7993    
## year             130.673  1     <2e-16 ***
## follow.cut:year    0.257  2     0.8792    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We conclude that forage yields in the spring in the second field year did not differ between plots that recieved follow-up cuts and plots that did not.

yield.2cut~follow.cut

We only have follow-up forage yield data in the second year for R100.2019

## Warning: Removed 48 rows containing non-finite values (stat_density).

## Warning: Removed 48 rows containing non-finite values (stat_smooth).

## Warning: Removed 48 rows containing missing values (geom_point).

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: yield.2cut ~ follow.cut + (1 | block)
##    Data: subset(dat1, field.year == "second")
## 
##      AIC      BIC   logLik deviance df.resid 
##    370.1    374.7   -181.1    362.1       19 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3238 -0.7732 -0.3476  0.6162  2.5365 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept) 0.0000   0.0000  
##  Residual             0.2517   0.5017  
## Number of obs: 23, groups:  block, 4
## 
## Fixed effects:
##                   Estimate Std. Error t value Pr(>|z|)    
## (Intercept)        7.19405    0.14019  51.318   <2e-16 ***
## follow.cutoctober  0.04286    0.20271   0.211    0.833    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## fllw.ctctbr -0.692
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## 
## Call:
## glm(formula = yield.2cut ~ follow.cut, family = Gamma(link = "log"), 
##     data = subset(dat1, field.year == "second"), na.action = na.omit)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.9240  -0.4539  -0.1857   0.2810   0.9504  
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        7.19405    0.15156  47.467   <2e-16 ***
## follow.cutoctober  0.04286    0.21915   0.196    0.847    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Gamma family taken to be 0.2756368)
## 
##     Null deviance: 5.6551  on 22  degrees of freedom
## Residual deviance: 5.6446  on 21  degrees of freedom
##   (48 observations deleted due to missingness)
## AIC: 366.12
## 
## Number of Fisher Scoring iterations: 4

## Analysis of Deviance Table (Type II tests)
## 
## Response: yield.2cut
##            LR Chisq Df Pr(>Chisq)
## follow.cut 0.038265  1     0.8449

There was no difference among the yield.2cut the following year based on whether there was a follow-up cut previously

yield.total~follow.cut

When we look at the total yield, we are basically just looking at R100.2019 because there is no difference between yield.1cut and yield.total in R70 due to a lack of follow-up cuts in 2018.

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: yield.total ~ follow.cut + (1 | block)
##    Data: subset(dat1, env == "R100.2019")
## 
##      AIC      BIC   logLik deviance df.resid 
##    640.7    648.5   -315.3    630.7       30 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3762 -0.6881 -0.3625  0.4400  2.8546 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept)       0     0    
##  Residual             3920355  1980    
## Number of obs: 35, groups:  block, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           4646.0      571.6   8.128
## follow.cutseptember   1680.7      808.3   2.079
## follow.cutoctober     2098.9      826.5   2.540
## 
## Correlation of Fixed Effects:
##             (Intr) fllw.cts
## fllw.ctsptm -0.707         
## fllw.ctctbr -0.692  0.489  
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## 
## Call:
## lm(formula = yield.total ~ follow.cut, data = subset(dat1, env == 
##     "R100.2019"), na.action = na.omit)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2724.8 -1362.5  -717.8   871.2  5652.1 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           4646.0      597.8   7.772 7.29e-09 ***
## follow.cutseptember   1680.7      845.4   1.988   0.0554 .  
## follow.cutoctober     2098.9      864.4   2.428   0.0210 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2071 on 32 degrees of freedom
## Multiple R-squared:  0.174,  Adjusted R-squared:  0.1223 
## F-statistic:  3.37 on 2 and 32 DF,  p-value: 0.04699

## Analysis of Variance Table
## 
## Response: yield.total
##            Df    Sum Sq  Mean Sq F value  Pr(>F)  
## follow.cut  2  28896381 14448191  3.3695 0.04699 *
## Residuals  32 137212414  4287888                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

conclusion

In the second field year, no differences in forage yield were detected among the different follow-up cut treatments. This finding is surprising since plots without follow-up cuts recieved a lot more GDD and precip prior to their first cut the following year than plots that recieved a follow-up cut in October. Despite a plot being cut multiple times in the first year of the experiment, there was no observable differnce in its forage yield the following year at the first cut of forage yield, the second cut, and the total yield for that plot. Though it should be acknowledged that since R70.2018 did not recieve follow-up cuts, we are relying on two environments for our conclusion that no differences occured at yield.1cut, but only one environment for differences occured at yield.2cut and yield.total

Moving forward in the analysis, there is little reason to believe that the follow.cut treatments will have main effects, simple effects or interactions with timing.1cut for forage yield. There is also less reason to believe that field.year will have an appreciable effect on forage yields, though there may still be impacts on RFQ.

RFQ.1cut~follow.cut

Does the RFQ in the second field year differ between plots that received follow-up cuts the previous year?

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: RFQ.1cut ~ follow.cut * year + (1 | block)
##    Data: subset(dat1, field.year == "second")
## 
##      AIC      BIC   logLik deviance df.resid 
##    581.5    599.6   -282.7    565.5       63 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8626 -0.7730 -0.1776  0.7856  2.3650 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept)   0.0     0.00   
##  Residual             168.5    12.98   
## Number of obs: 71, groups:  block, 4
## 
## Fixed effects:
##                              Estimate Std. Error t value
## (Intercept)                   85.4517     3.7469  22.806
## follow.cutseptember            1.7453     5.2989   0.329
## follow.cutoctober              2.3523     5.2989   0.444
## year2019                     -11.5649     5.2989  -2.183
## follow.cutseptember:year2019  -1.2442     7.4938  -0.166
## follow.cutoctober:year2019    -0.9358     7.5785  -0.123
## 
## Correlation of Fixed Effects:
##               (Intr) fllw.cts fllw.ctc yr2019 fllw.cts:2019
## fllw.ctsptm   -0.707                                       
## fllw.ctctbr   -0.707  0.500                                
## year2019      -0.707  0.500    0.500                       
## fllw.cts:2019  0.500 -0.707   -0.354   -0.707              
## fllw.ctc:2019  0.494 -0.350   -0.699   -0.699  0.494       
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## 
## Call:
## lm(formula = RFQ.1cut ~ follow.cut * year, data = subset(dat1, 
##     field.year == "second"), na.action = na.omit)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -24.176 -10.033  -2.306  10.197  30.697 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   85.4517     3.9160  21.821   <2e-16 ***
## follow.cutseptember            1.7453     5.5381   0.315   0.7537    
## follow.cutoctober              2.3523     5.5381   0.425   0.6724    
## year2019                     -11.5649     5.5381  -2.088   0.0407 *  
## follow.cutseptember:year2019  -1.2442     7.8320  -0.159   0.8743    
## follow.cutoctober:year2019    -0.9358     7.9205  -0.118   0.9063    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.57 on 65 degrees of freedom
## Multiple R-squared:  0.1863, Adjusted R-squared:  0.1237 
## F-statistic: 2.976 on 5 and 65 DF,  p-value: 0.01762

## Analysis of Variance Table
## 
## Response: RFQ.1cut
##                 Df  Sum Sq Mean Sq F value    Pr(>F)    
## follow.cut       2    54.7   27.34  0.1486 0.8622397    
## year             1  2678.7 2678.73 14.5565 0.0003055 ***
## follow.cut:year  2     5.0    2.51  0.0137 0.9864393    
## Residuals       65 11961.4  184.02                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We conclude that the RFQ.1cut of the second year did not differ depending on whether the previous year the plot recieved follow-up cuts.

RFQ.2cut~follow.cut

## Warning: Removed 48 rows containing non-finite values (stat_density).

## Warning: Removed 48 rows containing non-finite values (stat_smooth).

## Warning: Removed 48 rows containing missing values (geom_point).

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: RFQ.2cut ~ follow.cut + (1 | block)
##    Data: subset(dat1, field.year == "second")
## 
##      AIC      BIC   logLik deviance df.resid 
##    156.5    161.0    -74.2    148.5       19 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3428 -0.6038  0.2199  0.6743  1.3114 
## 
## Random effects:
##  Groups   Name        Variance  Std.Dev.
##  block    (Intercept) 0.0008213 0.02866 
##  Residual             0.0044862 0.06698 
## Number of obs: 23, groups:  block, 4
## 
## Fixed effects:
##                   Estimate Std. Error t value Pr(>|z|)    
## (Intercept)        4.43686    0.03077  144.20   <2e-16 ***
## follow.cutoctober -0.01402    0.02752   -0.51     0.61    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## fllw.ctctbr -0.417

## 
## Call:
## glm(formula = RFQ.2cut ~ follow.cut, family = Gamma(link = "log"), 
##     data = subset(dat1, field.year == "second"), na.action = na.omit)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.20028  -0.03492   0.02188   0.04739   0.09218  
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        4.437528   0.021455 206.825   <2e-16 ***
## follow.cutoctober -0.007796   0.031025  -0.251    0.804    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Gamma family taken to be 0.005524039)
## 
##     Null deviance: 0.12373  on 22  degrees of freedom
## Residual deviance: 0.12338  on 21  degrees of freedom
##   (48 observations deleted due to missingness)
## AIC: 154.88
## 
## Number of Fisher Scoring iterations: 3

## Analysis of Deviance Table (Type II tests)
## 
## Response: RFQ.2cut
##            LR Chisq Df Pr(>Chisq)
## follow.cut 0.063139  1     0.8016

There was no difference among the RFQ.2cut the following year based on whether there was a follow-up cut previously

RFQ.total~follow.cut

When we look at the total yield, we are basically just looking at R100.2019 because there is no difference between RFQ.1cut and RFQ.total in R70 due to a lack of follow-up cuts in 2018.

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: RFQ.total ~ follow.cut + (1 | block)
##    Data: subset(dat1, env == "R100.2019")
## 
##      AIC      BIC   logLik deviance df.resid 
##    247.6    255.4   -118.8    237.6       30 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.6364 -0.7268 -0.1787  0.9453  1.5905 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept)  0       0.000   
##  Residual             52       7.211   
## Number of obs: 35, groups:  block, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)           73.887      2.082  35.494
## follow.cutseptember    1.481      2.944   0.503
## follow.cutoctober      2.481      3.010   0.824
## 
## Correlation of Fixed Effects:
##             (Intr) fllw.cts
## fllw.ctsptm -0.707         
## fllw.ctctbr -0.692  0.489  
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## 
## Call:
## lm(formula = RFQ.total ~ follow.cut, data = subset(dat1, env == 
##     "R100.2019"), na.action = na.omit)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.801  -5.241  -1.288   6.816  11.469 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           73.887      2.177  33.939   <2e-16 ***
## follow.cutseptember    1.481      3.079   0.481    0.634    
## follow.cutoctober      2.481      3.148   0.788    0.436    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.542 on 32 degrees of freedom
## Multiple R-squared:  0.01941,    Adjusted R-squared:  -0.04188 
## F-statistic: 0.3167 on 2 and 32 DF,  p-value: 0.7308

## [1] 245.6199

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: RFQ.total
##             Chisq Df Pr(>Chisq)
## follow.cut 0.6927  2     0.7073

## Analysis of Variance Table
## 
## Response: RFQ.total
##            Df  Sum Sq Mean Sq F value Pr(>F)
## follow.cut  2   36.02  18.010  0.3167 0.7308
## Residuals  32 1820.03  56.876

No differences are detected in the RFQ we expect for the final total forage yield of a plot in the second year of production among the follow.cut treatments.

conclusion

We expect the RFQ to increase with follow-up cuts, but were unsure if the impact of cutting plots in year one different amount of times would impact the RFQ in the following year. With a limited dataset of only one site, we did not detect any differences in RFQ in the second year based on the frequency of follow-up harvests in year one.

We can feel confident in combining across field.years in order to observe the proportional contributions of the spring and follow-up cuts towards the year’s quantity and quality of forage

#Effect of the follow-up cuts on both field.years

proportion graphs

yield

Of the total yield, what is the proportional contribution of the follow-up cuts to the final yield?

## Warning: Removed 181 rows containing non-finite values (stat_summary).

The spring cut provides the bulk of the total forage yield

is the total forage yield differing by the follow.cut?

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: yield.total ~ follow.cut + (1 | block)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   2592.7   2607.5  -1291.4   2582.7      138 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.6225 -0.7967 -0.0410  0.6572  3.3354 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept) 0.003539 0.05949 
##  Residual             0.311785 0.55838 
## Number of obs: 143, groups:  block, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value Pr(>|z|)    
## (Intercept)          8.11978    0.09723  83.509   <2e-16 ***
## follow.cutseptember  0.17184    0.12324   1.394    0.163    
## follow.cutoctober    0.16720    0.12404   1.348    0.178    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) fllw.cts
## fllw.ctsptm -0.634         
## fllw.ctctbr -0.629  0.497

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.total
##             Chisq Df Pr(>Chisq)
## follow.cut 2.5136  2     0.2846

Ok, so aparently there’s too much variation in data to reject null hypothesis that there is a difference in yield.total among the follow.cut treatments

RFQ

If all the forage from a plot over the course of a growing season was mixed together before selling, what would be its RFQ and to what extent did each harvest contribute to that RFQ?

## Warning: Removed 182 rows containing non-finite values (stat_summary).

Doesn’t appear to be a difference among follow.cuts, but let’s look

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: Gamma  ( log )
## Formula: RFQ.total ~ follow.cut + (1 | block)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
##   1278.1   1293.0   -634.1   1268.1      138 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3679 -0.8255 -0.2906  0.8096  3.4120 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept) 0.001062 0.03259 
##  Residual             0.054084 0.23256 
## Number of obs: 143, groups:  block, 4
## 
## Fixed effects:
##                     Estimate Std. Error t value Pr(>|z|)    
## (Intercept)         4.508921   0.040848 110.383   <2e-16 ***
## follow.cutseptember 0.002741   0.045604   0.060    0.952    
## follow.cutoctober   0.016923   0.045850   0.369    0.712    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) fllw.cts
## fllw.ctsptm -0.558         
## fllw.ctctbr -0.555  0.497

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: RFQ.total
##             Chisq Df Pr(>Chisq)
## follow.cut 0.1565  2     0.9247

We estimate the RFQ does not differ for plots with or without follow-up cuts.

Maximizing yield: only spring forage cut (yield.1cut)

What is the best time to harvest IWG forage in order to maximize your yield?

Here we look at the forage yield (dry biomass on a kg ha-1 basis) of the first cut of the year

distribution

First, let’s look at the distribution of the spring forage yield response.

The distribution of the first year looks normal, but the second year appears more Gamma/poisson.

variance

Variance seems homogenous except for the anthesis cuts in the second year, we’ll need to be aware of this limitation of our data.

Growing Degree Days vs. yield.1cut

How do GDD accumulation explain the observed forage yields?

Since we are exploring the data and we expect that the treatments from one field.year may confound the data from the second field year, we are going to see if the patterns in GDD accumulation and yield are similar within each field year.

yield.1cut vs. GDD.1cut where field.year=“first”

Now, we look only at data from the first year of the experiment, thereby removing any effects of the follow-on harvest treatments (follow.cut)

It appears as more GDD accumulated prior to the first forage cut, the forage yields were greater at R100 and didn’t really change or had a negative slope at R70.

Since we are only looking at the data for the spring cut in the first year, this is the only instance where the I2 data could be used.

When I2 is included, the positive relationship between GDD accumulation and yield is even less clear. While overall there is a positive correlation, this is driven primarily by the R100, and both I2 and R70 show slightly negative relationships.

A negative relationship between forage yield and growing degree days is not expected and may indicate that another environmental factor like precipitation may be influencing the response. Let’s look at the data by environment rather than site.

It’s strange that the GDD of the first cut is identical for both environments in 2017, indicating they were harvested on the same day. Investigating the notes, it appears that in 2017, I2 and R70 were harvested on the same day for all three spring harvest timings, so that checks out.

It’s noteworthy that harvests in R100.2018 occurred after less GDD accumulated, which indicates the plants were maturing faster in 2018 as compared to 2017, and since both R70 and R100 had the same age, difference in rate of maturity could be due to site factors that are hard to quantify (i.e. soil texture, soil health) or they could be simply due to precipitation

It’s clear that the accumulated precipitation between spring forage harvests differed between 2017 and 2018. Only about 250 mm of rain accumulated before the cut at the boot stage in 2018, whereas almost twice as much has accumulated before the first cut at the boot stage in 2017. There’s a couple ways this can be interpreted, were forage yields reduced for the boot cut in R100.2018 because of a lack of rain and that’s why we see a positive slope or was a lack of rain after the boot cut in 2017 that caused the lack of a positive slope. There may be a better way to visualize this.

At this point, it makes sense that the I2 data shows a negative slope because the IWG stand, as previously explained, was old and starting to die and did die after the grain harvest in 2017, so let’s just remove I2 and look at it again

So this might be something to be aware of in the discussion. Need to tease this out a bit more.

yield.1cut vs. GDD.1cut where field.year=“second”

Next we look at forage yields in the second year based on what treatment was applied in the first field year.

So here we are looking at the GDD that accumulated in year 2 on the x axis prior to the first cut and the forage yield of that first cut.
Regression lines suggest that in the second year, as GDD accumulated the forage yield was greater.
Regression lines also indicate very little difference among the follow.cut treatments. This is even more obvious if you allow se=T in the geom_smooth equation.
But remember that the kernza plants with follow.cut=none got about 2 more months of GDD in the first field year than kernza plants where follow.cut=october.

Now we will add in the additional GDD that accumulated in first field.year after the last cut in a given plot.

To do this, we need to need to create a new dataframe that is just field.year=second
then create a new dataframe that is just field.year=first
match these dataframes by row and mutate a new column that’s gdd.1cut of the dataframe for the second year plus the gdd.extra of that same plot the previous year

So here we are looking at the GDD that accumulated for the IWG plants after they were last cut in year 1 and before they were cut again in year 2.

Its remarkable that the plots with no follow-up cut received almost twice the average GDD than plots that were cut again in september and almost 4 times that amount of GDD than plots cut in October prior to the first cut in the spring the following year.

Did this translate to differences in yield?

##             numDF denDF   F-value p-value
## (Intercept)     1    61 3.1381474  0.0815
## follow.cut      2    61 0.1201979  0.8870

This is interesting! I would expect that plants that did not receive a follow-up cut (and therefore would accumulate more GDD compared to the plots with follow-up harvests), would have a greater forage yield for the yield.1cut the following year. However, despite follow.cut=none plots receiving upwards of 4x GDD compared with plots where follow.cut=october, the yield the following year was similar. When we do not take into account the GDD accumulation of the previous year and only look at the GDD accumulation in the field.year when the harvest occurred, then the GDD accumulation and yield allign.

This suggests that the GDD that accumulated after the last cut in year1 did not really have that great of an effect on the yield the following year. It also suggests that whether the plot was harvested once, twice or three times following harvest the first year did not have an impact on the first yield of the second year.
This may be worth further analyzing as one of our hypothesis is that multiple harvests on kernza plants will reduce their forage yield the following year.

yield.1cut vs. gdd.1cut

Since there doesn’t seem to be an effect of the harvest intensity (follow.cut) on the following field.year, we can combine across field.years!

let’s look at the relationship between yield and GDD among years…

Ok, when looking at year (which helps incorporate precip), it seems there’s a positive relationship between yield and GDD across all years except maybe 2017.
Since the data isn’t super even, let’s seperate by environment to get a better sense of things

There seems to be a consistent positive association between forage yield and GDD except for in 2017.
Did 2017 differ in precipitation or something? That could explain the non-positive association between GDD and forage yield

Let’s look at rain for 2017 vs the other years

weather for context

Now let’s add dates in 2017 and 2018 when the boot, anthesis and dough cut occurred.

It’s hard to tell without the another line depicting the 30 year average which year was weird, but it looks like 2018 happened to be drier than other years, especially in late May. Looking at both environments in 2018, the yield was low at the boot cut, making the slope more positive. The precipiation somewhat explains why the relationship between GDD and yield.1cut was so positive in 2018 (because plants were water-stressed at the boot stage harvest and then got water causing forage yields to increase), but there isn’t really that much of a pattern here since in 2019 there was a strong positive relationship between GDD and yield despite no water stress.

Trying to absorb the weirdness of R70.2017 by separating by site and keeping years together was a smart move by Jake

yield.1cut vs GDD, seperated by site

Much better, nice consistent trend

regression comparison of yield.cut vs. GDD.1cut

Now let’s compare a linear regression with a quadratic regression

Now let’s statistically compare the linear and quadratic model for goodness of fit.

##             numDF denDF  F-value p-value
## (Intercept)     1   134 22.21362  <.0001
## gdd.1cut        1   134 31.50397  <.0001

## Linear mixed-effects model fit by maximum likelihood
##  Data: dat1 
##        AIC      BIC    logLik
##   2506.082 2520.896 -1248.041
## 
## Random effects:
##  Formula: ~1 | site
##         (Intercept)
## StdDev:    942.6386
## 
##  Formula: ~1 | block %in% site
##         (Intercept) Residual
## StdDev:    282.3626 1441.402
## 
## Fixed effects: yield.1cut ~ gdd.1cut 
##                Value Std.Error  DF  t-value p-value
## (Intercept) 951.3149  801.9483 134 1.186255  0.2376
## gdd.1cut      1.8102    0.3225 134 5.612840  0.0000
##  Correlation: 
##          (Intr)
## gdd.1cut -0.511
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -1.4620552 -0.5899343 -0.2642517  0.2075620  4.4722016 
## 
## Number of Observations: 143
## Number of Groups: 
##            site block %in% site 
##               2               8

##               numDF denDF  F-value p-value
## (Intercept)       1   133 19.87100  <.0001
## gdd.1cut          1   133 37.66029  <.0001
## I(gdd.1cut^2)     1   133 26.77571  <.0001

##      Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mod1     1  5 2506.082 2520.896 -1248.041                        
## mod2     2  6 2483.244 2501.021 -1235.622 1 vs 2 24.83836  <.0001

We conclude model 2 (the quadratic) fits the data better and differs from model 1.

summary stats of regression; yield vs gdd

Now we determine at what level of GDD accumulation our maximimum forage yield.1cut occurs by calculate at what point our quadratic function we fitted onto the yield.1cut vs. GDD data has a slope of zero, and the corresponding GDD, this value is our “agronomically optimal growing degree day accumulation” or AOGDD

## estAOGDD 
## 1484.834

## estAOGDD 
## 4118.921

conclusion of GDD for yield.1cut

We conclude that our maximum yield for the first forage cut is 4,103 kg ha dry forage biomass and that this is achievable when 1,468 GDD have accumulated, which appears to be after anthesis and before dough stage.

Timing.1cut vs. yield.1cut

Did the forage yield of the first cut differ among timing.1cut treatments?

Interaction testing, model building

The forage yield response is normally distributed and variance is homogeneous, but if sliced by year the second year will require a glm with link=“log”.

The main effect of interest is the timing.1cut. This fixed effect was based on the physiological stage of the IWG and therefore somewhat takes into account the random effects of year conditions (GDD and precip accumulation), site conditions (fertility differences) and even differences in the age of the kernza stand. We can model these random effects as a singular factor, environment. The additional random effect of block within environment can be nested within environment.

The repeated measuring of an IWG stand across two field.years, however, violates the assumption of independent sampling. Since each environment is treated as a random effect, this should solve that problem because each environment is able to have different y intercept. In general, with fewer than 4 repeated measures it does not make sense to estimate the serial correlation structure and is easier to just make it a random effect.

##             numDF denDF  F-value p-value
## (Intercept)     1   137 26.64464  <.0001
## timing.1cut     2   137 17.08650  <.0001

##  timing.1cut emmean  SE df lower.CL upper.CL .group
##  boot          2377 648  3      314     4440  1    
##  dough         3693 649  3     1628     5757   2   
##  anthesis      3704 648  3     1642     5767   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

This model concludes that the boot cut is typically lower in forage yield for the first cut than the anthesis and dough cuts. This makes sense and agrees with trends in our data, though R70.2017 likely would result in interactions if included

Let’s compare my model with Jake’s approach where an interaction between timing.1cut and field.year is included and site, rather than environment, is used as the random effect.

First test to see if interaction with the year of the experiment

##                        numDF denDF   F-value p-value
## (Intercept)                1   130 22.436628  <.0001
## timing.1cut                2   130 14.591185  <.0001
## field.year                 1   130  3.073607  0.0819
## timing.1cut:field.year     2   130 11.138958  <.0001

There is an interaction between timing.1cut and field.year. This seems to be driven by high variability at the anthesis harvest of the second year. There are no outliers, but that anthesis harvest looks weirdly variable and it seems to be driving the interaction

Since there’s an interaction between field.years and yield, we must separate by field.years.

Now we compare models based on AIC criterion and plot residuals

## integer(0)

## integer(0)

## integer(0)

## integer(0)

It’s hard to judge these residual plots.

We could compare models, but the ones that are sliced by year will obviously have better fits and its pointless looking at the sya model because of the interaction

I’d argue that either model jpb.1 or splitting by year and doing sy1 and sy2 are good.

The AIC criterion and log liklihood will always reduce the more that we split up the data. The question is whether its necessary to include field.year as a fixed effect. According to my previous analysis of the follow.cut, there isn’t a reason to think field.year=1 had a lasting effect on field.year=2.

conclusion of mean comparison

If we do not seperate by field.years, our graph would look like this

##  timing.1cut emmean  SE df lower.CL upper.CL .group
##  boot          2377 648  3      314     4440  1    
##  dough         3693 649  3     1628     5757   2   
##  anthesis      3704 648  3     1642     5767   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

##   timing.1cut  n     mean       se tukey
## 1        boot 48 2377.276 196.8551     b
## 2    anthesis 48 3704.417 346.1988     a
## 3       dough 47 3660.779 199.1394     a

Maximizing yield: all cuts summed together

What combination of the first cut timing and follow-up cuts yields the highest result?

We need to remember that R70.2018 did not have follow-up cuts, so we really have 3 environments with follow-up cuts: R70.2017, R100.2018 and R100.2019. We shouldn’t include the yield data from R70.2018 if there weren’t any followup cuts!

distribution

First, let’s look at the distribution of the spring forage yield response.

All sites appear normally distributed except the R70.2017 might benefit from a GLM.

variance

Variance of total yield appears to be homogenous except for the high yielding anthesis treatments in R100.2019

There does not appear to be any obvious trends at all.

Now, we expect the total yield to be greater with follow-up cuts (It can’t be lower unless it’s the second year and there’s some carryover effect, which we already tested and concluded there is no carryover effect). While we see what we expect in the R100 site, in R70.2017 there appears to be a decrease in total yield in the plots with follow-up cuts. This cannot be explained by precipitation (which was very similar to 2019) nor treatments from a previous year (this was the first year), I think it’s just random luck and very low yields in the follow-up cuts for the first year. More weirdness.

Treatment for total yield

##             numDF denDF  F-value p-value
## (Intercept)     1    97 63.58618  <.0001
## treatment       8    97  2.34653  0.0238

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: yield.total
##            Chisq Df Pr(>Chisq)   
## treatment 21.954  8   0.005001 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  treatment emmean  SE df lower.CL upper.CL .group
##  BN          3400 717  2      314     6487  1    
##  BO          4088 717  2     1002     7174  12   
##  BS          4243 717  2     1157     7329  12   
##  AN          4352 717  2     1266     7438  12   
##  DN          4352 717  2     1266     7438  12   
##  DO          4866 717  2     1780     7952  12   
##  DS          4908 717  2     1822     7994  12   
##  AS          5489 717  2     2403     8575   2   
##  AO          5586 717  2     2500     8672   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

##  treatment emmean  SE   df lower.CL upper.CL .group
##  BN          3381 794 10.2     1616     5146  1    
##  BO          4069 794 10.2     2303     5834  12   
##  BS          4224 794 10.2     2458     5989  12   
##  AN          4332 794 10.2     2567     6097  12   
##  DN          4333 794 10.2     2567     6098  12   
##  DO          4846 794 10.2     3081     6611  12   
##  DS          4888 794 10.2     3123     6653  12   
##  AS          5469 794 10.2     3704     7234   2   
##  AO          5566 794 10.2     3801     7331   2   
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

Unsurprisingly, the plots with the greatest total yield combined across the three environments with total yield data favors the cuts at anthesis and with september and october followup cuts.

now we make a nice graph

##  treatment emmean  SE df lower.CL upper.CL .group
##  BN          3400 717  2      314     6487  1    
##  BO          4088 717  2     1002     7174  12   
##  BS          4243 717  2     1157     7329  12   
##  AN          4352 717  2     1266     7438  12   
##  DN          4352 717  2     1266     7438  12   
##  DO          4866 717  2     1780     7952  12   
##  DS          4908 717  2     1822     7994  12   
##  AS          5489 717  2     2403     8575   2   
##  AO          5586 717  2     2500     8672   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

##   treatment  n     mean       se tukey
## 1        BN 12 3400.492 388.0992     b
## 2        BS 12 4243.389 283.1127    ab
## 3        BO 12 4088.314 418.1572    ab
## 4        AN 12 4352.060 546.4632    ab
## 5        AS 12 5488.623 846.4588     a
## 6        AO 12 5585.631 808.9571     a
## 7        DN 12 4352.326 282.2937    ab
## 8        DS 12 4907.874 347.1665    ab
## 9        DO 12 4865.630 525.0484    ab

conclusion

maximum total yields were lower when just harvested at the boot stage compared with when harvested at anthesis and harvested again in september and september + october.

Plots that received follow-up harvests did not differ from plots that were only harvested once, assuming the timing of the first harvest was the same.

Relative Forage Quality (RFQ)

What are the best harvest times for maximizing quality?

So we have more data for the RFQ and RFV, which could allow us to look at grain cuts, but for now we won’t because we’ve been filtering out the I2 and grain cuts up until this point so it might be confusing for the reader.

–>

RFQ.1cut

distribution

First, let’s look at the distribution of the spring forage yield response.

yikes, this is a mess of distributions, not very normally distributed.

Looks like the RFQ went down in the second field.year. I wonder if the plots whose RFQ didn’t go down were the one’s without a followup harvest?

variance

As we expect, as the plant matures the RFQ decreases.

I noticed a blimp of high RFQ values in the second field year and thought they might the plots that didn’t recieve follow-up cuts, but it appears there isn’t a clear impact on RFQ in the second year if the IWG was harvested multiple times the year prior.

The variances seem pretty homogenous

First we’ll look at treatments, then we can mess around with GDD

interaction testing

There seems to be a consistent enough trend that we might be able to treat site and year as a fixed effect, but R100.2018 poses enough weirdness that it’s best to keep environment as random.

We conclude that a glm with environment and block nested within site as random effects provides the best fit and that a difference in RFQ.1cut occurred and that earlier cuts have higher RFQ. I tried even more approaches and all came to the same conclusion. The best fit I achieved actually allowed the intensity of timing.1cut to vary within environments (below), though conclusions were all the same

conclusion

The RFQ differs depending on whether the first cut occurs at the boot, anthesis or dough stage. RFQ is greatest at the boot stage, less at the anthesis stage and least at the dough stage.

And remember, we already tested for any effects from the follow.cut and there were none, but I ran some more code just to check.

Only significant effects on RFQ.1cut were from the timing of the first cut, the followup cut did not impact the RFQ.1cut at all.

RFQ.2cut

RFQ.3cut

RFQ.avgallcut

Since we’re looking at all spring and follow-up cuts, we want to use “dat3” which subsets the 3 environments that had both a spring cut and follow-up cut.

There does not seem to be a consistent pattern to the average RFQ of the total year forage yield of a plot.

Our analysis should probably keep environment as a random effect and will likely find that there is no difference in RFQ among treatments.

## Data: dat3
## Models:
## RFQt.glmer.3: RFQ.total ~ treatment + (1 | env)
## RFQt.glmer.2: RFQ.total ~ treatment + (1 | env) + (1 | site:block)
##              npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## RFQt.glmer.3   11 829.62 859.13 -403.81   807.62                         
## RFQt.glmer.2   12 808.53 840.71 -392.26   784.53 23.096  1  1.541e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: RFQ.total
##            Chisq Df Pr(>Chisq)    
## treatment 147.52  8  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  treatment emmean    SE  df asymp.LCL asymp.UCL .group
##  DN          4.37 0.149 Inf      4.08      4.66  1    
##  DS          4.43 0.149 Inf      4.14      4.72  12   
##  DO          4.45 0.149 Inf      4.16      4.75  12   
##  AN          4.47 0.149 Inf      4.18      4.76  12   
##  AO          4.47 0.149 Inf      4.18      4.77  12   
##  AS          4.49 0.149 Inf      4.20      4.79   2   
##  BS          4.64 0.149 Inf      4.34      4.93    3  
##  BO          4.65 0.149 Inf      4.36      4.95    3  
##  BN          4.71 0.149 Inf      4.41      5.00    3  
## 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

Ok, nevermind, there does appear to be a difference among the plots that were first harvested at the boot cut. These plots tend to have higher RFQ.

##   treatment  n      mean       se tukey
## 1        BN 12 110.49888 6.190170     a
## 2        BS 12 103.14755 5.733120     a
## 3        BO 12 106.30427 8.147707     a
## 4        AN 12  88.68458 6.553009    bc
## 5        AS 12  90.37690 6.338576     b
## 6        AO 12  88.63966 6.499657    bc
## 7        DN 12  79.07718 4.665204     c
## 8        DS 12  83.94587 5.000748    bc
## 9        DO 12  86.26620 5.411569    bc

$ net returns

Here we combine our estimates for the year’s total yield and estimated RFV among treatments to determine if any treatments are expected to provide a better net return.

The net returns ($) seems to decrease when there are follow-up harvests, likely due to the cost of the extra harvests.

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: return.total ~ treatment + (1 + treatment | env) + (1 | site:block)
##    Data: dat3
## 
##      AIC      BIC   logLik deviance df.resid 
##   1485.0   1635.2   -686.5   1373.0       52 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.66324 -0.62046 -0.03185  0.65261  2.47206 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev. Corr                               
##  site:block (Intercept)   6896    83.04                                      
##  env        (Intercept)  28021   167.39                                      
##             treatmentBS  25560   159.88   -0.97                              
##             treatmentBO  61695   248.38   -0.98  1.00                        
##             treatmentAN  81000   284.60   -0.93  0.99  0.99                  
##             treatmentAS 144915   380.68   -0.90  0.98  0.97  1.00            
##             treatmentAO 133489   365.36   -0.86  0.96  0.95  0.99  1.00      
##             treatmentDN  44190   210.21   -0.97  0.89  0.91  0.82  0.78  0.72
##             treatmentDS  62094   249.19   -0.98  0.90  0.92  0.84  0.80  0.74
##             treatmentDO  56228   237.13   -0.95  0.85  0.86  0.76  0.72  0.66
##  Residual                14195   119.14                                      
##             
##             
##             
##             
##             
##             
##             
##             
##             
##   1.00      
##   1.00  0.99
##             
## Number of obs: 108, groups:  site:block, 8; env, 3
## 
## Fixed effects:
##               Estimate Std. Error t value
## (Intercept)  368.42118  106.78579   3.450
## treatmentBS  -42.81360  104.33533  -0.410
## treatmentBO -220.95159  151.42919  -1.459
## treatmentAN   -4.46416  171.36410  -0.026
## treatmentAS   -0.02601  225.10160   0.000
## treatmentAO -125.69045  216.47641  -0.581
## treatmentDN   -9.80779  130.75079  -0.075
## treatmentDS  -62.39606  151.86769  -0.411
## treatmentDO -189.96553  145.28804  -1.308
## 
## Correlation of Fixed Effects:
##             (Intr) trtmBS trtmBO trtmAN trtmAS trtmAO trtmDN trtmDS
## treatmentBS -0.884                                                 
## treatmentBO -0.912  0.912                                          
## treatmentAN -0.871  0.906  0.940                                   
## treatmentAS -0.846  0.896  0.933  0.965                            
## treatmentAO -0.810  0.878  0.910  0.955  0.972                     
## treatmentDN -0.903  0.819  0.856  0.784  0.748  0.696              
## treatmentDS -0.913  0.833  0.875  0.806  0.773  0.722  0.939       
## treatmentDO -0.885  0.783  0.824  0.739  0.698  0.640  0.933  0.940
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

## Data: dat3
## Models:
## nrt.lmer.3: return.total ~ treatment + (1 | env)
## nrt.lmer.2: return.total ~ treatment + (1 | env) + (1 | site:block)
## nrt.lmer.4: return.total ~ treatment + (1 + treatment | env) + (1 | site:block)
##            npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## nrt.lmer.3   11 1475.0 1504.5 -726.50   1453.0                        
## nrt.lmer.2   12 1467.1 1499.3 -721.56   1443.1  9.881  1   0.001670 **
## nrt.lmer.4   56 1485.0 1635.2 -686.52   1373.0 70.086 44   0.007443 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## Linear mixed model fit by REML ['lmerMod']
## Formula: return.total ~ treatment + (1 | env) + (1 | site:block)
##    Data: dat3
## 
## REML criterion at convergence: 1352.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2263 -0.6764 -0.1954  0.5339  3.2538 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  site:block (Intercept)  8023     89.57  
##  env        (Intercept)  4628     68.03  
##  Residual               35606    188.70  
## Number of obs: 108, groups:  site:block, 8; env, 3
## 
## Fixed effects:
##               Estimate Std. Error t value
## (Intercept)  362.90262   74.75495   4.855
## treatmentBS  -42.81360   77.03462  -0.556
## treatmentBO -220.95159   77.03462  -2.868
## treatmentAN   -4.46416   77.03462  -0.058
## treatmentAS   -0.02601   77.03462   0.000
## treatmentAO -125.69045   77.03462  -1.632
## treatmentDN   -9.80779   77.03462  -0.127
## treatmentDS  -62.39606   77.03462  -0.810
## treatmentDO -189.96553   77.03462  -2.466
## 
## Correlation of Fixed Effects:
##             (Intr) trtmBS trtmBO trtmAN trtmAS trtmAO trtmDN trtmDS
## treatmentBS -0.515                                                 
## treatmentBO -0.515  0.500                                          
## treatmentAN -0.515  0.500  0.500                                   
## treatmentAS -0.515  0.500  0.500  0.500                            
## treatmentAO -0.515  0.500  0.500  0.500  0.500                     
## treatmentDN -0.515  0.500  0.500  0.500  0.500  0.500              
## treatmentDS -0.515  0.500  0.500  0.500  0.500  0.500  0.500       
## treatmentDO -0.515  0.500  0.500  0.500  0.500  0.500  0.500  0.500

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: return.total
##            Chisq Df Pr(>Chisq)  
## treatment 19.788  8    0.01117 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  treatment emmean   SE   df lower.CL upper.CL .group
##  BO           142 75.2 11.5    -22.7      307  1    
##  DO           173 75.2 11.5      8.3      338  1    
##  AO           237 75.2 11.5     72.6      402  1    
##  DS           301 75.2 11.5    135.9      465  1    
##  BS           320 75.2 11.5    155.5      485  1    
##  DN           353 75.2 11.5    188.5      518  1    
##  AN           358 75.2 11.5    193.8      523  1    
##  AS           363 75.2 11.5    198.2      528  1    
##  BN           363 75.2 11.5    198.3      528  1    
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

##   treatment  n     mean       se tukey
## 1        BN 12 369.7437 68.24694      
## 2        BS 12 326.9301 45.38144      
## 3        BO 12 148.7921 54.21684      
## 4        AN 12 365.2795 53.53136      
## 5        AS 12 369.7177 86.34685      
## 6        AO 12 244.0532 89.12097      
## 7        DN 12 359.9359 38.86078      
## 8        DS 12 307.3476 46.65293      
## 9        DO 12 179.7782 60.25688

For context, alfalfa averages around $300 per hectare net return.

So this suggests among the three environments we have full datasets for, you want to cut your IWG at sometime between boot and dough (doesn’t matter) and then HOLD and you might get net returns similar to alfalfa. https://hayandforage.com/article-1862-crop-prices-may-favor-alfalfa-in-2018.html

year as a fixed effect

we have data from 3 environments and 3 different years (2017,2018,2019). The differences in weather may help explain the response we observed, but first we need to see if there’s a signfiicant interaction between treatment and year.

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: return.total
##                  Chisq Df Pr(>Chisq)    
## treatment       52.752  8  1.204e-08 ***
## year            16.112  2  0.0003171 ***
## treatment:year 142.807 16  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2017

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: return.total
##           Chisq Df Pr(>Chisq)    
## treatment 65.84  8  3.292e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  treatment emmean   SE   df lower.CL upper.CL .group
##  AO          16.6 79.8 14.3   -154.2      187  1    
##  BO          32.3 79.8 14.3   -138.6      203  12   
##  DO          96.1 79.8 14.3    -74.7      267  12   
##  AS         101.4 79.8 14.3    -69.4      272  12   
##  DS         194.3 79.8 14.3     23.5      365  12   
##  AN         214.8 79.8 14.3     44.0      386  12   
##  DN         304.2 79.8 14.3    133.4      475  123  
##  BS         338.8 79.8 14.3    167.9      510   23  
##  BN         609.5 79.8 14.3    438.7      780    3  
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

##   treatment n      mean        se tukey
## 1        BN 4 609.53111 132.11544     a
## 2        BS 4 338.75536 126.42822    ab
## 3        BO 4  32.25074  82.94215    bc
## 4        AN 4 214.79474  50.36329    bc
## 5        AS 4 101.37287  43.00303    bc
## 6        AO 4  16.64983  28.33674     c
## 7        DN 4 304.23440  58.41648   abc
## 8        DS 4 194.30495  76.72982    bc
## 9        DO 4  96.07121  49.73881    bc

In 2017, there was average precipitation and faster GDD accumulation in the early part of the summer.

This may explain why there are greater net returns among the spring cuts and especially at the boot cut.

2018

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: return.total
##            Chisq Df Pr(>Chisq)    
## treatment 33.303  8  5.431e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  treatment emmean   SE   df lower.CL upper.CL .group
##  AO           135 74.7 9.39    -32.4      303  1    
##  BO           149 74.7 9.39    -18.7      317  1    
##  BN           236 74.7 9.39     68.5      404  12   
##  BS           272 74.7 9.39    104.5      440  12   
##  AS           311 74.7 9.39    143.6      479  12   
##  AN           316 74.7 9.39    148.4      484  12   
##  DO           322 74.7 9.39    154.0      490  12   
##  DS           433 74.7 9.39    265.5      601   2   
##  DN           452 74.7 9.39    283.9      620   2   
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

##   treatment n     mean        se tukey
## 1        BN 4 236.3270  59.77077    ab
## 2        BS 4 272.3850  56.30424    ab
## 3        BO 4 149.2078  83.14144     b
## 4        AN 4 316.2894  52.36816    ab
## 5        AS 4 311.4348  54.47804    ab
## 6        AO 4 135.4290  64.43356     b
## 7        DN 4 451.7633  88.26447     a
## 8        DS 4 433.3613  90.11991     a
## 9        DO 4 321.8455 103.70178    ab

In 2018 there was slower GDD accumulation and less precipitation during June.

This may explain the lack of large net returns during the spring cuts, and the relatively greater net returns later in the season. This also explains why in 2018 when there was lower than average rainfall, it would make sense to wait later in the spring to take your first cut (dough stage)

2019

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: return.total
##            Chisq Df Pr(>Chisq)    
## treatment 43.344  8  7.566e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##  treatment emmean   SE   df lower.CL upper.CL .group
##  DO           121 91.5 18.8    -70.3      313  1    
##  BN           263 91.5 18.8     71.6      455  12   
##  BO           265 91.5 18.8     73.2      457  12   
##  DS           294 91.5 18.8    102.6      486  12   
##  DN           324 91.5 18.8    132.1      516  123  
##  BS           370 91.5 18.8    177.9      561  123  
##  AN           565 91.5 18.8    373.0      756   23  
##  AO           580 91.5 18.8    388.3      772   23  
##  AS           696 91.5 18.8    504.6      888    3  
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 9 estimates 
## significance level used: alpha = 0.05

##   treatment n     mean        se tukey
## 1        BN 4 263.3730  36.55323    bc
## 2        BS 4 369.6500  43.05000   abc
## 3        BO 4 264.9178  97.56956    bc
## 4        AN 4 564.7545  68.04991    ab
## 5        AS 4 696.3454 128.44958     a
## 6        AO 4 580.0809 153.63807    ab
## 7        DN 4 323.8100  33.59551   abc
## 8        DS 4 294.3767  17.98926    bc
## 9        DO 4 121.4178 128.50356     c

In 2019 there was average GDD accumulation and above average rainfall.

For some unknown reason, the anthesis yields were very high that year and it is unclear why.

Net returns sliced by year

summarize all graph in one place

Jakes code remainder

RFQ.1cut vs. GDD

Review of experiment

Over 3 sites locations and 3 years, we selected two consecutive years of data for two sites. The R70 site was planted in the fall of 2015 and we collected data for 2017 and 2018. The R100 site was planted in 2016 and we collected data in 2018 and 2019. As a result, both sites had data collected when the IWG was in its second year of production and the following year when it was in its third year of production. Both sites were also located about 0.25 kilometers apart and on the same soil type (Tallula silt loam with very good drainage and water storage capacity).

In 2017, there was average precipitation and faster GDD accumulation in the early part of the summer. In 2018 there was slower GDD accumulation and less precipitation during June. In 2019 there was average GDD accumulation and above average rainfall.

We expected to observe a well known trend, as more GDD accumulated, forage yield would increase and as forage yield increased the forage quality would decrease. Our question was how forage yield and quality were impacted based on the timing of our first forage cut and/or whether we performed follow up cuts later in the season. Ultimately, we wanted to combine forage yield and quality data to get an estimate of how much $ per hectare could be expected for different combinations of first cut timing and follow-up cut and identify the most profitable system for IWG as a forage.

Methods for analysis

Key conclusions

## # A tibble: 143 x 2
##    env         age
##    <fct>     <int>
##  1 R100.2018     2
##  2 R100.2018     2
##  3 R100.2018     2
##  4 R100.2018     2
##  5 R100.2018     2
##  6 R100.2018     2
##  7 R100.2018     2
##  8 R100.2018     2
##  9 R100.2018     2
## 10 R100.2018     2
## # ... with 133 more rows

## [1] 2 3

FIG Data Analysis

Jesse Puka-Beals

02 03 2021

TO DO

Overview of missing and problematic data

outliers

Overview of key analysis questions

case study: yield.1cut vs. timing.1cut

Site conditions

yield.1cut~follow.cut

yield.2cut~follow.cut

yield.total~follow.cut

conclusion

RFQ.1cut~follow.cut

RFQ.2cut~follow.cut

RFQ.total~follow.cut

conclusion

proportion graphs

yield

RFQ

Maximizing yield: only spring forage cut (yield.1cut)

distribution

variance

Growing Degree Days vs. yield.1cut

yield.1cut vs. GDD.1cut where field.year=“first”

yield.1cut vs. GDD.1cut where field.year=“second”

yield.1cut vs. gdd.1cut

weather for context

yield.1cut vs GDD, seperated by site

regression comparison of yield.cut vs. GDD.1cut

summary stats of regression; yield vs gdd

conclusion of GDD for yield.1cut

Timing.1cut vs. yield.1cut

Interaction testing, model building

conclusion of mean comparison

Maximizing yield: all cuts summed together

distribution

variance

Treatment for total yield

conclusion

Relative Forage Quality (RFQ)

RFQ.1cut

distribution

variance

interaction testing

conclusion

RFQ.2cut

RFQ.3cut

RFQ.avgallcut

$ net returns

year as a fixed effect

2017

2018

2019

Net returns sliced by year

Jakes code remainder

RFQ.1cut vs. GDD

Review of experiment

Methods for analysis

Key conclusions