TO DO

Need to figure out how to present precip data Add in Jake’s code find sources for Jake

Overview of missing and problematic data

Here we analyze data from the Forage Timing Experiment.

missing data
The data for both fall harvests of at the R70 site in 2018 are missing from the dataset
The data from some plots where the spring harvest was at the grain stage is missing from the dataset

problematic data
The IWG at the I2 site was 6 years old and did not grow back after the first spring harvest. Of the data that was collected for the spring harvest in 2017 (environment=I2.2017), the data had a negative relationship between accumulated GDD and yield, unlike all other environments. Considering the I2.2017 IWG did not grow back after the spring harvest and the negative relationship between GDD and yield in 2017, its very likely the old age of this site caused it to respond very differently than the two other sites. It was decided to remove the limited data from I2 from the dataset. In some instances, the I2 data is presented just to illustrate the difference of its performance.

Spring Forage yield (yield.1cut)

Here we look at the forage yield (dry biomass on a kg ha-1 basis) of the first cut of the year

visualize and explore yield response

distribution

First, let’s look at the distribution of the spring forage yield response.

The distribution of the first year looks normal, but the second year appears more poisson.

variance

More on this later

Seperate by field years?

Is there an interaction between the timing of the first cut and the field year of the experiment that would justify seperating by field years?

##                        numDF denDF   F-value p-value
## (Intercept)                1   131 12.138391  0.0007
## timing.1cut                2   131 13.462861  <.0001
## field.year                 1   131  3.856811  0.0517
## timing.1cut:field.year     2   131 11.511820  <.0001

There is a significant interaction between the timing of when the plots were harvested for the first time (boot vs. anthesis vs. dough) and whether the experiment was in the first or second year. Therefore there is justification to seperate data by the field year.

This interaction will be important for mean comparisons later.

Growing Degree Days and yield

yield.1cut vs. GDD.1cut where field.year=“first”

How is the forage yield of the first cut related to the GDD that accumulated prior to the first cut?

Next, we look only at data from the first year of the experiment, thereby removing any effects of the follow-on harvest treatments (follow.cut)

It appears as more GDD accumulated prior to the first forage cut, the forage yields were greater at R100 and didn’t really change or had a negative slope at R70.

Since we are only looking at the data for the spring cut in the first year, this is the only instance where the I2 data could be used.

When I2 is included, the positive relationship between GDD accumulation and yield is even less clear. While overall there is a positive correlation, this is driven primarily by the R100, and both I2 and R70 show slightly negative relationships.

A negative relationship between forage yield and growing degree days is not expected and may indicate that another environmental factor like precipitation may be influecing the response. Let’s look at the data by environment rather than site.

It’s strange that the GDD of the first cut is identical for both environments in 2017, indicating they were harvested on the same day. Investigating the notes, it appears that in 2017, I2 and R70 were harvested on the same day for all three spring harvest timings, so that checks out.

It’s noteworthy that harvests in R100.2018 occurred after less GDD accumulated, which indicates the plants were maturing faster in 2018 as compared to 2017, and since both R70 and R100 had the same age, difference in rate of maturity could be due to site factors that are hard to quantify (i.e. soil texture, soil health) or they could be simply due to precipitation

It’s clear that the accumulated precipitation between spring forage harvests differed between 2017 and 2018. Only about 250 mm of rain accumulated before the cut at the boot stage in 2018, whereas almost twice as much has accumulated before the first cut at the boot stage in 2017. There’s a couple ways this can be interpreted, were forage yields reduced for the boot cut in R100.2018 because of a lack of rain and that’s why we see a positive slope or was a lack of rain after the boot cut in 2017 that caused the lack of a positive slope. There may be a better way to visualize this.

At this point, it makes sense that the I2 data shows a negative slope because the IWG stand, as previously explained, was old and starting to die and did die after the grain harvest in 2017, so let’s just remove I2 and look at it again

So this might be something to be aware of in the discussion. Need to tease this out a bit more.

yield.1cut vs. GDD.1cut where field.year=“second”

Next we look at forage yields in the second year based on what treatment was applied in the first field year.

So here we are looking at the GDD that accumulated in year 2 on the x axis prior to the first cut and the forage yield of that first cut.
Regression lines suggest that in the second year, as GDD accumulated the forage yield was greater.
Regression lines also indicate very little difference among the follow.cut treatments. This is even more obvious if you allow se=T in the geom_smooth equation.
But remember that the kernza plants with follow.cut=none got about 2 more months of GDD in the first field year than kernza plants where follow.cut=october.

Now we will add in the additional GDD that accumulated in first field.year after the last cut in a given plot.

To do this, we need to need to create a new dataframe that is just field.year=second
then create a new dataframe that is just field.year=first
match these dataframes by row and mutate a new column that’s gdd.1cut of the dataframe for the second year plus the gdd.extra of that same plot the previous year

So here we are looking at the GDD that accumulated for the IWG plants after they were last cut in year 1 and before they were cut again in year 2.

Its remarkable that the plots with no follow-up cut received almost twice the average GDD than plots that were cut again in september and almost 4 times that amount of GDD than plots cut in October prior to the first cut in the spring the following year.

Did this translate to differences in yield?

##             numDF denDF  F-value p-value
## (Intercept)     1    62 3.234570  0.0770
## follow.cut      2    62 0.335089  0.7166

This is interesting! I would expect that plants that did not receive a follow-up cut (and therefore would accumulate more GDD compared to the plots with follow-up harvests), would have a greater forage yield for the yield.1cut the following year. However, despite follow.cut=none plots receiving upwards of 4x GDD compared with plots where follow.cut=october, the yield the following year was similar. When we do not take into account the GDD accumulation of the previous year and only look at the GDD accumulation in the field.year when the harvest occurred, then the GDD accumulation and yield allign.

This suggests that the GDD that accumulated after the last cut in year1 did not really have that great of an effect on the yield the following year. It also suggests that whether the plot was harvested once, twice or three times following harvest the first year did not have an impact on the first yield of the second year.
This may be worth further analyzing as one of our hypothesis is that multiple harvests on kernza plants will reduce their forage yield the following year.

yield.1cut vs. gdd.1cut

Since there doesn’t seem to be an effect of the harvest intensity (follow.cut) on the following field.year, we can combine across field.years!

let’s look at the relationship between yield and GDD among years…

Ok, when looking at year (which helps incorporate precip), it seems there’s a positive relationship between yield and GDD across all years except maybe 2017.
Since the data isn’t super even, let’s seperate by environment to get a better sense of things

There seems to be a consistent positive association between forage yield and GDD except for in 2017.
Did 2017 differ in precipitation or something? That could explain the non-positive association between GDD and forage yield

Let’s look at rain for 2017 vs the other years

weather!

Let’s try importing the weather data and making cumulative precipitation data

Now let’s add dates in 2017 and 2018 when the boot, anthesis and dough cut occurred.

It’s hard to tell without the another line depicting the 30 year average which year was weird, but it looks like 2018 happened to be drier than other years, especially in late May. Looking at both environments in 2018, the yield was low at the boot cut, making the slope more positive, BUT there isn’t really that much of a pattern here.

Trying to absorb the weirdness of R70.2017 by seperating by site and keeping years together was a smart move by Jake

yield.1cut vs GDD, seperated by site

Much better, nice consistent trend

regression comparison of yield.cut vs. GDD.1cut

Now let’s compare a linear regression with a quadratic regression

Now let’s statistically compare the linear and quadratic model for goodness of fit.

##             numDF denDF  F-value p-value
## (Intercept)     1   127 30.11156  <.0001
## gdd.1cut        1   127 23.73650  <.0001
## Linear mixed-effects model fit by maximum likelihood
##  Data: dat1 
##        AIC      BIC    logLik
##   2506.526 2521.375 -1248.263
## 
## Random effects:
##  Formula: ~1 | env
##         (Intercept)
## StdDev:    1137.364
## 
##  Formula: ~1 | block %in% env
##         (Intercept) Residual
## StdDev:     282.524 1324.787
## 
## Fixed effects: yield.1cut ~ gdd.1cut 
##                 Value Std.Error  DF  t-value p-value
## (Intercept) 1380.9027  699.0128 127 1.975504  0.0504
## gdd.1cut       1.4479    0.2972 127 4.872012  0.0000
##  Correlation: 
##          (Intr)
## gdd.1cut -0.541
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.5492141 -0.6146865 -0.1353912  0.2870142  4.6046687 
## 
## Number of Observations: 144
## Number of Groups: 
##            env block %in% env 
##              4             16
##               numDF denDF  F-value p-value
## (Intercept)       1   126 29.12935  <.0001
## gdd.1cut          1   126 28.93048  <.0001
## I(gdd.1cut^2)     1   126 28.78035  <.0001
##      Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## mod1     1  5 2506.526 2521.375 -1248.263                        
## mod2     2  6 2482.088 2499.907 -1235.044 1 vs 2 26.43869  <.0001

We conclude model 2 (the quadratic) fits the data better and differs from model 1.
## summary stats of regression; yield vs gdd

Now we determine at what level of GDD accumulation our maximimum forage yield.1cut occurs by calculate at what point our quadratic function we fitted onto the yield.1cut vs. GDD data has a slope of zero, and the corresponding GDD, this value is our “agronomically optimal growing degree day accumulation” or AOGDD

## estAOGDD 
## 1468.264
## estAOGDD 
## 4102.717

We conclude that our maximum yield for the first forage cut is 4,103 kg ha dry forage biomass and that this is achievable when 1,468 GDD have accumulated, which appears to be after anthesis and before dough stage.

mean comparisons

effect of timing.1cut

first test to see if interaction with the year of the experiment

##                        numDF denDF   F-value p-value
## (Intercept)                1   131 12.138391  0.0007
## timing.1cut                2   131 13.462861  <.0001
## field.year                 1   131  3.856811  0.0517
## timing.1cut:field.year     2   131 11.511820  <.0001

next look at the first field year

##             numDF denDF  F-value p-value
## (Intercept)     1    62 349.5791  <.0001
## timing.1cut     2    62   9.3629   3e-04
## Linear mixed-effects model fit by REML
##  Data: subset(dat1, field.year == "first") 
##        AIC      BIC    logLik
##   1180.319 1193.723 -584.1593
## 
## Random effects:
##  Formula: ~1 | site
##         (Intercept)
## StdDev:    75.61527
## 
##  Formula: ~1 | block %in% site
##         (Intercept) Residual
## StdDev:    362.6055 1032.283
## 
## Fixed effects: yield.1cut ~ timing.1cut 
##                      Value Std.Error DF   t-value p-value
## (Intercept)      3146.2187  252.3777 62 12.466312  0.0000
## timing.1cutboot  -128.5546  297.9943 62 -0.431399  0.6677
## timing.1cutdough 1046.9163  297.9943 62  3.513209  0.0008
##  Correlation: 
##                  (Intr) tmng.1ctb
## timing.1cutboot  -0.59           
## timing.1cutdough -0.59   0.50    
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.76826790 -0.71062934 -0.09541172  0.72884080  3.31702584 
## 
## Number of Observations: 72
## Number of Groups: 
##            site block %in% site 
##               2               8
##  timing.1cut emmean  SE df lower.CL upper.CL .group
##  boot          3018 252  1   -189.1     6224  1    
##  anthesis      3146 252  1    -60.5     6353  1    
##  dough         4193 252  1    986.4     7400   2   
## 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## significance level used: alpha = 0.05

##        Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## sy1r       1  5 1216.577 1227.960 -603.2885                        
## sy2r.2     2  6 1217.485 1231.145 -602.7424 1 vs 2 1.092231   0.296
## Linear mixed-effects model fit by maximum likelihood
##  Data: subset(dat1, field.year == "first") 
##        AIC     BIC    logLik
##   1216.577 1227.96 -603.2885
## 
## Random effects:
##  Formula: ~1 | site
##         (Intercept)
## StdDev:    103.6711
## 
##  Formula: ~1 | block %in% site
##         (Intercept) Residual
## StdDev:    370.8025 1005.735
## 
## Fixed effects: yield.1cut ~ gdd.1cut 
##                 Value Std.Error DF  t-value p-value
## (Intercept) 1577.7764  467.5465 63 3.374587  0.0013
## gdd.1cut       1.4652    0.3325 63 4.406841  0.0000
##  Correlation: 
##          (Intr)
## gdd.1cut -0.91 
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -1.5757597 -0.7564275 -0.1350538  0.7422545  3.4693326 
## 
## Number of Observations: 72
## Number of Groups: 
##            site block %in% site 
##               2               8

Linear model is better, suggests yield are maximized at dough stage

Relative Forage Quality (RFQ)

Methods for calculating relative forage quality can be found at this url: https://fyi.extension.wisc.edu/forage/files/2014/01/RFQ-FOF.pdf

We use equations for grass, not the alfalfa equation

October cut

Here we use data=dat, which includes I2 and grain cut, though since we are only doing the october cut, there is no data for I2. Unsure if there is no data for timing.cut=“grain”

Ok, there are still 11 data points for the grain cut, going to change this to remove the grain cut data by making data=“dat1”