This script takes 1) the cleaned daily weather data, and 2) the CDBN experiment data with estimated and known planting dates, and does the following:

  1. Determines the variation in day length by site during the growing season.
  2. Makes cumulative growing degree days (GDD) vs flowering time (DTF) plots and models.

Variation in daylength

Jeff doesn’t think there will be much variation in daylength during the growing season by location.

Here are some summary statistics for daylength by location.

Here are some summary statistics for daylength by location*year. I didn’t expect (and we don’t see) much variation in daylength between years for the same site.

This is a plot of the variation in mean and standard deviation in daylength between locations. Most sites (~61/68) have average daylengths of ~13.4 - 15.4 hours during the growing season, with the exception of Maryland and Arizona. Generally, the northern sites have a larger standard deviation in daylength during the growing season.

Here’s an alternative visualization on a different map background. Browner circles have a shorter average daylength during the growing season.

Cumulative GDD vs DTF plots

I first found the average number of days to flowering for each site * year combination.

Then, I found the day of the year after the planting date that is the average number of days to flowering. To do this, I made a counter for “Days after planting” (DAP) and joined the growing season weather data to the Locations_by_years data frame. That adds the cumulative GDD from the day where the DAP counter is equal to the mean days to flowering for that location and year combination.

Plot cumulative GDD vs Days to Flowering (DTF)

Tbase = 8 C

Jeff suggested calculating cumulative GDD with 8C as the base temperature, above which beans grow. So here’s a plot and a simple model without location information predicting DTF given a cumulative GDD with 10 C as the base temperature.

DTF8 <- lm(DTF_mean ~ cum_GDD_8Cbase, data = CumGDD)
summary(DTF8)
#> 
#> Call:
#> lm(formula = DTF_mean ~ cum_GDD_8Cbase, data = CumGDD)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -14.909  -4.478  -1.035   4.244  23.451 
#> 
#> Coefficients:
#>                 Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)    39.845437   2.044928  19.485  < 2e-16 ***
#> cum_GDD_8Cbase  0.014704   0.003675   4.001 8.28e-05 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 6.575 on 254 degrees of freedom
#>   (426 observations deleted due to missingness)
#> Multiple R-squared:  0.05928,    Adjusted R-squared:  0.05558 
#> F-statistic: 16.01 on 1 and 254 DF,  p-value: 8.283e-05

Tbase = 10 C

Li suggested using 10 C as the base temperature above which beans grow to calculate cumulative GDD. In her paper, they estimate Tbase for 141 genotypes, and the average Tbase they find is 10 +/- 3 C, and Toptl as 22.3 +/- 4 C.

DTF10 <- lm(DTF_mean ~ cum_GDD_10Cbase, data = CumGDD)
summary(DTF10)
#> 
#> Call:
#> lm(formula = DTF_mean ~ cum_GDD_10Cbase, data = CumGDD)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -14.025  -4.517  -1.040   4.079  25.428 
#> 
#> Coefficients:
#>                  Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)     44.036154   1.796481  24.512   <2e-16 ***
#> cum_GDD_10Cbase  0.008518   0.003891   2.189   0.0295 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 6.715 on 254 degrees of freedom
#>   (426 observations deleted due to missingness)
#> Multiple R-squared:  0.01852,    Adjusted R-squared:  0.01465 
#> F-statistic: 4.792 on 1 and 254 DF,  p-value: 0.0295

anova(DTF8, DTF10)
#> Analysis of Variance Table
#> 
#> Model 1: DTF_mean ~ cum_GDD_8Cbase
#> Model 2: DTF_mean ~ cum_GDD_10Cbase
#>   Res.Df   RSS Df Sum of Sq F Pr(>F)
#> 1    254 10979                      
#> 2    254 11455  0   -475.77

Without including any information about location, both sets of cumulative GDD do a pretty bad job at predicting average flowering time, although there is a modest, and for a Tbase of 8C, a significant positive correlation. Interestingly, the 8C Tbase seems to do a better job at predicting flowering time than the 10C Tbase (R^2 of 5.56% vs 1.46%, a much more significant slope, and a smaller RSS in an ANOVA).

Estimated vs Known planting dates

Is this a problem with the estimated planting dates?

ggplot(data = CumGDD2, mapping = aes(x = cum_GDD_8Cbase, y = DTF_mean)) +
  geom_point(aes(color = PD_check.x)) +
  labs(x = "Cumulative GDD with T_base of 8 C", y = "Average Days to Flowering")


DTF_PD10 <- lm(DTF_mean ~ cum_GDD_10Cbase:PD_check.x, data = CumGDD)
summary(DTF_PD10)
#> 
#> Call:
#> lm(formula = DTF_mean ~ cum_GDD_10Cbase:PD_check.x, data = CumGDD)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -13.743  -4.565  -0.951   4.114  25.706 
#> 
#> Coefficients:
#>                                      Estimate Std. Error t value Pr(>|t|)
#> (Intercept)                         44.186690   1.810470  24.406   <2e-16
#> cum_GDD_10Cbase:PD_check.xEstimated  0.009416   0.004092   2.301   0.0222
#> cum_GDD_10Cbase:PD_check.xKnown      0.007847   0.004006   1.959   0.0512
#>                                        
#> (Intercept)                         ***
#> cum_GDD_10Cbase:PD_check.xEstimated *  
#> cum_GDD_10Cbase:PD_check.xKnown     .  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 6.722 on 253 degrees of freedom
#>   (426 observations deleted due to missingness)
#> Multiple R-squared:  0.0205, Adjusted R-squared:  0.01276 
#> F-statistic: 2.648 on 2 and 253 DF,  p-value: 0.07278

DTF_PD8 <- lm(DTF_mean ~ cum_GDD_8Cbase:PD_check.x, data = CumGDD)
summary(DTF_PD8)
#> 
#> Call:
#> lm(formula = DTF_mean ~ cum_GDD_8Cbase:PD_check.x, data = CumGDD)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -14.763  -4.626  -1.003   4.307  23.620 
#> 
#> Coefficients:
#>                                     Estimate Std. Error t value Pr(>|t|)
#> (Intercept)                        39.945464   2.062439  19.368  < 2e-16
#> cum_GDD_8Cbase:PD_check.xEstimated  0.015104   0.003806   3.969 9.41e-05
#> cum_GDD_8Cbase:PD_check.xKnown      0.014362   0.003772   3.807 0.000176
#>                                       
#> (Intercept)                        ***
#> cum_GDD_8Cbase:PD_check.xEstimated ***
#> cum_GDD_8Cbase:PD_check.xKnown     ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 6.585 on 253 degrees of freedom
#>   (426 observations deleted due to missingness)
#> Multiple R-squared:  0.05992,    Adjusted R-squared:  0.05249 
#> F-statistic: 8.063 on 2 and 253 DF,  p-value: 0.0004029

anova(DTF_PD10, DTF_PD8)
#> Analysis of Variance Table
#> 
#> Model 1: DTF_mean ~ cum_GDD_10Cbase:PD_check.x
#> Model 2: DTF_mean ~ cum_GDD_8Cbase:PD_check.x
#>   Res.Df   RSS Df Sum of Sq F Pr(>F)
#> 1    253 11432                      
#> 2    253 10972  0    460.08

The estimated planting dates are actually more correlated with DTF than the known planting dates. I assume that must have something to do with the types of sites we could actually estimate planting date for… let me check this assumption by doing analyses for each climate_bin, State, and Location_code. I suspect that will solve a lot of the problems.

Climate region as a factor

Just including Climate_region as a factor significantly improves the model’s performance.

ggplot(data = CumGDD2, mapping = aes(x = cum_GDD_8Cbase, y = DTF_mean)) +
  geom_point(aes(color = PD_check.x)) +
  facet_wrap(~Climate_bin) +
  labs(x = "Cumulative GDD with T_base of 8 C", y = "Average Days to Flowering")


DTF_cb <- lm(DTF_mean ~ Climate_bin:cum_GDD_8Cbase, data = CumGDD2)
summary(DTF_cb)
#> 
#> Call:
#> lm(formula = DTF_mean ~ Climate_bin:cum_GDD_8Cbase, data = CumGDD2)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -12.0133  -3.8041  -0.5596   3.6010  20.2396 
#> 
#> Coefficients:
#>                                        Estimate Std. Error t value
#> (Intercept)                           40.111457   1.868309  21.469
#> Climate_binArizona:cum_GDD_8Cbase      0.002099   0.004671   0.449
#> Climate_binCalifornia:cum_GDD_8Cbase   0.002277   0.003796   0.600
#> Climate_binGreatLakes:cum_GDD_8Cbase   0.006443   0.003859   1.670
#> Climate_binKS&MO:cum_GDD_8Cbase        0.003168   0.008301   0.382
#> Climate_binND&MN:cum_GDD_8Cbase        0.019754   0.004278   4.618
#> Climate_binRockiesWest:cum_GDD_8Cbase  0.018434   0.003350   5.503
#> Climate_binTexas:cum_GDD_8Cbase        0.012414   0.007460   1.664
#>                                       Pr(>|t|)    
#> (Intercept)                            < 2e-16 ***
#> Climate_binArizona:cum_GDD_8Cbase       0.6536    
#> Climate_binCalifornia:cum_GDD_8Cbase    0.5492    
#> Climate_binGreatLakes:cum_GDD_8Cbase    0.0963 .  
#> Climate_binKS&MO:cum_GDD_8Cbase         0.7030    
#> Climate_binND&MN:cum_GDD_8Cbase       6.24e-06 ***
#> Climate_binRockiesWest:cum_GDD_8Cbase 9.29e-08 ***
#> Climate_binTexas:cum_GDD_8Cbase         0.0974 .  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 5.602 on 248 degrees of freedom
#> Multiple R-squared:  0.3332, Adjusted R-squared:  0.3144 
#> F-statistic: 17.71 on 7 and 248 DF,  p-value: < 2.2e-16

anova(DTF8, DTF_cb)
#> Analysis of Variance Table
#> 
#> Model 1: DTF_mean ~ cum_GDD_8Cbase
#> Model 2: DTF_mean ~ Climate_bin:cum_GDD_8Cbase
#>   Res.Df     RSS Df Sum of Sq      F    Pr(>F)    
#> 1    254 10978.9                                  
#> 2    248  7781.5  6    3197.4 16.984 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

State as a factor

ggplot(data = CumGDD2, mapping = aes(x = cum_GDD_8Cbase, y = DTF_mean)) +
  geom_point(aes(color = PD_check.x)) +
  facet_wrap(~State) +
  labs(x = "Cumulative GDD with T_base of 8 C", y = "Average Days to Flowering")


DTF_st <- lm(DTF_mean ~ State:cum_GDD_8Cbase, data = CumGDD2)
summary(DTF_st)
#> 
#> Call:
#> lm(formula = DTF_mean ~ State:cum_GDD_8Cbase, data = CumGDD2)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -10.7957  -2.5530  -0.5883   2.0420  13.6943 
#> 
#> Coefficients:
#>                         Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)            32.278740   1.678764  19.228  < 2e-16 ***
#> StateAB:cum_GDD_8Cbase  0.050408   0.005014  10.053  < 2e-16 ***
#> StateAZ:cum_GDD_8Cbase  0.011871   0.003714   3.197 0.001579 ** 
#> StateCA:cum_GDD_8Cbase  0.014790   0.003217   4.597 6.97e-06 ***
#> StateCO:cum_GDD_8Cbase  0.023821   0.002973   8.014 5.05e-14 ***
#> StateID:cum_GDD_8Cbase  0.031177   0.003522   8.853  < 2e-16 ***
#> StateMB:cum_GDD_8Cbase  0.040269   0.009234   4.361 1.93e-05 ***
#> StateMI:cum_GDD_8Cbase  0.020873   0.003427   6.091 4.50e-09 ***
#> StateMN:cum_GDD_8Cbase  0.045125   0.004786   9.428  < 2e-16 ***
#> StateMO:cum_GDD_8Cbase  0.017898   0.006514   2.747 0.006467 ** 
#> StateMT:cum_GDD_8Cbase  0.041265   0.003441  11.992  < 2e-16 ***
#> StateND:cum_GDD_8Cbase  0.029486   0.003857   7.645 5.20e-13 ***
#> StateNE:cum_GDD_8Cbase  0.021099   0.002909   7.253 5.75e-12 ***
#> StateNY:cum_GDD_8Cbase  0.016568   0.005312   3.119 0.002039 ** 
#> StateON:cum_GDD_8Cbase  0.025034   0.003925   6.378 9.31e-10 ***
#> StateSK:cum_GDD_8Cbase  0.045782   0.004120  11.113  < 2e-16 ***
#> StateTX:cum_GDD_8Cbase  0.022309   0.005766   3.869 0.000141 ***
#> StateWA:cum_GDD_8Cbase  0.038479   0.003991   9.641  < 2e-16 ***
#> StateWY:cum_GDD_8Cbase  0.036648   0.003164  11.584  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 4.244 on 237 degrees of freedom
#> Multiple R-squared:  0.6342, Adjusted R-squared:  0.6064 
#> F-statistic: 22.83 on 18 and 237 DF,  p-value: < 2.2e-16

anova(DTF8, DTF_cb, DTF_st)
#> Analysis of Variance Table
#> 
#> Model 1: DTF_mean ~ cum_GDD_8Cbase
#> Model 2: DTF_mean ~ Climate_bin:cum_GDD_8Cbase
#> Model 3: DTF_mean ~ State:cum_GDD_8Cbase
#>   Res.Df     RSS Df Sum of Sq      F    Pr(>F)    
#> 1    254 10978.9                                  
#> 2    248  7781.5  6    3197.4 29.585 < 2.2e-16 ***
#> 3    237  4269.0 11    3512.5 17.728 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Location code as a factor

ggplot(data = CumGDD2, mapping = aes(x = cum_GDD_8Cbase, y = DTF_mean)) +
  geom_point(aes(color = PD_check.x)) +
  facet_wrap(~Location_code, drop = TRUE) +
  labs(x = "Cumulative GDD with T_base of 8 C", y = "Average Days to Flowering")


DTF_lc <- lm(DTF_mean ~ Location_code:cum_GDD_8Cbase, data = CumGDD2)
summary(DTF_lc)
#> 
#> Call:
#> lm(formula = DTF_mean ~ Location_code:cum_GDD_8Cbase, data = CumGDD2)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -10.3555  -2.3357  -0.4199   1.6517  12.6765 
#> 
#> Coefficients:
#>                                   Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)                      30.430515   1.840642  16.533  < 2e-16 ***
#> Location_codeABBI:cum_GDD_8Cbase  0.058256   0.007683   7.583 1.02e-12 ***
#> Location_codeABBR:cum_GDD_8Cbase  0.054815   0.006383   8.588 1.84e-15 ***
#> Location_codeABLE:cum_GDD_8Cbase  0.056690   0.007330   7.734 4.04e-13 ***
#> Location_codeABVA:cum_GDD_8Cbase  0.044321   0.008810   5.031 1.03e-06 ***
#> Location_codeAZBO:cum_GDD_8Cbase  0.014177   0.003679   3.854 0.000154 ***
#> Location_codeCACH:cum_GDD_8Cbase  0.012919   0.005504   2.347 0.019833 *  
#> Location_codeCADV:cum_GDD_8Cbase  0.018531   0.003548   5.224 4.15e-07 ***
#> Location_codeCAIR:cum_GDD_8Cbase  0.025221   0.005652   4.462 1.31e-05 ***
#> Location_codeCAOR:cum_GDD_8Cbase  0.011633   0.004913   2.368 0.018790 *  
#> Location_codeCOFC:cum_GDD_8Cbase  0.027203   0.003434   7.921 1.27e-13 ***
#> Location_codeCOFR:cum_GDD_8Cbase  0.025953   0.003433   7.559 1.18e-12 ***
#> Location_codeIDKI:cum_GDD_8Cbase  0.034629   0.004681   7.398 3.11e-12 ***
#> Location_codeIDNA:cum_GDD_8Cbase  0.023873   0.005928   4.027 7.85e-05 ***
#> Location_codeIDPA:cum_GDD_8Cbase  0.038652   0.003834  10.080  < 2e-16 ***
#> Location_codeIDTF:cum_GDD_8Cbase  0.027073   0.005780   4.684 5.00e-06 ***
#> Location_codeMBMO:cum_GDD_8Cbase  0.044009   0.008868   4.963 1.42e-06 ***
#> Location_codeMIEN:cum_GDD_8Cbase  0.020607   0.004165   4.948 1.52e-06 ***
#> Location_codeMISA:cum_GDD_8Cbase  0.025939   0.003742   6.933 4.79e-11 ***
#> Location_codeMNCR:cum_GDD_8Cbase  0.050051   0.005101   9.813  < 2e-16 ***
#> Location_codeMNPR:cum_GDD_8Cbase  0.042451   0.008296   5.117 6.89e-07 ***
#> Location_codeMOCO:cum_GDD_8Cbase  0.021374   0.006364   3.359 0.000927 ***
#> Location_codeMTHU:cum_GDD_8Cbase  0.045038   0.006057   7.436 2.46e-12 ***
#> Location_codeMTSI:cum_GDD_8Cbase  0.043972   0.003473  12.659  < 2e-16 ***
#> Location_codeNDCA:cum_GDD_8Cbase  0.027508   0.008642   3.183 0.001674 ** 
#> Location_codeNDER:cum_GDD_8Cbase  0.030807   0.005240   5.880 1.56e-08 ***
#> Location_codeNDFA:cum_GDD_8Cbase  0.030336   0.005032   6.029 7.16e-09 ***
#> Location_codeNDHA:cum_GDD_8Cbase  0.038478   0.005301   7.258 7.13e-12 ***
#> Location_codeNEMI:cum_GDD_8Cbase  0.026348   0.003541   7.440 2.41e-12 ***
#> Location_codeNES2:cum_GDD_8Cbase  0.019639   0.007353   2.671 0.008147 ** 
#> Location_codeNESB:cum_GDD_8Cbase  0.023341   0.003290   7.095 1.86e-11 ***
#> Location_codeNYFR:cum_GDD_8Cbase  0.020437   0.005365   3.809 0.000182 ***
#> Location_codeONEL:cum_GDD_8Cbase  0.023734   0.005608   4.232 3.43e-05 ***
#> Location_codeONEX:cum_GDD_8Cbase  0.027922   0.004798   5.819 2.14e-08 ***
#> Location_codeONGU:cum_GDD_8Cbase  0.031693   0.004796   6.608 3.04e-10 ***
#> Location_codeSKOU:cum_GDD_8Cbase  0.049514   0.004396  11.264  < 2e-16 ***
#> Location_codeSKSA:cum_GDD_8Cbase  0.049408   0.005607   8.812 4.28e-16 ***
#> Location_codeTXLU:cum_GDD_8Cbase  0.024644   0.005537   4.451 1.38e-05 ***
#> Location_codeWAOT:cum_GDD_8Cbase  0.042185   0.004434   9.515  < 2e-16 ***
#> Location_codeWARO:cum_GDD_8Cbase  0.042892   0.004905   8.745 6.65e-16 ***
#> Location_codeWYPO:cum_GDD_8Cbase  0.044700   0.003677  12.155  < 2e-16 ***
#> Location_codeWYTO:cum_GDD_8Cbase  0.030551   0.003558   8.587 1.86e-15 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 3.978 on 214 degrees of freedom
#> Multiple R-squared:  0.7099, Adjusted R-squared:  0.6543 
#> F-statistic: 12.77 on 41 and 214 DF,  p-value: < 2.2e-16

anova(DTF8, DTF_cb, DTF_st, DTF_lc)
#> Analysis of Variance Table
#> 
#> Model 1: DTF_mean ~ cum_GDD_8Cbase
#> Model 2: DTF_mean ~ Climate_bin:cum_GDD_8Cbase
#> Model 3: DTF_mean ~ State:cum_GDD_8Cbase
#> Model 4: DTF_mean ~ Location_code:cum_GDD_8Cbase
#>   Res.Df     RSS Df Sum of Sq       F    Pr(>F)    
#> 1    254 10978.9                                   
#> 2    248  7781.5  6    3197.4 33.6834 < 2.2e-16 ***
#> 3    237  4269.0 11    3512.5 20.1838 < 2.2e-16 ***
#> 4    214  3385.6 23     883.4  2.4276 0.0005035 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And indeed the models with specific locations accounted for do MUCH better jobs! Up to 65% of the variation (adjusted R^2) is explained! Every location has a slope that is significantly different than 0, and many look quite different from one another.

NB: 10C Tbase still does a worse job when location is included.

Just FYI, 10C cumulative GDD does a worse job here too. Models using a Tbase of 8C have 4-5% more predictive power for all of these models!

ggplot(data = CumGDD2, mapping = aes(x = cum_GDD_10Cbase, y = DTF_mean)) +
  geom_point(aes(color = PD_check.x)) +
  facet_wrap(~Climate_bin) +
  labs(x = "Cumulative GDD with T_base of 10 C", y = "Average Days to Flowering")


DTF_cb <- lm(DTF_mean ~ Climate_bin:cum_GDD_10Cbase, data = CumGDD2)
summary(DTF_cb)
#> 
#> Call:
#> lm(formula = DTF_mean ~ Climate_bin:cum_GDD_10Cbase, data = CumGDD2)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -12.4248  -3.8686  -0.7324   3.4500  22.0155 
#> 
#> Coefficients:
#>                                         Estimate Std. Error t value
#> (Intercept)                            43.888149   1.654637  26.524
#> Climate_binArizona:cum_GDD_10Cbase     -0.002943   0.005200  -0.566
#> Climate_binCalifornia:cum_GDD_10Cbase  -0.004377   0.004122  -1.062
#> Climate_binGreatLakes:cum_GDD_10Cbase  -0.001115   0.004200  -0.265
#> Climate_binKS&MO:cum_GDD_10Cbase       -0.004616   0.009915  -0.466
#> Climate_binND&MN:cum_GDD_10Cbase        0.015675   0.004933   3.178
#> Climate_binRockiesWest:cum_GDD_10Cbase  0.014094   0.003607   3.907
#> Climate_binTexas:cum_GDD_10Cbase        0.008773   0.008684   1.010
#>                                        Pr(>|t|)    
#> (Intercept)                             < 2e-16 ***
#> Climate_binArizona:cum_GDD_10Cbase     0.571940    
#> Climate_binCalifornia:cum_GDD_10Cbase  0.289351    
#> Climate_binGreatLakes:cum_GDD_10Cbase  0.790940    
#> Climate_binKS&MO:cum_GDD_10Cbase       0.641940    
#> Climate_binND&MN:cum_GDD_10Cbase       0.001672 ** 
#> Climate_binRockiesWest:cum_GDD_10Cbase 0.000121 ***
#> Climate_binTexas:cum_GDD_10Cbase       0.313350    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 5.755 on 248 degrees of freedom
#> Multiple R-squared:  0.2963, Adjusted R-squared:  0.2764 
#> F-statistic: 14.92 on 7 and 248 DF,  p-value: 3.193e-16

ggplot(data = CumGDD2, mapping = aes(x = cum_GDD_10Cbase, y = DTF_mean)) +
  geom_point(aes(color = PD_check.x)) +
  facet_wrap(~State) +
  labs(x = "Cumulative GDD with T_base of 10 C", y = "Average Days to Flowering")


DTF_st <- lm(DTF_mean ~ State:cum_GDD_10Cbase, data = CumGDD2)
summary(DTF_st)
#> 
#> Call:
#> lm(formula = DTF_mean ~ State:cum_GDD_10Cbase, data = CumGDD2)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -11.1475  -2.6472  -0.4164   2.1435  15.5541 
#> 
#> Coefficients:
#>                          Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)             36.323700   1.572377  23.101  < 2e-16 ***
#> StateAB:cum_GDD_10Cbase  0.053899   0.006496   8.297 8.09e-15 ***
#> StateAZ:cum_GDD_10Cbase  0.007629   0.004246   1.797 0.073613 .  
#> StateCA:cum_GDD_10Cbase  0.009521   0.003614   2.634 0.008984 ** 
#> StateCO:cum_GDD_10Cbase  0.020631   0.003376   6.111 4.03e-09 ***
#> StateID:cum_GDD_10Cbase  0.028846   0.004127   6.990 2.78e-11 ***
#> StateMB:cum_GDD_10Cbase  0.039895   0.011965   3.334 0.000992 ***
#> StateMI:cum_GDD_10Cbase  0.015740   0.003918   4.017 7.92e-05 ***
#> StateMN:cum_GDD_10Cbase  0.047647   0.006080   7.836 1.56e-13 ***
#> StateMO:cum_GDD_10Cbase  0.012273   0.007989   1.536 0.125818    
#> StateMT:cum_GDD_10Cbase  0.042657   0.004150  10.279  < 2e-16 ***
#> StateND:cum_GDD_10Cbase  0.026747   0.004601   5.813 1.97e-08 ***
#> StateNE:cum_GDD_10Cbase  0.017406   0.003268   5.327 2.32e-07 ***
#> StateNY:cum_GDD_10Cbase  0.009552   0.006437   1.484 0.139155    
#> StateON:cum_GDD_10Cbase  0.020629   0.004644   4.442 1.37e-05 ***
#> StateSK:cum_GDD_10Cbase  0.047810   0.005137   9.307  < 2e-16 ***
#> StateTX:cum_GDD_10Cbase  0.019743   0.006897   2.862 0.004580 ** 
#> StateWA:cum_GDD_10Cbase  0.037905   0.004892   7.748 2.72e-13 ***
#> StateWY:cum_GDD_10Cbase  0.036059   0.003702   9.740  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 4.489 on 237 degrees of freedom
#> Multiple R-squared:  0.5909, Adjusted R-squared:  0.5598 
#> F-statistic: 19.02 on 18 and 237 DF,  p-value: < 2.2e-16

ggplot(data = CumGDD2, mapping = aes(x = cum_GDD_10Cbase, y = DTF_mean)) +
  geom_point(aes(color = PD_check.x)) +
  facet_wrap(~Location_code, drop = TRUE) +
  labs(x = "Cumulative GDD with T_base of 10 C", y = "Average Days to Flowering")


DTF_lc <- lm(DTF_mean ~ Location_code:cum_GDD_10Cbase, data = CumGDD2)
summary(DTF_lc)
#> 
#> Call:
#> lm(formula = DTF_mean ~ Location_code:cum_GDD_10Cbase, data = CumGDD2)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -11.5004  -2.4203  -0.3724   1.7793  13.5780 
#> 
#> Coefficients:
#>                                    Estimate Std. Error t value Pr(>|t|)
#> (Intercept)                       34.846500   1.750962  19.901  < 2e-16
#> Location_codeABBI:cum_GDD_10Cbase  0.062295   0.010428   5.974 9.55e-09
#> Location_codeABBR:cum_GDD_10Cbase  0.059168   0.008555   6.916 5.28e-11
#> Location_codeABLE:cum_GDD_10Cbase  0.061370   0.010002   6.136 4.05e-09
#> Location_codeABVA:cum_GDD_10Cbase  0.045041   0.011711   3.846 0.000158
#> Location_codeAZBO:cum_GDD_10Cbase  0.009694   0.004213   2.301 0.022360
#> Location_codeCACH:cum_GDD_10Cbase  0.006534   0.006515   1.003 0.317055
#> Location_codeCADV:cum_GDD_10Cbase  0.013718   0.004065   3.375 0.000876
#> Location_codeCAIR:cum_GDD_10Cbase  0.019403   0.007002   2.771 0.006083
#> Location_codeCAOR:cum_GDD_10Cbase  0.005855   0.005731   1.022 0.308125
#> Location_codeCOFC:cum_GDD_10Cbase  0.023800   0.003987   5.969 9.83e-09
#> Location_codeCOFR:cum_GDD_10Cbase  0.022655   0.003989   5.679 4.38e-08
#> Location_codeIDKI:cum_GDD_10Cbase  0.031235   0.005748   5.434 1.49e-07
#> Location_codeIDNA:cum_GDD_10Cbase  0.019038   0.007272   2.618 0.009474
#> Location_codeIDPA:cum_GDD_10Cbase  0.038117   0.004610   8.267 1.44e-14
#> Location_codeIDTF:cum_GDD_10Cbase  0.023158   0.007091   3.266 0.001271
#> Location_codeMBMO:cum_GDD_10Cbase  0.043611   0.011535   3.781 0.000203
#> Location_codeMIEN:cum_GDD_10Cbase  0.014340   0.004883   2.937 0.003679
#> Location_codeMISA:cum_GDD_10Cbase  0.021094   0.004353   4.846 2.42e-06
#> Location_codeMNCR:cum_GDD_10Cbase  0.053177   0.006538   8.134 3.36e-14
#> Location_codeMNPR:cum_GDD_10Cbase  0.042821   0.010901   3.928 0.000115
#> Location_codeMOCO:cum_GDD_10Cbase  0.015571   0.007822   1.991 0.047788
#> Location_codeMTHU:cum_GDD_10Cbase  0.045377   0.007928   5.724 3.50e-08
#> Location_codeMTSI:cum_GDD_10Cbase  0.045543   0.004216  10.802  < 2e-16
#> Location_codeNDCA:cum_GDD_10Cbase  0.022856   0.010991   2.079 0.038764
#> Location_codeNDER:cum_GDD_10Cbase  0.027818   0.006458   4.308 2.51e-05
#> Location_codeNDFA:cum_GDD_10Cbase  0.027152   0.006170   4.401 1.70e-05
#> Location_codeNDHA:cum_GDD_10Cbase  0.036906   0.006665   5.537 8.94e-08
#> Location_codeNEMI:cum_GDD_10Cbase  0.023775   0.004138   5.745 3.13e-08
#> Location_codeNES2:cum_GDD_10Cbase  0.014292   0.008989   1.590 0.113325
#> Location_codeNESB:cum_GDD_10Cbase  0.019051   0.003755   5.074 8.45e-07
#> Location_codeNYFR:cum_GDD_10Cbase  0.013270   0.006511   2.038 0.042766
#> Location_codeONEL:cum_GDD_10Cbase  0.017615   0.006867   2.565 0.011000
#> Location_codeONEX:cum_GDD_10Cbase  0.024157   0.005841   4.136 5.08e-05
#> Location_codeONGU:cum_GDD_10Cbase  0.027642   0.005890   4.693 4.80e-06
#> Location_codeSKOU:cum_GDD_10Cbase  0.051721   0.005528   9.357  < 2e-16
#> Location_codeSKSA:cum_GDD_10Cbase  0.051253   0.007349   6.974 3.78e-11
#> Location_codeTXLU:cum_GDD_10Cbase  0.021885   0.006649   3.291 0.001166
#> Location_codeWAOT:cum_GDD_10Cbase  0.041433   0.005511   7.518 1.50e-12
#> Location_codeWARO:cum_GDD_10Cbase  0.042962   0.006237   6.888 6.19e-11
#> Location_codeWYPO:cum_GDD_10Cbase  0.045489   0.004444  10.236  < 2e-16
#> Location_codeWYTO:cum_GDD_10Cbase  0.028250   0.004201   6.725 1.58e-10
#>                                      
#> (Intercept)                       ***
#> Location_codeABBI:cum_GDD_10Cbase ***
#> Location_codeABBR:cum_GDD_10Cbase ***
#> Location_codeABLE:cum_GDD_10Cbase ***
#> Location_codeABVA:cum_GDD_10Cbase ***
#> Location_codeAZBO:cum_GDD_10Cbase *  
#> Location_codeCACH:cum_GDD_10Cbase    
#> Location_codeCADV:cum_GDD_10Cbase ***
#> Location_codeCAIR:cum_GDD_10Cbase ** 
#> Location_codeCAOR:cum_GDD_10Cbase    
#> Location_codeCOFC:cum_GDD_10Cbase ***
#> Location_codeCOFR:cum_GDD_10Cbase ***
#> Location_codeIDKI:cum_GDD_10Cbase ***
#> Location_codeIDNA:cum_GDD_10Cbase ** 
#> Location_codeIDPA:cum_GDD_10Cbase ***
#> Location_codeIDTF:cum_GDD_10Cbase ** 
#> Location_codeMBMO:cum_GDD_10Cbase ***
#> Location_codeMIEN:cum_GDD_10Cbase ** 
#> Location_codeMISA:cum_GDD_10Cbase ***
#> Location_codeMNCR:cum_GDD_10Cbase ***
#> Location_codeMNPR:cum_GDD_10Cbase ***
#> Location_codeMOCO:cum_GDD_10Cbase *  
#> Location_codeMTHU:cum_GDD_10Cbase ***
#> Location_codeMTSI:cum_GDD_10Cbase ***
#> Location_codeNDCA:cum_GDD_10Cbase *  
#> Location_codeNDER:cum_GDD_10Cbase ***
#> Location_codeNDFA:cum_GDD_10Cbase ***
#> Location_codeNDHA:cum_GDD_10Cbase ***
#> Location_codeNEMI:cum_GDD_10Cbase ***
#> Location_codeNES2:cum_GDD_10Cbase    
#> Location_codeNESB:cum_GDD_10Cbase ***
#> Location_codeNYFR:cum_GDD_10Cbase *  
#> Location_codeONEL:cum_GDD_10Cbase *  
#> Location_codeONEX:cum_GDD_10Cbase ***
#> Location_codeONGU:cum_GDD_10Cbase ***
#> Location_codeSKOU:cum_GDD_10Cbase ***
#> Location_codeSKSA:cum_GDD_10Cbase ***
#> Location_codeTXLU:cum_GDD_10Cbase ** 
#> Location_codeWAOT:cum_GDD_10Cbase ***
#> Location_codeWARO:cum_GDD_10Cbase ***
#> Location_codeWYPO:cum_GDD_10Cbase ***
#> Location_codeWYTO:cum_GDD_10Cbase ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 4.237 on 214 degrees of freedom
#> Multiple R-squared:  0.6707, Adjusted R-squared:  0.6077 
#> F-statistic: 10.63 on 41 and 214 DF,  p-value: < 2.2e-16