1) Map of Buzzards Bay

map

map

2) Figure of rising temps/preicps

For temperature, I used temperature data from the CRUTEM4 dataset (1850-2016; https://crudata.uea.ac.uk/cru/data/crutem/ge/crutem4-2018-06/N42.5W072.5/720223_data.txt) supplemented by Wunderground monthly data (for 2017 and 2018) from New Bedford Regional Airport.

Precipitation data from the US Historical Climate Network is available for most of the same period. Missing years are 2002-2016 and must be supplemented by precip data from the New Bedford Regional Airport station.

However, because of the difference in stations, there’s difference in precipitation. We can see this for the years 1997-2002 (where the two records overlap): the aiport consistently had ~200 mm less / year (17 mm less / month) than the network station.

So we might “correct” the data by adding 17 mm / month to the New Bedford Regional Airport data, or just use the raw data. Below, I’ve used the corrected data (yrs 2002-2016).

3) Composite pic of flowers

(Jan will add)

4) Phenology records

There are 1280 taxon names in the Hervey data, with phenological periods indicated.

There are 764 species in the contemporary data, with 3531 observations.

Here’s a summary of the # of species which occur (have an “X” marked) in various Hervey-defined periods.

Table 1: Contemporary and Hervey date ranges. Hervey date ranges, and the number of species recorded wihin each. For our purposes, I’ve still used the period names from Hervey, but the actual periods end one day sooner to avoid overlap: e.g. the “Apr 1-Apr 15” goes from day 91 to day 104). The contemporary data is any species for which an observation is recorded in any of the contemporary years.

##                date N taxa 1860-1911 N taxa 2014-2018
## 1    Jan 1 - Mar 15               NA               23
## 2       Mar 15-Apr1                9               21
## 3      Apr 1-Apr 15               36               44
## 4      Apr 15-May 1               32              109
## 5      May 1-May 10              141              117
## 6     May 10-May 20               79              139
## 7     May 20-June 1              145              158
## 8    June 1-June 10              122              107
## 9   June 10-June 20              138              101
## 10   June 20-July 1              157               97
## 11   July 1-July 10              145               54
## 12  July 10-July 20              141              212
## 13    July 20-Aug 1              159              117
## 14   Aug. 1-Aug. 10              196              175
## 15  Aug. 10-Aug. 20              163              133
## 16  Aug. 20-Sept. 1              144              150
## 17 Sept. 1-Sept. 15              123              173
## 18   Sept. 15-Oct 1               NA              128
## 19    Oct. 1-Oct 31               92               99
## 20     Nov 1-Dec 31                4               36

There are 545 species of 341 genera that were recorded by Hervey and in at least one contemporary year. Of these,74 species of 66 genera were recorded by Hervey and in every contemporary year.

There are 735 species of 403 genera that were recorded by Hervey but not in the contemporary data; and 219 species of 154 genera that were recorded in the contemporary data but not by Hervey. Within this, 241 genera recorded by Hervey were never recorded in the contemporary data and 49 genera were newly recorded in the contemporary data but not by Hervey.

Figure: Comparing Hervey and contemporary observations, there is not a simple sign of earlier contemporary observations

Figure: Comparing Hervey and contemporary observations, there is not a simple sign of earlier contemporary observations

Comparing as a whole the total 1499 across both Hervey and contemporary studies, there is not a simple earlier record of contemporary observations versus the mean (=midpoint, in this case) of Hervey’s species ranges.

Rather, it looks like 2014 and 2015 are signifcantly earlier, followed by Hervey, followed by 2016 and then 2017-2018, which are significantly later. Here are the year-to-year statistics:

##               Df   Sum Sq Mean Sq F value Pr(>F)    
## year           5   248437   49687   17.51 <2e-16 ***
## Residuals   5553 15755970    2837                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = day_of_year ~ year, data = joined)
## 
## Linear Hypotheses:
##                  Estimate Std. Error t value Pr(>|t|)    
## 2014 - 1860 == 0  -11.634      3.032  -3.837  0.00165 ** 
## 2015 - 1860 == 0  -10.324      2.756  -3.746  0.00247 ** 
## 2016 - 1860 == 0    5.153      2.118   2.433  0.13857    
## 2017 - 1860 == 0    6.616      2.140   3.092  0.02329 *  
## 2018 - 1860 == 0   11.139      2.135   5.217  < 0.001 ***
## 2015 - 2014 == 0    1.310      3.740   0.350  0.99926    
## 2016 - 2014 == 0   16.787      3.299   5.089  < 0.001 ***
## 2017 - 2014 == 0   18.250      3.313   5.509  < 0.001 ***
## 2018 - 2014 == 0   22.774      3.310   6.881  < 0.001 ***
## 2016 - 2015 == 0   15.477      3.047   5.080  < 0.001 ***
## 2017 - 2015 == 0   16.940      3.062   5.532  < 0.001 ***
## 2018 - 2015 == 0   21.464      3.059   7.018  < 0.001 ***
## 2017 - 2016 == 0    1.463      2.504   0.584  0.99160    
## 2018 - 2016 == 0    5.986      2.499   2.395  0.15099    
## 2018 - 2017 == 0    4.523      2.518   1.796  0.45580    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
Figure: Comparing Hervey records to records in each of the 5 Contemporary years

Figure: Comparing Hervey records to records in each of the 5 Contemporary years

Could part of the difference come from the different taxa being observed between Hervey and modern periods? Below, we do the same for just the 545 shared taxa.

Below, can look at these 545 shared species among different years. Again, 2014 and 2015 are significantly earlier (the post-hoc test puts them in the “a” group), followed by Hervey, which - for this species set - is not significantly different from the other years (the post-hoc test puts them in the “b” group).

##               Df   Sum Sq Mean Sq F value   Pr(>F)    
## year           5   156362   31272   10.84 2.26e-10 ***
## Residuals   3804 10974138    2885                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = day_of_year ~ year, data = joined_ret)
## 
## Linear Hypotheses:
##                  Estimate Std. Error t value Pr(>|t|)    
## 2014 - 1860 == 0 -14.2127     3.6657  -3.877  0.00141 ** 
## 2015 - 1860 == 0 -13.1884     3.3088  -3.986  < 0.001 ***
## 2016 - 1860 == 0   3.1727     2.5893   1.225  0.81914    
## 2017 - 1860 == 0   2.3349     2.6163   0.892  0.94641    
## 2018 - 1860 == 0   6.4583     2.6163   2.469  0.12886    
## 2015 - 2014 == 0   1.0243     4.3452   0.236  0.99990    
## 2016 - 2014 == 0  17.3854     3.8258   4.544  < 0.001 ***
## 2017 - 2014 == 0  16.5477     3.8441   4.305  < 0.001 ***
## 2018 - 2014 == 0  20.6711     3.8441   5.377  < 0.001 ***
## 2016 - 2015 == 0  16.3611     3.4853   4.694  < 0.001 ***
## 2017 - 2015 == 0  15.5233     3.5054   4.428  < 0.001 ***
## 2018 - 2015 == 0  19.6467     3.5054   5.605  < 0.001 ***
## 2017 - 2016 == 0  -0.8378     2.8362  -0.295  0.99969    
## 2018 - 2016 == 0   3.2856     2.8362   1.158  0.85179    
## 2018 - 2017 == 0   4.1234     2.8608   1.441  0.69401    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

5) Seasonal differences imply that things are not (simply) getting earlier

We still want to look for differnces between the different seasons in phenological change. I worked with Ivan to address my concerns about this analysis, and (to cut a long story short) what we were able to come up with is that for this data, regressing the mean shift of a species (difference between the species in contemporary and Hervey) on the mean flowering of a species (based on Hervey day of year) is not biased, whereas the other methods we considered (regressing on mean flowering of a species from contemporary observations or on an average of contemporary and Hervey) both are biased.

To demonstrate this, we did a simulation where our actual species mean shift values were randomly shuffled among species 100 times, and were able to show that the intercepts, slopes, and significances of these randomized simiulations looked very different than our actual data.

I’m not sure that we need to include any of this except for the model of our acutal data, but for your information, here’s one example simulation, our acutal data, and the slope and intercepts of the 100 simulations compared to the acutal data.

The outcome here is: yes: we may say that early-flowering species show a greater phenological delay, while late-flowering species show less or no change.

Full model for actual data:

## 
## Call:
## lm(formula = before_after ~ t1_mean, data = test_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -121.476  -13.392   -4.318   10.149  118.569 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 26.70264    4.66838   5.720 1.76e-08 ***
## t1_mean     -0.09148    0.02472  -3.701 0.000237 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25.57 on 543 degrees of freedom
## Multiple R-squared:  0.0246, Adjusted R-squared:  0.0228 
## F-statistic:  13.7 on 1 and 543 DF,  p-value: 0.0002369

6) What are environmental variables that explain this?

For the modern years, there is a small but significant pattern of earlier flowering in warmer years (A, solid blue line). However, our temperature and phenology data for the Hervey dataset do not follow this pattern, to an extent that there is no significant relationship when it is added (A, dotted black line).

In contrast, the small but significant pattern of LATER flower in modern wetter years (B, solid blue line) is in line with the Hervey data, and more significant when the Hervey period data added (B dotted black line).

## 
## Call:
## lm(formula = species_deviation ~ mean_temp, data = both[both$year != 
##     1860, ])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -66.590 -10.041   0.101   8.501  93.848 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   51.333     17.317   2.964  0.00323 **
## mean_temp     -4.519      1.559  -2.899  0.00397 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.15 on 368 degrees of freedom
## Multiple R-squared:  0.02232,    Adjusted R-squared:  0.01967 
## F-statistic: 8.403 on 1 and 368 DF,  p-value: 0.00397
## 
## Call:
## lm(formula = species_deviation ~ mean_temp, data = both)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -62.712 -10.368  -0.016   8.536  94.398 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -1.1309    12.9869  -0.087    0.931
## mean_temp     0.1039     1.1911   0.087    0.931
## 
## Residual standard error: 17.78 on 442 degrees of freedom
## Multiple R-squared:  1.723e-05,  Adjusted R-squared:  -0.002245 
## F-statistic: 0.007615 on 1 and 442 DF,  p-value: 0.9305
## 
## Call:
## lm(formula = species_deviation ~ sum_precip, data = both[both$year != 
##     1860, ])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -63.544 -10.378  -0.454   8.037  95.506 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -16.754469   6.500049  -2.578  0.01034 * 
## sum_precip    0.013224   0.004742   2.789  0.00557 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.17 on 368 degrees of freedom
## Multiple R-squared:  0.0207, Adjusted R-squared:  0.01804 
## F-statistic: 7.777 on 1 and 368 DF,  p-value: 0.005566
## 
## Call:
## lm(formula = species_deviation ~ sum_precip, data = both)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -63.113 -10.105  -0.194   9.109  96.364 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -21.210036   5.236805  -4.050 6.04e-05 ***
## sum_precip    0.016257   0.003963   4.102 4.88e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.45 on 442 degrees of freedom
## Multiple R-squared:  0.03667,    Adjusted R-squared:  0.03449 
## F-statistic: 16.82 on 1 and 442 DF,  p-value: 4.881e-05

6a) Divided by spring / fall

Spring-flowering species (those with species mean flowering date <183) are significantly advanced (that is, flower before the long-term mean date for their species) following warmer six-month periods (A). Responses of these species to precipitation are non-significant. In contrast, fall-flowering species (those with species mean flowering date >182) do not have a significant response to temperature but are significaly delayed following wetter six-month periods (B).

## 
## Call:
## lm(formula = species_deviation ~ p6mo, data = both_with_window[both_with_window$species_mean < 
##     183, ])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -61.182  -8.525  -0.887   7.430  60.342 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   0.9804     1.1763   0.833  0.40556   
## p6mo         -2.5507     0.9562  -2.668  0.00824 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 16.19 on 208 degrees of freedom
## Multiple R-squared:  0.03308,    Adjusted R-squared:  0.02843 
## F-statistic: 7.117 on 1 and 208 DF,  p-value: 0.008239

## 
## Call:
## lm(formula = species_deviation ~ p6mo_pr, data = both_with_window[both_with_window$species_mean > 
##     182, ])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -57.591 -10.076   1.239   8.699  94.258 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.27390    1.40520  -2.330   0.0207 *  
## p6mo_pr      0.05238    0.01195   4.382 1.78e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.21 on 232 degrees of freedom
## Multiple R-squared:  0.07644,    Adjusted R-squared:  0.07246 
## F-statistic:  19.2 on 1 and 232 DF,  p-value: 1.782e-05

In other words, more precipitation drives later fall flowering, and the fact that it’s much wetter now (in the months experienced by all but a few species in 2015) explains why fall flowering is later.

7) Duration / variability of flowering

We can look at variability comparing latest flowering date minus earliest flowering date for each species for contemporary observations, and for Hervey data, and then taking the difference of these two durations. Positive values indicate duration of the species has lengthened, negative that it has shortened.

If we compare the temporal range of each of the 545 species from the Hervey data that show up in at least one of the contemporary years to their temporal range across the contemporary years, it’s clear that the modern period (2014-2018) shows a greater variation/duration than the 1860-1911 period.

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  rjoined$t1_duration and rjoined$t2_duration
## W = 126270, p-value = 1.702e-05
## alternative hypothesis: true location shift is not equal to 0

And this isn’t just greater variability among contemporary years: there is greater duration in most contemorary years individually.

Contemporary years do differ in observed within-year duration.

## 
## Call:
## lm(formula = duration ~ year, data = both_with_window)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -57.07 -27.57 -20.07  19.93 289.93 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   27.568      5.811   4.744 2.84e-06 ***
## year2014      -7.081      8.217  -0.862 0.389313    
## year2015      -1.608      8.217  -0.196 0.844939    
## year2016      29.500      8.217   3.590 0.000368 ***
## year2017      28.554      8.217   3.475 0.000562 ***
## year2018      18.054      8.217   2.197 0.028539 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 49.98 on 438 degrees of freedom
## Multiple R-squared:  0.08119,    Adjusted R-squared:  0.0707 
## F-statistic: 7.741 on 5 and 438 DF,  p-value: 5.428e-07

Years with warmer temperatures are associated with greater within-year duration.

## 
## Call:
## lm(formula = duration ~ p6mo, data = both_with_window)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -53.82 -33.73 -21.33  22.28 300.10 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   36.298      2.656  13.667   <2e-16 ***
## p6mo           5.805      2.390   2.429   0.0155 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 51.57 on 442 degrees of freedom
## Multiple R-squared:  0.01317,    Adjusted R-squared:  0.01094 
## F-statistic: 5.899 on 1 and 442 DF,  p-value: 0.01555

8) Species origin and other traits

In the 1860-1911 time period, there are 1266 native species, 737 introduced species (not including those classifed as invasive) and 24 invasive species.

In this data, there is no significant difference in the midpoint of flowering between native and introduced species, but invasive species are significantly earlier.

##                 Df  Sum Sq Mean Sq F value Pr(>F)  
## is_introduced    2   18494    9247   4.519  0.011 *
## Residuals     2024 4141599    2046                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 observation deleted due to missingness
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = midpoints ~ is_introduced, data = dhervey_tidy)
## 
## Linear Hypotheses:
##                            Estimate Std. Error t value Pr(>|t|)   
## introduced - native == 0    -0.8763     2.0959  -0.418  0.89982   
## invasive - native == 0     -27.9759     9.3208  -3.001  0.00639 **
## invasive - introduced == 0 -27.0996     9.3828  -2.888  0.00889 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

In the 2014-2018 time period, there are 2184 native species, 1128 introduced species (not including those classifed as invasive) and 182 invasive species.

In this data, both introduced and invasive species have significantly earlier midpoints than native.

##                 Df   Sum Sq Mean Sq F value Pr(>F)    
## is_introduced    2   555415  277707   86.54 <2e-16 ***
## Residuals     3491 11202168    3209                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 37 observations deleted due to missingness
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = day_of_year ~ is_introduced, data = contemporary_dates)
## 
## Linear Hypotheses:
##                            Estimate Std. Error t value Pr(>|t|)    
## introduced - native == 0    -26.951      2.077 -12.976   <0.001 ***
## invasive - native == 0      -18.538      4.370  -4.242   <0.001 ***
## invasive - introduced == 0    8.413      4.525   1.859    0.141    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

This effect is unchanged if we look just at the 545 shared species monitored in both periods, which suggests that the introduced species have shifted more over time.

##                        Df   Sum Sq Mean Sq F value   Pr(>F)    
## is_introduced           2   371891  185946  68.663  < 2e-16 ***
## source                  1    14789   14789   5.461   0.0195 *  
## type                    2   182515   91258  33.698 2.84e-15 ***
## is_introduced:source    2   186302   93151  34.397 1.43e-15 ***
## is_introduced:type      4   258062   64515  23.823  < 2e-16 ***
## Residuals            5509 14918978    2708                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 38 observations deleted due to missingness

Taking a look at other species traits from the data Emily assembled for 280 species, which together account for 2990 observations (4-49 observations each). Annual/perennial/biennial, color, cultivation, growth habit, reproduction, and uses all show differences in day of year, but NOT in day of year * source interaction (that is, now difference in shift). In contrast, native range shows a significant interaction, but the species that seem to be driving this are few.

##               Df   Sum Sq Mean Sq F value Pr(>F)    
## value          2   261772  130886   46.23 <2e-16 ***
## Residuals   3647 10325250    2831                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##               Df  Sum Sq Mean Sq F value Pr(>F)    
## value          9  447444   49716   17.47 <2e-16 ***
## Residuals   2900 8253863    2846                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##               Df   Sum Sq Mean Sq F value Pr(>F)  
## value          1    14309   14309   4.921 0.0266 *
## Residuals   4463 12977982    2908                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##               Df   Sum Sq Mean Sq F value Pr(>F)    
## value          5  1352285  270457   99.33 <2e-16 ***
## Residuals   5118 13935378    2723                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##                               Df  Sum Sq Mean Sq F value Pr(>F)  
## num_range_categories           1    6778    6778   2.757 0.0971 .
## source                         1   15529   15529   6.316 0.0121 *
## num_range_categories:source    1    2182    2182   0.887 0.3464  
## Residuals                   1268 3117653    2459                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##                Df   Sum Sq Mean Sq F value Pr(>F)    
## value           2   320846  160423  54.208 <2e-16 ***
## source          1      691     691   0.234  0.629    
## value:source    2     8602    4301   1.453  0.234    
## Residuals    3616 10701127    2959                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##               Df   Sum Sq Mean Sq F value Pr(>F)    
## value          9   485329   53925  18.250 <2e-16 ***
## source         1     3050    3050   1.032   0.31    
## Residuals   5162 15252464    2955                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1