Visit GitHub Repository

1 .R Packages required

Make sure you have latest R and Rstudio installed before starting this process. These are the R packages that are required to complete the data cleaning and documentation using Rstudio.

  • knitr - for rendering HTML reports
  • tidyverse - for data manupulations

Note: The above packages do not come with Rstudio installation, they need to be installed explictly, use the packages tab or just type install.packages(“package_name”).

Next load the R packages:

2 .Avian Monitoring

2.1 Data Preparation

Reshaped data of Avia Monitoring

Monitoring_Route Date eBird eachCount totalCount H_S variable value Weeks WeekNoYear Year Month MonthWithYear season
Blue Trail 2018-06-27 N 14 10 Seen GBHE 2 2018-W26 W26 2018 6 2018-06 summer
Blue Trail 2016-07-13 N 28 14 Heard MAWR 1 2016-W28 W28 2016 7 2016-07 summer
Purple Trail 2017-08-28 N 11 6 2017-W35 W35 2017 8 2017-08 summer
Blue Trail 2016-07-30 N 15 5 Seen GBHE 4 2016-W30 W30 2016 7 2016-07 summer
Blue Trail 2017-03-13 N 12 4 Seen CAGO 8 2017-W11 W11 2017 3 2017-03 spring

Aggregating by each species and week with no year

WeekNoYear variable value
W14 GBHE 1
W22 WODU 4
W11 CAGO 11
W26 GBHE 7
W22 WOTH 3

Data from eBird

Week_starting_on species Frequency Total_checklists_submitted Abundance Birds_Per_Party_Hour checklists_reporting_species High_Count checklists_reporting_species__1 Totals checklists_reporting_species__2 Average_Count checklists_reporting_species__3 variable
01-07 Canada Goose 37.442681 5670 50.3238095 385.820322 2137 4500 2293 304769 2187 139.354824 2187 CAGO
10-21 Great Blue Heron 23.639091 4666 0.5816545 4.145696 1128 74 1144 2820 1137 2.480211 1137 GBHE
07-21 Wood Thrush 9.742288 5122 0.1805935 2.445683 511 12 539 971 527 1.842505 527 WOTH
07-31 Wood Duck 16.662159 3697 1.6418718 10.848080 618 86 624 6171 618 9.985437 618 WODU
02-07 Wood Thrush 0.000000 7679 0.0000000 0.000000 0 0 0 0 0 0.000000 0 WOTH

2.2 Descriptive Analysis

2.2.1 Species Number in Sample

Data Summary:

This plot shows the total number of each species based on the csv file. As per the plot we can see that CAGO is the highest in number and WOTH is lowest in number. This refers to that CAGO are the most seen species and GBHE is the second highest in number. Also, the data shows that MAWR and WODU are almost similar in number.

2.2.2 Alluvial Diagram

Data Summary:

a, CAGO(Canada Goose) and GBHE(Great Blue Hero) are the same magnitude, MAWR(Marsh Wren), WOTH(Wood Thrush) and WODU(Wood Duck) are another size group in this sample.

b,2017 has the largest sample number, but 2016 and 2018 are not much less than 2017. Different species have different composition ratio.

c, The way of observation in this five species concentrates on Seen. MAWR and WOTH have a large proportion on Heard than Seen. MAWR has no sample on H&S.

d, Most of the proportion in the sample comes from Blue Trail, but most WOTH number comes from Red Trail.

e, Each species has its unique season to be observated.

  • CAGO appears in winter,spring and autumn.
  • MAWR only appears on summer.
  • GBHE is mostly observated in summer,and part is in spring.
  • Most of WODU are in summer, and a little of them are in autumn.
  • WOTH is only in summer.

2.2.3 Cross Table

With season

variable autumn spring summer winter
CAGO 22 63 1 31
MAWR 0 1 19 0
GBHE 1 18 85 0
WODU 5 3 16 0
WOTH 0 0 14 0

With Monitoring Route

variable Blue Trail Green Trail Purple Trail Red Trail
CAGO 105 6 2 4
MAWR 20 0 0 0
GBHE 103 1 0 0
WODU 24 0 0 0
WOTH 0 1 1 12

With H_S

variable H&S Heard Seen
CAGO 2 12 103
MAWR 0 14 6
GBHE 10 10 84
WODU 6 0 18
WOTH 1 10 3

With Year

variable 2016 2017 2018
CAGO 24 72 21
MAWR 11 1 8
GBHE 30 33 41
WODU 13 9 2
WOTH 7 2 5

2.3 ANOVA Analysis

According to descriptive part, We can be sure each variable with each species in the sample is not homogeneous. But five species have their unique characters on each variable, and we may eplore more relationship on the variables.

The total number of observation in MAWR, WOTH and WODU is not much enough to well estimate the variables difference in the sample. We will only take a trying to test anova group defference on the two species(CAGO, GBHE).

2.3.1 Canada Goose(CAGO)

## Analysis of Variance Table
## 
## Response: value
##                  Df Sum Sq Mean Sq F value Pr(>F)
## H_S               2 565.79 282.893  2.0096 0.2488
## Year              1   7.49   7.494  0.0532 0.8288
## Monitoring_Route  3 301.97 100.657  0.7150 0.5924
## season            3  86.87  28.956  0.2057 0.8876
## Residuals         4 563.10 140.774

As the result shows, we can not reject the null hypothesis on 5%, GAGO’s observation number in the four variables has no significant different.

2.3.2 Great blue hero(GBHE)

## Analysis of Variance Table
## 
## Response: value
##                  Df Sum Sq Mean Sq F value  Pr(>F)  
## H_S               2 414.00 207.000  3.7619 0.08733 .
## season            2 356.36 178.182  3.2382 0.11122  
## Year              1  15.99  15.988  0.2906 0.60926  
## Monitoring_Route  1  61.50  61.499  1.1177 0.33112  
## Residuals         6 330.15  55.025                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Still no variable’s p-value low than 5% in GBHE. But as the result of the fit and the alluvial diagram suggest, the independence variables have interaction.

## Analysis of Variance Table
## 
## Response: value
##            Df Sum Sq Mean Sq F value  Pr(>F)  
## season      2 265.94 132.971  3.2609 0.11002  
## H_S         2 504.42 252.210  6.1850 0.03484 *
## season:H_S  2 162.97  81.485  1.9983 0.21622  
## Residuals   6 244.67  40.778                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

With the involving of the interaction, the H_S variable seems to be significant this time. It means different season has different observation ways number.

2.4 Comparing with eBird

In this part, we want to compare the OWC bird data against eBird data. We want to keep the same five indicator spieces as our sample, which are CAGO, MAWR, GNHE, WOTH and WODU. This part mainly focuses on the different observation number of bird species between OWC data and eBird data which we find in the eBird website based on the week. With the limited source and access to eBird data, we only find some plot from eBird website which are related with these five indicator species. Then, we will make some plot of the same species based on OWC bird data.

In order to develop side-by-side comparison of the same details, we use week as our x-axis and observation number as our y-axis, which are the same meaning as eBird data. Then, we can analysis the total number of speices based on the week and see the comparsion betwwen OWC data and eBird data.

2.4.1 Canada Goose(CAGO)

2.4.2 Marsh Wren(MAWR)

2.4.3 Great blue hero(GBHE)

2.4.4 Wood Thrush (WOTH)

2.4.5 Wood Duck (WODU)

Based on the above plots, we can see that there are more data aboout total number of species in eBird dataset. However, with AvianMonitoring OWC dataset, there are lack of the observation number of species. Even we cumulate the three years (2016,2017 and 2018), it still cannot be covered in each week because of lack of data. Thus, we cannot make a very clearly comparison between OWC and eBird data.

3 Bald Eagle Nesting

Setting the working directory and reading the cleaned data:

‘eagle_raw_data’ stores the whole dataset from csv file. We make data manupulations of the variable ‘eagle_raw_data’.

4 .Contributorship

Indra - I worked on Bald eagle, github.

Sun - Worked on eBird Data, github

Kalpana - worked on Indicator species, github and proofreading.