11/28/2017

The Ordway Birds

Background

Ordway bird species is a table of records of birds captured and released at the Katharine Ordaway natural History study area.

There are mistakes in the data entry. The variable SpeciesName needs some fixing. It identifies each of the species of birds, the some of the spelling of the birds with similar names varies. This leads so misclassifications of birds. There is also month and day variables that have issues as well.

Continuation

The data table OrdwaySpeciesNames collects together all the different types of names. The assignment is basically creating a manual for birders to guide them to the correct time of year to visit ordway to see a particular species.

Cleaning up data

Getting the data table:

OrdwayBirds <-
  OrdwayBirds %>%
  select(SpeciesName, Month, Day) %>%
  mutate(Month= as.numeric(as.character(Month)),
         Day= as.numeric(as.character(Day)))

The mutate() function arranges month and day as numerical variables.

Task 1

Part 1

Including Mis-spellings, how many different species are there in OrdwayBirds data?

Make a data table that gives the number of the distinct species in the SpeciesNameCleaned variable in OrdwaySpeciesNames. Using n_distinct() is very helpful counting the number of unique values in a variable.

Solution

OrdwayBirds %>%
  summarise(count = n_distinct(SpeciesName))
##   count
## 1   275

New data table:

OrdwaySpecNameCount <-
  OrdwaySpeciesNames %>%
  summarise(count = n_distinct(SpeciesNameCleaned))
OrdwaySpecNameCount 
##   count
## 1   109

Task 2

Part 1

Use the OrdwaySpeciesNames table to create a new data table that corrects the mispellings in SpeciesNames. Can be done by easily using the inner_join() data verb.

Corrected <-
  OrdwayBirds %>%
  inner_join( OrdwaySpeciesNames) %>%
  select(Species = SpeciesNameCleaned, Month, Day) %>%
  na.omit() # Cleans up the missing ones

Output

Data table

Data table

Questions

Look at the names of the varibles in OrdwaySpeciesNames and OrdwayBirds:

Whch variable was used for matching cases?

  • Species Name

What were the variables that will be added?

  • The variables added will be the month and day plus the new variable which consist of the corrected bird names

Task 3

Part 1

Count how many bird captures there are of each of the corrected species. You can call the data table that contains the count, CountCorrect. Arrange this into descending order from the species with the most birds and look through the list.

CountCorrect <- 
  Corrected %>% 
  group_by(Species) %>% 
  summarise(count=n()) %>%
  arrange(desc(count))

Output

Part 2

Define for yourself a "major species" as a species with more than a particular threshold count. Set your threshold so that there are 5 or 6 species designated a major

Filter to produce a data table with only the birds that belong to a major species. Save the output in a table called Majors.

topSixSpec <-
  CountCorrect %>% 
  head(n = 6) %>% 
  .$Species
topSixSpec
## [1] "Slate-colored Junco"    "Tree Swallow"          
## [3] "Black-capped Chickadee" "American Goldfinch"    
## [5] "Field Sparrow"          "Lincoln's Sparrow"

Then filter…..

Majors <-
  Corrected %>% 
  filter(Species %in% topSixSpec)

Output

Task 4

Part 1

Write a command that produces the month-by-month count of each of the major species. Call this table ByMonth.

ByMonth <- Majors %>%
  group_by(Species, Month) %>%
  summarise(count = n()) %>%
  arrange(Month)

Output

Part 2

Display this month-by-month count with a bar chart arranged in a way that will tell the story of what time of year the various species appear.

  • First: Need to change the month variables from numbers to month names
CrazyMonth <- monthAbbr <- with(Majors, 
                  plyr::mapvalues(Month, from = 1:12,
                                  to = month.abb))
CrazyMonth <- factor(CrazyMonth, levels = month.abb)
Majors$Month <- CrazyMonth

Part 2 Cont.

  • Last: Graph to make bar plot
Majors %>% 
  ggplot(aes(x = Month)) +
  geom_bar() +
  facet_wrap(~ Species) +
  theme(axis.text.x=element_text(angle=45,hjust=1))

Here is The Graph

What time of year the various Species appear.

What time of year the various Species appear.

Questions From bar plot

  • Which species are present year-round?
  • American Goldfinch
  • Black-capped chickadee
  • Which species are migratory, that is, primarly present in one or two seasons?
  • Lincolns Sparrow
  • Slate-colored Junco

Questions cont.

  • What is the peak month for each major species?
  • American Goldfinch: October
  • Black-capped Chickadee: November
  • Field Sparrow: May
  • Lincolns Sparrow: October
  • Slate-colored Junco: October
  • Tree swallow: March
  • Which major species that are seen in good numbers for atleast 6 months of the year?
  • American Goldfinch
  • Black-capped Chickadee
  • Tree swallow

Conclusion

Overall, correcting the data allows for people to basically know what months are best for going to the Ordway to see particular species especially the major species. In the future trying to do minor species as well with a threshold then compare the months of the major and minor species.