I would like to hunt next year and would like to know which season will provide me the best chance of success for a certain Unit (Unit 77). How did things go in past years?
setwd("~/_code/colorado-dow/Phase I - Descriptive Analytics")
Load required libraries for wrangling data and charting
library(dplyr,quietly = T)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2, quietly = T)
Prettier chart theme
prettytheme <- theme(
axis.text=element_text(colour="#606060",family="Muli-Regular"),
plot.title=element_text(hjust = 0.5,colour="#333333", family="Muli-Bold"),
panel.grid.major = element_line(colour = "#d8d8d8"),
panel.background = element_rect(fill="#ffffff"),
plot.background = element_rect(fill = "#ffffff"),
axis.title=element_text(colour="#707070", family="Muli-Regular"),
axis.title.x=element_text(vjust=-.3),
legend.text=element_text(color="#333333",family="Muli-Regular"),
legend.background = element_rect(fill='#ffffff'),
legend.direction = "horizontal",
legend.position = "top",
legend.key = element_rect(fill='#ffffff',colour='#ffffff'),
panel.grid.minor= element_blank(),
strip.text = element_text(family="Muli-Regular", colour="#333333"),
strip.background=element_rect(fill="#ffffff", colour="#ffffff")
)
Palette from highcharts
hcpalette <- c('#7cb5ec', '#434348', '#90ed7d', '#f7a35c', '#8085e9', '#f15c80', '#e4d354', '#8085e8', '#8d4653', '#91e8e1')
Run script to get hunt tables
source('~/_code/colorado-dow/datasets/read colorado dow pdf.R', echo=F)
COElkRifleInspect <- COElkRifleAll
First lets look at the entire state as a whole
COElkRifleSuccess <- summarise(group_by(COElkRifleInspect,Year),
Success = mean(Success),
Harvest_Effort = mean(Harvest_Effort,na.rm = T))
COElkRifleSuccess
## # A tibble: 12 x 3
## Year Success Harvest_Effort
## <chr> <dbl> <dbl>
## 1 2006 24.8 24.4
## 2 2007 21.5 31.2
## 3 2008 21.4 32.1
## 4 2009 23.3 26.9
## 5 2010 23.3 28.0
## 6 2011 22.1 26.0
## 7 2012 21.9 28.2
## 8 2013 20.7 27.7
## 9 2014 20.6 32.4
## 10 2015 23.1 27.6
## 11 2016 18.9 39.4
## 12 2017 20.1 33.0
ggplot(COElkRifleSuccess, aes(Year,Success)) +
geom_bar(stat="identity") +
prettytheme +
ggtitle("Statewide Rifle Elk Hunting Success")
Overall I should expect a success rate of ~20%, and besides 2016 things appear to be consistent.
FUTURE question, what happened in 2016 that caused success rates to drop statewide?
How about statewide per season?
COElkRifleSuccess1 <- summarise(group_by(COElkRifleInspect,Season),
Success = mean(Success),
Harvest_Effort = mean(Harvest_Effort,na.rm = T))
COElkRifleSuccess1
## # A tibble: 4 x 3
## Season Success Harvest_Effort
## <chr> <dbl> <dbl>
## 1 1 28.9 15.9
## 2 2 19.4 35.5
## 3 3 18.0 39.0
## 4 4 22.0 26.6
ggplot(COElkRifleSuccess1, aes(Season,Success)) +
geom_bar(stat="identity") +
prettytheme +
ggtitle("Statewide Rifle Elk Hunting Success")
Looks like the first season has the best success over all of the years, and success drops through the main seasons before improving a bit for the last season.
Now to our question. What about Unit 77?
COElkRifleSuccess77 <- filter(COElkRifleInspect, Unit == 77) # filter out Unit 77
ggplot(COElkRifleSuccess77, aes(Season,Success)) +
geom_bar(stat="identity",position = 'dodge') +
scale_fill_manual(values = hcpalette) +
prettytheme +
ggtitle("Unit 77 Rifle Elk Hunting Success\n2006-2017")
Unit 77 has the same trend thru First to Third Seasons, but the Fourth season is the best
Lets see if this is always the case
ggplot(COElkRifleSuccess77, aes(Year,Success,group=Season,fill=Season)) +
geom_bar(stat="identity",position = 'dodge') +
scale_fill_manual(values = hcpalette) +
prettytheme +
ggtitle("Unit 77 Rifle Elk Hunting Success by Year")
Recently, the first and fourth seasons have the highest success. Last year the Fourth Season had the best success rate.
Conversely, the Third season has had the highest success only once, in 2009.
At this point I believe my choice is between First and Fourth season. One of the points of consideration could be the type of weather I enjoy to hunt in. Subjectively, early October is generally much nicer than late November.
FUTURE question, what weather data can we attach to the hunt units in past years?
The hunt tables provide more data as well. Maybe I want to avoid the busy seasons with lots of hunters. Maybe I want to be sure I have access to this unit for all seasons regulated by preference points from CPW.
FUTURE question, do I have access to preference points required for hunting in certain units per season?
Lets look at the how many hunters are in each of these seasons
ggplot(COElkRifleSuccess77, aes(Season,Hunters)) +
geom_bar(stat="identity",position = 'dodge') +
scale_fill_manual(values = hcpalette) +
prettytheme +
ggtitle("Unit 77 Number of Rifle Hunters")
The fourth and first seasons definately have less hunters, but also a shorter Seasons, so its not a true measure of density (busy). Will need to populate the table with Season Duration somehow
FUTURE add Season Durations
Success from the CPW tables is merely Harvest / Hunters. But a truer measure might have to do with effort. Recreation Days is an estimate from CPW on how many hunter days were put in the field…regardless of how long they were out there.
Maybe a good measure would be how much effort it takes to have a successful result, or Rec Days / Harvest… how much effort to harvest
# Adding this field to the data acquisition script
Again, lets start with a statewide summary to view expected results
ggplot(COElkRifleSuccess, aes(Year,Harvest_Effort)) +
geom_bar(stat="identity") +
prettytheme +
ggtitle("Statewide Rifle Elk Hunting Effort Required")
Not sure if this provides any important info towards my question. I do note that 2016 required an unusually amount of extra effort.
Unit 77 by season for all years can tell me overall the differences in seasonal effort
ggplot(COElkRifleSuccess77, aes(Season,Harvest_Effort)) +
geom_bar(stat="identity",position = 'dodge') +
scale_fill_manual(values = hcpalette) +
prettytheme +
ggtitle("Unit 77 Rifle Elk Hunting Effort Required\n2006-2017")
## Warning: Removed 1 rows containing missing values (geom_bar).
In Unit 77, it is clear that the first season will usually take the fewest amount of hunting days to have success.
ggplot(COElkRifleSuccess77, aes(Year,Harvest_Effort,group=Season,fill=Season)) +
geom_bar(stat="identity",position = 'dodge') +
scale_fill_manual(values = hcpalette) +
prettytheme +
ggtitle("Unit 77 Rifle Elk Hunting Effort Required")
## Warning: Removed 1 rows containing missing values (geom_bar).
In recent years the trend is similar. First has been easier, and then fourth season. Last year the fourth season required the least amount of effort.
Using the data from these tables it appears that the first season should be my first preference for my 2018 hunt in Unit 77.
However, I have thought of additional questions or pieces to investigate.
Why does the first season have the highest success rate? Why is the required effort to be successful lower in the first season?