I first became interested in studying animal shelters last year, when my community learned of some devastating news: after receiving some anonymous tips about sub-par conditions, the Guilford County Sheriff’s office decided to visit the Guilford County Animal Shelter to see if the accusations had merit. Unfortunately, they did. Dozens of cases of animal abuse, neglect, and inhumane euthanasia were found at the shelter, which was run by a private company on behalf of the county. The private company was disbanded, criminal charges were filed, and the county has begun to self-manage the shelter. The abrupt change in management has been challenging for the county, and has and brought some very real concerns about animal safety and adoption to light. Although the private company may have kept records such as Austin’s in the past that could be used by Guilford County to improve decision making going forward, the data would be untrustworthy. The claims of animal neglect stem from allegations that records were mishandled and suffering animals were not euthanized when necessary so the private company could boast a low euthanization and high adoption rates. The general problem the county faces right now is that they just don’t have any reliable information to guide their decision making.
I plan to study Austin’s data to glean insights about general conceptions people have regarding animal shelters, described below:
My interest has been peaked and I want to learn what I can from the Austin data to see how I can help my community better understand animal intake and adoption rates and trends. Finding answers to these questions will help me better understand the issues encountered by my local animal shelter. I hope to pass on this information to the Chairman of the County Commissioners, who is currently overseeing the efforts of the county to take over the management of the shelter. Hopefully, information I discover from this data set will help me make a difference in my community.
The data sets were both found on the Data.gov website.
#Packages necessary for data analyses#
library(tidyverse)
library(lubridate)
library(DT)
library(ggplot2)
library(knitr)
library(plotly)
Before merging my two data sets, many variables needed to be given distinctive names, as they share the same name within both data sets, though they held different meaning. For example, the variable “DateTime” was in both of the data sets, but referred to the Intake Date in the intake data while the “DateTime” variable in the outcomes data referred to the Outcome Date. I arranged the data by Animal ID to prepare for the merger.
#Reading in "intakes" data#
intakes <- read_csv("Austin_Animal_Center_Intakes.csv") %>%
rename(`Intake Date` = DateTime) %>%
rename(`Intake Breed` = Breed) %>%
rename(`Intake Color` = Color) %>%
rename(`Intake Animal Type` = `Animal Type`) %>%
mutate(`Intake Date` = as.Date(`Intake Date`, format = "%m/%d/%Y")) %>%
select(-`Found Location`, -MonthYear) %>%
group_by(`Animal ID`) %>%
arrange(`Intake Date`) %>%
mutate(Visit = 1:n()) %>%
arrange(`Animal ID`)
datatable(head(intakes), options = list(dom = 't'))
For the outcomes data, my cleaning steps were very similar to the intakes data except for one key piece of information: Outcome Type. My researched showed there were 9 animals in the system that did not have an Outcome Type listed. I found that these 9 animals had Outcome Dates of within a day or two of the date of the data set. I made the assumption that these animals, although they left the shelter in some way, had not had their outcome recorded. For analysis purposes, I filtered out these 9 cases.
#Reading in "outcomes" data#
outcomes <- read_csv("Austin_Animal_Center_Outcomes.csv") %>%
rename(`Outcome Date` = DateTime) %>%
rename(`Outcome Animal Type` = `Animal Type`) %>%
rename(`Outcome Breed` = `Breed`) %>%
rename(`Outcome Color` = `Color`) %>%
mutate(`Outcome Date` = as.Date(`Outcome Date`, format = "%m/%d/%Y")) %>%
mutate(`Date of Birth` = as.Date(`Date of Birth`, format = "%m/%d/%Y")) %>%
select( -MonthYear, -Name) %>%
group_by(`Animal ID`) %>%
arrange(`Outcome Date`) %>%
mutate(Visit = 1:n()) %>%
arrange(`Animal ID`) %>%
filter(!is.na(`Outcome Type`))
datatable(head(outcomes), options = list(dom = 't'))
When I first merged my data into allanimals, I created a new variable, Days at Shelter by subtracting Intake Date from Outcome Date. I ran a sum() function to see if there were any Days at Shelter values that were less than zero. The result came back that I had 121 negative Days at Shelter ranging from -1 to -1337. After reviewing my code, I made the assumption that these dates were inputted incorrectly in the system at the Austin Animal Shelter (since an animal can’t leave the shelter before they have arrived at the shelter), and I filtered out these rows. That left me with 68,239 animal visits to analyze in order to answer my six questions.
#Merging intakes and outcomes#
allanimals <- merge(intakes,outcomes, by=c("Animal ID","Visit")) %>%
mutate(`Days at Shelter` = as.numeric(`Outcome Date` - `Intake Date`)) %>%
filter(`Days at Shelter` >= 0)
allanimals$`Days at Shelter`[allanimals$`Days at Shelter` == 0] <- 1
datatable(head(allanimals), options = list(dom = 't'))
| Variable | Character Type | Variable Description |
|---|---|---|
Animal ID |
Character | Unique identification number for each animal |
Vist |
Integer | Visit order within the time period of the data |
Name |
Character | Animal name, if known |
Intake Date |
Date | Date animal arrived at shelter |
Intake Type |
Character | Why animal is at the shelter |
Intake Condition |
Character | Health of animal at time of arrival |
Intake Animal Type |
Character | General animal type |
Sex Upon Intake |
Character | Gender of animal at time of arrival |
Age Upon Intake |
Character | Estimated age of animal at time of arrival |
Intake Breed |
Character | Specific breed of animal type |
Intake Color |
Character | Identifiable coloring/markings of the animal |
Outcome Date |
Date | Date animal left the shelter |
Date of Birth |
Date | The known or estimated date of birth |
Outcome Type |
Character | Why animal left the shelter |
Outcome Subtype |
Character | Reasoning behind Outcome Type |
Outcome Animal Type |
Character | General animal type |
Sex Upon Outcome |
Character | Gender of animal at time of departure |
Age Upon Outcome |
Character | Estimated age of animal at time of departure |
Outcome Breed |
Character | Specific breed of animal type |
Outcome Color |
Character | Identifiable coloring/markings of the animal |
Days at Shelter |
Numeric | How long the animal stayed at the shelter between intake and outcome |
To get a general idea of the counts within each Outcome Type, I did a simple table function to check quantities.
#Table of `Outcome Type` counts#
table(allanimals$`Outcome Type`)
##
## Adoption Died Disposal Euthanasia
## 28099 607 279 5579
## Missing Relocate Return to Owner Rto-Adopt
## 42 14 12639 74
## Transfer
## 20906
This table gives us a good overview of the Outcome Types. We see that the majority of animals are in fact adopted, and the amount of adoptions is over five times the amount of euthanizations. When exploring the data on Data.gov,, I found that Austin considers itself to be one of the best “No Kill” shelters in the country, boasting that over 90% of all animals are adopted, transferred, etc. to prevent them from being euthanized. A quick mathematical calculation proves that for this data set, less than 10% of animals are euthanized.
#Euthanizations dividied by all animal outcomes#
5579/68239
## [1] 0.08175677
To take a deeper look at this data, I wanted to see the variation in Outcome Types across the different Outcome Animal Types, so I created the two graphs below. Note: Hover cursor over data points to get more information.
#Stacked bar chart with `Outcome Animal Type` in the x-axis#
Q1_G1 <- allanimals %>%
ggplot() +
geom_bar(mapping = aes(x = `Outcome Animal Type`, fill = `Outcome Type`), position = "fill") +
labs(title = "Proportion of Outcome Type by Animal Type") +
scale_fill_discrete(name = "Outcomes")
ggplotly(Q1_G1)
In this first graph, we see that nearly 45% of all dogs are adopted, and only about 3.68% are euthanized. For cats, we see that over 41% of them are adopted and 5.2% are euthanized. The largest portion of euthanizations comes from the bird and “Other” categories, which we know are mostly composed of wild animals, but we will research that further in Question 4.
#Stacked bar chart with `Outcome Type` in the x-axis#
Q1_G2 <- allanimals %>%
ggplot() +
geom_bar(mapping = aes(x = `Outcome Type`, fill = `Outcome Animal Type`), position = "fill") +
scale_x_discrete(labels = c("Adoption" = "Adpt", "Died" = "Died", "Disposal" = "Dispo", "Euthanasia" = "Euthn", "Missing" = "Miss", "Relocate" = "Relo", "Return to Owner" = "RtOwn", "Rto-Adopt" = "RtO-Adpt", "Transfer" = "Trnsfr")) +
labs(title = "Propotion of Animal Type by Outcome Type") +
scale_fill_discrete(name = "Animal")
ggplotly(Q1_G2)
The second graph shows the same data from a different perspective. We can see that cats and dogs make up nearly 99% of adoptions, which mean they will be the key revenue drivers for the shelter, and revenue made from adoption fees can be used to offset the expenses they incur while at the shelter. Nearly half of all euthanizations are from animals in the “Other” category, which we will study more in Question 4. A surprising discovery in this graph is the transfer data, which shows that over 57% of animals that are transferred are cats, which we will research in Question 3.
An interesting variable in this set is titled “Missing”. The count that I ran earlier in this question showed that there are 42 animals that have been logged as “Missing” nearly for years of data collection. I have been unable to determine the meaning of “Missing” in the information provided by the Austin Animal Center. I can infer that these animals went missing while in the care of the center, or they were logged out of the shelter incorrectly and thus were labeled “Missing”.
While merging my data, I created the variable Days at Shelter to track the duration of an animal’s stay. I wanted to find out if dogs typically stay at the shelter longer than all animal types, so I conducted the hypothesis test below.
## [1] 15.93719
##
## One Sample t-test
##
## data: dog_days$`Days at Shelter`
## t = -4.0843, df = 38617, p-value = 2.215e-05
## alternative hypothesis: true mean is less than 15.7779
## 95 percent confidence interval:
## -Inf 15.32614
## sample estimates:
## mean of x
## 15.02152
##
## One Sample t-test
##
## data: log2(as.numeric(dog_days$`Days at Shelter`))
## t = -159.9, df = 38617, p-value < 2.2e-16
## alternative hypothesis: true mean is less than 3.979833
## 95 percent confidence interval:
## -Inf 2.474228
## sample estimates:
## mean of x
## 2.458579
##
## Wilcoxon signed rank test with continuity correction
##
## data: dog_days$`Days at Shelter`
## V = 179410000, p-value < 2.2e-16
## alternative hypothesis: true location is less than 15.7779
Because the p value for this test is 0.00000148, which is less than the standard significance level of alpha = 0.05, we can reject the null hypothesis and conclude that on average, dogs typically do stay at the shelter for more days than the population of all animal types. Below I created a violin graph to show the duration of stay based on animal type. Note: Hover cursor over data points to get more information.
#Graphically displaying `Days at Shelter`#
violin <- ggplot(allanimals, aes(`Outcome Animal Type`, `Days at Shelter`)) +
geom_violin() +
labs(title="Days at Shelter from Intake to Outcome",
subtitle="Grouped by Animal Type",
x= "Animal Type",
y= "Days at Shelter") +
coord_flip()
ggplotly(violin)
This graph is clearly right skewed for every animal type. We see that each animal type has the highest concentration of stays that last less than 60 days. Animals that are at the shelter longer than 60 days are generally considered to be outliers.
To compare the age of an animal and the time they spend at the shelter, I did a correlation analysis by creating a new variable, Age Upon Intake (years). I then compared that variable to the Days at Shelter, and grouped the data by Outcome Type to see if an animal’s age has an impact on how long they stay at the shelter before they are adopted.
#Correlation#
animals_age_cor <- allanimals %>%
mutate(`Age Upon Intake (years)` = round((`Intake Date` - `Date of Birth`)/365),
`Age Upon Intake (years)` = as.numeric(`Age Upon Intake (years)`),
`Days at Shelter` = as.numeric(`Days at Shelter`)) %>%
group_by(`Outcome Type`) %>%
summarize(r = round(cor(`Days at Shelter`, `Age Upon Intake (years)`), 2),
p_value = cor.test(`Days at Shelter`, `Age Upon Intake (years)`) $p.value)
datatable(animals_age_cor, caption = "Correlation Between `Days at Shelter` and `Age Upon Intake`")
From this table we can see that the weakest correlation between Age Upon Intake (years) and Days at Shelter is in the Adoption category, which debunks the common myth that when older animals come to the shelter they are likely to be there longer, as adoptive pet parents will prefer a younger animal. This information proves that when it comes to Adoptions, the age of the animal and the amount of time that they been at the shelter are random variables that have very little affect on the adoption. The strongest correlation can be found in the Transfer category. A likely cause of this could be that when older animals have been at the shelter for a long period of time, they are transferred to another shelter or rescue group that specializes in the care of older animals.
#Chart to track age of animal type, days at shelter, and outcome type#
both_age <- filter(allanimals, `Outcome Animal Type` %in% c("Dog", "Cat")) %>%
mutate(`Age Upon Intake (years)` = round((`Intake Date` - `Date of Birth`)/365)) %>%
group_by(`Outcome Type`, `Outcome Animal Type`) %>%
summarize(
count = n(),
`Shelter Days` = round(mean(`Days at Shelter`, na.rm = TRUE)),
`Age (years)` = round(mean(`Age Upon Intake (years)`, na.rm = TRUE))
) %>%
arrange(`Outcome Type`, `Outcome Animal Type`)
ggplot(data = both_age) +
geom_point(mapping = aes(x = `Age (years)`, y = `Shelter Days`, color = `Outcome Type`, shape = `Outcome Animal Type`, size = 1)) +
guides(size = FALSE) +
labs(title = "Comparison of Cat and Dog Age, Shelter Days, and Outcome Type")
In the graph above, we can see that cats and dogs have different ages and days at shelter based on outcome types. For example, if we look at the position of the two red points, we see that cats are typically a year younger than dogs when they are adopted. However, cats typically have to wait at the shelter for 36 days before they are adopted, whereas dogs typically only wait 24 days before they are adopted. This information can be vital to the shelter as they seek to budget expenses for each animal based on average length of stay. It is also helpful to know that animals that are ultimately returned to their owners stay at the shelter for about 4 days on average. Another item to note is that the average days that a dog is at the shelter is a very low value, about 7 days, and they were 3 years old on average. We can deduce that these animals were euthanized for reasons other than their age and their length of stay, which supports the Austin Animal Center’s claim that they are the best “No Kill” shelter in the country.
To answer this question I decided to study the “Other” variable category for Outcome Animal Type to see what kind of animals were included in this set. I filtered out all other Outcome Animal Types so that only the “Other” value was listed in the data. Then, I counted the unique values in the Outcome Breed variable, which gives more detail on the animal type.
#How many types of breed are there?#
animal_type2 <- allanimals %>%
filter(`Outcome Animal Type` == "Other") %>%
group_by(`Outcome Breed`) %>%
arrange(`Outcome Breed`)
unique(animal_type2$`Outcome Breed`)
## [1] "American Mix" "American Sable" "American Sable Mix"
## [4] "Angora-English Mix" "Angora-French Mix" "Armadillo"
## [7] "Armadillo Mix" "Bat" "Bat Mix"
## [10] "Bobcat Mix" "Californian" "Californian Mix"
## [13] "Checkered Giant Mix" "Chinchilla-Amer Mix" "Chinchilla-Stnd Mix"
## [16] "Chinchilla Mix" "Cinnamon" "Cinnamon Mix"
## [19] "Cold Water" "Cottontail Mix" "Coyote"
## [22] "Coyote Mix" "Dutch" "Dutch Mix"
## [25] "Dutch/Rabbit Sh" "English Spot" "English Spot Mix"
## [28] "Ferret" "Ferret Mix" "Fox"
## [31] "Fox Mix" "Frog" "Gerbil Mix"
## [34] "Guinea Pig" "Guinea Pig Mix" "Hamster"
## [37] "Hamster Mix" "Harlequin Mix" "Havana Mix"
## [40] "Hedgehog" "Hotot Mix" "Jersey Wooly Mix"
## [43] "Lionhead" "Lionhead Mix" "Lizard"
## [46] "Lizard Mix" "Lop-Amer Fuzzy Mix" "Lop-English Mix"
## [49] "Lop-Holland" "Lop-Holland Mix" "Lop-Mini"
## [52] "Lop-Mini Mix" "Mouse Mix" "Netherlnd Dwarf"
## [55] "Netherlnd Dwarf Mix" "New Zealand Wht Mix" "Opossum"
## [58] "Opossum Mix" "Otter Mix" "Polish Mix"
## [61] "Rabbit Lh" "Rabbit Lh Mix" "Rabbit Sh"
## [64] "Rabbit Sh Mix" "Rabbit Sh/Lop-Mini" "Raccoon"
## [67] "Raccoon Mix" "Rat" "Rat Mix"
## [70] "Rex" "Rex-Mini" "Rex-Mini/Lop-English"
## [73] "Rex Mix" "Ringtail Mix" "Silver Mix"
## [76] "Skunk" "Skunk Mix" "Snake"
## [79] "Snake Mix" "Snake/Snake" "Squirrel"
## [82] "Squirrel Mix" "Sugar Glider" "Tarantula"
## [85] "Tortoise" "Tortoise Mix" "Turtle"
## [88] "Turtle Mix"
88 categorical variables seemed a bit too broad. After reviewing the breed names and researching their animal type, I was able to group the 88 breeds into a new variable called Species. Note: Hover cursor over data points to get more information.
#Subsetting "Breed" into "Species"#
animal_species <- list()
animal_species[["Rabt"]] <- c("American Mix", "American Sable", "American Sable Mix",
"Angora-English Mix", "Angora-French Mix", "Californian",
"Californian Mix", "Checkered Giant Mix", "Cinnamon", "Cinnamon Mix",
"Cottontail Mix", "Dutch", "Dutch Mix", "Dutch/Rabbit Sh",
"English Spot", "English Spot Mix", "Harlequin Mix",
"Havana Mix", "Hotot Mix", "Jersey Wooly Mix", "Lionhead",
"Lionhead Mix", "Lop-Amer Fuzzy Mix", "Lop-English Mix",
"Lop-Holland", "Lop-Holland Mix", "Lop-Mini", "Lop-Mini Mix",
"Mouse Mix", "Netherlnd Dwarf", "Netherlnd Dwarf Mix",
"New Zealand Wht Mix", "Polish Mix", "Rabbit Lh", "Rabbit Lh Mix",
"Rabbit Sh", "Rabbit Sh/Lop-Mini", "Rabbit Sh Mix", "Rex", "Rex-Mini",
"Rex-Mini/Lop-English", "Rex Mix", "Silver Mix")
animal_species[["Bat"]] <- c("Bat", "Bat Mix")
animal_species[["Ferret"]] <- c("Ferret", "Ferret Mix")
animal_species[["Fox"]] <- c("Fox", "Fox Mix")
animal_species[["GinP"]] <- c("Guinea Pig", "Guinea Pig Mix")
animal_species[["Hmst"]] <- c("Hamster", "Hamster Mix")
animal_species[["Opsm"]] <- c("Opossum", "Opossum Mix")
animal_species[["Racn"]] <- c("Raccoon", "Raccoon Mix")
animal_species[["Rat"]] <- c("Rat", "Rat Mix")
animal_species[["Sknk"]] <- c("Skunk", "Skunk Mix")
animal_species[["Snke"]] <- c("Snake", "Snake Mix", "Snake/Snake")
animal_species[["Squirl"]] <- c("Squirrel", "Squirrel Mix")
animal_species[["Trtl"]] <- c("Tortoise", "Tortoise Mix", "Turtle", "Turtle Mix")
animal_species[["Othr"]] <- c("Armadillo", "Armadillo Mix","Bobcat Mix","Chinchilla-Amer Mix", "Chinchilla-Stnd Mix",
"Chinchilla Mix", "Cold Water", "Coyote", "Coyote Mix", "Frog", "Gerbil Mix", "Hedgehog",
"Lizard", "Lizard Mix", "Otter Mix", "Ringtail Mix", "Sugar Glider", "Tarantula")
animal_type2$Species <- NA
for(ob in 1:nrow(animal_type2)) {
for(ob2 in names(animal_species)) {
#print(animal_species[[ob2]])
#print(animal_type2[ob,]$`Outcome Breed`)
if(animal_type2[ob,]$`Outcome Breed` %in% animal_species[[ob2]]) {
animal_type2[ob,]$Species <- ob2
}
}
}
species_chart <- ggplot(data = animal_type2) +
geom_bar(mapping = aes(x = `Species`, fill = `Outcome Type`)) +
labs(title = "Outcome Type by Species")
ggplotly(species_chart)
In the graphs created for Question 1 I found that the “Other” animal type category had the highest instance of euthanization. In the graph above, we can see that the main cause of the high euthanasia rates in the category can be attributed to more wild species, such as bats, raccoon, opossums, skunks, squirrels, and foxes. The high level of bats that come to the Austin Animal Center can be expected due to the large bat population (estimated at around 750,000) that the city is known for. We can assume that the Guilford County Animal Shelter will not have nearly as many bats coming through their doors each day.
The highest adoption counts in this category can be attributed to the different species of rabbit, but there are only a total of 109 rabbits recorded as being adopted in the nearly four year span of this data. Additionally, the table below shows that rabbits typically stay at the shelter for three weeks before their outcome.
#How many days are these other animals at the shelter before they leave?#
species_days <- animal_type2 %>%
group_by(Species) %>%
summarize(
count = n(),
`Days Before Outcome` = round(mean(`Days at Shelter`, na.rm = TRUE))) %>%
arrange(desc(`Days Before Outcome`))
datatable(species_days, options = list(pageLength = 14), caption = "Days at Shelter")
The costs associated with housing and feeding such a small number of rabbits for three weeks seems like it would be a constraint on money and resources for the shelter, with little return. I would recommend that any potentially adoptable animals be transferred to a specialty adoption organization for that specific species, be it a rabbit, a snake, or a ferret, so that the shelter does not have to use resources for these small quantities of animals. The the specialty adoption organizations will be more equipped to provide food and shelter to the animals while they use their specific resources to locate homes for the animals.
I wanted to explore whether the often-heard stigma about black colored animals was actually true. Do potential adoptive parents steer clear of black animals because of the their so-called “bad luck”? To get an initial view of the data, I pulled out only observations where the animal was either a cat or dog, since these are the most popular animals and more likely to carry this stigma, and then also filtered for any solid color animals of Black, Gray, Brown, Tan, Yellow, and White. I decided that these solid colors would be the best for me to use to compare to Black. I then created the table below to display the percentage dispersion of Outcome Type given an animal’s coloring.
#Color data for table analysis#
animals_color <- allanimals %>%
filter(`Outcome Color` %in% c("Black", "Gray", "Brown", "Tan", "Yellow", "White")) %>%
filter(`Outcome Animal Type` %in% c("Dog", "Cat")) %>%
arrange(`Outcome Color`) %>%
group_by(`Outcome Color`)
ccc <- animals_color %>%
count(`Outcome Type`, `Outcome Color`) %>%
mutate(perc = round(prop.table(n)*100, 2)) %>%
select(-n) %>%
spread(`Outcome Type`, perc)
datatable(ccc, caption = "Outcome Type by Animal Color (%)")
The first two cell values of this table disprove the black animals theory. We see that over 40% of black cats and dogs that left the shelter were adopted, nearly the exact same proportion as brown cats and dogs. In fact, black cats and dogs are adopted 40.41% of the time, which barely makes the color black the second most popular color for cat and dog adoptions, with tan being the most popular color at 45.40%. Looking at euthanasia, we see that once again, the two values are nearly identical, with brown having slightly higher values. The highest rate for euthanasia is Gray, which is the same color that has the lowest proportional adoption rate.
To analyze the counts of adopted animals by color category, I created the heat chart below. The lighter the shade of blue, the higher the concentration of animals had that outcome and coloring. Blank cells denote outcome and coloring combinations where values were zero or close to zero. Note: Hover cursor over data points to get more information.
#Heat map#
color <- animals_color %>%
group_by(`Outcome Color`, `Outcome Type`) %>%
summarize(count = n())
heat <- ggplot(color, aes(`Outcome Type`, `Outcome Color`)) +
geom_raster(aes(fill = count)) +
labs(title = "Heat Map of Animal Outcomes by Color") +
scale_fill_continuous(name = "Count") +
scale_x_discrete(labels = c("Adoption", "Died", "Disposal", "Euthanasia", "Missing", "Rtn to Own", "RtO Adpt", "Transfer"))
ggplotly(heat)
This graph reiterates the information in the table above, with black colored cats and dogs having the highest count of adoptions across all color combinations. Over 2,200 black cats and dogs have been adopted from the Austin Animal Center between October 2013 and July 2017.
To get a sequential view of the data, I filtered out the dates in late 2013 and early 2017 so that I would have three full years of data: 2014, 2015, 2016. Using the cleaned data, I created the animated chart below. Note: Click “Play” at the bottom of the graph to see the year-by-year analysis.
#Data and graph for Adoption analysis#
startdate <- as.Date("2014-01-01")
enddate <- as.Date("2016-12-31")
animal_time <- filter(allanimals, `Outcome Date`> startdate & `Outcome Date`< enddate & `Outcome Type` == "Adoption") %>%
mutate(month = format(`Outcome Date`, "%m")) %>%
mutate(year = format(`Outcome Date`, "%Y")) %>%
filter(`Outcome Animal Type`%in% c("Dog", "Cat")) %>%
group_by(year, month, `Outcome Animal Type`) %>%
summarize(count = n())
warning <- getOption("warn")
options(warn = -1)
month_chart <- ggplot(animal_time, aes(month, count, group = `Outcome Animal Type`, color = `Outcome Animal Type`)) +
geom_line(aes(frame = year)) +
geom_point(aes(frame = year)) +
labs(title = "Adoptions Per Month by Animal Type",
x = "Month",
y = "Number of Adoptions")
ggplotly(month_chart)
invisible(options(warn = warning))
To get a tabular view of this information, I created the table below. Contrary to the Guilford County Animal Shelter’s comments, it appears that cats have the highest average adoptions in the months of June, July, and August. As one might assume, dogs have a high average adoption count in December, but also May, July, and August. With both cats and dogs, the month of August appears to have average adoptions in the top quartile of months. This means that either a) the Guilford County Animal Shelter does not have sufficient data to make the claim that August is a low month for adoptions, or b) There is an external factor causing Austin to have higher adoption rates in these months, such as additional marketing campaigns or successful adoption fairs. Asking the questions about these external factors could also help clear up the year-by-year changes in the graph, such as the leap in adoptions of cats in July between 2014 and 2015, or the drop in dog adoptions between 2014 and 2015 followed by an increase between 2015 and 2016. I recommend that further study take place to find out if August truly is a low adoption rate month for the Guilford County Animal Shelter, and if so, what efforts does Austin undertake to make the month successful?
#table#
time_table <- animal_time %>%
group_by(month, `Outcome Animal Type`) %>%
summarize(`Average Adoptions` = round(mean(count, na.rm = TRUE))) %>%
spread(key = `Outcome Animal Type`, value = `Average Adoptions`)
#melt(month = c("Outcome Animal Type"))
#cast(month ~ `Outcome Animal Type`)
datatable(time_table, options = list(pageLength = 12), rownames = FALSE, colnames = c('Month', 'Avg Cat Adoptions', 'Avg Dog Adoptions'), caption = "Average Adoptions Per Month 2014-2016")
The data showed that nearly 99% of all adoptions are cats and dogs, so other animal types like birds and livestock make up about 1% of adoptions. Almost half of all euthanizations are animals from the “Other” category. Research on the Austin Animal Center showed that it is the top “No kill” shelter in the country. Guilford Country should examine their records to see what percentage of animals that are not cats and dogs are treated at the shelter, and should determine if the resources necessary to house and feed these animals could be spent elsewhere if the animals were transferred to specialty organizations.
A hypothesis test determined that on average, dogs do in fact stay at the shelter longer than than the general population average of all animal types. However, the violin graph showed that there are actually some more extreme outliers in the cat category, with a few cats staying in the shelter for over 1,000 days. My recommendation to the Guilford Country Animal Shelter would be to make sure no animal stays in the shelter for that length of time. Animals that have been at the shelter longer than a year should be tracked so they can be given special marketing promotion or transferred to an organization that can give them specialized attention so they can be adopted at a faster rate.
To test the common conception that older animals are less desirable than younger animals to potential adoptive pet parents, I ran a correlation analysis and found that the weakest correlation between the age of the animal and the days they spend at the shelter is in the adoption category. A dot graph proved that there is a vast amount of data to be studied that may surprise readers, such as the fact that cats are adopted at an average age younger than dogs, but cats tend to stay at the shelter for more days than dogs on average. Any information gleaned from this graph could be useful to the Guilford County Animal Shelter as they prepare their annual budget and attempt to estimate the average time each animal will be at the shelter to extrapolate their annual expenditures.
Building on the information I learned in Question 1, I dove deeper into the “Other” animal category and found that a large proportion of this category can be attributed to bats that are very common in the Austin area. The bats make up a large proportion of the euthanizations at the shelter, so they are actually inflating the shelter’s overall euthanization percentage. Because bats are not nearly as populous in Guilford Country as they are in Austin, the County does not need to worry about this inflated number. However, they should still research the costs associated with housing the other species with more rare occurrences, such as gerbils, snakes, rabbits, etc. that are considered adoptable animals that may stay at the shelter longer waiting for the perfect adoptive parent to walk through the door. A specialty organization would be better equipped to handle the housing of these animals, as well as the marketing efforts needed to find them a forever home.
I found that the commonly held myth that black animals are less desirable than other colors is actually proved false in the Austin Animal Center Data. Gray animals are in fact the least desirable in terms of adoption percentage, and they are euthanized the most out of the other four color categories. The Guilford County Animal Shelter should study it’s own records to track the coloring of animals that are adopted to confirm this analysis. If they confirm that gray animals are less desirable than black animals, perhaps their adoption fee waiving offer for black cats and dogs on a Friday the 13th should instead be switched to a full moon gray animal adoption fee waiving offer.
Lastly, I wanted to study if adoptions were seasonal after recently discovering that Guilford County is waiving adoption fees in the month of August, saying that it is a slow month for the shelter. My analysis proved that in Austin, the summer months are historically the most popular months for adoptions, especially July. My recommendation to Guilford County would be to study their adoption data to ensure that August is a slow month for adoptions, and if so, find out if their are any external factors that drive Austin’s adoption rate upwards in the summer months, such as a large adoption fair or additional spending on marketing.
In conclusion, if this report helps even one animal find a new forever home, then a month of working on this report will be worth every second. I have poured over this data with constant thoughts my love for my community, and I know that this research can be used as a spring board for better policy making, strategic decision making, and marketing expenditures. I plan to share this information with as many government officials as possible to see how this information can help the Guilford County Animal Shelter as it navigates this new management transition.