Analysis of the Austin Animal Center

Introduction

The Problem and Why I Care

I first became interested in studying animal shelters last year, when my community learned of some devastating news: after receiving some anonymous tips about sub-par conditions, the Guilford County Sheriff’s office decided to visit the Guilford County Animal Shelter to see if the accusations had merit. Unfortunately, they did. Dozens of cases of animal abuse, neglect, and inhumane euthanasia were found at the shelter, which was run by a private company on behalf of the county. The private company was disbanded, criminal charges were filed, and the county has begun to self-manage the shelter. The abrupt change in management has been challenging for the county, and has and brought some very real concerns about animal safety and adoption to light. Although the private company may have kept records such as Austin’s in the past that could be used by Guilford County to improve decision making going forward, the data would be untrustworthy. The claims of animal neglect stem from allegations that records were mishandled and suffering animals were not euthanized when necessary so the private company could boast a low euthanization and high adoption rates. The general problem the county faces right now is that they just don’t have any reliable information to guide their decision making.


Solving The Problem

I plan to study Austin’s data to glean insights about general conceptions people have regarding animal shelters, described below:

  • Question 1: How many animals that come through the shelter get adopted? Get transferred? Get euthanized?
    • Methodology: Study the counts of the different outcomes in the Output data.
    • Why it’s important: Knowing this statistic would be beneficial for the county to know so they can project their expenses. If the survival rate (the rate of any outcome besides euthanasia) is high, that should be published for public relations purposes.
    • Packages Required: tidyverse to read in the data, ggplot to create a histograms.
  • Question 2: How long do animals stay in the shelter?
    • Methodology: This would require tracking the Animal ID variable to compare Intake Date in the Intake data with the Outcome Date and Outcome Type from the Outcome data. I could then use my results to sort by Outcome Type and compare my results to what I calculate in my first bullet.
    • Why it’s important: If If the average time an animal stays at the shelter is high, perhaps the shelter needs to push their marketing efforts. Dog customers know where they are located? Are the adoption fees comparable to other counties? Should more pictures of animals be put on social media?
    • Packages Required: tidyverse to read in the data, rename() and mutate() to prepare for merging, lubridate to change the Date columns from character vectors to dates. Do an inner mutating join of the data sets so Outcome Date can be subtracted from Intake Date to create the new variable Days at Shelter. ggplot to chart the findings, and plotly to make the charts interactive.
  • Question 3: Are older animals adopted at a slower rate than younger animals?
    • Methodology: I would use the same methodology as the question above, but would include the Date of Birth variable from the Outcome data. Then, I would sort for those animals that were adopted and create a new variable for how many days they were in the shelter before adoption.
    • Why it’s important: Keeping animals in a shelter for long periods of time racks up added expenses for the shelter, and increases the chance of euthanasia.
    • Packages Required: tidyverse to read in the data, lubridate to change the Date columns from character vectors to dates. DT for the table, ggplot to compare Date of Birth to new variable, Days at Shelter.
  • Question 4: Are there any other animals at the shelter besides dogs and cats?
    • Methodology: Study the counts of the different types of animals, group them into “Dog or Cat” and “Other” to check dispersion.
    • Why it’s important: This is another point that is important for budget and PR reasons. If it costs the shelter a significant amount of money to care for the specialty animals, perhaps they should be transferred to another organization. If cost is low, the shelter should consider better advertising these animals so people know adoption is for more than just dogs and cats.
    • Packages Required: tidyverse to read in the data, DT for the table, ggplot to graph Intake Animal Type.
  • Question 5: Does the color of an animal dictate if they get adopted or euthanized?
    • Methodology: Compare the color of the animal with the outcome.
    • Why it’s important: The shelter sometimes holds events where you can adopt an animal of specific color at a discounted rate, such as their black dog and cat Black Friday discount.
    • Packages Required: tidyverse to read in the data, filter() to only look at the animals that are a single color, not mixed. DT for the table, ggplot to great sideways bar chart with color of animal on y axis and average days in shelter on x axis.
  • Question 6: Are adtoptions seasonal? Are there certain months where animal adoptions are more/less frequent?
    • Methodology: Only study animal adoptions by month for the period of January 1, 2014 to December 31, 2016 so that data will not be skewed. Study counts and their changes by month across a three year period.
    • Why it’s important: On August 7, 2017, the Guilford County animal shelter released a statement that all animal adoption fees would be waived through the end of August. Their reasoning for this decision was that adoptions are typically low in this month. Studying the historic data over several years will show if adoptions do in fact have seasonality, which could help the shelter adjust their budgetary and marketing needs around these low demand times.
    • Packages Required: tidyverse to read in the data, lubridate to recognize dates, DT for the table, ggplot2 and plotly to create animated graph.

How I’ll Help

My interest has been peaked and I want to learn what I can from the Austin data to see how I can help my community better understand animal intake and adoption rates and trends. Finding answers to these questions will help me better understand the issues encountered by my local animal shelter. I hope to pass on this information to the Chairman of the County Commissioners, who is currently overseeing the efforts of the county to take over the management of the shelter. Hopefully, information I discover from this data set will help me make a difference in my community.


Packages and Data

Data Sets To Be Analyzed

The data sets were both found on the Data.gov website.


Packages

#Packages necessary for data analyses#
library(tidyverse)
library(lubridate)
library(DT)
library(ggplot2)
library(knitr)
library(plotly)

Intakes

Before merging my two data sets, many variables needed to be given distinctive names, as they share the same name within both data sets, though they held different meaning. For example, the variable “DateTime” was in both of the data sets, but referred to the Intake Date in the intake data while the “DateTime” variable in the outcomes data referred to the Outcome Date. I arranged the data by Animal ID to prepare for the merger.

#Reading in "intakes" data#
intakes <- read_csv("Austin_Animal_Center_Intakes.csv") %>%
  rename(`Intake Date` = DateTime) %>%
  rename(`Intake Breed` = Breed) %>%
  rename(`Intake Color` = Color) %>%
  rename(`Intake Animal Type` = `Animal Type`) %>%
  mutate(`Intake Date` = as.Date(`Intake Date`, format = "%m/%d/%Y")) %>%
  select(-`Found Location`, -MonthYear) %>%
  group_by(`Animal ID`) %>%
  arrange(`Intake Date`) %>%
  mutate(Visit = 1:n()) %>%
  arrange(`Animal ID`)

datatable(head(intakes), options = list(dom = 't'))

Outcomes

For the outcomes data, my cleaning steps were very similar to the intakes data except for one key piece of information: Outcome Type. My researched showed there were 9 animals in the system that did not have an Outcome Type listed. I found that these 9 animals had Outcome Dates of within a day or two of the date of the data set. I made the assumption that these animals, although they left the shelter in some way, had not had their outcome recorded. For analysis purposes, I filtered out these 9 cases.

#Reading in "outcomes" data#
outcomes <- read_csv("Austin_Animal_Center_Outcomes.csv") %>%
  rename(`Outcome Date` = DateTime) %>%
  rename(`Outcome Animal Type` = `Animal Type`) %>%
  rename(`Outcome Breed` = `Breed`) %>%
  rename(`Outcome Color` = `Color`) %>%
  mutate(`Outcome Date` = as.Date(`Outcome Date`, format = "%m/%d/%Y")) %>%
  mutate(`Date of Birth` = as.Date(`Date of Birth`, format = "%m/%d/%Y")) %>%
  select( -MonthYear, -Name) %>%
  group_by(`Animal ID`) %>%
  arrange(`Outcome Date`) %>%
  mutate(Visit = 1:n()) %>%
  arrange(`Animal ID`) %>%
  filter(!is.na(`Outcome Type`))

datatable(head(outcomes), options = list(dom = 't'))

Merged Data

When I first merged my data into allanimals, I created a new variable, Days at Shelter by subtracting Intake Date from Outcome Date. I ran a sum() function to see if there were any Days at Shelter values that were less than zero. The result came back that I had 121 negative Days at Shelter ranging from -1 to -1337. After reviewing my code, I made the assumption that these dates were inputted incorrectly in the system at the Austin Animal Shelter (since an animal can’t leave the shelter before they have arrived at the shelter), and I filtered out these rows. That left me with 68,239 animal visits to analyze in order to answer my six questions.

#Merging intakes and outcomes#
allanimals <- merge(intakes,outcomes, by=c("Animal ID","Visit")) %>%
  mutate(`Days at Shelter` = as.numeric(`Outcome Date` - `Intake Date`)) %>%
  filter(`Days at Shelter` >= 0)

allanimals$`Days at Shelter`[allanimals$`Days at Shelter` == 0] <- 1

datatable(head(allanimals), options = list(dom = 't'))

Data Dictionary

Variable Character Type Variable Description
Animal ID Character Unique identification number for each animal
Vist Integer Visit order within the time period of the data
Name Character Animal name, if known
Intake Date Date Date animal arrived at shelter
Intake Type Character Why animal is at the shelter
Intake Condition Character Health of animal at time of arrival
Intake Animal Type Character General animal type
Sex Upon Intake Character Gender of animal at time of arrival
Age Upon Intake Character Estimated age of animal at time of arrival
Intake Breed Character Specific breed of animal type
Intake Color Character Identifiable coloring/markings of the animal
Outcome Date Date Date animal left the shelter
Date of Birth Date The known or estimated date of birth
Outcome Type Character Why animal left the shelter
Outcome Subtype Character Reasoning behind Outcome Type
Outcome Animal Type Character General animal type
Sex Upon Outcome Character Gender of animal at time of departure
Age Upon Outcome Character Estimated age of animal at time of departure
Outcome Breed Character Specific breed of animal type
Outcome Color Character Identifiable coloring/markings of the animal
Days at Shelter Numeric How long the animal stayed at the shelter between intake and outcome

Question 1

How many animals that come through the shelter get adopted? Get euthanized? Get transfered?

To get a general idea of the counts within each Outcome Type, I did a simple table function to check quantities.

#Table of `Outcome Type` counts#
table(allanimals$`Outcome Type`)
## 
##        Adoption            Died        Disposal      Euthanasia 
##           28099             607             279            5579 
##         Missing        Relocate Return to Owner       Rto-Adopt 
##              42              14           12639              74 
##        Transfer 
##           20906

This table gives us a good overview of the Outcome Types. We see that the majority of animals are in fact adopted, and the amount of adoptions is over five times the amount of euthanizations. When exploring the data on Data.gov,, I found that Austin considers itself to be one of the best “No Kill” shelters in the country, boasting that over 90% of all animals are adopted, transferred, etc. to prevent them from being euthanized. A quick mathematical calculation proves that for this data set, less than 10% of animals are euthanized.

#Euthanizations dividied by all animal outcomes#
5579/68239
## [1] 0.08175677

To take a deeper look at this data, I wanted to see the variation in Outcome Types across the different Outcome Animal Types, so I created the two graphs below. Note: Hover cursor over data points to get more information.

#Stacked bar chart with `Outcome Animal Type` in the x-axis#
Q1_G1 <- allanimals %>%  
  ggplot() +
  geom_bar(mapping = aes(x = `Outcome Animal Type`, fill = `Outcome Type`), position = "fill") +
  labs(title = "Proportion of Outcome Type by Animal Type") +
  scale_fill_discrete(name = "Outcomes")

ggplotly(Q1_G1)

In this first graph, we see that nearly 45% of all dogs are adopted, and only about 3.68% are euthanized. For cats, we see that over 41% of them are adopted and 5.2% are euthanized. The largest portion of euthanizations comes from the bird and “Other” categories, which we know are mostly composed of wild animals, but we will research that further in Question 4.

#Stacked bar chart with `Outcome Type` in the x-axis#
Q1_G2 <- allanimals %>%  
  ggplot() +
  geom_bar(mapping = aes(x = `Outcome Type`, fill = `Outcome Animal Type`), position = "fill") +
  scale_x_discrete(labels = c("Adoption" = "Adpt", "Died" = "Died", "Disposal" = "Dispo", "Euthanasia" = "Euthn", "Missing" = "Miss", "Relocate" = "Relo", "Return to Owner" = "RtOwn", "Rto-Adopt" = "RtO-Adpt", "Transfer" = "Trnsfr")) +
  labs(title = "Propotion of Animal Type by Outcome Type") +
  scale_fill_discrete(name = "Animal")
      

ggplotly(Q1_G2)

The second graph shows the same data from a different perspective. We can see that cats and dogs make up nearly 99% of adoptions, which mean they will be the key revenue drivers for the shelter, and revenue made from adoption fees can be used to offset the expenses they incur while at the shelter. Nearly half of all euthanizations are from animals in the “Other” category, which we will study more in Question 4. A surprising discovery in this graph is the transfer data, which shows that over 57% of animals that are transferred are cats, which we will research in Question 3.

An interesting variable in this set is titled “Missing”. The count that I ran earlier in this question showed that there are 42 animals that have been logged as “Missing” nearly for years of data collection. I have been unable to determine the meaning of “Missing” in the information provided by the Austin Animal Center. I can infer that these animals went missing while in the care of the center, or they were logged out of the shelter incorrectly and thus were labeled “Missing”.

Question 2

How long do animals stay in the shelter?

While merging my data, I created the variable Days at Shelter to track the duration of an animal’s stay. I wanted to find out if dogs typically stay at the shelter longer than all animal types, so I conducted the hypothesis test below.

## [1] 15.93719
## 
##  One Sample t-test
## 
## data:  dog_days$`Days at Shelter`
## t = -4.0843, df = 38617, p-value = 2.215e-05
## alternative hypothesis: true mean is less than 15.7779
## 95 percent confidence interval:
##      -Inf 15.32614
## sample estimates:
## mean of x 
##  15.02152
## 
##  One Sample t-test
## 
## data:  log2(as.numeric(dog_days$`Days at Shelter`))
## t = -159.9, df = 38617, p-value < 2.2e-16
## alternative hypothesis: true mean is less than 3.979833
## 95 percent confidence interval:
##      -Inf 2.474228
## sample estimates:
## mean of x 
##  2.458579
## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  dog_days$`Days at Shelter`
## V = 179410000, p-value < 2.2e-16
## alternative hypothesis: true location is less than 15.7779

Because the p value for this test is 0.00000148, which is less than the standard significance level of alpha = 0.05, we can reject the null hypothesis and conclude that on average, dogs typically do stay at the shelter for more days than the population of all animal types. Below I created a violin graph to show the duration of stay based on animal type. Note: Hover cursor over data points to get more information.

#Graphically displaying `Days at Shelter`#
violin <- ggplot(allanimals, aes(`Outcome Animal Type`, `Days at Shelter`)) +
  geom_violin() +
  labs(title="Days at Shelter from Intake to Outcome",
       subtitle="Grouped by Animal Type",
       x= "Animal Type",
       y= "Days at Shelter") +
  coord_flip()

ggplotly(violin)

This graph is clearly right skewed for every animal type. We see that each animal type has the highest concentration of stays that last less than 60 days. Animals that are at the shelter longer than 60 days are generally considered to be outliers.

Question 3

Are older animals adoped at a slower rate than younger animals?

To compare the age of an animal and the time they spend at the shelter, I did a correlation analysis by creating a new variable, Age Upon Intake (years). I then compared that variable to the Days at Shelter, and grouped the data by Outcome Type to see if an animal’s age has an impact on how long they stay at the shelter before they are adopted.

#Correlation#
animals_age_cor <- allanimals %>%
  mutate(`Age Upon Intake (years)` = round((`Intake Date` - `Date of Birth`)/365),
         `Age Upon Intake (years)` = as.numeric(`Age Upon Intake (years)`),
         `Days at Shelter` = as.numeric(`Days at Shelter`)) %>%
  group_by(`Outcome Type`) %>%
  summarize(r = round(cor(`Days at Shelter`, `Age Upon Intake (years)`), 2),
            p_value = cor.test(`Days at Shelter`, `Age Upon Intake (years)`) $p.value)

datatable(animals_age_cor, caption = "Correlation Between `Days at Shelter` and `Age Upon Intake`")

From this table we can see that the weakest correlation between Age Upon Intake (years) and Days at Shelter is in the Adoption category, which debunks the common myth that when older animals come to the shelter they are likely to be there longer, as adoptive pet parents will prefer a younger animal. This information proves that when it comes to Adoptions, the age of the animal and the amount of time that they been at the shelter are random variables that have very little affect on the adoption. The strongest correlation can be found in the Transfer category. A likely cause of this could be that when older animals have been at the shelter for a long period of time, they are transferred to another shelter or rescue group that specializes in the care of older animals.

#Chart to track age of animal type, days at shelter, and outcome type#
both_age <- filter(allanimals, `Outcome Animal Type` %in% c("Dog", "Cat")) %>%
  mutate(`Age Upon Intake (years)` = round((`Intake Date` - `Date of Birth`)/365)) %>%
  group_by(`Outcome Type`, `Outcome Animal Type`) %>%
  summarize(
    count = n(),
    `Shelter Days` = round(mean(`Days at Shelter`, na.rm = TRUE)),
    `Age (years)` = round(mean(`Age Upon Intake (years)`, na.rm = TRUE))
  ) %>%
  arrange(`Outcome Type`, `Outcome Animal Type`)

ggplot(data = both_age) +
  geom_point(mapping = aes(x = `Age (years)`, y = `Shelter Days`, color = `Outcome Type`, shape = `Outcome Animal Type`, size = 1)) +
  guides(size = FALSE) +
  labs(title = "Comparison of Cat and Dog Age, Shelter Days, and Outcome Type")

In the graph above, we can see that cats and dogs have different ages and days at shelter based on outcome types. For example, if we look at the position of the two red points, we see that cats are typically a year younger than dogs when they are adopted. However, cats typically have to wait at the shelter for 36 days before they are adopted, whereas dogs typically only wait 24 days before they are adopted. This information can be vital to the shelter as they seek to budget expenses for each animal based on average length of stay. It is also helpful to know that animals that are ultimately returned to their owners stay at the shelter for about 4 days on average. Another item to note is that the average days that a dog is at the shelter is a very low value, about 7 days, and they were 3 years old on average. We can deduce that these animals were euthanized for reasons other than their age and their length of stay, which supports the Austin Animal Center’s claim that they are the best “No Kill” shelter in the country.

Question 4

Are there any other animals at the shelter besides dogs and cats?

To answer this question I decided to study the “Other” variable category for Outcome Animal Type to see what kind of animals were included in this set. I filtered out all other Outcome Animal Types so that only the “Other” value was listed in the data. Then, I counted the unique values in the Outcome Breed variable, which gives more detail on the animal type.

#How many types of breed are there?#
animal_type2 <- allanimals %>%
  filter(`Outcome Animal Type` == "Other") %>%
  group_by(`Outcome Breed`) %>%
  arrange(`Outcome Breed`)

unique(animal_type2$`Outcome Breed`)
##  [1] "American Mix"         "American Sable"       "American Sable Mix"  
##  [4] "Angora-English Mix"   "Angora-French Mix"    "Armadillo"           
##  [7] "Armadillo Mix"        "Bat"                  "Bat Mix"             
## [10] "Bobcat Mix"           "Californian"          "Californian Mix"     
## [13] "Checkered Giant Mix"  "Chinchilla-Amer Mix"  "Chinchilla-Stnd Mix" 
## [16] "Chinchilla Mix"       "Cinnamon"             "Cinnamon Mix"        
## [19] "Cold Water"           "Cottontail Mix"       "Coyote"              
## [22] "Coyote Mix"           "Dutch"                "Dutch Mix"           
## [25] "Dutch/Rabbit Sh"      "English Spot"         "English Spot Mix"    
## [28] "Ferret"               "Ferret Mix"           "Fox"                 
## [31] "Fox Mix"              "Frog"                 "Gerbil Mix"          
## [34] "Guinea Pig"           "Guinea Pig Mix"       "Hamster"             
## [37] "Hamster Mix"          "Harlequin Mix"        "Havana Mix"          
## [40] "Hedgehog"             "Hotot Mix"            "Jersey Wooly Mix"    
## [43] "Lionhead"             "Lionhead Mix"         "Lizard"              
## [46] "Lizard Mix"           "Lop-Amer Fuzzy Mix"   "Lop-English Mix"     
## [49] "Lop-Holland"          "Lop-Holland Mix"      "Lop-Mini"            
## [52] "Lop-Mini Mix"         "Mouse Mix"            "Netherlnd Dwarf"     
## [55] "Netherlnd Dwarf Mix"  "New Zealand Wht Mix"  "Opossum"             
## [58] "Opossum Mix"          "Otter Mix"            "Polish Mix"          
## [61] "Rabbit Lh"            "Rabbit Lh Mix"        "Rabbit Sh"           
## [64] "Rabbit Sh Mix"        "Rabbit Sh/Lop-Mini"   "Raccoon"             
## [67] "Raccoon Mix"          "Rat"                  "Rat Mix"             
## [70] "Rex"                  "Rex-Mini"             "Rex-Mini/Lop-English"
## [73] "Rex Mix"              "Ringtail Mix"         "Silver Mix"          
## [76] "Skunk"                "Skunk Mix"            "Snake"               
## [79] "Snake Mix"            "Snake/Snake"          "Squirrel"            
## [82] "Squirrel Mix"         "Sugar Glider"         "Tarantula"           
## [85] "Tortoise"             "Tortoise Mix"         "Turtle"              
## [88] "Turtle Mix"

88 categorical variables seemed a bit too broad. After reviewing the breed names and researching their animal type, I was able to group the 88 breeds into a new variable called Species. Note: Hover cursor over data points to get more information.

#Subsetting "Breed" into "Species"#
animal_species <- list()
  animal_species[["Rabt"]] <- c("American Mix", "American Sable", "American Sable Mix", 
                                  "Angora-English Mix", "Angora-French Mix", "Californian", 
                                  "Californian Mix", "Checkered Giant Mix", "Cinnamon", "Cinnamon Mix", 
                                  "Cottontail Mix", "Dutch", "Dutch Mix", "Dutch/Rabbit Sh", 
                                  "English Spot", "English Spot Mix", "Harlequin Mix", 
                                  "Havana Mix", "Hotot Mix", "Jersey Wooly Mix", "Lionhead", 
                                  "Lionhead Mix", "Lop-Amer Fuzzy Mix", "Lop-English Mix", 
                                  "Lop-Holland", "Lop-Holland Mix", "Lop-Mini", "Lop-Mini Mix", 
                                  "Mouse Mix", "Netherlnd Dwarf", "Netherlnd Dwarf Mix", 
                                  "New Zealand Wht Mix", "Polish Mix", "Rabbit Lh", "Rabbit Lh Mix", 
                                  "Rabbit Sh", "Rabbit Sh/Lop-Mini", "Rabbit Sh Mix", "Rex", "Rex-Mini", 
                                  "Rex-Mini/Lop-English", "Rex Mix", "Silver Mix")
  animal_species[["Bat"]] <- c("Bat", "Bat Mix")
  animal_species[["Ferret"]] <- c("Ferret", "Ferret Mix")
  animal_species[["Fox"]] <- c("Fox", "Fox Mix")
  animal_species[["GinP"]] <- c("Guinea Pig", "Guinea Pig Mix")
  animal_species[["Hmst"]] <- c("Hamster", "Hamster Mix")
  animal_species[["Opsm"]] <- c("Opossum", "Opossum Mix")
  animal_species[["Racn"]] <- c("Raccoon", "Raccoon Mix")
  animal_species[["Rat"]] <- c("Rat", "Rat Mix")
  animal_species[["Sknk"]] <- c("Skunk", "Skunk Mix")
  animal_species[["Snke"]] <- c("Snake", "Snake Mix", "Snake/Snake")
  animal_species[["Squirl"]] <- c("Squirrel", "Squirrel Mix")
  animal_species[["Trtl"]] <- c("Tortoise", "Tortoise Mix", "Turtle", "Turtle Mix")
  animal_species[["Othr"]] <- c("Armadillo", "Armadillo Mix","Bobcat Mix","Chinchilla-Amer Mix", "Chinchilla-Stnd Mix", 
                                 "Chinchilla Mix", "Cold Water", "Coyote", "Coyote Mix", "Frog", "Gerbil Mix", "Hedgehog", 
                                 "Lizard", "Lizard Mix", "Otter Mix", "Ringtail Mix", "Sugar Glider", "Tarantula")
  
animal_type2$Species <- NA
for(ob in 1:nrow(animal_type2)) {
  for(ob2 in names(animal_species)) {
    #print(animal_species[[ob2]])
    #print(animal_type2[ob,]$`Outcome Breed`)
    if(animal_type2[ob,]$`Outcome Breed` %in% animal_species[[ob2]]) {
      animal_type2[ob,]$Species <- ob2
    }
  }
}

species_chart <- ggplot(data = animal_type2) +
  geom_bar(mapping = aes(x = `Species`, fill = `Outcome Type`)) +
  labs(title = "Outcome Type by Species")

ggplotly(species_chart)

In the graphs created for Question 1 I found that the “Other” animal type category had the highest instance of euthanization. In the graph above, we can see that the main cause of the high euthanasia rates in the category can be attributed to more wild species, such as bats, raccoon, opossums, skunks, squirrels, and foxes. The high level of bats that come to the Austin Animal Center can be expected due to the large bat population (estimated at around 750,000) that the city is known for. We can assume that the Guilford County Animal Shelter will not have nearly as many bats coming through their doors each day.

The highest adoption counts in this category can be attributed to the different species of rabbit, but there are only a total of 109 rabbits recorded as being adopted in the nearly four year span of this data. Additionally, the table below shows that rabbits typically stay at the shelter for three weeks before their outcome.

#How many days are these other animals at the shelter before they leave?#
species_days <- animal_type2 %>%
  group_by(Species) %>%
  summarize(
    count = n(),
    `Days Before Outcome` = round(mean(`Days at Shelter`, na.rm = TRUE))) %>%
  arrange(desc(`Days Before Outcome`))

datatable(species_days, options = list(pageLength = 14), caption = "Days at Shelter")

The costs associated with housing and feeding such a small number of rabbits for three weeks seems like it would be a constraint on money and resources for the shelter, with little return. I would recommend that any potentially adoptable animals be transferred to a specialty adoption organization for that specific species, be it a rabbit, a snake, or a ferret, so that the shelter does not have to use resources for these small quantities of animals. The the specialty adoption organizations will be more equipped to provide food and shelter to the animals while they use their specific resources to locate homes for the animals.

Question 5

Does the color of an animal dictate if they get adopted or euthanized?

I wanted to explore whether the often-heard stigma about black colored animals was actually true. Do potential adoptive parents steer clear of black animals because of the their so-called “bad luck”? To get an initial view of the data, I pulled out only observations where the animal was either a cat or dog, since these are the most popular animals and more likely to carry this stigma, and then also filtered for any solid color animals of Black, Gray, Brown, Tan, Yellow, and White. I decided that these solid colors would be the best for me to use to compare to Black. I then created the table below to display the percentage dispersion of Outcome Type given an animal’s coloring.

#Color data for table analysis#
animals_color <- allanimals %>%
  filter(`Outcome Color` %in% c("Black", "Gray", "Brown", "Tan", "Yellow", "White")) %>%
  filter(`Outcome Animal Type` %in% c("Dog", "Cat")) %>%
  arrange(`Outcome Color`) %>%
  group_by(`Outcome Color`)

ccc <- animals_color %>%
  count(`Outcome Type`, `Outcome Color`) %>%
  mutate(perc = round(prop.table(n)*100, 2)) %>%
  select(-n) %>%
  spread(`Outcome Type`, perc)

datatable(ccc, caption = "Outcome Type by Animal Color (%)")

The first two cell values of this table disprove the black animals theory. We see that over 40% of black cats and dogs that left the shelter were adopted, nearly the exact same proportion as brown cats and dogs. In fact, black cats and dogs are adopted 40.41% of the time, which barely makes the color black the second most popular color for cat and dog adoptions, with tan being the most popular color at 45.40%. Looking at euthanasia, we see that once again, the two values are nearly identical, with brown having slightly higher values. The highest rate for euthanasia is Gray, which is the same color that has the lowest proportional adoption rate.

To analyze the counts of adopted animals by color category, I created the heat chart below. The lighter the shade of blue, the higher the concentration of animals had that outcome and coloring. Blank cells denote outcome and coloring combinations where values were zero or close to zero. Note: Hover cursor over data points to get more information.

#Heat map#
color <- animals_color %>%
  group_by(`Outcome Color`, `Outcome Type`) %>%
  summarize(count = n())

heat <- ggplot(color, aes(`Outcome Type`, `Outcome Color`)) +
  geom_raster(aes(fill = count)) +
  labs(title = "Heat Map of Animal Outcomes by Color") +
  scale_fill_continuous(name = "Count") +
  scale_x_discrete(labels = c("Adoption", "Died", "Disposal", "Euthanasia", "Missing", "Rtn to Own", "RtO Adpt", "Transfer"))

ggplotly(heat)

This graph reiterates the information in the table above, with black colored cats and dogs having the highest count of adoptions across all color combinations. Over 2,200 black cats and dogs have been adopted from the Austin Animal Center between October 2013 and July 2017.

Question 6

Are adoptions seasonal?

To get a sequential view of the data, I filtered out the dates in late 2013 and early 2017 so that I would have three full years of data: 2014, 2015, 2016. Using the cleaned data, I created the animated chart below. Note: Click “Play” at the bottom of the graph to see the year-by-year analysis.

#Data and graph for Adoption analysis#
startdate <- as.Date("2014-01-01")
enddate <- as.Date("2016-12-31")

animal_time <- filter(allanimals, `Outcome Date`> startdate & `Outcome Date`< enddate & `Outcome Type` == "Adoption") %>%
  mutate(month = format(`Outcome Date`, "%m")) %>%
  mutate(year = format(`Outcome Date`, "%Y")) %>%
  filter(`Outcome Animal Type`%in% c("Dog", "Cat")) %>%
  group_by(year, month, `Outcome Animal Type`) %>%
  summarize(count = n())

warning <- getOption("warn")
options(warn = -1)

month_chart <- ggplot(animal_time, aes(month, count, group = `Outcome Animal Type`, color = `Outcome Animal Type`)) +
  geom_line(aes(frame = year)) +
  geom_point(aes(frame = year)) +
  labs(title = "Adoptions Per Month by Animal Type",
       x = "Month",
       y = "Number of Adoptions")

ggplotly(month_chart)
invisible(options(warn = warning))

To get a tabular view of this information, I created the table below. Contrary to the Guilford County Animal Shelter’s comments, it appears that cats have the highest average adoptions in the months of June, July, and August. As one might assume, dogs have a high average adoption count in December, but also May, July, and August. With both cats and dogs, the month of August appears to have average adoptions in the top quartile of months. This means that either a) the Guilford County Animal Shelter does not have sufficient data to make the claim that August is a low month for adoptions, or b) There is an external factor causing Austin to have higher adoption rates in these months, such as additional marketing campaigns or successful adoption fairs. Asking the questions about these external factors could also help clear up the year-by-year changes in the graph, such as the leap in adoptions of cats in July between 2014 and 2015, or the drop in dog adoptions between 2014 and 2015 followed by an increase between 2015 and 2016. I recommend that further study take place to find out if August truly is a low adoption rate month for the Guilford County Animal Shelter, and if so, what efforts does Austin undertake to make the month successful?

#table#
time_table <- animal_time %>%
  group_by(month, `Outcome Animal Type`) %>%
  summarize(`Average Adoptions` = round(mean(count, na.rm = TRUE))) %>%
  spread(key = `Outcome Animal Type`, value = `Average Adoptions`)
  
  #melt(month = c("Outcome Animal Type"))
  #cast(month ~ `Outcome Animal Type`)

datatable(time_table, options = list(pageLength = 12), rownames = FALSE, colnames = c('Month', 'Avg Cat Adoptions', 'Avg Dog Adoptions'), caption = "Average Adoptions Per Month 2014-2016")

Conclusion

Summary

Question 1. How many animals that come through the shelter get adopted? Get euthanized? Get transfered?

The data showed that nearly 99% of all adoptions are cats and dogs, so other animal types like birds and livestock make up about 1% of adoptions. Almost half of all euthanizations are animals from the “Other” category. Research on the Austin Animal Center showed that it is the top “No kill” shelter in the country. Guilford Country should examine their records to see what percentage of animals that are not cats and dogs are treated at the shelter, and should determine if the resources necessary to house and feed these animals could be spent elsewhere if the animals were transferred to specialty organizations.

Question 2. How long do animals stay in the shelter?

A hypothesis test determined that on average, dogs do in fact stay at the shelter longer than than the general population average of all animal types. However, the violin graph showed that there are actually some more extreme outliers in the cat category, with a few cats staying in the shelter for over 1,000 days. My recommendation to the Guilford Country Animal Shelter would be to make sure no animal stays in the shelter for that length of time. Animals that have been at the shelter longer than a year should be tracked so they can be given special marketing promotion or transferred to an organization that can give them specialized attention so they can be adopted at a faster rate.

Question 3. Are older animals adoped at a slower rate than younger animals?

To test the common conception that older animals are less desirable than younger animals to potential adoptive pet parents, I ran a correlation analysis and found that the weakest correlation between the age of the animal and the days they spend at the shelter is in the adoption category. A dot graph proved that there is a vast amount of data to be studied that may surprise readers, such as the fact that cats are adopted at an average age younger than dogs, but cats tend to stay at the shelter for more days than dogs on average. Any information gleaned from this graph could be useful to the Guilford County Animal Shelter as they prepare their annual budget and attempt to estimate the average time each animal will be at the shelter to extrapolate their annual expenditures.

Question 4. Are there any other animals at the shelter besides dogs and cats?

Building on the information I learned in Question 1, I dove deeper into the “Other” animal category and found that a large proportion of this category can be attributed to bats that are very common in the Austin area. The bats make up a large proportion of the euthanizations at the shelter, so they are actually inflating the shelter’s overall euthanization percentage. Because bats are not nearly as populous in Guilford Country as they are in Austin, the County does not need to worry about this inflated number. However, they should still research the costs associated with housing the other species with more rare occurrences, such as gerbils, snakes, rabbits, etc. that are considered adoptable animals that may stay at the shelter longer waiting for the perfect adoptive parent to walk through the door. A specialty organization would be better equipped to handle the housing of these animals, as well as the marketing efforts needed to find them a forever home.

Question 5. Does the color of an animal dictate if they get adopted or euthanized?

I found that the commonly held myth that black animals are less desirable than other colors is actually proved false in the Austin Animal Center Data. Gray animals are in fact the least desirable in terms of adoption percentage, and they are euthanized the most out of the other four color categories. The Guilford County Animal Shelter should study it’s own records to track the coloring of animals that are adopted to confirm this analysis. If they confirm that gray animals are less desirable than black animals, perhaps their adoption fee waiving offer for black cats and dogs on a Friday the 13th should instead be switched to a full moon gray animal adoption fee waiving offer.

Question 6. Are adoptions seasonal?

Lastly, I wanted to study if adoptions were seasonal after recently discovering that Guilford County is waiving adoption fees in the month of August, saying that it is a slow month for the shelter. My analysis proved that in Austin, the summer months are historically the most popular months for adoptions, especially July. My recommendation to Guilford County would be to study their adoption data to ensure that August is a slow month for adoptions, and if so, find out if their are any external factors that drive Austin’s adoption rate upwards in the summer months, such as a large adoption fair or additional spending on marketing.

Conclusion

In conclusion, if this report helps even one animal find a new forever home, then a month of working on this report will be worth every second. I have poured over this data with constant thoughts my love for my community, and I know that this research can be used as a spring board for better policy making, strategic decision making, and marketing expenditures. I plan to share this information with as many government officials as possible to see how this information can help the Guilford County Animal Shelter as it navigates this new management transition.