I first became interested in studying animal shelters last year, when my community learned of some devastating news: after receiving some anonymous tips about sub-par conditions, the Guilford County Sheriff’s office decided to visit the Guilford County Animal Shelter to see if the accusations had merit. Unfortunately, they did. Dozens of cases of animal abuse, neglect, and inhumane euthanasia were found at the shelter, which was run by a private company on behalf of the county. The private company was disbanded, criminal charges were filed, and the county has begun to self-manage the shelter. The abrupt change in management has been challenging for the county, and has and brought some very real concerns about animal safety and adoption to light. Although the private company may have kept records such as Austin’s in the past that could be used by Guilford County to improve decision making going forward, the data would be untrustworthy. The claims of animal neglect stem from allegations that records were mishandled and suffering animals were not euthanized when necessary so the private company could boast a low euthanization and high adoption rates. The general problem the county faces right now is that they just don’t have any reliable information to guide their decision making.
I plan to study Austin’s data to glean insights about general conceptions people have regarding animal shelters, described below:
If the data leads me to do so, I may explore the parameters below. Similar to many of the questions above, finding answers to the questions below could help guide the county to make decisions about PR efforts and budgetary spending.
My interest has been peaked and I want to learn what I can from the Austin data to see how I can help my community better understand animal intake and adoption rates and trends. Finding answers to these questions will help me better understand the issues encountered by my local animal shelter. I hope to pass on this information to the Chairman of the County Commissioners, who is currently overseeing the efforts of the county to take over the management of the shelter. Hopefully, information I discover from this data set will help me make a difference in my community.
The data sets were both found on the Data.gov website. * Austin Animal Center Outcomes - 10/01/2013 to 07/21/2017 * Collected to “check out” animals that were leaving the shelter * 12 variables * Used nrow() to discover there are 69,293 rows * Differences between “Intake” data: this data contains a Date of Birth variable * Austin Animal Center Intakes - 10/01/2013 to 07/21/2017 * Collected to “check in” animals that were arriving at the shelter * 12 variables * Used nrow() to discover there are 69,513 rows * Differences between “Outcome” data: this data contains a Found Location variable
I started by reading in the data, and labeling them “intakes” and “outcomes.” I prepared the data for merging by changing character vetors to dates, renaming all variables that had identical names across the sets except for Animal ID, and dropping the Found Location, MonthYear, and Name variables.
The main variables I am concerned with are the Animal ID, Intake and Outcome Dates, Date of Birth, Animal Type, and Animal Color. Many questions can be answered and inferences can be made by combining the date of a certain event and a qualitative variable.
#Reading in "intakes" data#
library(tidyverse)
library(lubridate)
intakes <- read_csv("Austin_Animal_Center_Intakes.csv") %>%
rename(`Intake Date` = DateTime) %>%
rename(`Intake Breed` = `Breed`) %>%
rename(`Intake Color` = `Color`) %>%
rename(`Intake Animal Type` = `Animal Type`) %>%
mutate(`Intake Date` = as.Date(`Intake Date`, format = "%m/%d/%Y")) %>%
select(-`Found Location`, -MonthYear, -Name) %>%
arrange(`Animal ID`)
intakes[!rev(duplicated(rev(intakes$`Animal ID`))),]
## # A tibble: 63,231 x 9
## `Animal ID` `Intake Date` `Intake Type` `Intake Condition`
## <chr> <date> <chr> <chr>
## 1 A006100 2014-03-07 Public Assist Normal
## 2 A047759 2014-04-02 Owner Surrender Normal
## 3 A134067 2013-11-16 Public Assist Injured
## 4 A141142 2013-11-16 Stray Aged
## 5 A163459 2014-11-14 Stray Normal
## 6 A165752 2014-09-15 Stray Normal
## 7 A178569 2014-03-17 Public Assist Normal
## 8 A189592 2015-09-18 Stray Normal
## 9 A191351 2015-11-13 Stray Normal
## 10 A197810 2014-12-08 Stray Normal
## # ... with 63,221 more rows, and 5 more variables: `Intake Animal
## # Type` <chr>, `Sex upon Intake` <chr>, `Age upon Intake` <chr>, `Intake
## # Breed` <chr>, `Intake Color` <chr>
#Reading in "outcomes" data#
library(tidyverse)
library(lubridate)
outcomes <- read_csv("Austin_Animal_Center_Outcomes.csv") %>%
rename(`Outcome Date` = DateTime) %>%
rename(`Outcome Animal Type` = `Animal Type`) %>%
rename(`Outcome Breed` = `Breed`) %>%
rename(`Outcome Color` = `Color`) %>%
mutate(`Outcome Date` = as.Date(`Outcome Date`, format = "%m/%d/%Y")) %>%
mutate(`Date of Birth` = as.Date(`Date of Birth`, format = "%m/%d/%Y")) %>%
select( -MonthYear, -Name) %>%
arrange(`Animal ID`)
outcomes[!rev(duplicated(rev(outcomes$`Animal ID`))),]
## # A tibble: 63,040 x 10
## `Animal ID` `Outcome Date` `Date of Birth` `Outcome Type`
## <chr> <date> <date> <chr>
## 1 A006100 2014-12-20 2007-07-09 Return to Owner
## 2 A047759 2014-04-07 2004-04-02 Transfer
## 3 A134067 2013-11-16 1997-10-16 Return to Owner
## 4 A141142 2013-11-17 1998-06-01 Return to Owner
## 5 A163459 2014-11-14 1999-10-19 Return to Owner
## 6 A165752 2014-09-15 1999-08-18 Return to Owner
## 7 A178569 2014-03-23 1999-03-17 Return to Owner
## 8 A189592 2015-09-18 1997-08-01 Return to Owner
## 9 A191351 2015-11-17 1999-08-21 Return to Owner
## 10 A197810 2014-12-22 2000-01-21 Transfer
## # ... with 63,030 more rows, and 6 more variables: `Outcome
## # Subtype` <chr>, `Outcome Animal Type` <chr>, `Sex upon Outcome` <chr>,
## # `Age upon Outcome` <chr>, `Outcome Breed` <chr>, `Outcome Color` <chr>
I have been unsuccessful in my attempts to merge my data. Some animals have been at the shelter on more than one occasion and they are tracked by their Animal ID. As you can see from the first entry in the below table, Scamp has been at the shelter on two separate occasions. I attemped to use merge() and inner_join() along with an if() statement to merge the two lines only if the Intake Date was older than the Outcome Date, but was unsuccessful.
#Historgram with color fill to show breakdown of Outcome Type#
library(tidyverse)
ggplot(data = outcomes) +
geom_bar(mapping = aes(x = `Outcome Type`, fill = `Outcome Type`))
More animals have the outcome of “Adopted” than any other outcome. Euthanasia is suprisingly low.
I wanted to see the breakdown of the outcomes by Animal Type, so I created the two graphs below.
#Histogram by Animal Type#
library(tidyverse)
ggplot(data = outcomes) +
geom_bar(mapping = aes(x = `Outcome Animal Type`, fill = `Outcome Type`))
#Proportional histogram by Animal Type#
library(tidyverse)
ggplot(data = outcomes) +
geom_bar(mapping = aes(x = `Outcome Type`, fill = `Outcome Animal Type`), position = "fill")
This graph supports my initial theory that most of the adoptions at the shelter are dogs and cats. Looking through a few of the individual data points, I found that most of the animals listed in the “Other” Animal Type are wild, such as opossum, foxes, non-domesticated birds, etc. So it makes sense why this animal type has high percentages in Died, Disposal and Relocated.
count(outcomes, `Outcome Animal Type`)
## # A tibble: 5 x 2
## `Outcome Animal Type` n
## <chr> <int>
## 1 Bird 294
## 2 Cat 26020
## 3 Dog 39150
## 4 Livestock 9
## 5 Other 3820
Here we can see that there have only been 9 cases of livestock over the past few years at the shelter, so to feed and house this livestock is probably requires a special contingency fund to budget for these odd, infrequent expenses like food and shelter.