Pet Adoption Analysis

When looking to add a new fury friend to the Family the thought of “Adopting” instead of buying is encouraged. However, in order to find a good fit for a home, the new pet owner needs to consider some aspects about the animal before the adoption process.

Where are they coming from?, vaccinated?, have they been in a special needs home before? are they house trained?

Problem Statement

I would like to analyze the information that is available to future pet owners that are opting for adoption instead of purchase. In addition, i would like to determine in which state are adoption pets more abundant and how breeds are represented in numbers compared to others.

Implementation

The data was scraped and manipulated accordingly for the analysis. The data was then reviewed graphically to determine what is the most abundant breed, where are these breeds located and what information is provided for the future pet owner

Results

Synopsis

Pet Adoption

Packages

#Packages used

library(tidytext)
library(DT)
library(tm)
## Loading required package: NLP
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.0.4     v dplyr   1.0.2
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x ggplot2::annotate() masks NLP::annotate()
## x dplyr::filter()     masks stats::filter()
## x dplyr::lag()        masks stats::lag()
library(stringr)
library(magrittr)
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract
library(leaflet)
library(ggplot2)
library(dplyr)
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(forcats)

1.- Data Exploration

1.1 Summary
1.2 Counting Qualitative Data

Breed

Sex

Male Dogs are prominenet than female dogs in shelters for adoptions
myfile %>%
  count(sex) %>%
  arrange(desc(n))
##      sex   n
## 1   Male 122
## 2 Female  99

Age

Adult Dogs seem to be more abundant for adoptioins than young or baby dogs
myfile %>%
  count(age) %>%
  arrange(desc(n))
##      age  n
## 1  Adult 87
## 2  Young 71
## 3   Baby 43
## 4 Senior 20

Size

Extra Large dogs seems to be the least in shelters, whereas small and large dogs are the more abundant
myfile %>%
  count(size) %>%
  arrange(desc(n))
##          size  n
## 1       Small 94
## 2       Large 72
## 3      Medium 54
## 4 Extra Large  1

2.- Data Wrangling

2.1 Subsets

Subset1

We subset data in order to understand it better. Subset_1 allows us to see the a new data frame including variables like

  • Age
  • sex
  • Size
subset_1<- subset(myfile, select = age:size)
head(subset_1)
##      age    sex   size
## 1 Senior   Male Medium
## 2  Adult   Male  Large
## 3  Adult   Male  Large
## 4   Baby Female  Large
## 5  Young   Male  Small
## 6   Baby   Male Medium

Subset2

subset_2<- subset(myfile, select = fixed:declawed)
head(subset_2)
##   fixed house_trained declawed
## 1  TRUE          TRUE       NA
## 2  TRUE          TRUE       NA
## 3  TRUE         FALSE       NA
## 4 FALSE         FALSE       NA
## 5  TRUE         FALSE       NA
## 6  TRUE         FALSE       NA
2.2 Filter to subsets

Sub1_senior

If we want to understand the amount of senior dogs for adoption we can filter the subset and see there is a significant amount of senior dogs given for adoption.

Sub1_senior <- subset_1 %>%
  filter(age == "Senior")
Sub1_senior
##       age    sex   size
## 1  Senior   Male Medium
## 2  Senior   Male Medium
## 3  Senior Female  Small
## 4  Senior Female  Small
## 5  Senior   Male Medium
## 6  Senior   Male  Large
## 7  Senior Female Medium
## 8  Senior   Male  Small
## 9  Senior   Male  Small
## 10 Senior Female  Small
## 11 Senior Female  Small
## 12 Senior Female  Large
## 13 Senior   Male Medium
## 14 Senior   Male  Large
## 15 Senior   Male  Small
## 16 Senior   Male  Small
## 17 Senior   Male  Small
## 18 Senior   Male  Small
## 19 Senior   Male  Small
## 20 Senior Female  Small
Factor $ Levels

Subset_1

In order to understand Categorical Data we need to use Levels and factors. The package used for this task is forcats

summary(subset_1)
##      age                sex                size          
##  Length:221         Length:221         Length:221        
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character
subset_1 %>%
  mutate(age = fct_lump(age, n = 50)) %>%
  count(age)
##      age  n
## 1  Adult 87
## 2   Baby 43
## 3 Senior 20
## 4  Young 71

Subset_2

summary(subset_2)
##    fixed         house_trained   declawed      
##  Mode :logical   Mode :logical   Mode:logical  
##  FALSE:41        FALSE:171       NA's:221      
##  TRUE :180       TRUE :50
subset_2 %>%
  count(fixed, house_trained)
##   fixed house_trained   n
## 1 FALSE         FALSE  39
## 2 FALSE          TRUE   2
## 3  TRUE         FALSE 132
## 4  TRUE          TRUE  48

3.- Graphs

subset_1
ggplot(subset_1, aes(x = age)) + 
  geom_bar() +
  coord_flip()

subset_1 %>%
  mutate(age = fct_infreq(age)) %>%
  ggplot(aes(x = age)) + 
  geom_bar()

ggplot(subset_1, aes(x = age, fill = sex)) + 
  geom_density(col = NA, alpha = 0.55)

Subset_2
ggplot(subset_2, aes(x = fixed)) + 
  geom_bar() +
  coord_flip()

ggplot(subset_2, aes(x = house_trained, fill = fixed)) + 
  geom_density(col = NA, alpha = 0.55)

4.- Conclusion

Pet Adoption

AGE - SEX - SIZE

Thanks to the Data Wrangling and Graphs we can now understand that most of the dogs available for adoption are Adult dogs, the second most abundant age of dogs is young dogs. Senior is the least abundant in age of dogs for adoption.

SEX

Female Dogs seem to be in slightly a larger number than male dogs

ggplot(subset_2, aes(x = house_trained, fill = fixed)) + 
  geom_density(col = NA, alpha = 0.55)

Subset2

FIXED

Fixed dogs are more abundant than not fixed dogs when adopting. That means the standard will be to adopt a fixed dog.

HOUSE TRAINED vs FIXED

Most of the dogs who are fixed are also house trained that represents less emotional stress for the animal and the new owner.