Business is Barking

Dogs come in all shapes, sizes, and personalities. Adoptable dogs are no exception. These dogs demand goods and services just like humans. Nearly every canine bone, leash, bowl or bed, comes to market through an extensive supply chain forecasted to meet demand. For example, bigger dogs need bigger beds, bones, collars, and cages. Smaller dogs need smaller products which take up less retail shelf space and have lower freight costs.

With this data analysis, manufacturers can optimize their supply chain, and pet retailers can be more profitable with more accurate forecasts for canine-specific products. Ultimately, the data analysis will save freight costs, prevent stockouts, and increase customer service satisfaction.

____________________________________________________________________________________________________________________________

Introduction

Objective

Our goal is to minimize shortages and reduce cost of goods sold. This data analysis provides insights on adoptable canines across the country to improve purchasing and logistics, making sure manufacturers and retailers accurately get the right products, to the correct location, at the right time.

Focus

This data analysis will focus on descriptive insights from a dataset on adoptable dogs. This data is from September 19, 2019, and is provided by PetFinder and Github. This analysis is intended to help canine manufacturers and retailers optimize merchandise in regions with the most demand for specific products and services. Specifically, we’ll focus on four areas: size, breed, and age of the dog.

  • Breed: Looking at Breed can be a great indicator of a dog’s temperment, health concerns, and energy levels. We will evaluate which geographic regions may need to carry more specialty items for specific breeds and which breeds are most popular.
  • Size: Size is an important metric to consider when larger dogs require heavier products that cost more to ship and require more shelf space. We will evaluate which geographic regions require the most inventory for specific dog sizes in order to optimize freight costs.
  • Age: Knowing a dog’s age can help retailers target the best types of health products, toys, and accessories to recommend to a customer. We will evaluate which geographic regions will need to stock for younger and older dogs.

Requirements & Prep

Packages Required
list.of.packages <- c("tidyverse", "scales", "readr", "maps", "DT", "knitr", "rmarkdown", "ggthemes", "plotly", "ggbump", "ggalluvial", "tinytex")
library(tidyverse)     # easy installation of packages
library(scales)        # for scale functions for visualization
library(readr)         # to easily import delimited data
library(maps)          # for geographical data
library(DT)            # to create functional tables in HTML
library(knitr)         # for dynamic report generation
library(rmarkdown)     # for RMarkdown documents into a variety of formats
library(ggthemes)      # to implement theme across report
library(plotly)        # for dynamic plotting
library(kableExtra)    # for styling tables
library(tinytex)       # for making tables
library(ggalluvial)    # to visualize frequency tables of categorical variables
library(ggbump)        # bump chart to plot ranking when the path between two 
Prep and Import

The project contains data used in The Pudding essay Finding Forever Homes written by Amber Thomas and published in October 2019.

A dataset, dog_description.csv, was downloaded from GitHub labeled Adoptable Dogs.

Our project uses Finding Forever Homes data initially collected from Petfinder.com on all adoptable dogs in the U.S. on a single day, specifically 09-20-2019.

  • The original purpose of the data was used for the Finding Forever Homes essay to highlight where a state’s adoptable dogs are imported from by state and why they were relocated. The essay draws conclusions about the benefits and risks of the transportation of dogs for adoption.

  • The data available comes in 3 csv files labeled dog_description.csv, dog_moves.csv,and dog_travel.csv. However, we will use data from only one dataset for this this project.

    • dog_description.csv has 58,180 entries with 36 variables. Each row represents an individual adoptable dog in the U.S. on September 20, 2019. Each dog has a unique I.D. number. Unless otherwise noted, all the data is exactly as reported by the shelter or rescue that posted an individual animal adoption on PetFinder.
  • Missing values are recorded using “NA” in original data sets.

Data is imported from csv as shown below:

dog_descriptions <- readr::read_csv(url("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-12-17/dog_descriptions.csv"))

-Copyright

Data Cleaning

Initial Cleaning

Values in the original dataset were shifted into the wrong column. This step shows cleaning the data through realignment. The data was also cleaned by correcting any misspellings or misinformation. We also organized the data to make is easier to work with.

After taking a look at the variables in the dataset, it was important to remove the variables that would not be needed or contribute to the story of our analysis. We decided to keep some of the original 36 variables in dog_description.csv. The columns that were not needed for the analysis were removed.

After evaluating column specifications it is important to address missing values. To do this we utilize the is.na() function to determine which columns contain missing values.

Data Dictionary
Summary of the Orginal Variables Kept
Variable Type Description
id numeric Observation identifier
breed character Type of dog breed
age character Age of dog
size character Size of dog
city character City where dog is located
state character State where dog is located
zip character Zip code of where dog is located
region character United States region where dog is located
Subsets

For the purposes of our visualizations and analysis, we have broken the United States out into regions 1 ~ 8 noted on the map below.

Illustration: 8 Regions:

Map

The dataset is directly based off Petfinder Data for a single day and are almost all character variables. After looking at the data, no outliers appear.

Data Analysis

By Breed

Dog Breed
In our analysis, we took a look at which breeds were most popular in the US. Understanding the volume of dogs of a particular breed will allow for more products to be on shelves that are specific to customer needs. For example, hunting dogs may require more toys to run and play with while toy dogs may need more grooming products.

We wanted to start by finding the zip code and city with the highest concentration of the most popular breed in the US: Pit Bull. We found this to be in Atlanta, GA where there were 242 pitbulls.

Table 1: Top Zip Code for Pit Bull Dogs
breed city zip n
Pit Bull Terrier Atlanta 30318 242
Next we wanted to find the zip code and city with the highest concentration of the another popular breed in the US: Chihuahua. We found this breed interesting because of their size. This is a breed that is too small to use standard dog toys and clothes, so it is important to prepare with special inventory in places with a high volume of this breed. Las Vegas has the highest number of Chihuahuas at 54 dogs.
Table 2: Top Zip Code for Chihuahua Dogs
breed city zip n
Chihuahua Las Vegas 89103 54

Ranking dogs by breed allows us to understand which popular breeds to stock products for and further research their needs. In the US the most popular breed is the Pit Bull Terrier, closely followed by the Labrador Retriever. There were over 7000 dogs from each of these breeds on PetFinder.

Table 3: Top 10 Most Popular Dog Breeds
breed breed_rank n
Pit Bull Terrier 1 7890
Labrador Retriever 2 7198
Chihuahua 3 3766
Mixed Breed 4 3242
Terrier 5 2641
Hound 6 2282
German Shepherd Dog 7 2122
Boxer 8 2050
Shepherd 9 1972
American Staffordshire Terrier 10 1862

By Size

Dog Size
In our analysis, we took a look at where dogs of certain sizes were most popular in the US. Manufacturers and retailers could value from this by saving money of shipments of heavier, larger goods and only distributing what is needed to the right locations. For example, extra large dogs eat more food and need big crates and toys, while small dogs need small treats and collars.

First, we evaluated which dog sizes were most popular in different regions of the US. We predicted there would be a big difference in regions due to rural vs. urban areas, climate differences, and the need for working dogs. However, we were surprised that while there were some differences in preference, most regions prefer medium sized dogs. We were also interested to see that extra large dogs are significantly more rare than other sizes.

We counted the dog Size per Region here:

Table 4: Size Popularity by Region
region size n prop
Southeast
Region-1 Extra Large 154 1.09
Region-1 Large 3887 27.58
Region-1 Medium 7631 54.15
Region-1 Small 2420 17.17
Mid South
Region-2 Extra Large 49 1.29
Region-2 Large 1017 26.69
Region-2 Medium 2113 55.46
Region-2 Small 631 16.56
Southwest
Region-3 Extra Large 151 2.94
Region-3 Large 1329 25.84
Region-3 Medium 2298 44.67
Region-3 Small 1366 26.56
West Coast
Region-4 Extra Large 25 0.96
Region-4 Large 829 32.00
Region-4 Medium 856 33.04
Region-4 Small 881 34.00
Pacific Northwest
Region-5 Extra Large 26 1.72
Region-5 Large 471 31.17
Region-5 Medium 673 44.54
Region-5 Small 341 22.57
Great Plaines
Region-6 Extra Large 56 1.84
Region-6 Large 900 29.60
Region-6 Medium 1375 45.22
Region-6 Small 710 23.35
Midwest
Region-7 Extra Large 164 2.05
Region-7 Large 2402 30.00
Region-7 Medium 4030 50.34
Region-7 Small 1410 17.61
Northeast
Region-8 Extra Large 306 1.53
Region-8 Large 4926 24.65
Region-8 Medium 10932 54.70
Region-8 Small 3821 19.12

After counting % of Dog Size per Region we wanted the percentages of each dog size in each region. It appears, the west coast has the least preference in dog sizes.

From here we wanted to look at our own state of Ohio to understand how many dogs there were of different sizes. In Ohio most dogs are medium sized.

Table 5: Ohio Dog Size Preference
size count_by_size
Extra Large 67
Large 763
Medium 1321
Small 522
We also wanted to see which state preferred small dogs most. This was Utah at 300 small dogs, at 61.86% of all their dogs being small. This may be due to a warmer climate in some parts of the state. It would be important for a pet store in Utah to carry more small dog merchandise than other states.
Table 6: State that Prefers Small Dogs the Most
state size n percent_small
UT Small 300 61.86
We wanted to see which state preferred large dogs most. This was South Dakota at 15 large dogs, at 62.50% of all their dogs being large. This may be due to the need for working dogs or the rural areas with plenty of land to own large dogs. It would be important for a pet store in South Dakota to carry more large dog merchandise than other states.
Table 7: State that Prefers Large Dogs the Most
state size n percent_large
SD Large 15 62.5

By Age

Dog Age
We looked at where there were dogs of certain ages were most common in the US. Dogs have evolving needs throughout their lives that owners must prepare for. Therefore, manufacturers and retailers become more trusted to the customer when they are able to provide products to dogs of all ages. For example, puppies need training tools like clickers and harnesses, while elderly dogs need medical products.

We analyzed zip codes where puppies were most commonly found in the US. It is possible that this could be because there are breeders or puppy mills in this area. This will translate to more demand for puppy supplies.

Table 8: Zip Codes with the Most Puppies
zip city state age n
80126 Littleton CO Baby 87
11558 Island Park NY Baby 84
35051 Columbiana AL Baby 78
01810 Andover MA Baby 77
29607 Greenville SC Baby 77

We analyzed cities where aging dogs were most commonly found in the US. It is possible that this could be because there are more families or elderly people in this area. This will translate to more demand for elderly dog supplies.

Table 9: Cities with the Most Elderly Dogs
city state age n
Kanab UT Senior 86
Las Vegas NV Senior 75
Phoenix AZ Senior 74
Chamblee GA Senior 59
New York NY Senior 55

Finally, we evaluated which dog age was most common by percentage in each region of the US. As expected, we found adult dogs to be the most common age in all regions. This makes sense as the range of adult years spans most of a dog’s life. Items related to adult dogs will have the most need in all pet stores.

Conclusions

Summary of Our Findings

After completing our data analysis, we believe market research is crucial for companies to be able to source the right products to their customers. In the US canine manufacturers and retailers should focus mainly on items catered to medium sized, adult dogs. They should also keep specialty items for Pitt Bull Terriers and Labrador Retrievers. Pet stores should take into account that many factors can affect the outcomes of data analysis gathered from just one day, so we recommend they continue research with similar PetFinder data over time. It is also important to note that there are many underlying reasons for the results of our data analysis that may not be explicitly shown in the research we have presented. Finding other related data sources in the future can provide answers to questions about the outcomes of our analysis and give more context. Overall, canine manufacturers and suppliers will gain a better understanding of the products needed most from our research, and can reduce cost while having the best selection for their customers.

-Copyright

Some Texts