import data

data <- read_csv("../00_data/mydata.csv") 
## New names:
## Rows: 29787 Columns: 23
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (18): animal_id, animal_name, animal_type, primary_color, secondary_colo... dbl
## (3): ...1, latitude, longitude lgl (2): outcome_is_dead, was_outcome_alive
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`

Introduction

Questions

Variation

Visualizing distributions

ggplot(data = data) +
    geom_bar(mapping = aes(x = animal_type))

data %>% count(animal_type)
## # A tibble: 10 × 2
##    animal_type     n
##    <chr>       <int>
##  1 amphibian       3
##  2 bird         2075
##  3 cat         14145
##  4 dog          9768
##  5 guinea pig    172
##  6 livestock      10
##  7 other        1332
##  8 rabbit        526
##  9 reptile       344
## 10 wild         1412
data %>% count(animal_type) %>% ggplot(aes(x = animal_type, y = n)) + geom_col()

Typical values

Unusual values

Missing Values

Covariation

A categorical and continuous variable

Two categorical variables

Two continous variables

Patterns and models