import data
data <- read_csv("../00_data/mydata.csv")
## New names:
## Rows: 29787 Columns: 23
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (18): animal_id, animal_name, animal_type, primary_color, secondary_colo... dbl
## (3): ...1, latitude, longitude lgl (2): outcome_is_dead, was_outcome_alive
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
Introduction
Questions
Variation
Visualizing distributions
ggplot(data = data) +
geom_bar(mapping = aes(x = animal_type))

data %>% count(animal_type)
## # A tibble: 10 × 2
## animal_type n
## <chr> <int>
## 1 amphibian 3
## 2 bird 2075
## 3 cat 14145
## 4 dog 9768
## 5 guinea pig 172
## 6 livestock 10
## 7 other 1332
## 8 rabbit 526
## 9 reptile 344
## 10 wild 1412
data %>% count(animal_type) %>% ggplot(aes(x = animal_type, y = n)) + geom_col()

Typical values
Unusual values
Missing Values
Covariation
A categorical and continuous variable
Two categorical variables
Two continous variables
Patterns and models