Import Data

plants <- read_csv("../00_data/plants.csv")
## Rows: 500 Columns: 24
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (6): binomial_name, country, continent, group, year_last_seen, red_list...
## dbl (18): threat_AA, threat_BRU, threat_RCD, threat_ISGD, threat_EPM, threat...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
plants
## # A tibble: 500 × 24
##    binomial_name     country continent group year_last_seen threat_AA threat_BRU
##    <chr>             <chr>   <chr>     <chr> <chr>              <dbl>      <dbl>
##  1 Abutilon pitcair… Pitcai… Oceania   Flow… 2000-2020              0          0
##  2 Acaena exigua     United… North Am… Flow… 1980-1999              0          0
##  3 Acalypha dikuluw… Congo   Africa    Flow… 1940-1959              0          0
##  4 Acalypha rubrine… Saint … Africa    Flow… Before 1900            1          1
##  5 Acalypha wilderi  Cook I… Oceania   Flow… 1920-1939              1          0
##  6 Acer hilaense     China   Asia      Flow… 1920-1939              0          0
##  7 Achyranthes atol… United… North Am… Flow… 1960-1979              0          0
##  8 Adenophorus peri… United… North Am… Fern… 2000-2020              0          0
##  9 Adiantum lianxia… China   Asia      Fern… 2000-2020              0          0
## 10 Aechmea cymosopa… Venezu… South Am… Flow… Before 1900            1          1
## # ℹ 490 more rows
## # ℹ 17 more variables: threat_RCD <dbl>, threat_ISGD <dbl>, threat_EPM <dbl>,
## #   threat_CC <dbl>, threat_HID <dbl>, threat_P <dbl>, threat_TS <dbl>,
## #   threat_NSM <dbl>, threat_GE <dbl>, threat_NA <dbl>, action_LWP <dbl>,
## #   action_SM <dbl>, action_LP <dbl>, action_RM <dbl>, action_EA <dbl>,
## #   action_NA <dbl>, red_list_category <chr>
threats <- read_csv("../00_data/threats.csv")
## Rows: 6000 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): binomial_name, country, continent, group, year_last_seen, red_list_...
## dbl (1): threatened
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
threats
## # A tibble: 6,000 × 8
##    binomial_name        country continent group year_last_seen red_list_category
##    <chr>                <chr>   <chr>     <chr> <chr>          <chr>            
##  1 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  2 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  3 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  4 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  5 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  6 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  7 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  8 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
##  9 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
## 10 Abutilon pitcairnen… Pitcai… Oceania   Flow… 2000-2020      Extinct in the W…
## # ℹ 5,990 more rows
## # ℹ 2 more variables: threat_type <chr>, threatened <dbl>

Introduction

Questions

Variation

Visualizing distributions

plants %>%
    ggplot(aes(x = continent, fill = group)) +
    geom_bar()

Typical values

plants %>%
    filter(continent == "Africa") %>%
        ggplot(aes(x = group)) +
        geom_bar()

Unusual values

plants %>%
    filter(group %in% c("Algae", "Conifer", "Cycad", "Ferns and Allies", "Mosses")) %>%
    ggplot(aes(x = continent, fill = group)) +
    geom_bar()

Missing Values

plants %>%
    mutate(group = ifelse(group == "Conifer", NA, group)) %>%
    ggplot(aes(x = continent, fill = group)) + 
    geom_bar()

Covariation

A categorical and continuous variable

N/A

Two categorical variables

ggplot(data = plants) +
  geom_count(mapping = aes(x = continent, y = group))

Two continous variables

N/A

Patterns and models

No clear patterns that I can see but if you have ideas I am happy to add more.