Import data

# csv file
mydata<- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2025/2025-03-04/longbeach.csv')
mydata

## # A tibble: 29,787 × 22
##    animal_id animal_name animal_type primary_color secondary_color sex     
##    <chr>     <chr>       <chr>       <chr>         <chr>           <chr>   
##  1 A693708   *charlien   dog         white         <NA>            Female  
##  2 A708149   <NA>        reptile     brown         green           Unknown 
##  3 A638068   <NA>        bird        green         red             Unknown 
##  4 A639310   <NA>        bird        white         gray            Unknown 
##  5 A618968   *morgan     cat         black         white           Female  
##  6 A730385   *brandon    rabbit      black         white           Neutered
##  7 A646202   <NA>        bird        black         <NA>            Unknown 
##  8 A628138   <NA>        other       gray          black           Unknown 
##  9 A597464   <NA>        cat         black         <NA>            Unknown 
## 10 A734321   sophie      dog         cream         <NA>            Spayed  
## # ℹ 29,777 more rows
## # ℹ 16 more variables: dob <date>, intake_date <date>, intake_condition <chr>,
## #   intake_type <chr>, intake_subtype <chr>, reason_for_intake <chr>,
## #   outcome_date <date>, crossing <chr>, jurisdiction <chr>,
## #   outcome_type <chr>, outcome_subtype <chr>, latitude <dbl>, longitude <dbl>,
## #   outcome_is_dead <lgl>, was_outcome_alive <lgl>, geopoint <chr>

State one question

Through the years adoption rates have been increasing.

Plot data

mydata %>% 
    # Count the frequency of combinations of 'outcome_date' and 'outcome_type'
    # and arrange the results in descending order by the count (n).
    count(outcome_date, outcome_type, sort = TRUE) %>% 
    # Filter the data to keep only rows where the 'outcome_type' is exactly "adoption".
    filter(outcome_type == "adoption") %>% 
    
    # Group the daily/monthly counts into yearly totals for time series analysis.
    # - '.date_var = outcome_date': Specifies the column containing the date information.
    # - '.by = "year"': Aggregates the data to a yearly frequency.
    # - 'n = sum(n)': Calculates the total number of adoptions (the original 'n' column)
    #   within each year and assigns it to the new 'n' column.
    timetk::summarise_by_time(.date_var = outcome_date, .by = "year", n = sum(n)) %>%

    # Plot
    ggplot(aes(outcome_date, n)) + 
    geom_line()

Interpret

There seems to be preliminary evidence of a positive relationship between professors salary, and their years of service.

Module 5: Apply 4

Lilli Warnock

Import data

State one question

Plot data

Interpret