We are going to be reading in the miltiary active status and FAOSTAT. The FAOSTAT dataset contains information such as regions, group code while the SNL actors
## Rows: 46 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (5): sid, year, first_epid, last_epid, n_episodes
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 614 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): aid
## dbl (5): sid, first_epid, last_epid, n_episodes, season_fraction
## lgl (2): featured, update_anchor
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
The seasons and casts respectively contain data about seasons and casts on SNL over the years. There isn’t much to mutuate or tidy in these two datasets.
valid_tenure <- snl_casts %>%
filter(!is.na(first_epid), !is.na(last_epid)) %>%
summarize(avg_tenure = mean(last_epid - first_epid, na.rm = TRUE))
seasons_tenure <- cross_join(snl_seasons, valid_tenure)
seasons_tenure
## # A tibble: 46 × 6
## sid year first_epid last_epid n_episodes avg_tenure
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 1975 19751011 19760731 24 NaN
## 2 2 1976 19760918 19770521 22 NaN
## 3 3 1977 19770924 19780520 20 NaN
## 4 4 1978 19781007 19790526 20 NaN
## 5 5 1979 19791013 19800524 20 NaN
## 6 6 1980 19801115 19810411 13 NaN
## 7 7 1981 19811003 19820522 20 NaN
## 8 8 1982 19820925 19830514 20 NaN
## 9 9 1983 19831008 19840512 19 NaN
## 10 10 1984 19841006 19850413 17 NaN
## # ℹ 36 more rows