##Synopsis
As a part of #TidyTuesday , in this Rpubs, I will try to depict my subjective relation* between the casting resilience of competitors and the rating perforance of the show
the data is in .csv format and can be directly read from the the following links I just need “episodes”, “survivalists” data sets required libraries are ggplot2 and dplyr
episodes <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/episodes.csv')
survivalists <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/survivalists.csv')
library(ggplot2); library(dplyr)
The dataset gives detailed info about each competitor. What we need here is
survivalists %>%
select (season, name, gender, days_lasted)
## # A tibble: 94 × 4
## season name gender days_lasted
## <dbl> <chr> <chr> <dbl>
## 1 1 Alan Kay Male 56
## 2 1 Sam Larson Male 55
## 3 1 Mitch Mitchell Male 43
## 4 1 Lucas Miller Male 39
## 5 1 Dustin Feher Male 8
## 6 1 Brant McGee Male 6
## 7 1 Wayne Russell Male 4
## 8 1 Joe Robinet Male 4
## 9 1 Chris Weatherman Male 1
## 10 1 Josh Chavez Male 0
## # ℹ 84 more rows
the boxplot below shows the distribution of the competitors resilience (i.e. days lasted in the season) according to gender
survivalists %>% group_by(season)%>%
ggplot(aes(y=season , x=days_lasted, fill=gender)) +
geom_boxplot(aes(group=season)) +facet_wrap(~gender) +
labs(title="Casting resilience of each season") +
scale_y_continuous(breaks = seq(min(survivalists$season), max(survivalists$season), by = 1))+
theme(text = element_text(size = 8))
Who says women cant survive alone in the wilderness! Here is what I saw in this graph:
The first season: there were only male competitors with a very poor
resilience as majority of the male contestants did not last more than a
week or so, meaning in the first of second episodes, majority of the
competitors were gone and the show lasted with a handful
competitors.
2nd season: the casting decided to include women in the show this time,
yet their selection of women was like their selection of men in the 1st
season, as many women competitors got eliminated in the first week.
Meanwhile casting did a good job selecting much fitter male competitors
as the lasted much more better than the first episode
3rd season: This season is where casting decided to raise the
resilience bar of both male and female competitors as both male as their
resilience were among the top in the history of the show. I believe
casting was on a learning curve in the selection of competitors and by
the start of 3rd season, they knew whom to cast.
4th season: there is something very very wrong with this season as they
totally sacked the casting people and did the job with a new one. there
were very few women competitor this time and the average resilience
quality of male competitors was not comparable to the last 2
episodes.
In the 5th and 6th seasons, number of women participants increased,
where males tend to last more episodes. 7th episode was when casting did
pick the most one of the most resilient competitors in Alone TV series
history, where females on average were about to overtake males. Yet the
male outlier with 100 days of survival broke the record and won the
season.
Episodes database contains audience performance of all season except 9th season. I just handpicked season, episode and viewer performance figures among others. viewers depict the number of million people who watched the show.
episodes %>% select (season, episode, viewers)
## # A tibble: 98 × 3
## season episode viewers
## <dbl> <dbl> <dbl>
## 1 1 1 1.58
## 2 1 2 1.70
## 3 1 3 1.86
## 4 1 4 2.08
## 5 1 5 2.08
## 6 1 6 2.18
## 7 1 7 2.09
## 8 1 8 NA
## 9 1 9 1.80
## 10 1 10 1.94
## # ℹ 88 more rows
Below are the rating performance of each season
episodes %>%
ggplot(aes(x=episode, y=viewers)) + geom_line() + facet_wrap(~season) +geom_smooth(method=lm) +
scale_x_continuous(breaks = seq(1, 13, by = 1))+
scale_y_continuous(breaks = seq(1, 2.5, by = 0.5))+
theme(text = element_text(size = 8)) +
labs(title="Audience performance of each season") + ylab("viewers (million)")
If we read these time series with the resilience performance for each season, there are some interesting deductions:
1st season: the first cut is the deepest so just like many shows, the first episode started with huge interest and increased popularity until the finale.
2nd season: neither the inclusion of weak female competitors nor better selection of male competitors did help the rating performance of the show.
3rd season: this is where the casting learned to pick the best ingredients for the show. a handful but super competitive females who last till to the finale and one the best male casting with a best average resilience in show’s history.
4th season: This is where the casting people lost in the woods. just a couple of female competitors and a terrible group of male competitors. with the resilience bar increasing in the first 3 seasons, this was not the audience was expecting from the show. And the ratings, hence the popularity of the show start to erode.
5th season and onwards: from now on the casting people start to increase the quality of the competitors, yet thanks the to blow in the 4th season and the natural aging of TV series, the show failed to raise an average audience above 1.5 million.