The Office is a popular American sitcom which aired from 2005 -2013 carrying out 9 seasons. Based on data visualizations how has the popularity and appeal of The Office changed over the course of the series?
https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-03-17/readme.md https://en.wikipedia.org/wiki/List_of_The_Office_(American_TV_series)_episodes
7 Variables
186 Observations
datatable(office_ratings, options = list(scrollX = TRUE))
Season: season during which an episode aired
Episode: number episode within the season
Title: title off the episode
Viewers: number of viewers for an episode in millions
imdb_rating: rating of an episode on IMDb
total_votes: total number of ratings on IMDb for the episode
air_date: date the episode aired
library(tidyverse)
Packages within the tidyverse library were used to create the various data visualizations throughout the report.
ggplot(data = office_ratings, mapping = aes (x = viewers)) +
geom_histogram() +
labs(x = "Viewers (millions)",
y = "Count",
title = "Distribution of Episode Viewership")
The distribution of viewers for episodes of The Office ranged between 3.25 million and 22.91 million. However, the majority of the distribution exists between 3.25 million and 10.50 million. “Stress relief,” the episode which recieved 22.91 million viewers is an outlier, likely caused by the fact that the episode aired directly after Super Bowl XLIII on NBC. The pilot episode is also slightly higher than the rest of the episodes with 11.20 million viewers likely due to the fact that the show was brand new. The greatest distribution of episodes exist with around 7.5 million viewers (37 episodes) with a similar number of episodes (36) receiving 8.0 million viewers. The histogram peaks between 7.5 and 8.5 million viewers, indicating that the majority of episodes received a number of viewers somewhere in that range.
ggplot(data = office_ratings, mapping = aes (x = imdb_rating)) +
geom_histogram() +
labs(x = "IMDb Rating",
y = "Count",
title = "Distribution of IMDb Ratings per Episode ")
The distribution of IMDb Ratings for episodes of The Office ranges from 6.7 to 9.7 out of 10. The peak of the histogram occurs around 8.3, roughly the middle of the curve. The histogram has fairly symmetrically shaped bell curve with another slight peak around 8.7. There are no extreme values but there are a few episodes slightly outisde of normal curve. Three episodes are slightly higher than would be expected. Both “Goodbye Michael,” which marks the episode that Michael a significant character within the show leaves, and “Finale,” which marks the end of the series received ratings of 9.7 likely due to the significance of the episodes to the viewers. Closely behind with a rating of 9.6, still 0.3 higher than the next highest rated episode was “Stress Relief” which as mentioned earlier received an outstanding number of viewers. On the low end of the distribution curve, the episode “The Banker” which is a compilation of events previously shown received a rating of only 6.8. perhaps due to the lack of new content. And “Get the Girl” received a ranking of 6.7 likely due its story which centered around characters that were disliked by a significant proportion of viewers.
ggplot(data = office_ratings, mapping = aes (x = total_votes, fill = total_votes)) +
geom_histogram() +
labs(x = "Number IMDb Votes",
y = "Count",
title = "Distribution of Total Number IMDb Votes per Episode ")
The distribution of IMDb votes ranges between 1393 votes and 7934 votes. The distribution is skewed to the right with the peak occurring around 1700 IMDb votes, indicating that the majority of episodes are on the lower half of the distribution curve. Three episodes can be noted as outliers with a particularly high number of votes. “Stress relief”received 5948 votes likely due to the outstanding number of viewers it received. “Goodbye Michael,” and “Finale” received 5749 and 7934 votes likely due to the emotional significance to the show and its viewers.
ggplot(data = office_ratings, mapping = aes (x = viewers, y = imdb_rating)) +
geom_point() +
geom_smooth(se = FALSE) +
labs(x = "Viewers (millions)",
y = "IMDb Rating" ,
title = "Coorelation Between Viewership and IMDb Rating")
Viewership and IMDb Rating are positively correlated, as one increases so does the other. “Stress Relief” is marked as an outlier due to its remarkably high number of viewers caused by its airing date and time.
ggplot(data = office_ratings, mapping = aes (x = viewers, y = imdb_rating, color = season)) +
geom_point() +
labs(x = "Viewers (millions)",
y = "IMDb Rating",
color = "Season" ,
title = "Coorelation Between Viewership and IMDb Rating for each Season")
Introducing which season the episode aired highlights exceptions found within this trend between viewership and ratings. Episodes that aired after Michael left in the season 7 finale generally e received fewer viewers than previous seasons. Despite having lower viewership the final three episodes of the series “Livin’ the Dream,” “A.A.R.M,” and “Finale,” received ratings in the upper half of the distribution. “Finale” in particular tied with “Goodbye Michael” for the highest rated episode of the entire series. These episodes serve as wrap ups and endings for the plots and characters viewers had gotten to know over the course of the series perhaps earning higher ratings than previous episodes. In contrast, the pilot episode while having the 2nd highest number of viewres of any episode in the series received rather low ratings likely due its viewers unfamiliarity with the shows characters and structure.
ggplot(data = office_ratings, mapping = aes (x = viewers, y = total_votes, color = season)) +
geom_point() +
labs(x = "Viewers (millions)",
y = "Number of Votes on IMDb",
color = "Season",
title = "Coorelation between Viewership and Number of IMDb Votes")
Typically the more viewers and episode had the greater number of IMDb votes they received. However, episodes from season one also consistently have slightly greater number of votes than the trend would suggest. At this time the show was new and viewers more likely to leave reviews than they were in later seasons. It should also be noted that “Finale” the last episode of the series received the greatest number of votes in the whole series despite its mid to low number of viewers. The episode “Goodbye Michael” which as mentioned earlier is an important turning point in the series recieved a greater number of votes than the trend would suggest. These two outlying episodes seem to suggest that if the story of the episode is outstanding or perhaps the viewers are highly invested emotionally they will be more likely to rate it.
ggplot(data = office_ratings) +
geom_point(mapping = aes (x = air_date, y = viewers, color = season)) +
geom_smooth(mapping = aes (x = air_date, y = viewers), se = FALSE) +
labs(x = "Air Date",
y = "Viewers (millions)",
color = "Season",
title = "Trends in Viewership Over Time")
The show greatly increased in popularity between season 1 and season 2 and continued to do so until around 2008 within season 4. It was then followed by a steady decline in viewership until its finale. Once again the outlying point in the center of the graph is the episode “Stress Relief.” The decrease in popularity becomes slightly steeper around 2012 likely due to Micheal’s absence from the show.
ggplot(data = office_ratings) +
geom_point(mapping = aes (x = air_date, y = imdb_rating, color = season)) +
geom_smooth(mapping = aes (x = air_date, y = imdb_rating), se = FALSE) +
labs(x = "Air Date",
y = "IMDb Rating",
color = "Season",
title = "Trends in Viewership Over Time")
The shows appeal reaches its peak slightly before its peak in popularity. The trend in raising ratings reaches its peak slightly before season 4. The ratings then decease until season 8 where they plateau and then raise slightly in season 9. The majority of episodes in both season 8 and season 9 are below the trend line. indicating that the plataeu and slight increase are strongly influenced by the final three episodes in season 9 which all received very high ratings.
ggplot(data = office_ratings, mapping = aes (x = air_date, y = viewers, color = season)) +
geom_line()
Where The Office saw a fall in popularity until its end, its appeal found a slight uptick in its last season. Michael’s leaving of the show seems to have dramatically decreased its popularity which could not be recovered. However, for those that stayed to watch the final seasons of the series it seems the shows appeal was “saved” due to well written episodes which recieved high ratings perhaps through the the return of loved characters, happy endings, and wrap ups to plots that had carried on throughout the series.
Throughout the individual seasons, as a season continues viewership decreases. The first episode of a season always has more viewers than the final episode of that same season. Season 1 has a sharp decrease in viewership from the pilot episode a until its finale and while not quite as steep, season 8 continually decreases from its beginning to end. Seasons 2 - 5 reach their peak about 50-60% into the season then decreases in viewership to below where it initially started. It should be noted that in season 5 this peak may dictated by the episode”Stress Relief” which has previously been noted to have an outlying number of viewers. Seasons 6 and 7 appear to alternate between higher and lower viewership for the season but ultimately end in slightly lower viewership than they started with. The exception to this rule is Season 9. This is the only season have a noticeable increase in its viewership at its ending, once again likely due to the finale and other final episodes of the season that may have brought viewers back to find closure for the series they had watched previously.
A large portion of episodes from The Office were distributed 7.3 and 8.7 million viewers
IMDb ratings of episodes from The Office had a standard bell curve histogram distribution reaching its peak around 8.3
The number of ratings on IMDb for an episode were skewed to the right, peaking around 1800 votes
The more viewers an episode had the higher it was rated on IMDb
Seasons 8 and 9 had lower viewership than the other seasons
The greater number of viewers for an episode the greater number of votes it received on IMDb
The popularity of The Office reached its peak in 2008 and then declined until the end
The shows appeal decreased following its peak after the 4th season until the final 3 episodes of the series in season 9
The first episode of a season has higher viewership than the final season