# A tibble: 2 × 5
type total_titles average_duration earliest_release latest_release
<chr> <int> <dbl> <dbl> <dbl>
1 Movie 6132 99.6 1942 2021
2 TV Show 2677 1.76 1925 2024
TV Shows & Movies in the Streaming Era
Part 1: Introduction & Data Overview
Introduction
Hello everyone! My name is Madison Clore. I am a graduating senior at Xavier University studying Finance and Business Analytics. After graduation, I will be working at Pension Corporation of America (PCA) as an Investment Advisor Representative.
I have always enjoyed watching different TV shows and movies. It has been a part of how I spend some of my free time with family and friends, whether it is watching a new series together or rewatching familiar favorites. With that said, I have noticed how much streaming platforms have changed the way people discover and consume content because of the wide array of options across genres and formats. This shift has made me curious about what kinds of movies and TV shows are actually included in these platforms and what patterns exist within their libraries. I wanted to explore whether certain characteristics, such as release year, duration, or content type, tend to be more common than others and what that might reveal about modern viewing habits. At the same time, I am also interested in how these broader content trends relate to highly rated or widely recognized shows, since certain titles consistently appear in rankings while others do not receive the same level of recognition.
For my Programming in Analytics course, I explored this area of interest by analyzing a Netflix Movies and TV Shows dataset. This dataset contains detailed information on titles of movies and TV shows available on the Netflix streaming platform. Each observation represents a single title, and there are 12 variables that describe key characteristics of each title. This dataset is well-suited for analyzing patterns in streaming content because it captures key descriptive characteristics across both movies and TV shows. This dataset can be accessed here: https://myxaviermy.sharepoint.com/:x:/g/personal/clorem_xavier_edu/IQCe_iBNpJ7HQLetGoc6Gt9ZAaCwr_h1hni_dClieuJM1N8?e=KM5CvN.
Data Dictionary
The Netflix Movies and TV Shows dataset has 8,809 observations and 12 variables. The variables include:
show_id: unique identifier assigned to each movie or TV show
type: indicates whether the entry is a movie or a TV show
title: title of the movie or TV show
director: director(s) of the movie or TV show
cast: main actors featured in the movie or TV show
country: country where the movie or TV show was produced
date_added: date the title was added to Netflix
release_year: year the movie or TV show was originally released
rating: content rating (e.g. TV-MA, TV-PG)
duration: length of movie in minutes or number of seasons for TV shows
listed_in: genre(s) associated with the title
description: brief summary of the movie or TV show
Summary Statistics
The summary statistics reveal several differences between movies and TV shows that are available on Netflix. Movies make up the majority of the dataset with over 6,100 titles, while TV shows account for approximately 2,700. On average, movies have a duration of about 100 minutes, whereas TV shows average around 1.8 seasons. This suggests that shorter series formats are common within Netflix’s catalog. In addition to this, the release years show that Netflix includes both older and newer content, with movies ranging from 1942 to 2021 and TV shows spanning from 1925 to 2024. Overall, these statistics suggest that Netflix’s library is heavily dominated by movies while still maintaining a substantial collection of television content.
Part 2: Descriptive Analysis
To explore these ideas further, I used a variety of visuals and tables to identify trends and patterns within Netflix’s content library.
1. Distribution of Titles on Netflix by Release Year
The distribution of titles on Netflix by release year is negatively-skewed, with the majority of movies and TV shows being released after 2015. There is very limited older content available on the platform, especially for titles released before 2000. The sharp increase in more recent releases reflects the rapid expansion of Netflix’s library over time, while also suggesting that audiences may be more interested in newer content and recent releases.
2. Delay Between Content Release & Netflix Availability
The distribution of the delay between content release and Netflix availability is positively-skewed, with nearly 5,000 titles landing on the platform within the same year they were released. The number of titles decreases sharply after that, with very few taking more than 10 years to appear on Netflix. As reflected in the previous visual, Netflix’s catalog is heavily concentrated with newer content, so it makes sense that most titles are added relatively quickly after release. This pattern highlights how the streaming industry has accelerated content distribution, which allows movies and TV shows to move from initial release to streaming platforms within a short period of time.
3. Content Ratings by Frequency
# A tibble: 10 × 2
rating `number of titles`
<chr> <int>
1 TV-MA 3208
2 TV-14 2160
3 TV-PG 863
4 R 799
5 PG-13 490
6 TV-Y7 334
7 TV-Y 307
8 PG 287
9 TV-G 220
10 NR 80
Netflix’s content ratings are dominated by TV-MA and TV-14 titles, which together account for well over half of all titles on the platform’s catalog. This strong concentration of mature and teen-focused content is not particularly surprising, as adult audiences make up a large portion of Netflix’s subscriber base. In comparison, family-oriented ratings such as TV-G, TV-Y, and PG represent a smaller share of the catalog, which suggests that children and family content serve more as a supplemental offering rather than Netflix’s primary focus.
4. Relationship Between TV Show Release Year & Number of Seasons
The scatterplot shows that the vast majority of TV shows on Netflix have fewer than five seasons, and this pattern is consistent across all release years. The plot becomes noticeably denser after 2010, which aligns with Netflix’s broader expansion of its content library during that period. TV shows with 10 or more seasons are relatively rare, which suggests that the platform tends to feature shorter, more recent series rather than long-running traditional TV shows.
5. Average Time for Netflix to Add Title by Content Rating Group
Kids content appears on Netflix the fastest, averaging just under three years after release. Teen/Young Adult and Family content take the longest to arrive, at around six years on average, while Mature content falls in the middle at roughly 3.5 years. The faster turnaround for kids’ titles is expected as children’s programming tends to have a shorter relevance window and is frequently refreshed to keep younger audiences engaged with new content.
Part 3: Secondary Data Source
This dataset contains 250 distinct TV shows from IMDb’s Top 250 TV Shows list, which represents some of the highest-rated series according to user reviewers and ratings. Each row corresponds to a single TV show, and the original scraped variables include:
tvshow_title: name of the TV show
rating_of_show: content rating (e.g., TV-MA, TV-14)
series_type: type of show (TV series or mini-series)
stars_of_tvshow: IMDb user rating (out of 10)
start_year: year the show first aired
end_year: year the show ended (if applicable)
Additionally variables were created during the data wrangling process to support the analysis. These include:
show_age: number of years since release
show_length: number of years the show ran
rating_group: grouped content rating category (Kids, Family, Teen, Mature)
This dataset is used as a secondary source to compare highly rated television content against Netflix’s overall catalog.
1. Availability of IMDb Top-Ranked TV Shows on Netflix
When examining the IMDb Top 250 TV Shows, Netflix includes approximately 70 titles, while around 180 are not available on the platform. This means Netflix carries roughly one-quarter of the highest-rated shows, which is a notable gap given the overall size of its catalog. For viewers who prioritize highly rated content, this suggests that Netflix alone may not fully capture the breadth of top-performing TV series.
2. Average TV Show Length Comparison
The difference is notable: IMDb’s top-rated shows average nearly six seasons, while Netflix shows average fewer than two. This suggests that IMDb ratings may favor longevity, as longer-running series have more opportunities to build large, engaged audiences that actively rate and support them over time. In contrast, Netflix’s lower average aligns with its tendency toward shorter series runs and frequent cancellations.
Conclusion
Overall, this analysis explored whether characteristics are more common within Netflix’s catalog and what this reveals about modern viewing habits. The results show a clear emphasis on recent content, with most titles released in recent years and added to the platform relatively quickly. Netflix’s library is also dominated by shorter TV series and mature or teen-focused content, which reflects both audience preferences and platform strategy. When compared to IMDb’s top-rated shows, a gap emerges, as Netflix only includes a portion of highly ranked series. Ultimately, these findings suggest that Netflix prioritizes newer, shorter-form content, while highly rated TV shows are more often associated with longevity and sustained audience engagement.