Introduction

Launched on November 12, 2019, Disney Plus marked The Walt Disney Company’s strategic venture into the streaming market, directly challenging established players like Netflix and Amazon Prime Video. This platform was grounded in Disney’s rich library of animated classics and television shows, enriched by content from acquisitions such as Pixar, Marvel, Star Wars, and National Geographic. Alongside these, Disney Plus introduced a range of original content, rapidly becoming a cultural sensation. This launch signified Disney’s shift towards direct-to-consumer services, leveraging its renowned brand and legacy while heavily investing in new content. The strategy proved successful, as evidenced by the platform’s swift growth to over 116 million subscribers globally by mid-2021. This impressive subscriber count, buoyed by an extensive library of around 1,300 movies and TV shows, highlights Disney Plus’s widespread appeal and the enduring allure of its diverse content offerings.

Insights on the Dataset

The dataset analyzed encompasses a detailed catalog of 1,450 titles available on Disney Plus, spanning from its launch in 2019 to mid-2021. It contains essential details such as ID, content type, title, director, cast, country of origin, addition date, release year, rating, duration, genre, and description. This collection covers a wide time range, from vintage classics of 1928 to the contemporary releases of 2021, with a median release year situated in 2011. The range of content ratings, from TV-G to PG, mirrors Disney Plus’s appeal to various age groups. This diverse compilation, which includes everything from animated movies to documentaries, offers a valuable foundation for examining trends in content distribution, genre popularity, and the strategic direction of the platform

# Set the working directory
setwd("/Users/janellnapper/Desktop/DS 736/R Project")

# Load libraries 
library(ggplot2)
library(data.table)
library(dplyr)

# Read in the data
data <- fread("disney_plus_titles.csv")

Findings

Within the following tabs, you will find a series of five meticulously crafted charts, each offering deeper insight into Disney Plus’s content dynamics. The Line Chart traces the trajectory of titles added over time, prompting questions about changes in Disney Plus’s content strategy and any significant shifts in content volume since its inception. The Pie Chart quantifies the proportion of TV shows versus movies, raising queries about how this balance reflects the platform’s target audience or content priorities. In the Violin Chart, the distribution of release years for movies and TV shows is mapped, inviting an analysis of the platform’s blend of classic and contemporary offerings. The Scatterplot correlates the number of seasons with release years, suggesting a deeper look into whether Disney Plus favors long-standing series or newer shows. Finally, the Bar Chart ranks the top ten TV shows, encouraging exploration into viewer preferences, popular genres, and how these shows align with the overall content strategy of Disney Plus. Together, these visual representations offer a multifaceted and comprehensive analysis of the diverse content landscape within Disney Plus.

# Data cleaning and transformation using dplyr
data <- data %>%
  # Replace empty country values with "Unknown"
  mutate(country = if_else(country == "", "Unknown", country)) %>%
  
  # Standardize text to lowercase
  mutate(across(c(director, title, cast), tolower)) %>%
  
  # Convert 'date_added' to Date format
  mutate(date_added = as.Date(date_added, format = "%B %d, %Y")) %>%
  
  # Filter out unlikely durations
  filter(duration != "0 min") %>%
  
  # Convert duration to numeric for movies and TV shows
  mutate(duration_min = if_else(type == "Movie", as.numeric(gsub(" min", "", duration)), NA_real_),
         num_seasons = if_else(type == "TV Show", as.numeric(gsub(" Seasons?| Season", "", duration)), NA_real_)) %>%
  
  # Remove duplicate rows
  distinct() %>%
  
  # Additional columns
  mutate(is_series = type == "TV Show",
         year = as.numeric(format(date_added, "%Y")))

Number of Titles Added to Disney Plus Per Year

The line chart illustrates a surge in titles during Disney Plus’s 2019 debut, likely reflecting an aggressive launch strategy that capitalized on Disney’s vast content reserves to attract a broad subscriber base. A notable decline in new titles in 2020 could suggest a shift towards prioritizing content quality, a strategic pacing of releases, or production delays possibly due to the COVID-19 pandemic. By 2021, title additions stabilized, a trend partially influenced by the dataset’s cutoff at mid-year. The initial influx of titles may have been a market saturation tactic to secure a diverse audience. Further analysis correlating subscriber trends with title additions could shed light on the launch strategy’s impact. Additionally, comparing the production timelines of Disney Plus originals to the procurement of reruns might clarify the observed patterns, as originals often have longer production times. In sum, the data indicates a bold market entry followed by a more deliberate content strategy, likely shaped by both internal policy and external events.

# Ensure 'year' is numeric and remove NA values using filter() from dplyr
titles_per_year <- data %>%
  mutate(year = as.numeric(format(date_added, "%Y"))) %>%
  filter(!is.na(year)) %>%
  count(year) %>%
  arrange(year)

ggplot(titles_per_year, aes(x = year, y = n)) +
  geom_line(color = "tomato") +
  geom_point(color = "tomato") +
  labs(title = "Number of Titles Added to Disney Plus Per Year",
       x = "Year",
       y = "Number of Titles") +
  scale_x_continuous(breaks = seq(min(titles_per_year$year), max(titles_per_year$year), by = 1)) +
  theme_minimal()

Proportion of Movies vs Tv Shows

The pie chart provides a clear visual breakdown of Disney Plus content, highlighting that of the platform’s offerings, 1,052 are movies and 398 are TV shows. With movies constituting over two-thirds of the catalog, it’s evident that Disney Plus leans more towards films in its content strategy, possibly echoing Disney’s historical emphasis on movie production. In contrast, TV shows account for less than a third, underscoring the company’s cinematic heritage. This distribution offers a simple yet effective comparison of the volume of movies to TV shows available on Disney Plus.

# Count of movies and TV shows
content_count <- table(data$type)

# Pie chart
pie(content_count, 
    labels = paste(names(content_count), "\n", content_count),
    main = "Proportion of Movies vs TV Shows on Disney Plus",
    col = c("lightblue", "lightgreen"))

Distribution of Release Years for Movies and TV Shows

The violin chart illustrates the release year distribution of Disney Plus’s movies and TV shows. Movies present a wide-ranging timeline, from just after 1925 to as recent as 2021, highlighting a century’s worth of film offerings. The chart’s bulge around 2019 to 2021 signals a surge in movie releases, aligning with the platform’s inception period. On the other hand, TV show releases are more concentrated around the launch years of Disney Plus, with a tail stretching back to the years following 1950. This suggests a modest selection of classic series amidst a focus on contemporary content.

# Violin plot
ggplot(data, aes(x = type, y = release_year, fill = type)) +
  geom_violin() +
  labs(title = "Distribution of Release Years for Movies and TV Shows",
       x = "Type",
       y = "Release Year") +
  theme_minimal() +
  theme(legend.position = "none")

Number of Seasons vs. Release Year

The scatter plot analysis suggests that Disney Plus’s catalog includes a strategic mix of shows, balancing classic reruns with extensive seasons and newer series with a limited number. Since its 2019 launch, the platform has incorporated both beloved reruns and Disney Plus originals or recent acquisitions, evident from the data points pre- and post-launch year. Newer shows typically have fewer seasons, reflecting their recent introduction. An outlier with over 30 seasons underscores the inclusion of a notably long-running series, a draw for subscribers seeking familiar content. This strategy highlights the platform’s dual focus: offering a rich backlog of established shows for extensive viewership and promoting fresh content to keep the library updated. In essence, Disney Plus has curated its offerings to appeal to a wide audience base, from those who indulge in nostalgia to those seeking new experiences, potentially fostering longer subscription tenures.

# Filter to include only TV shows
tv_shows <- subset(data, type == "TV Show" & !is.na(num_seasons))

# Calculate the maximum number of seasons for setting the upper limit of the breaks
max_seasons <- max(tv_shows$num_seasons, na.rm = TRUE)

# Scatter plot with adjusted color scheme
ggplot(tv_shows, aes(x = release_year, y = num_seasons)) +
  geom_point(aes(color = num_seasons), alpha = 0.8) +
  scale_color_gradient(name = "Number of Seasons", low = "green", high = "black",
                       breaks = seq(0, max_seasons, by = 5)) +
  labs(title = "Number of Seasons vs. Release Year for TV Shows on Disney Plus",
       x = "Release Year",
       y = "Number of Seasons") +
  theme_minimal() +
  theme(legend.position = "right")

Top TV Shows

This bar chart ranks the top ten TV shows on Disney Plus by the number of seasons available. Each show’s bar is color-coded according to its content rating: TV-14 (green) for ages 14 and up, TV-G (red) for all ages, and TV-PG (blue) for content that may require parental guidance. This rating system indicates the platform’s target audience and content strategy. ‘The Simpsons’ leads the chart with 32 seasons, while ‘When Sharks Attack’ is at the lower end with 7 seasons. The chart showcases Disney Plus’s commitment to featuring both enduring series with dedicated followings and a mixture of genres, from ‘The Simpsons’ to docu-series like ‘Wicked Tuna’ and ‘Life Below Zero’, as well as educational programs like ‘Brain Games’. The prevalence of TV-PG rated shows suggests a family-centric content approach. Overall, the variety in the number of seasons—from 7 to 32—highlights Disney Plus’s aim to cater to diverse viewer preferences, accommodating fans of both long-standing franchises and concise series.

# Create a bar chart to show the Top 10 shows with the most seasons
tv_shows <- data %>% 
  filter(type == "TV Show" & !is.na(num_seasons)) %>% 
  arrange(desc(num_seasons)) %>%
  slice_head(n = 10)

ggplot(tv_shows, aes(x = reorder(title, num_seasons), y = num_seasons, fill = rating)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  geom_text(aes(label = num_seasons), position = position_dodge(width = 0.9), hjust = -0.1, size = 3) +
  labs(title = "Top TV Shows on Disney Plus by Number of Seasons",
       x = "TV Show",
       y = "Number of Seasons",
       fill = "Rating") + # Label for the legend
  theme_minimal() +
  theme(legend.position = "right") # Show legend

Conclusion

In conclusion, the analysis of Disney Plus’s content from its launch in 2019 to mid-2021 reveals a strategic blend of classic and contemporary offerings, catering to a diverse audience range. The Line Chart indicates an aggressive initial launch strategy followed by a more measured approach in content addition, reflecting strategic shifts or external influences like the pandemic. The Pie Chart highlights a balanced mix of TV shows and movies, showcasing Disney Plus’s aim to appeal to varied viewer preferences. The Violin Chart’s distribution of release years illustrates the platform’s commitment to offering a range of content from timeless classics to modern hits. The Scatterplot reveals a strategic inclusion of both long-standing series and newer shows, suggesting a focus on both nostalgia and fresh content. Finally, the Bar Chart of top TV shows underscores the platform’s successful incorporation of popular series, aligning with its goals to attract and retain a broad subscriber base. Overall, this analysis underscores Disney Plus’s dynamic approach in shaping a versatile and engaging streaming service.