Introduction

This analysis explores the Netflix dataset to identify which actors appear most frequently in TV shows. We’ll transform the data to separate actors listed in the cast column and count their appearances.

Loading Required Libraries

library(tidyverse)
library(knitr)

Loading the Dataset

First, we’ll load the Netflix dataset that was downloaded from Kaggle.

# Check current working directory
cat("Current working directory:", getwd(), "\n\n")
## Current working directory: C:/Users/William/OneDrive/Desktop/New folder
# List files in the current directory to verify Netflix.csv exists
cat("Files in current directory:\n")
## Files in current directory:
csv_files <- list.files(pattern = "*.csv")
print(csv_files)
## [1] "Netflix.csv"
# Try to find and load the Netflix file
# Common file names from Kaggle dataset
possible_names <- c("Netflix.csv", "netflix.csv", "netflix_titles.csv", 
                    "Netflix_titles.csv", "netflix_data.csv")

file_found <- FALSE
for (filename in possible_names) {
  if (file.exists(filename)) {
    cat("\nFound file:", filename, "\n")
    Netflix <- read.csv(filename, stringsAsFactors = FALSE)
    file_found <- TRUE
    break
  }
}
## 
## Found file: Netflix.csv
if (!file_found) {
  stop("Netflix CSV file not found! Please ensure the file is in the working directory.
       Available CSV files: ", paste(csv_files, collapse = ", "))
}

# Display the structure of the dataset
cat("\nDataset loaded successfully!\n")
## 
## Dataset loaded successfully!
str(Netflix)
## 'data.frame':    6234 obs. of  12 variables:
##  $ show_id     : int  81145628 80117401 70234439 80058654 80125979 80163890 70304989 80164077 80117902 70304990 ...
##  $ type        : chr  "Movie" "Movie" "TV Show" "TV Show" ...
##  $ title       : chr  "Norm of the North: King Sized Adventure" "Jandino: Whatever it Takes" "Transformers Prime" "Transformers: Robots in Disguise" ...
##  $ director    : chr  "Richard Finn, Tim Maltby" "" "" "" ...
##  $ cast        : chr  "Alan Marriott, Andrew Toth, Brian Dobson, Cole Howard, Jennifer Cameron, Jonathan Holmes, Lee Tockar, Lisa Duru"| __truncated__ "Jandino Asporaat" "Peter Cullen, Sumalee Montano, Frank Welker, Jeffrey Combs, Kevin Michael Richardson, Tania Gunadi, Josh Keaton"| __truncated__ "Will Friedle, Darren Criss, Constance Zimmer, Khary Payton, Mitchell Whitfield, Stuart Allan, Ted McGinley, Peter Cullen" ...
##  $ country     : chr  "United States, India, South Korea, China" "United Kingdom" "United States" "United States" ...
##  $ date_added  : chr  "September 9, 2019" "September 9, 2016" "September 8, 2018" "September 8, 2018" ...
##  $ release_year: int  2019 2016 2013 2016 2017 2016 2014 2017 2017 2014 ...
##  $ rating      : chr  "TV-PG" "TV-MA" "TV-Y7-FV" "TV-Y7" ...
##  $ duration    : chr  "90 min" "94 min" "1 Season" "1 Season" ...
##  $ listed_in   : chr  "Children & Family Movies, Comedies" "Stand-Up Comedy" "Kids' TV" "Kids' TV" ...
##  $ description : chr  "Before planning an awesome wedding for his grandfather, a polar bear king must take back a stolen artifact from"| __truncated__ "Jandino Asporaat riffs on the challenges of raising kids and serenades the audience with a rousing rendition of"| __truncated__ "With the help of three human allies, the Autobots once again protect Earth from the onslaught of the Decepticon"| __truncated__ "When a prison ship crash unleashes hundreds of Decepticons on Earth, Bumblebee leads a new Autobot force to protect humankind." ...
# Display the first few rows
head(Netflix)
##    show_id    type                                   title
## 1 81145628   Movie Norm of the North: King Sized Adventure
## 2 80117401   Movie              Jandino: Whatever it Takes
## 3 70234439 TV Show                      Transformers Prime
## 4 80058654 TV Show        Transformers: Robots in Disguise
## 5 80125979   Movie                            #realityhigh
## 6 80163890 TV Show                                 Apaches
##                   director
## 1 Richard Finn, Tim Maltby
## 2                         
## 3                         
## 4                         
## 5         Fernando Lebrija
## 6                         
##                                                                                                                                                                                 cast
## 1                                        Alan Marriott, Andrew Toth, Brian Dobson, Cole Howard, Jennifer Cameron, Jonathan Holmes, Lee Tockar, Lisa Durupt, Maya Kay, Michael Dobson
## 2                                                                                                                                                                   Jandino Asporaat
## 3 Peter Cullen, Sumalee Montano, Frank Welker, Jeffrey Combs, Kevin Michael Richardson, Tania Gunadi, Josh Keaton, Steve Blum, Andy Pessoa, Ernie Hudson, Daran Norris, Will Friedle
## 4                                                           Will Friedle, Darren Criss, Constance Zimmer, Khary Payton, Mitchell Whitfield, Stuart Allan, Ted McGinley, Peter Cullen
## 5           Nesta Cooper, Kate Walsh, John Michael Higgins, Keith Powers, Alicia Sanz, Jake Borelli, Kid Ink, Yousef Erakat, Rebekah Graf, Anne Winters, Peter Gilroy, Patrick Davis
## 6                                                                                                      Alberto Ammann, Eloy Azorín, Verónica Echegui, Lucía Jiménez, Claudia Traisac
##                                    country        date_added release_year
## 1 United States, India, South Korea, China September 9, 2019         2019
## 2                           United Kingdom September 9, 2016         2016
## 3                            United States September 8, 2018         2013
## 4                            United States September 8, 2018         2016
## 5                            United States September 8, 2017         2017
## 6                                    Spain September 8, 2017         2016
##     rating duration
## 1    TV-PG   90 min
## 2    TV-MA   94 min
## 3 TV-Y7-FV 1 Season
## 4    TV-Y7 1 Season
## 5    TV-14   99 min
## 6    TV-MA 1 Season
##                                                           listed_in
## 1                                Children & Family Movies, Comedies
## 2                                                   Stand-Up Comedy
## 3                                                          Kids' TV
## 4                                                          Kids' TV
## 5                                                          Comedies
## 6 Crime TV Shows, International TV Shows, Spanish-Language TV Shows
##                                                                                                                                            description
## 1         Before planning an awesome wedding for his grandfather, a polar bear king must take back a stolen artifact from an evil archaeologist first.
## 2    Jandino Asporaat riffs on the challenges of raising kids and serenades the audience with a rousing rendition of "Sex on Fire" in his comedy show.
## 3         With the help of three human allies, the Autobots once again protect Earth from the onslaught of the Decepticons and their leader, Megatron.
## 4                       When a prison ship crash unleashes hundreds of Decepticons on Earth, Bumblebee leads a new Autobot force to protect humankind.
## 5 When nerdy high schooler Dani finally attracts the interest of her longtime crush, she lands in the cross hairs of his ex, a social media celebrity.
## 6             A young journalist is forced into a life of crime to save his father and family in this series based on the novel by Miguel Sáez Carral.
# Check if 'cast' column exists
if (!"cast" %in% colnames(Netflix)) {
  stop("'cast' column not found in dataset. Available columns: ", 
       paste(colnames(Netflix), collapse = ", "))
}

Data Transformation

We need to separate the actors in the cast column since multiple actors are listed together, separated by commas.

# Separate actors in the cast column and rename the column
Netflix_Actor <- Netflix %>% 
  separate_rows(cast, sep = ", ") %>% 
  drop_na(cast) %>% 
  rename(actor = cast)

# Display the transformed data
head(Netflix_Actor)
## # A tibble: 6 × 12
##    show_id type  title     director actor country date_added release_year rating
##      <int> <chr> <chr>     <chr>    <chr> <chr>   <chr>             <int> <chr> 
## 1 81145628 Movie Norm of … Richard… Alan… United… September…         2019 TV-PG 
## 2 81145628 Movie Norm of … Richard… Andr… United… September…         2019 TV-PG 
## 3 81145628 Movie Norm of … Richard… Bria… United… September…         2019 TV-PG 
## 4 81145628 Movie Norm of … Richard… Cole… United… September…         2019 TV-PG 
## 5 81145628 Movie Norm of … Richard… Jenn… United… September…         2019 TV-PG 
## 6 81145628 Movie Norm of … Richard… Jona… United… September…         2019 TV-PG 
## # ℹ 3 more variables: duration <chr>, listed_in <chr>, description <chr>

Finding Top 6 Actors in TV Shows

Now we’ll filter for TV shows only and count which actors appear most frequently.

# Find the 6 actors with the most appearances in TV shows
top_actors <- Netflix_Actor %>%
  select(type, actor) %>% 
  filter(type == "TV Show") %>% 
  group_by(actor) %>% 
  count(sort = TRUE) %>% 
  ungroup() %>% 
  head()

# Display the results in a nice table
kable(top_actors, 
      col.names = c("Actor", "Number of TV Shows"),
      caption = "Top 6 Actors with Most TV Show Appearances on Netflix")
Top 6 Actors with Most TV Show Appearances on Netflix
Actor Number of TV Shows
210
Takahiro Sakurai 18
Yuki Kaji 16
Daisuke Ono 14
David Attenborough 14
Ashleigh Ball 12

Visualization

Let’s create a bar chart to visualize these results.

# Create a bar plot
ggplot(top_actors, aes(x = reorder(actor, n), y = n, fill = actor)) +
  geom_bar(stat = "identity", show.legend = FALSE) +
  geom_text(aes(label = n), hjust = -0.2, size = 4) +
  coord_flip() +
  labs(title = "Top 6 Actors with Most TV Show Appearances on Netflix",
       x = "Actor",
       y = "Number of TV Shows") +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
        axis.text = element_text(size = 11))

Summary

The analysis reveals the top 6 actors who appear most frequently in Netflix TV shows. These actors have demonstrated significant presence on the platform through multiple TV show appearances.

Session Information

sessionInfo()
## R version 4.5.1 (2025-06-13 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
##   LAPACK version 3.12.1
## 
## locale:
## [1] LC_COLLATE=English_American Samoa.utf8 
## [2] LC_CTYPE=English_American Samoa.utf8   
## [3] LC_MONETARY=English_American Samoa.utf8
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=English_American Samoa.utf8    
## 
## time zone: Asia/Taipei
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] knitr_1.50      lubridate_1.9.4 forcats_1.0.1   stringr_1.5.2  
##  [5] dplyr_1.1.4     purrr_1.1.0     readr_2.1.5     tidyr_1.3.1    
##  [9] tibble_3.3.0    ggplot2_4.0.0   tidyverse_2.0.0
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6       jsonlite_2.0.0     compiler_4.5.1     tidyselect_1.2.1  
##  [5] jquerylib_0.1.4    scales_1.4.0       yaml_2.3.10        fastmap_1.2.0     
##  [9] R6_2.6.1           labeling_0.4.3     generics_0.1.4     bslib_0.9.0       
## [13] pillar_1.11.1      RColorBrewer_1.1-3 tzdb_0.5.0         rlang_1.1.6       
## [17] utf8_1.2.6         stringi_1.8.7      cachem_1.1.0       xfun_0.53         
## [21] sass_0.4.10        S7_0.2.0           timechange_0.3.0   cli_3.6.5         
## [25] withr_3.0.2        magrittr_2.0.4     digest_0.6.37      grid_4.5.1        
## [29] rstudioapi_0.17.1  hms_1.1.3          lifecycle_1.0.4    vctrs_0.6.5       
## [33] evaluate_1.0.5     glue_1.8.0         farver_2.1.2       rmarkdown_2.30    
## [37] tools_4.5.1        pkgconfig_2.0.3    htmltools_0.5.8.1

Finding Top 6 Actors in TV Show