This analysis explores the Netflix dataset to identify which actors appear most frequently in TV shows. We’ll transform the data to separate actors listed in the cast column and count their appearances.
library(tidyverse)
library(knitr)
First, we’ll load the Netflix dataset that was downloaded from Kaggle.
# Check current working directory
cat("Current working directory:", getwd(), "\n\n")
## Current working directory: C:/Users/William/OneDrive/Desktop/New folder
# List files in the current directory to verify Netflix.csv exists
cat("Files in current directory:\n")
## Files in current directory:
csv_files <- list.files(pattern = "*.csv")
print(csv_files)
## [1] "Netflix.csv"
# Try to find and load the Netflix file
# Common file names from Kaggle dataset
possible_names <- c("Netflix.csv", "netflix.csv", "netflix_titles.csv",
"Netflix_titles.csv", "netflix_data.csv")
file_found <- FALSE
for (filename in possible_names) {
if (file.exists(filename)) {
cat("\nFound file:", filename, "\n")
Netflix <- read.csv(filename, stringsAsFactors = FALSE)
file_found <- TRUE
break
}
}
##
## Found file: Netflix.csv
if (!file_found) {
stop("Netflix CSV file not found! Please ensure the file is in the working directory.
Available CSV files: ", paste(csv_files, collapse = ", "))
}
# Display the structure of the dataset
cat("\nDataset loaded successfully!\n")
##
## Dataset loaded successfully!
str(Netflix)
## 'data.frame': 6234 obs. of 12 variables:
## $ show_id : int 81145628 80117401 70234439 80058654 80125979 80163890 70304989 80164077 80117902 70304990 ...
## $ type : chr "Movie" "Movie" "TV Show" "TV Show" ...
## $ title : chr "Norm of the North: King Sized Adventure" "Jandino: Whatever it Takes" "Transformers Prime" "Transformers: Robots in Disguise" ...
## $ director : chr "Richard Finn, Tim Maltby" "" "" "" ...
## $ cast : chr "Alan Marriott, Andrew Toth, Brian Dobson, Cole Howard, Jennifer Cameron, Jonathan Holmes, Lee Tockar, Lisa Duru"| __truncated__ "Jandino Asporaat" "Peter Cullen, Sumalee Montano, Frank Welker, Jeffrey Combs, Kevin Michael Richardson, Tania Gunadi, Josh Keaton"| __truncated__ "Will Friedle, Darren Criss, Constance Zimmer, Khary Payton, Mitchell Whitfield, Stuart Allan, Ted McGinley, Peter Cullen" ...
## $ country : chr "United States, India, South Korea, China" "United Kingdom" "United States" "United States" ...
## $ date_added : chr "September 9, 2019" "September 9, 2016" "September 8, 2018" "September 8, 2018" ...
## $ release_year: int 2019 2016 2013 2016 2017 2016 2014 2017 2017 2014 ...
## $ rating : chr "TV-PG" "TV-MA" "TV-Y7-FV" "TV-Y7" ...
## $ duration : chr "90 min" "94 min" "1 Season" "1 Season" ...
## $ listed_in : chr "Children & Family Movies, Comedies" "Stand-Up Comedy" "Kids' TV" "Kids' TV" ...
## $ description : chr "Before planning an awesome wedding for his grandfather, a polar bear king must take back a stolen artifact from"| __truncated__ "Jandino Asporaat riffs on the challenges of raising kids and serenades the audience with a rousing rendition of"| __truncated__ "With the help of three human allies, the Autobots once again protect Earth from the onslaught of the Decepticon"| __truncated__ "When a prison ship crash unleashes hundreds of Decepticons on Earth, Bumblebee leads a new Autobot force to protect humankind." ...
# Display the first few rows
head(Netflix)
## show_id type title
## 1 81145628 Movie Norm of the North: King Sized Adventure
## 2 80117401 Movie Jandino: Whatever it Takes
## 3 70234439 TV Show Transformers Prime
## 4 80058654 TV Show Transformers: Robots in Disguise
## 5 80125979 Movie #realityhigh
## 6 80163890 TV Show Apaches
## director
## 1 Richard Finn, Tim Maltby
## 2
## 3
## 4
## 5 Fernando Lebrija
## 6
## cast
## 1 Alan Marriott, Andrew Toth, Brian Dobson, Cole Howard, Jennifer Cameron, Jonathan Holmes, Lee Tockar, Lisa Durupt, Maya Kay, Michael Dobson
## 2 Jandino Asporaat
## 3 Peter Cullen, Sumalee Montano, Frank Welker, Jeffrey Combs, Kevin Michael Richardson, Tania Gunadi, Josh Keaton, Steve Blum, Andy Pessoa, Ernie Hudson, Daran Norris, Will Friedle
## 4 Will Friedle, Darren Criss, Constance Zimmer, Khary Payton, Mitchell Whitfield, Stuart Allan, Ted McGinley, Peter Cullen
## 5 Nesta Cooper, Kate Walsh, John Michael Higgins, Keith Powers, Alicia Sanz, Jake Borelli, Kid Ink, Yousef Erakat, Rebekah Graf, Anne Winters, Peter Gilroy, Patrick Davis
## 6 Alberto Ammann, Eloy Azorín, Verónica Echegui, Lucía Jiménez, Claudia Traisac
## country date_added release_year
## 1 United States, India, South Korea, China September 9, 2019 2019
## 2 United Kingdom September 9, 2016 2016
## 3 United States September 8, 2018 2013
## 4 United States September 8, 2018 2016
## 5 United States September 8, 2017 2017
## 6 Spain September 8, 2017 2016
## rating duration
## 1 TV-PG 90 min
## 2 TV-MA 94 min
## 3 TV-Y7-FV 1 Season
## 4 TV-Y7 1 Season
## 5 TV-14 99 min
## 6 TV-MA 1 Season
## listed_in
## 1 Children & Family Movies, Comedies
## 2 Stand-Up Comedy
## 3 Kids' TV
## 4 Kids' TV
## 5 Comedies
## 6 Crime TV Shows, International TV Shows, Spanish-Language TV Shows
## description
## 1 Before planning an awesome wedding for his grandfather, a polar bear king must take back a stolen artifact from an evil archaeologist first.
## 2 Jandino Asporaat riffs on the challenges of raising kids and serenades the audience with a rousing rendition of "Sex on Fire" in his comedy show.
## 3 With the help of three human allies, the Autobots once again protect Earth from the onslaught of the Decepticons and their leader, Megatron.
## 4 When a prison ship crash unleashes hundreds of Decepticons on Earth, Bumblebee leads a new Autobot force to protect humankind.
## 5 When nerdy high schooler Dani finally attracts the interest of her longtime crush, she lands in the cross hairs of his ex, a social media celebrity.
## 6 A young journalist is forced into a life of crime to save his father and family in this series based on the novel by Miguel Sáez Carral.
# Check if 'cast' column exists
if (!"cast" %in% colnames(Netflix)) {
stop("'cast' column not found in dataset. Available columns: ",
paste(colnames(Netflix), collapse = ", "))
}
We need to separate the actors in the cast column since multiple actors are listed together, separated by commas.
# Separate actors in the cast column and rename the column
Netflix_Actor <- Netflix %>%
separate_rows(cast, sep = ", ") %>%
drop_na(cast) %>%
rename(actor = cast)
# Display the transformed data
head(Netflix_Actor)
## # A tibble: 6 × 12
## show_id type title director actor country date_added release_year rating
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr>
## 1 81145628 Movie Norm of … Richard… Alan… United… September… 2019 TV-PG
## 2 81145628 Movie Norm of … Richard… Andr… United… September… 2019 TV-PG
## 3 81145628 Movie Norm of … Richard… Bria… United… September… 2019 TV-PG
## 4 81145628 Movie Norm of … Richard… Cole… United… September… 2019 TV-PG
## 5 81145628 Movie Norm of … Richard… Jenn… United… September… 2019 TV-PG
## 6 81145628 Movie Norm of … Richard… Jona… United… September… 2019 TV-PG
## # ℹ 3 more variables: duration <chr>, listed_in <chr>, description <chr>
Now we’ll filter for TV shows only and count which actors appear most frequently.
# Find the 6 actors with the most appearances in TV shows
top_actors <- Netflix_Actor %>%
select(type, actor) %>%
filter(type == "TV Show") %>%
group_by(actor) %>%
count(sort = TRUE) %>%
ungroup() %>%
head()
# Display the results in a nice table
kable(top_actors,
col.names = c("Actor", "Number of TV Shows"),
caption = "Top 6 Actors with Most TV Show Appearances on Netflix")
| Actor | Number of TV Shows |
|---|---|
| 210 | |
| Takahiro Sakurai | 18 |
| Yuki Kaji | 16 |
| Daisuke Ono | 14 |
| David Attenborough | 14 |
| Ashleigh Ball | 12 |
Let’s create a bar chart to visualize these results.
# Create a bar plot
ggplot(top_actors, aes(x = reorder(actor, n), y = n, fill = actor)) +
geom_bar(stat = "identity", show.legend = FALSE) +
geom_text(aes(label = n), hjust = -0.2, size = 4) +
coord_flip() +
labs(title = "Top 6 Actors with Most TV Show Appearances on Netflix",
x = "Actor",
y = "Number of TV Shows") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
axis.text = element_text(size = 11))
The analysis reveals the top 6 actors who appear most frequently in Netflix TV shows. These actors have demonstrated significant presence on the platform through multiple TV show appearances.
sessionInfo()
## R version 4.5.1 (2025-06-13 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
##
## Matrix products: default
## LAPACK version 3.12.1
##
## locale:
## [1] LC_COLLATE=English_American Samoa.utf8
## [2] LC_CTYPE=English_American Samoa.utf8
## [3] LC_MONETARY=English_American Samoa.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_American Samoa.utf8
##
## time zone: Asia/Taipei
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.50 lubridate_1.9.4 forcats_1.0.1 stringr_1.5.2
## [5] dplyr_1.1.4 purrr_1.1.0 readr_2.1.5 tidyr_1.3.1
## [9] tibble_3.3.0 ggplot2_4.0.0 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 jsonlite_2.0.0 compiler_4.5.1 tidyselect_1.2.1
## [5] jquerylib_0.1.4 scales_1.4.0 yaml_2.3.10 fastmap_1.2.0
## [9] R6_2.6.1 labeling_0.4.3 generics_0.1.4 bslib_0.9.0
## [13] pillar_1.11.1 RColorBrewer_1.1-3 tzdb_0.5.0 rlang_1.1.6
## [17] utf8_1.2.6 stringi_1.8.7 cachem_1.1.0 xfun_0.53
## [21] sass_0.4.10 S7_0.2.0 timechange_0.3.0 cli_3.6.5
## [25] withr_3.0.2 magrittr_2.0.4 digest_0.6.37 grid_4.5.1
## [29] rstudioapi_0.17.1 hms_1.1.3 lifecycle_1.0.4 vctrs_0.6.5
## [33] evaluate_1.0.5 glue_1.8.0 farver_2.1.2 rmarkdown_2.30
## [37] tools_4.5.1 pkgconfig_2.0.3 htmltools_0.5.8.1