Data Visualization - Disney

1 Introduction

Dalam projek ini, saya akan menganalisis data dari Kaggle, yaitu data Disney. Disney Plus, juga dikenal sebagai Disney+, adalah platform streaming video on-demand yang dimiliki dan dioperasikan oleh The Walt Disney Company. Diluncurkan pada November 2019, Disney Plus telah menjadi salah satu platform streaming terkemuka di dunia dengan katalog konten yang luas dan beragam, selanjutnya saya akan mencoba memvisualisasikan berdasarkan hasil analisis yang telah dilakukan pada dataset yang didapatkan.

1.1 Input Data

Pertama, kita perlu menambahkan dataset

disney <- read.csv("datainput/disney_plus_titles.csv")

Deskripsi Kolom pada Dataset

  • show_id : Unique ID

  • type : Is it a Movie or a TV Show

  • title : Name of the movie / show

  • director : Directors of the movie / tv show

  • cast : Main cast of the movie / show

  • country : Country of production

  • date_added : Date added on Disney+

  • release_year : Original Release Year of the movie / tv show

  • rating : Rating of the movie / tv show

  • duration : Total duration of the movie / tv show

  • listed_in : Categories of the movie / show

  • description : Description of the movie / show

kemudian melihat 10 data teraratas pada data frame yang telah dibuat

head(disney,10)
#>    show_id    type                                            title
#> 1       s1   Movie Duck the Halls: A Mickey Mouse Christmas Special
#> 2       s2   Movie                           Ernest Saves Christmas
#> 3       s3   Movie                     Ice Age: A Mammoth Christmas
#> 4       s4   Movie                       The Queen Family Singalong
#> 5       s5 TV Show                            The Beatles: Get Back
#> 6       s6   Movie                                Becoming Cousteau
#> 7       s7 TV Show                                          Hawkeye
#> 8       s8 TV Show                           Port Protection Alaska
#> 9       s9 TV Show                        Secrets of the Zoo: Tampa
#> 10     s10   Movie            A Muppets Christmas: Letters To Santa
#>                             director
#> 1  Alonso Ramirez Ramos, Dave Wasson
#> 2                        John Cherry
#> 3                       Karen Disher
#> 4                    Hamish Hamilton
#> 5                                   
#> 6                         Liz Garbus
#> 7                                   
#> 8                                   
#> 9                                   
#> 10                  Kirk R. Thatcher
#>                                                                                            cast
#> 1  Chris Diamantopoulos, Tony Anselmo, Tress MacNeille, Bill Farmer, Russi Taylor, Corey Burton
#> 2                                                      Jim Varney, Noelle Parker, Douglas Seale
#> 3                             Raymond Albert Romano, John Leguizamo, Denis Leary, Queen Latifah
#> 4           Darren Criss, Adam Lambert, Derek Hough, Alexander Jean, Fall Out Boy, Jimmie Allen
#> 5                                     John Lennon, Paul McCartney, George Harrison, Ringo Starr
#> 6                                                         Jacques Yves Cousteau, Vincent Cassel
#> 7           Jeremy Renner, Hailee Steinfeld, Vera Farmiga, Fra Fee, Tony Dalton, Zahn McClarnon
#> 8         Gary Muehlberger, Mary Miller, Curly Leach, Sam Carlson, Stuart Andrews, David Squibb
#> 9  Dr. Ray Ball, Dr. Lauren Smith, Chris Massaro, Tiffany Burns, Mike Burns, Melinda Mendolusky
#> 10                                     Steve Whitmire, Dave Goelz, Bill Barretta, Eric Jacobson
#>          country        date_added release_year rating  duration
#> 1                November 26, 2021         2016   TV-G    23 min
#> 2                November 26, 2021         1988     PG    91 min
#> 3  United States November 26, 2021         2011   TV-G    23 min
#> 4                November 26, 2021         2021  TV-PG    41 min
#> 5                November 25, 2021         2021         1 Season
#> 6  United States November 24, 2021         2021  PG-13    94 min
#> 7                November 24, 2021         2021  TV-14  1 Season
#> 8  United States November 24, 2021         2015  TV-14 2 Seasons
#> 9  United States November 24, 2021         2019  TV-PG 2 Seasons
#> 10 United States November 19, 2021         2008      G    45 min
#>                               listed_in
#> 1                     Animation, Family
#> 2                                Comedy
#> 3             Animation, Comedy, Family
#> 4                               Musical
#> 5         Docuseries, Historical, Music
#> 6             Biographical, Documentary
#> 7           Action-Adventure, Superhero
#> 8         Docuseries, Reality, Survival
#> 9  Animals & Nature, Docuseries, Family
#> 10              Comedy, Family, Musical
#>                                                                                            description
#> 1                                                     Join Mickey and the gang as they duck the halls!
#> 2                                                   Santa Claus passes his magic bag to a new St. Nic.
#> 3                                                            Sid the Sloth is on Santa's naughty list.
#> 4                                                                 This is real life, not just fantasy!
#> 5    A three-part documentary from Peter Jackson capturing a moment in music history with The Beatles.
#> 6                            An inside look at the legendary life of adventurer Jacques-Yves Cousteau.
#> 7  Clint Barton/Hawkeye must team up with skilled archer Kate Bishop to unravel a criminal conspiracy.
#> 8        Residents of Port Protection must combat volatile conditions to survive and thrive in Alaska.
#> 9                          A day in the life at ZooTampa is anything but ordinary. It's extraordinary!
#> 10                                        Celebrate the holiday season with all your favorite Muppets.

kemudian perlu untuk melihat dimensi data pada data frame

dim(disney)
#> [1] 1450   12

2 Data Cleansing

Langkah pertama dalam melakukan analisis data adalah memastikan bahwa data yang akan digunakan bersih

2.1 Load Libraries

Pertama, kita perlu memuat library yang diperlukan

library(dplyr) # untuk transformasi data
library(plotly) # untuk membuat plot menjadi interaktif
library(glue) # untuk custom informasi saat plot interaktif
library(scales) # untuk custom keterangan axis atau lainnya
library(tidyr) # untuk custom keterangan axis atau lainnya
library(ggpubr) # untuk export plot
library(tidyverse)
library(lubridate)
library(ggplot2)

2.2 Explicit Coercion

Selanjutnya, kita perlu memeriksa tipe data di setiap kolom sudah benar

str(disney)
#> 'data.frame':    1450 obs. of  12 variables:
#>  $ show_id     : chr  "s1" "s2" "s3" "s4" ...
#>  $ type        : chr  "Movie" "Movie" "Movie" "Movie" ...
#>  $ title       : chr  "Duck the Halls: A Mickey Mouse Christmas Special" "Ernest Saves Christmas" "Ice Age: A Mammoth Christmas" "The Queen Family Singalong" ...
#>  $ director    : chr  "Alonso Ramirez Ramos, Dave Wasson" "John Cherry" "Karen Disher" "Hamish Hamilton" ...
#>  $ cast        : chr  "Chris Diamantopoulos, Tony Anselmo, Tress MacNeille, Bill Farmer, Russi Taylor, Corey Burton" "Jim Varney, Noelle Parker, Douglas Seale" "Raymond Albert Romano, John Leguizamo, Denis Leary, Queen Latifah" "Darren Criss, Adam Lambert, Derek Hough, Alexander Jean, Fall Out Boy, Jimmie Allen" ...
#>  $ country     : chr  "" "" "United States" "" ...
#>  $ date_added  : chr  "November 26, 2021" "November 26, 2021" "November 26, 2021" "November 26, 2021" ...
#>  $ release_year: int  2016 1988 2011 2021 2021 2021 2021 2015 2019 2008 ...
#>  $ rating      : chr  "TV-G" "PG" "TV-G" "TV-PG" ...
#>  $ duration    : chr  "23 min" "91 min" "23 min" "41 min" ...
#>  $ listed_in   : chr  "Animation, Family" "Comedy" "Animation, Comedy, Family" "Musical" ...
#>  $ description : chr  "Join Mickey and the gang as they duck the halls!" "Santa Claus passes his magic bag to a new St. Nic." "Sid the Sloth is on Santa's naughty list." "This is real life, not just fantasy!" ...

Berdasarkan tipe data untuk setiap kolom di atas, terdapat tipe data yang salah. Oleh karena itu, kita harus mengubah tipe datanya

disney$type <- as.factor(disney$type)
disney$date_added <- mdy(disney$date_added)
disney$release_year <- parse_date_time(disney$release_year,'y')
str(disney)
#> 'data.frame':    1450 obs. of  12 variables:
#>  $ show_id     : chr  "s1" "s2" "s3" "s4" ...
#>  $ type        : Factor w/ 2 levels "Movie","TV Show": 1 1 1 1 2 1 2 2 2 1 ...
#>  $ title       : chr  "Duck the Halls: A Mickey Mouse Christmas Special" "Ernest Saves Christmas" "Ice Age: A Mammoth Christmas" "The Queen Family Singalong" ...
#>  $ director    : chr  "Alonso Ramirez Ramos, Dave Wasson" "John Cherry" "Karen Disher" "Hamish Hamilton" ...
#>  $ cast        : chr  "Chris Diamantopoulos, Tony Anselmo, Tress MacNeille, Bill Farmer, Russi Taylor, Corey Burton" "Jim Varney, Noelle Parker, Douglas Seale" "Raymond Albert Romano, John Leguizamo, Denis Leary, Queen Latifah" "Darren Criss, Adam Lambert, Derek Hough, Alexander Jean, Fall Out Boy, Jimmie Allen" ...
#>  $ country     : chr  "" "" "United States" "" ...
#>  $ date_added  : Date, format: "2021-11-26" "2021-11-26" ...
#>  $ release_year: POSIXct, format: "2016-01-01" "1988-01-01" ...
#>  $ rating      : chr  "TV-G" "PG" "TV-G" "TV-PG" ...
#>  $ duration    : chr  "23 min" "91 min" "23 min" "41 min" ...
#>  $ listed_in   : chr  "Animation, Family" "Comedy" "Animation, Comedy, Family" "Musical" ...
#>  $ description : chr  "Join Mickey and the gang as they duck the halls!" "Santa Claus passes his magic bag to a new St. Nic." "Sid the Sloth is on Santa's naughty list." "This is real life, not just fantasy!" ...

2.3 Check Missing Values

colSums(is.na(disney))
#>      show_id         type        title     director         cast      country 
#>            0            0            0            0            0            0 
#>   date_added release_year       rating     duration    listed_in  description 
#>            3            0            0            0            0            0

saya memiliki nilai yang hilang di kolom date_added, saya akan melakukan tindakan pada NA dan mengubahnya menjadi string “Missing Values”

disney$director[disney$director==""] <- NA
disney$cast[disney$cast==""] <- NA
disney$country[disney$country==""] <- NA
disney$rating[disney$rating==""] <- NA
disney$director[which(is.na(disney$director))] <- "Missing Values"
disney$cast[which(is.na(disney$cast))] <- "Missing Values"
disney$country[which(is.na(disney$country))] <- "Missing Values"
disney$date_added[which(is.na(disney$date_added))] <- "01-01-01" #because the date_added column has a date data type
disney$rating[which(is.na(disney$rating))] <- "Missing Values"
colSums(is.na(disney))
#>      show_id         type        title     director         cast      country 
#>            0            0            0            0            0            0 
#>   date_added release_year       rating     duration    listed_in  description 
#>            0            0            0            0            0            0

Bagus! saya tidak memiliki nilai yang hilang.

2.4 Finishing Data Cleansing

Pertama, saya memeriksa ulang data lagi

head(disney,10)
#>    show_id    type                                            title
#> 1       s1   Movie Duck the Halls: A Mickey Mouse Christmas Special
#> 2       s2   Movie                           Ernest Saves Christmas
#> 3       s3   Movie                     Ice Age: A Mammoth Christmas
#> 4       s4   Movie                       The Queen Family Singalong
#> 5       s5 TV Show                            The Beatles: Get Back
#> 6       s6   Movie                                Becoming Cousteau
#> 7       s7 TV Show                                          Hawkeye
#> 8       s8 TV Show                           Port Protection Alaska
#> 9       s9 TV Show                        Secrets of the Zoo: Tampa
#> 10     s10   Movie            A Muppets Christmas: Letters To Santa
#>                             director
#> 1  Alonso Ramirez Ramos, Dave Wasson
#> 2                        John Cherry
#> 3                       Karen Disher
#> 4                    Hamish Hamilton
#> 5                     Missing Values
#> 6                         Liz Garbus
#> 7                     Missing Values
#> 8                     Missing Values
#> 9                     Missing Values
#> 10                  Kirk R. Thatcher
#>                                                                                            cast
#> 1  Chris Diamantopoulos, Tony Anselmo, Tress MacNeille, Bill Farmer, Russi Taylor, Corey Burton
#> 2                                                      Jim Varney, Noelle Parker, Douglas Seale
#> 3                             Raymond Albert Romano, John Leguizamo, Denis Leary, Queen Latifah
#> 4           Darren Criss, Adam Lambert, Derek Hough, Alexander Jean, Fall Out Boy, Jimmie Allen
#> 5                                     John Lennon, Paul McCartney, George Harrison, Ringo Starr
#> 6                                                         Jacques Yves Cousteau, Vincent Cassel
#> 7           Jeremy Renner, Hailee Steinfeld, Vera Farmiga, Fra Fee, Tony Dalton, Zahn McClarnon
#> 8         Gary Muehlberger, Mary Miller, Curly Leach, Sam Carlson, Stuart Andrews, David Squibb
#> 9  Dr. Ray Ball, Dr. Lauren Smith, Chris Massaro, Tiffany Burns, Mike Burns, Melinda Mendolusky
#> 10                                     Steve Whitmire, Dave Goelz, Bill Barretta, Eric Jacobson
#>           country date_added release_year         rating  duration
#> 1  Missing Values 2021-11-26   2016-01-01           TV-G    23 min
#> 2  Missing Values 2021-11-26   1988-01-01             PG    91 min
#> 3   United States 2021-11-26   2011-01-01           TV-G    23 min
#> 4  Missing Values 2021-11-26   2021-01-01          TV-PG    41 min
#> 5  Missing Values 2021-11-25   2021-01-01 Missing Values  1 Season
#> 6   United States 2021-11-24   2021-01-01          PG-13    94 min
#> 7  Missing Values 2021-11-24   2021-01-01          TV-14  1 Season
#> 8   United States 2021-11-24   2015-01-01          TV-14 2 Seasons
#> 9   United States 2021-11-24   2019-01-01          TV-PG 2 Seasons
#> 10  United States 2021-11-19   2008-01-01              G    45 min
#>                               listed_in
#> 1                     Animation, Family
#> 2                                Comedy
#> 3             Animation, Comedy, Family
#> 4                               Musical
#> 5         Docuseries, Historical, Music
#> 6             Biographical, Documentary
#> 7           Action-Adventure, Superhero
#> 8         Docuseries, Reality, Survival
#> 9  Animals & Nature, Docuseries, Family
#> 10              Comedy, Family, Musical
#>                                                                                            description
#> 1                                                     Join Mickey and the gang as they duck the halls!
#> 2                                                   Santa Claus passes his magic bag to a new St. Nic.
#> 3                                                            Sid the Sloth is on Santa's naughty list.
#> 4                                                                 This is real life, not just fantasy!
#> 5    A three-part documentary from Peter Jackson capturing a moment in music history with The Beatles.
#> 6                            An inside look at the legendary life of adventurer Jacques-Yves Cousteau.
#> 7  Clint Barton/Hawkeye must team up with skilled archer Kate Bishop to unravel a criminal conspiracy.
#> 8        Residents of Port Protection must combat volatile conditions to survive and thrive in Alaska.
#> 9                          A day in the life at ZooTampa is anything but ordinary. It's extraordinary!
#> 10                                        Celebrate the holiday season with all your favorite Muppets.

Langkah selanjutnya, saya ingin mengambil data yang diperlukan untuk melakukan analisis data, yaitu data yang bukan dari kolom “show_id”,“director”,“cast”,“country” dan “description”

disney <- disney %>% select(,c(-"show_id",-"director",-"cast",-"country",-"description"))
head(disney,10)
#>       type                                            title date_added
#> 1    Movie Duck the Halls: A Mickey Mouse Christmas Special 2021-11-26
#> 2    Movie                           Ernest Saves Christmas 2021-11-26
#> 3    Movie                     Ice Age: A Mammoth Christmas 2021-11-26
#> 4    Movie                       The Queen Family Singalong 2021-11-26
#> 5  TV Show                            The Beatles: Get Back 2021-11-25
#> 6    Movie                                Becoming Cousteau 2021-11-24
#> 7  TV Show                                          Hawkeye 2021-11-24
#> 8  TV Show                           Port Protection Alaska 2021-11-24
#> 9  TV Show                        Secrets of the Zoo: Tampa 2021-11-24
#> 10   Movie            A Muppets Christmas: Letters To Santa 2021-11-19
#>    release_year         rating  duration                            listed_in
#> 1    2016-01-01           TV-G    23 min                    Animation, Family
#> 2    1988-01-01             PG    91 min                               Comedy
#> 3    2011-01-01           TV-G    23 min            Animation, Comedy, Family
#> 4    2021-01-01          TV-PG    41 min                              Musical
#> 5    2021-01-01 Missing Values  1 Season        Docuseries, Historical, Music
#> 6    2021-01-01          PG-13    94 min            Biographical, Documentary
#> 7    2021-01-01          TV-14  1 Season          Action-Adventure, Superhero
#> 8    2015-01-01          TV-14 2 Seasons        Docuseries, Reality, Survival
#> 9    2019-01-01          TV-PG 2 Seasons Animals & Nature, Docuseries, Family
#> 10   2008-01-01              G    45 min              Comedy, Family, Musical

Kita mau mengambil category pertama dari kolom listed_in

disney <- disney %>% separate(listed_in,c("category","category2","category3"), sep = ",")
disney <- disney %>% select(c(-"category2", -"category3"))
head(disney,10)
#>       type                                            title date_added
#> 1    Movie Duck the Halls: A Mickey Mouse Christmas Special 2021-11-26
#> 2    Movie                           Ernest Saves Christmas 2021-11-26
#> 3    Movie                     Ice Age: A Mammoth Christmas 2021-11-26
#> 4    Movie                       The Queen Family Singalong 2021-11-26
#> 5  TV Show                            The Beatles: Get Back 2021-11-25
#> 6    Movie                                Becoming Cousteau 2021-11-24
#> 7  TV Show                                          Hawkeye 2021-11-24
#> 8  TV Show                           Port Protection Alaska 2021-11-24
#> 9  TV Show                        Secrets of the Zoo: Tampa 2021-11-24
#> 10   Movie            A Muppets Christmas: Letters To Santa 2021-11-19
#>    release_year         rating  duration         category
#> 1    2016-01-01           TV-G    23 min        Animation
#> 2    1988-01-01             PG    91 min           Comedy
#> 3    2011-01-01           TV-G    23 min        Animation
#> 4    2021-01-01          TV-PG    41 min          Musical
#> 5    2021-01-01 Missing Values  1 Season       Docuseries
#> 6    2021-01-01          PG-13    94 min     Biographical
#> 7    2021-01-01          TV-14  1 Season Action-Adventure
#> 8    2015-01-01          TV-14 2 Seasons       Docuseries
#> 9    2019-01-01          TV-PG 2 Seasons Animals & Nature
#> 10   2008-01-01              G    45 min           Comedy

Akhirnya, kita ingin membuat kolom baru yang bernama year_added dan month_added dari kolom date_added

disney$year_added <- year(disney$date_added)
disney$month_added <- month(disney$date_added, label = T)
head(disney,10)
#>       type                                            title date_added
#> 1    Movie Duck the Halls: A Mickey Mouse Christmas Special 2021-11-26
#> 2    Movie                           Ernest Saves Christmas 2021-11-26
#> 3    Movie                     Ice Age: A Mammoth Christmas 2021-11-26
#> 4    Movie                       The Queen Family Singalong 2021-11-26
#> 5  TV Show                            The Beatles: Get Back 2021-11-25
#> 6    Movie                                Becoming Cousteau 2021-11-24
#> 7  TV Show                                          Hawkeye 2021-11-24
#> 8  TV Show                           Port Protection Alaska 2021-11-24
#> 9  TV Show                        Secrets of the Zoo: Tampa 2021-11-24
#> 10   Movie            A Muppets Christmas: Letters To Santa 2021-11-19
#>    release_year         rating  duration         category year_added
#> 1    2016-01-01           TV-G    23 min        Animation       2021
#> 2    1988-01-01             PG    91 min           Comedy       2021
#> 3    2011-01-01           TV-G    23 min        Animation       2021
#> 4    2021-01-01          TV-PG    41 min          Musical       2021
#> 5    2021-01-01 Missing Values  1 Season       Docuseries       2021
#> 6    2021-01-01          PG-13    94 min     Biographical       2021
#> 7    2021-01-01          TV-14  1 Season Action-Adventure       2021
#> 8    2015-01-01          TV-14 2 Seasons       Docuseries       2021
#> 9    2019-01-01          TV-PG 2 Seasons Animals & Nature       2021
#> 10   2008-01-01              G    45 min           Comedy       2021
#>    month_added
#> 1          Nov
#> 2          Nov
#> 3          Nov
#> 4          Nov
#> 5          Nov
#> 6          Nov
#> 7          Nov
#> 8          Nov
#> 9          Nov
#> 10         Nov

Data cleansing tealh selesai dan data siap untuk digunakan dalam analisis dan visualisasi

3 Data Explanation

summary(disney)
#>       type         title             date_added        
#>  Movie  :1052   Length:1450        Min.   :0001-01-01  
#>  TV Show: 398   Class :character   1st Qu.:2019-11-12  
#>                 Mode  :character   Median :2019-11-12  
#>                                    Mean   :2016-03-18  
#>                                    3rd Qu.:2020-11-23  
#>                                    Max.   :2021-11-26  
#>                                                        
#>   release_year                       rating            duration        
#>  Min.   :1928-01-01 00:00:00.00   Length:1450        Length:1450       
#>  1st Qu.:1999-01-01 00:00:00.00   Class :character   Class :character  
#>  Median :2011-01-01 00:00:00.00   Mode  :character   Mode  :character  
#>  Mean   :2003-02-03 15:03:43.45                                        
#>  3rd Qu.:2018-01-01 00:00:00.00                                        
#>  Max.   :2021-01-01 00:00:00.00                                        
#>                                                                        
#>    category           year_added    month_added 
#>  Length:1450        Min.   :   1   Nov    :809  
#>  Class :character   1st Qu.:2019   Apr    : 86  
#>  Mode  :character   Median :2019   Jul    : 85  
#>                     Mean   :2016   Jan    : 64  
#>                     3rd Qu.:2020   Okt    : 63  
#>                     Max.   :2021   Mei    : 62  
#>                                    (Other):281

Insight: 1. Pada kolom type terdapat 1052 Movie titles dan 398 TV Show titles 2. Pada data ini, data dengan tanggal terakhir ada di tanggal 2021-11-26 3. Tahun rilis Movie/Tv Show pada disney platform berada di range 1928 - 2021 4. Tahun maksimum untuk Movie/TV Show by Disney di tahun 2021 5. Disney paling banyak menambahkan Movie/TV Show di bulan Nov, Apr, dan Jul

4 Study Case

  1. Tampilkan perbandingan antara jenis Movie atau jenis TV Show berdasarkan tahun rilis
disney %>% ggplot(mapping = aes(x=release_year, fill=type)) +
  geom_histogram() +
  labs(title = "Disney Films Released by Year", x="Release Year", y="Total Film") +
  scale_fill_manual(values = c("Movie" = "Black", "TV Show" = "Red")) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", hjust = 1))

Movie dan TV show cenderung memiliki tren naik di setiap tahunnya. Dimana movie mengungguli TV Show di setiap tahun, namun TV Show juga mengalami peningkatan yang cukup signifikan di rentang tahun 2000 - 2020

  1. Kategori apa yang termasuk kedalam 10 kategori teratas di Disney
top_categories <- disney %>% 
  group_by(category) %>% 
  count(name = "Freq") %>% 
  arrange(desc(Freq)) 
top10_categories <- head(top_categories,10)
top10_categories
#> # A tibble: 10 × 2
#> # Groups:   category [10]
#>    category          Freq
#>    <chr>            <int>
#>  1 Action-Adventure   452
#>  2 Animation          320
#>  3 Comedy             193
#>  4 Animals & Nature   173
#>  5 Documentary         65
#>  6 Coming of Age       56
#>  7 Biographical        35
#>  8 Docuseries          33
#>  9 Drama               27
#> 10 Buddy               20
ggplot(top10_categories,mapping=aes(x=Freq, reorder(category,Freq)))+
  geom_col(aes(fill=Freq),color = "maroon",show.legend = F)+
  scale_fill_gradient(low="pink",high="#cf2e2e")+
  labs(title = "Disney's Top 10 Categories", x = "Total Film", y = NULL)+
  theme_minimal()+
  theme(plot.title=element_text(face="bold", hjust = 1))+
  geom_label(data=top10_categories[1:4,], mapping=aes(label=Freq))+
  geom_vline(xintercept = mean(top10_categories$Freq), col="yellow",linetype=2,lwd=1)

Plot di atas adalah visualisasi dari 10 kategori teratas di Disney Ada garis kuning yang menunjukkan rata-rata kategori. Terdapat 4 kategori yang melebihi rata-rata yaitu kategori Action-Adventure, Animation, Comedy dan Animals & Nature

  1. Rating apa yang termasuk kedalam 10 kategori teratas di Disney
top_ratings <- disney %>% group_by(rating) %>% count(name = "Freq") %>% arrange(desc(Freq))
top10_ratings <- head(top_ratings,10)

top10_ratings
#> # A tibble: 10 × 2
#> # Groups:   rating [10]
#>    rating          Freq
#>    <chr>          <int>
#>  1 TV-G             318
#>  2 TV-PG            301
#>  3 G                253
#>  4 PG               236
#>  5 TV-Y7            131
#>  6 TV-14             79
#>  7 PG-13             66
#>  8 TV-Y              50
#>  9 TV-Y7-FV          13
#> 10 Missing Values     3
ggplot(data = top10_ratings, mapping=aes(x=Freq,y=reorder(rating,Freq)))+
  geom_col(aes(fill=Freq), color="black", show.legend = F)+
  scale_fill_gradient(low="#79DAE8",high="#0AA1DD")+
  labs(title="Disney's Top 10 Ratings", x = "Total Film", y= NULL)+
  theme_minimal()+
  theme(plot.title = element_text(face="bold",hjust=1))+
  geom_label(data = top10_ratings[1:4,], mapping=aes(label=Freq))+
  geom_vline(xintercept = mean(top10_ratings$Freq), col = "#FCF69C",linetype=2,lwd=1)

Plot diatas adalah visualisasi dari 10 peringkat teratas di Disney. Terdapat garis kuning yang menandakan rata-rata ratings. Dimana ternyata ada 4 rating yang melebihi rata-rata yakni TV-G, TV-PG, G dan PG. Untuk penjelasan dari setiap kategori rating yang dimaksud sebagai berikut: TV-G berarti TV Show yang bersifat General Audiences atau cocok untuk semua umur, TV-PG berarti TV Show yang bersifat Parental Guidance atau perlu bimbingan orang tua, kemudian G berarti Movie yang bersifat General Audiences, dan PG berarti movie yang bersifat Parental Guidance

  1. Bulan apa saja Disney menambahkan film paling banyak
disney %>% group_by(month_added,type)%>% 
  count(name = "Freq") %>% 
  ggplot(aes(x=month_added,y=Freq,fill=type))+
  geom_col(aes(fill=type))+
  labs(title="Disney Films Added by Month", x="Month", y="Total Film")+
  theme_minimal()+
  theme(plot.title=element_text(face="bold",hjust=1))

Dari plot dapat dilihat bahwa pada bulan November menjadi bulan terbanyak Disney menambahkan Film baik itu Movie maupun TV Show

  1. Menampilkan tren durasi jenis film dari tahun 2000 - 2020
disney %>% 
  filter(type=='Movie' & release_year>="2000-01-01" & release_year<="2020-01-02") %>% 
  mutate(movie_duration=substr(duration,1,nchar(as.character(duration))-4)) %>% 
  mutate(movie_duration = as.integer(movie_duration)) %>% 
  group_by(release_year) %>% 
  summarise(avg_duration = mean(movie_duration)) %>% 
  ggplot(aes(x=release_year, y= avg_duration))+
  geom_point() + geom_smooth()+
  labs(title = "Disney Movie Duration From 2000 - 2020", x = "Year", y = "Duration(Minutes)")+
  theme_minimal()+
  theme(plot.title=element_text(face="bold",hjust=1))

Plot di atas menunjukkan bahwa durasi film dari tahun 2000 hingga 2020 mengalami tren penurunan.

5 Final Conclusion

Dari analisis dan visualisasi data yang telah dilakukan, dapat disimpulkan bahwa Movie dan TV Show mengalami peningkatan di setiap tahunnya. Dimana Movie mengungguli TV Show disetiap tahunnya. Namun, TV Show juga memiliki tren peningkatan yang signifikan di tahun 2000 hingga 2021. Kemudian, kategori terpopuler di Disney adalah kategori Action-Adventure, Animation, dan Comedy

Kemudian terdapat 4 rating yang melebihi rata-rata yakni TV-G, TV-PG, G dan PG. Jenis tren Movie lebih tinggi dibandingkan TV Show, dimana pada bulan November menjadi bulan yang paling sering penambahan Disney. Terakhir, durasi film dari tahun 2000 - 2020 mengalami tren penurunan dari segi durasi yang diberikan.