The Tidyverses is an collection of R packages.When Tidyverse is loaded it loads ggplot2, dplyr, tidyr, readr, purrr, tibble, stringr, and forcats.

Forcats and ggplot:

For the implementation of Tidyverse, I have selected Forcats and ggplot libraries from this package.dplyr was used as well. I have selected Disney movies gross income dataset from the 1937-2016 from Kaggle.

The purpose this blog is to categorized Disney movies according to their genre. Those movies gross income is also going to be analyzed.

https://www.kaggle.com/rashikrahmanpritom/disney-movies-19372016-total-gross

library(dplyr)
library(forcats)
library(ggplot2)
library(kableExtra)
library(data.table)
library(tidyverse)
disney_movies_total_gross <- read.csv("https://raw.githubusercontent.com/maliat-hossain/FileProcessing/main/disney_movies_total_gross.csv")
head(disney_movies_total_gross)%>% kable() %>% 
  kable_styling(bootstrap_options = "striped", font_size = 10) %>% 
  scroll_box(height = "500px", width = "100%")

======= library(tidyverse)

url <- “https://raw.githubusercontent.com/maliat-hossain/FileProcessing/main/disney_movies_total_gross.csv

disney_movies_total_gross <- read.csv(url)

head(disney_movies_total_gross)%>% kable() %>% kable_styling(bootstrap_options = “striped”, font_size = 10) %>% scroll_box(height = “500px”, width = “100%”)

#### Only necessary rows and columns have been selected using Tidyverse package dplyr. For this assignment I am focusing on the Disney movies released from 1937 to 1961.

```r
DisneyMovies<-
  disney_movies_total_gross %>%
    dplyr::select(1)
DisneyMovies1<-
  DisneyMovies[1:10,]

The dataframe has been factorized for the purpose of implementing categories. The movies have been categorized as musical,adventure,comedy and drama.Forcats from tidyverse works really well to manipulate categorical variable.

DisneyMovies2<-
  factor(DisneyMovies1)
view(DisneyMovies2)%>%
  kable() %>% 
    kable_styling(bootstrap_options = "striped",
                  font_size = 10) %>% 
      scroll_box(height = "500px", width = "100%")
x
Snow White and the Seven Dwarfs
Pinocchio
Fantasia
Song of the South
Cinderella
20,000 Leagues Under the Sea
Lady and the Tramp
Sleeping Beauty
101 Dalmatians
The Absent Minded Professor
DisneyMovies2<-
  fct_recode(DisneyMovies2,
             Musical="Snow White and the Seven Dwarfs",
             Adventure="Pinocchio",
             Musical="Fantasia",
             Adventure="Song of the South",
             Drama="Cinderella",
             Adventure="20,000 Leagues Under the Sea",
             Drama="Lady and the Tramp",
             Drama="Sleeping Beauty",
             Comedy="101 Dalmatians",
             Comedy="The Absent Minded Professor")

Total gross income column for these movies have been added.

DisneyMovies3<-
  disney_movies_total_gross %>%
    dplyr::select(1,5)
DisneyMovies3<-
  DisneyMovies3[1:10,]

Summary statistics for total gross revenue from Disney movies has been calculated.

summary(DisneyMovies3)
##  movie_title         total_gross       
##  Length:10          Min.   :  9464608  
##  Class :character   1st Qu.: 37400000  
##  Mode  :character   Median : 83810000  
##                     Mean   : 81219150  
##                     3rd Qu.: 91450000  
##                     Max.   :184925485

case_when from dplyr is used for binning the gross income for movies.A variable named comparison_movies has been created which shows if the gross income of selected movie is “Below Average”, “Around Average”,or “Above Average”. To determine the average information from the summary statistics have been used.

DisneyMovies4<-
  DisneyMovies3 %>%
    mutate(comparison_movies=case_when(
      total_gross < 81219150 ~ "Below Average",
      total_gross > 81219150  & total_gross <83810000 ~ "Around Average",
      TRUE ~ "Above Average"))%>%
        select(movie_title,total_gross,comparison_movies)
view(DisneyMovies4)%>%
  kable() %>% 
    kable_styling(bootstrap_options = "striped",
                  font_size = 10) %>% 
                scroll_box(height = "500px",
                         width = "100%")
movie_title total_gross comparison_movies
Snow White and the Seven Dwarfs 184925485 Above Average
Pinocchio 84300000 Above Average
Fantasia 83320000 Around Average
Song of the South 65000000 Below Average
Cinderella 85000000 Above Average
20,000 Leagues Under the Sea 28200000 Below Average
Lady and the Tramp 93600000 Above Average
Sleeping Beauty 9464608 Below Average
101 Dalmatians 153000000 Above Average
The Absent Minded Professor 25381407 Below Average

The outcome of selected movies’ income has been visualized through the barplot. Each color represents different income status.

ggplot(data = DisneyMovies4,aes(x = movie_title,fill = comparison_movies))+
  geom_bar(position = "dodge")+
  coord_flip()

Conclusion

The plot shows most of the Disney movies have earned above average from the year 1937 to 1954.