TMDB's Kaggle movie data Statistical analysis

Vineet Jaiswal
28th October 2017

This application is a movie data analysis, currently I am taking input from front end filters, do the process on server side code and create a ggplot graph and table as output

This application can extend a lot and we can show a lot from input data

From where I got the data

I got the data from Kaggle site https://www.kaggle.com/tmdb/tmdb-movie-metadata

Specification of data

  • Columns
rawData <- read.csv("raw.csv", stringsAsFactors = F)
colnames(rawData)
 [1] "budget"               "genres"               "homepage"            
 [4] "id"                   "keywords"             "original_language"   
 [7] "original_title"       "overview"             "popularity"          
[10] "production_companies" "production_countries" "release_date"        
[13] "revenue"              "runtime"              "spoken_languages"    
[16] "status"               "tagline"              "title"               
[19] "vote_average"         "vote_count"          
  • Special thing of data : It has multiple columns with JSON values
rawData$genres[1]
[1] "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"name\": \"Fantasy\"}, {\"id\": 878, \"name\": \"Science Fiction\"}]"

Server.R

GIT hub Code Path https://github.com/jaiswalvineet/Final9/blob/master/server.R

  • Load, Clean and format data
  • Convert JSON to data frame for genre, keyword etc
  • Take input(slider) from front end and pass to server side
  • Create GGPlot graph (initially I have created plotly but it is not working on free version of shinyapp)
  • Show all the filtered data in form of table below the graph

ui.R

GIT hub Code Path https://github.com/jaiswalvineet/Final9/blob/master/ui.R

  • Define all the controls
  • Bind the controls with Values
  • Plot the graph
  • Show the table
  • Search the data inside table

Public site : ShinyApp