The dataset that I have chose to analyze provides IMDB data for all both movie and television titles. This dataset includes variables such as (but not limited to) year, country, budget, rating, language, etc.
## color object
## director_name object
## num_critic_for_reviews float64
## duration float64
## director_facebook_likes float64
## actor_3_facebook_likes float64
## actor_2_name object
## actor_1_facebook_likes float64
## gross object
## genres object
## actor_1_name object
## movie_title object
## num_voted_users int64
## cast_total_facebook_likes int64
## actor_3_name object
## facenumber_in_poster float64
## plot_keywords object
## movie_imdb_link object
## num_user_for_reviews float64
## language object
## country object
## content_rating object
## budget float64
## title_year float64
## actor_2_facebook_likes float64
## imdb_score float64
## aspect_ratio float64
## movie_facebook_likes int64
## dtype: object
The first graph that I created using this dataset illistrates the overall title count by title rating. The most popular rating by far was the Rating ‘R’, followed by ‘PG-13’ and ‘PG’. Over time, it appears more movies have been produced than television pictures (given the rating analysis).
## <BarContainer object of 10 artists>
The second graph that I created using this dataset illistrates the overall title count by title language. The most popular language was English, followed far behind by French and Spanish.
## <BarContainer object of 5 artists>
The third graph that I created using this dataset illistrates a histograph depicting the overall title count by year. The most popular decade was surely the 2010’s. The early 21st century has produced more movies than the entire 20th century.
## (array([ 1., 1., 1., 3., 2., 4., 9., 8., 6., 11., 8.,
## 11., 9., 16., 26., 31., 32., 24., 58., 87., 82., 122.,
## 95., 172., 519., 568., 604., 928., 676., 821.]), array([1916. , 1919.33333333, 1922.66666667, 1926. ,
## 1929.33333333, 1932.66666667, 1936. , 1939.33333333,
## 1942.66666667, 1946. , 1949.33333333, 1952.66666667,
## 1956. , 1959.33333333, 1962.66666667, 1966. ,
## 1969.33333333, 1972.66666667, 1976. , 1979.33333333,
## 1982.66666667, 1986. , 1989.33333333, 1992.66666667,
## 1996. , 1999.33333333, 2002.66666667, 2006. ,
## 2009.33333333, 2012.66666667, 2016. ]), <a list of 30 Patch objects>)
##
## C:\Users\dtmed\Anaconda3\lib\site-packages\numpy\lib\histograms.py:824: RuntimeWarning:
##
## invalid value encountered in greater_equal
##
## C:\Users\dtmed\Anaconda3\lib\site-packages\numpy\lib\histograms.py:825: RuntimeWarning:
##
## invalid value encountered in less_equal
## (array([1900., 1920., 1940., 1960., 1980., 2000., 2020., 2040.]), <a list of 8 Text xticklabel objects>)
## (array([ 0., 200., 400., 600., 800., 1000.]), <a list of 6 Text yticklabel objects>)
The fourth graph that I created using this dataset illistrates the overall title count by country. The United States clearly produces the most motion pictures followed by the UK.