The file “IMDB Movies” contains a sample of movies from the 250 highest rated movies of all time. The Box_Office variable is measured in Millions of Dollars. Use this file to answer the following questions. Include your written responses with any charts and tables in this document. Upload your excel file to go with your quiz. 1. What are the elements in the data set and determine if the data are from a population or a sample?
The elements in the data are the movies. This is a sample since it doesn’t include all the movies ever made.
Avg_Rating | Median_Rating | SD_Rating | CV |
---|---|---|---|
8.323 | 8.3 | 0.243 | 0.029 |
The average and median ratings are each close to 8.3 suggesting that the distribution is relatively symmetric for rating. The data represent some of the highest rated and grossing movies in history. The standard deviation of 0.243 shows very little variability in the ratings values. The coefficient of variation confirms this value since the standard deviation is only 3% of the mean. This indicates very little variability for rating across movies.
Min | Q1 | Q2 | Q3 | Max |
---|---|---|---|---|
8 | 8.1 | 8.3 | 8.5 | 9.3 |
Avg_Box | Median_Box | SD_Box |
---|---|---|
282.154 | 120.073 | 401.591 |
The average box office revenues are $282 million while the median box office revneues is $120 billion. There is a significant difference between those two values suggesting that there is a lot of right skew in the data. The median would provide a better representation of the center of the distribution since the data likely has extreme values.
The standard deviation is aroudn $400 million and is shows a lot of variation between movies in terms of box office.
A film with box office revenue of 775.09 would be considered the top 10% or 90th percentile in box office.
The z-score for the Dark Knight, a very popular movie, is 1.8. The Dark Knight is 1.8 standard deviations above the mean.
Avengers End Game has a z-score of 6.27. The box office revenue for this film is 6.27 standard deviations above the mean, which is much larger than 3 standard deviations. This would be considered an outlier.