# excel file
Summer_Movies <- read_excel("../00_data/Summer Movies.xlsx")
Summer_Movies
## # A tibble: 905 × 10
## tconst title_type primary_title original_title year runtime_minutes genres
## <chr> <chr> <chr> <chr> <dbl> <chr> <chr>
## 1 tt00114… movie Midsummer Ma… Midsummer Mad… 1920 60 Drama
## 2 tt00267… movie A Midsummer … A Midsummer N… 1935 133 Comed…
## 3 tt00338… movie The Teachers… Magistrarna p… 1941 86 Comedy
## 4 tt00373… movie Summer Storm Summer Storm 1944 106 Crime…
## 5 tt00384… movie Centennial S… Centennial Su… 1946 102 Histo…
## 6 tt00387… tvMovie A Midsummer … A Midsummer N… 1946 150 Drama…
## 7 tt00393… movie One Swallow … En fluga gör … 1947 88 Comedy
## 8 tt00408… movie Summer Holid… Summer Holiday 1948 93 Music…
## 9 tt00415… movie In the Good … In the Good O… 1949 102 Comed…
## 10 tt00429… movie Bountiful Su… Shchedroe leto 1951 87 Comed…
## # ℹ 895 more rows
## # ℹ 3 more variables: simple_title <chr>, average_rating <dbl>, num_votes <dbl>
Does the year correlate to the average rating at all?
ggplot(data = Summer_Movies) +
geom_point(mapping = aes(x = year, y = average_rating))
When looking at the data presented in the graph it appears that a majority of the movies sat within the 6-7 range of rating through the summers. The year only increasing the amount of movies within that space.