Task 1: Reflection

I decided to use a dataset from Kaggle that shows the IMDb Top 1000 Worst Rated Titles. I thought this was really interesting because there were a couple names of movies in there that I recognized. When looking at this data, there is a lot of opportunity for comparison and communication of relationships between each variable, as they all correspond to each other in different ways. While not all of the data shows trends or direct correlations, it still interested me as to what factors could have played into such a low rating (besides it just being a bad movie). I also thought that a dataset like this would present a good opportunity to use interactivity. If I were to create the visualization for Task 2 with a static visualization, it would be so crowded and overwhelming having to list the rating, genre, AND title. So, using ggplotly here was a very helpful tool to keep the data more organized while also being able to demonstrate which plot point corresponds to which movie.

Task 2: Interactive plots

Data sourced from: https://www.kaggle.com/datasets/octopusteam/imdb-top-1000-worst-rated-titles

library(readr)
library(tidyverse)
library(dplyr)
library(plotly)
library(ggplot2)

# Load data here
worst_rated_titles <- read_csv("data/worstratedtitles.csv")

Do the following:

  1. Make a plot. Any kind of plot will do (though it might be easiest to work with geom_point()).

  2. Make the plot interactive with ggplotly().

  3. Make sure the hovering tooltip is more informative than the default.

Good luck and have fun!

worst_50 <- worst_rated_titles[1:50, ]

# set up libraries
library(ggplot2)
library(plotly)

#make scatter plot
distribution <- ggplot(worst_50, aes(x = genres, y = averageRating, text = title)) + # set up hovering tooltip
  geom_point(color = "red", position = position_jitter(width = 0.2), alpha = 0.7) + # jitter for visibility
  labs(title = "The 50 Worst IMDb Ratings by Genre",
       x = "Genre", y = "Rating") +
  theme(axis.text.x = element_text(size = 5, angle = 45, hjust = 1))

# make interactive
ggplotly(distribution, tooltip = "text")

This visualization uses data sourced from Kaggle that shows the IMDb Top 1000 Worst Rated Titles. The plot represents the top 50 worst titles from the dataset and demonstrates the correlation between genre and rating of each of the 50 movies. The interactivity comes into play with the hovering tooltip, as it displays the title of the movie for each plot point. Based on the visualization, it doesn’t seem like there is a direct correlation between the genre and the rating, but it is still interesting to see where movies in these genres fall in the rankings (especially the ones that I have heard of before or seen).

Task 3:

Install the {flexdashboard} package and create a new R Markdown file in your project by going to File > New File… > R Markdown… > From Template > Flexdashboard.

Using the documentation for {flexdashboard} online, create a basic dashboard that shows a plot (static or interactive) in at least three chart areas. Play with the layout if you’re feeling brave.