Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The objective of the original data visualisation is to show the quantity of genre co-occurrences and the strength of relationships between different genres using the 5000 movies from The Movie Database (TMDb). The targeted audience for this visualisation includes but not limited to movie producers, movie writers, movie executives and critics and bloggers.
The visualisation chosen had the following three main issues:
Reference
The following code was used to fix the issues identified in the original.
library(readr)
library(dplyr)
library(ggplot2)
library(RColorBrewer)
Movie_Genres <- read_csv("Movie Genres.csv")
colourCount = length(unique(Movie_Genres$genre_sub_1))
getPalette = colorRampPalette(brewer.pal(9, "Set1"))
rs1 <- ggplot(Movie_Genres, aes(fill=genre_sub_1, y=genre_main, x=ration, main="Movie Genre Co-occurrence")) + geom_bar(position="stack", stat="identity") + scale_fill_manual(values = getPalette(colourCount)) + labs(x="Co-occurrence percentage", y="Genre Co-occurrences", title = "Movie Genre Co-occurrence")
Data Reference
The following plot fixes the main issues in the original.