Goodreads is a go-to platform for book lovers to track what they’ve read, share reviews, and discover new favorites. One of its most popular features is the “Best Books Ever” list, which ranks books based on user ratings and votes. It’s a mix of classics, modern hits, and everything in between, giving a good snapshot of what readers around the world consider the best of the best.
Analysis
In this analysis, I looked at the books on the “Best Books Ever” list to explore trends in how readers rate and engage with them. Using R to scrape and visualize the data, I compared average ratings, the number of ratings, and which books were the most popular. The goal was to understand what makes a book highly rated or widely read—and whether the two always go hand in hand.
Data Source:
We’ll scrape data from Goodreads’ “Best Books Ever” ranking, which is publicly accessible and contains structured HTML content.
Visualization 1. Average Rating Distribution: Most vs. Least Rated Books
Explanation: This density plot compares the distribution of average ratings between books with more than 2 million ratings and those with fewer. It helps identify whether highly-rated books also tend to be widely rated, and shows differences in rating concentration across popularity levels.
Visualization 2. Ratings Distribution by Popularity (Top vs. Bottom 10 Rated Books)
Explanation: This boxplot contrasts the spread of average ratings for the top 10 most-rated books and the bottom 10 least-rated books. It highlights how rating behavior may differ at the extremes of popularity.
Visualization 3. Distribution of Rating Counts (Log Scale)
Explanation: This histogram shows the distribution of how many ratings books have received, using a log-scaled x-axis to accommodate the wide range. It helps reveal patterns among books with low, moderate, and extremely high numbers of ratings.
Visualization 4. Distribution of Average Ratings
Explanation: This histogram provides an overview of how average ratings are distributed across all books. It helps identify common rating scores and shows whether most books are rated similarly or vary widely.
Visualization 5. Top 10 Books by Ratings
Explanation: This bar chart highlights the 10 books with the most total ratings. It helps identify the most widely read and engaged-with books on the platform.