This report is my attempt at creating some GoodReads-style analytics out of my personal reading log. I use a small Perl script to extract my reading log from a Markdown file, then process the CSV data here.
library(data.table)
library(ggplot2)
knitr::opts_chunk$set(tidy = FALSE)
reading.log = data.table(read.csv(pipe("perl extract-reading-log.pl reading-log.md")))
names(reading.log)
## [1] "Title" "Year" "Pages"
My reading log contains 169 books.
First, I will aggregate the data by year. If a book has no page count, omit it from page counting. Then create a function to plot some measure by year.
yearly.log = reading.log[!is.na(Year),
list(Count=length(Title),
Pages=sum(Pages, na.rm=TRUE)),
keyby=list(Year=Year)]
plot.year = function(measure) {
ggplot(yearly.log) +
aes(x=Year) + aes_string(y=measure) +
geom_bar(stat="identity", fill="#3465A4") +
geom_text(aes_string(y=measure, ymax=measure, label=measure,
hjust=1),
color="#EEEEEC",
position=position_dodge(width=1)) +
coord_flip() +
theme(panel.background=element_rect(fill="#EEEEEC"))
}
plot.year("Count") +
ggtitle("Books Read per Year")
plot.year("Pages") +
ggtitle("Pages Read per Year")