Here I'm showing how to track the number of page views of an article that came out at the end of April in the journal PLoS ONE, about an R package called phyloseq. Fittingly, I am using a different R package, called rplos to do all the hard work of interfacing with the Public Library of Science's API and generating a ggplot2-based plot of the results. You can read some details about the rplos package on the rplos-github front page.
We noted that the phyloseq article experienced a large increase in the initial number of views, after about Day 3 that the article “went live”. It did not take long to note that this also coincided with Jonathan Eisen's blog post about the phyloseq article on April 25. So in the following code snippets, I will show how to plot and modify a graphic of the page views of an article in a Public Library of Science (PLoS) journal (there are many different journals covering varied scientific topics published by PLoS). I will also note and annotate the date corresponding with J. Eisen's blog post, and let you decide if you think it helped drive readers to the article. (Hint: it obviously did).
First, load the rplos and ggplot2 packages.
library("rplos")
library("ggplot2")
theme_set(theme_bw())
Here is just one example for how you can access details about an article using the PLoS API in the rplos package. In this case I already knew the one article that I was making investigating, and so went and copied its DOI from the website. However, the API in rplos also includes some search functionality.
phyloseqdoi = "10.1371/journal.pone.0061217"
phyloseqalm = alm(doi = phyloseqdoi, info = "detail")
## Using default key: Please get your own API key at http://api.plos.org/
Note, while you don't have to get your own PLoS API key, it is apprently recommended. If you're going to do a lot of these types of queries, then you probably should.
Here is how to create the default plot using the almplot
function in the rplos package, with a view aesthetic tweaks from ggplot2.
p = almplot(phyloseqalm, type = "history") + geom_line(size = 3)
p + ggtitle("Totals over time")
p + ylim(0, 25) + ggtitle("Social Shares over time")
And here's how to show the total article views over time, emphasizing before and after J. Eisen's blog post.
viewsdat = p$data[p$data[, ".id"] == "counter", ]
eisenblogdate = as.Date("2013-04-25")
viewsdat$bump = "post-blog"
viewsdat$bump[viewsdat$dates <= eisenblogdate] <- "pre-blog"
ggplot(viewsdat, aes(dates, totals, color = bump)) + geom_point(size = 7) +
geom_path(aes(group = 1), size = 3, alpha = 0.5) + ylab("Total Views") +
ggtitle("phloseq's article views over time") + theme(plot.title = element_text(size = 28),
axis.title.x = element_text(size = 22), axis.title.y = element_text(size = 22)) +
annotate("text", label = "\"Eisen Bump\"", x = eisenblogdate, y = 125, size = 8,
colour = "red", hjust = -0.15)
What do you think?