The US National Library of Medicine created and employs the Medical Subject Headings (MeSH) vocabulary system to catalog and organize publications on MEDLINE. Its methodology and organization are well documented and explained on their websites. In general, it is a hierrachical, branching system.
MeSH Structure:
- Headings - At the topmost branch are 26,000 Headings that repreent major descriptors in the biomedical literature.
- Subheadings - these describe a specific concept (i.e. ‘adverse effects’, ‘therapy’, ‘physiology’, ‘epidemiology’…)
- Supplementary Concept Records
- Publication Characteristics
For this small, exploratory analysis I used the MeSH heading of “Sepsis” followed by subsequent iterations of searches using “Sepsis” as the heading with each of the different subheadins:
After each searh, I downloaded the results of publication counts by year, which is available on the right side of the screen. This was saved as CSV file in MIcrosoft Excel. Then I manually copy and pasted all of the columns together into one single file.
library(ggplot2)
library(dplyr)
library(reshape2)
The only major step was to reshape the date from wide to long format, using the handy R package, reshape2.
sepsis.pubs <- read.csv("sepsis Mesh publications.csv")
sepsis.pubs <- filter(sepsis.pubs, Year != "")
sepsis.pubs <- melt(sepsis.pubs, id.vars = c("Year"))
colnames(sepsis.pubs) <- c("Year", "MeSH.Term", "Count")
ggplot(sepsis.pubs, aes(x=Year, y=Count, color=MeSH.Term))+
geom_line()+
theme_bw() +
ggtitle("Publication Count in MEDLINE by Sepsis MeSH Term + Secondary Terms")+
labs(x='Year',y='Publication Count')