#R: Simple #bibliometric with Microsoft Academic

Following my previous post on simple bibliometric with GS Google Scholar, this time I try to do the same steps with MSA Microsoft Academic. The pros in using MSA is that it offers categorization of scientific entries. This is not available with GS. In this post I tabulated and compared each category with several keywords. Here I used the following keywords:

  1. West Java
  2. Bandung
  3. Citarum
  4. Cikapundung
  5. Groundwater Bandung
  6. Groundwater Citarum
  7. Groundwater Cikapundung
  8. Health Bandung

The following list contains the categories that automatically built by MSA:

  1. Agriculture Science (agsci)
  2. Arts & Humanities (arthum)
  3. Biology (bio)
  4. Chemistry (chem)
  5. Computer Science (comsci)
  6. Economics & Business (ecobus)
  7. Engineering (eng)
  8. Environmental Sciences (envsci)
  9. Geosciences (geosci)
  10. Mathematics (math)
  11. Material Science (matsci)
  12. Medicine (med)
  13. Multidisciplinary (muldis)
  14. Physics (phy)
  15. Social Science (socsci)

I worked around this with the following codes.

# load library
library("lattice")
library("gridExtra")
## Loading required package: grid

I use LibreOffice to prepare the data. Basically every keyword consists of 15 observations (see the result from head(bib)).

# load data
bib = read.csv("20140523b-summary references.csv", header = T)
head(bib)
##   no               fields2 fields     key dbase sum
## 1  1  Agriculture Science   agsci Bandung msacd  16
## 2  2    Arts & Humanities  arthum Bandung msacd  44
## 3  3              Biology     bio Bandung msacd 129
## 4  4            Chemistry    chem Bandung msacd 153
## 5  5     Computer Science  comsci Bandung msacd 406
## 6  6 Economics & Business  ecobus Bandung msacd  44

I did the subsetting for each keyword.

# subsetting data
bib.wj = subset(bib, bib$key == "West Java")
bib.bdg = subset(bib, bib$key == "Bandung")
bib.ctr = subset(bib, bib$key == "Citarum")
bib.ckp = subset(bib, bib$key == "Cikapundung")
bib.gwbdg = subset(bib, bib$key == "Groundwater Bandung")
bib.gwctr = subset(bib, bib$key == "Groundwater Citarum")
bib.gwckp = subset(bib, bib$key == "Groundwater Cikapundung")
bib.healthbdg = subset(bib, bib$key == "Health Bandung")

I used lattice and gridExtra package for plotting. You may use another package, but you have to change the codes.

# plotting
plot1 = xyplot(bib.wj$fields ~ bib.wj$sum, pch = 21, fill = "red", xlim = c(0, 
    8000), main = "key: West Java")
plot2 = xyplot(bib.bdg$fields ~ bib.bdg$sum, pch = 21, fill = "red", xlim = c(0, 
    8000), main = "key: Bandung")
plot3 = xyplot(bib.ctr$fields ~ bib.ctr$sum, pch = 21, fill = "red", xlim = c(0, 
    8000), main = "key: Citarum")
grid.arrange(plot1, plot2, plot3, ncol = 3)

plot of chunk unnamed-chunk-4


plot4 = xyplot(bib.gwbdg$fields ~ bib.gwbdg$sum, pch = 21, fill = "red", xlim = c(0, 
    50), main = "key: Groundwater Bandung")
plot5 = xyplot(bib.gwctr$fields ~ bib.gwctr$sum, pch = 21, fill = "red", xlim = c(0, 
    50), main = "key: Groundwater Citarum")
plot6 = xyplot(bib.healthbdg$fields ~ bib.healthbdg$sum, pch = 21, fill = "red", 
    xlim = c(0, 50), main = "key: Health Bandung")
grid.arrange(plot4, plot5, plot6, ncol = 3)

plot of chunk unnamed-chunk-4


plot7 = xyplot(bib.ckp$fields ~ bib.ckp$sum, pch = 21, fill = "red", xlim = c(0, 
    10), main = "key: Cikapundung")
plot8 = xyplot(bib.gwckp$fields ~ bib.gwckp$sum, pch = 21, fill = "red", xlim = c(0, 
    10), main = "key: Groundwater Cikapundung")
grid.arrange(plot7, plot8, ncol = 3)

plot of chunk unnamed-chunk-4

Note: OS : Ubuntu 13.10 R studio Version : 0.98.507 R base Version : 3.1.0 (2014-04-10)