#install.packages("dplyr")
#install.packages("Matrix")
#install.packages("stringr")
#install.packages("igraph")
#install.packages("bibliometrix", dependencies=TRUE)
library("dplyr")
library("Matrix")
library("stringr")
library("igraph")
library("FactoMineR")
library("factoextra")
library("ggplot2")
library("bibliometrix") ### load bibliometrix package
D <- readFiles("ccspcautocorrelated.bib")
#D
# base ISI WoK
# M <‐ convert2df(D, dbsource = "isi", format = "bibtex")
# base SCOPUS
M<‐ convert2df(D, dbsource = "scopus", format = "bibtex")
Articles extracted 100
Articles extracted 124
#M
results <‐ biblioAnalysis(M, sep = ";")
S=summary(object = results, k = 10, pause = FALSE)
Main Information about data
Articles 124
Sources (Journals, Books, etc.) 60
Keywords Plus (ID) 589
Author's Keywords (DE) 278
Period 1992 - 2017
Average citations per article 19.56
Authors 226
Author Appearances 303
Authors of single authored articles 9
Authors of multi authored articles 217
Articles per Author 0.549
Authors per Article 1.82
Co-Authors per Articles 2.44
Collaboration Index 2.07
Annual Scientific Production
Year Articles
1992 1
1994 2
1995 3
1996 5
1997 9
1998 2
1999 2
2000 5
2001 2
2002 8
2003 8
2004 3
2005 3
2006 3
2007 6
2008 7
2009 6
2010 7
2011 7
2012 4
2013 8
2014 6
2015 7
2016 9
2017 1
Annual Percentage Growth Rate 0
Most Productive Authors
Authors Articles Authors Articles Fractionalized
1 TSUNG,F 6 ZHANG,NF 3.00
2 ADAMS,BM 4 TSUNG,F 2.50
3 APLEY,DW 4 RUNGER,GC 2.33
4 RUNGER,GC 4 WOODALL,WH 2.33
5 TESTIK,MC 4 APLEY,DW 2.00
6 WOODALL,WH 4 SUN,J 2.00
7 AMIRI,A 3 PERRY,MB 1.83
8 ANG,BW 3 WEI,CH 1.75
9 ATIENZA,OO 3 ADAMS,BM 1.50
10 CONERLY,MD 3 LU,C-W 1.50
Top manuscripts per citations
Paper TC TCperYear
1 WARDELL DG ;MOSKOWITZ H ;PLANTE RD,(1994),TECHNOMETRICS 220 9.57
2 LU C-W ;REYNOLDSJR MR,(1999),J QUAL TECHNOL 172 9.56
3 LU C-W ;REYNOLDSJR MR,(1999),J QUAL TECHNOL 130 7.22
4 ZHANG NF,(1998),TECHNOMETRICS 117 6.16
5 JIANG W ;WOODALL WH ;TSUI K-L,(2000),TECHNOMETRICS 94 5.53
6 LU C-W ;REYNOLDSJR MR,(2001),J QUAL TECHNOL 76 4.75
7 MARAGAH HD,(1992),J. STAT. COMPUT. SIMUL. 74 2.96
8 RUNGER GC ;WILLEMAIN TR ;PRABHU S,(1995),COMMUN STAT THEORY METHODS 63 2.86
9 MONTGOMERY DOUGLASC ;WOODALL WILLIAMH,(1997),J QUAL TECHNOL 62 3.10
10 NOOROSSANA R ;AMIRI A ;SOLEIMANI P,(2008),COMMUN STAT THEORY METHODS 53 5.89
Most Productive Countries
Country Articles Freq
1 USA 51 0.4113
2 CHINA 9 0.0726
3 TAIWAN 9 0.0726
4 TURKEY 6 0.0484
5 BRAZIL 5 0.0403
6 INDIA 5 0.0403
7 IRAN 5 0.0403
8 GERMANY 4 0.0323
9 HONG KONG 4 0.0323
10 UNITED KINGDOM 4 0.0323
Total Citations per Country
Country Total Citations Average Article Citations
1 USA 1331 26.10
2 TAIWAN 461 51.22
3 HONG KONG 150 37.50
4 IRAN 78 15.60
5 UNITED KINGDOM 78 19.50
6 GERMANY 64 16.00
7 INDIA 47 9.40
8 TUNISIA 34 11.33
9 CHINA 32 3.56
10 SINGAPORE 30 30.00
Most Relevant Sources
Sources Articles
1 JOURNAL OF QUALITY TECHNOLOGY 19
2 QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL 11
3 JOURNAL OF APPLIED STATISTICS 7
4 IIE TRANSACTIONS (INSTITUTE OF INDUSTRIAL ENGINEERS) 6
5 COMMUNICATIONS IN STATISTICS: SIMULATION AND COMPUTATION 5
6 JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION 5
7 INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 4
8 QUALITY ENGINEERING 4
9 TECHNOMETRICS 4
10 QINGHUA DAXUE XUEBAO/JOURNAL OF TSINGHUA UNIVERSITY 3
Most Relevant Keywords
Author Keywords (DE) Articles Keywords-Plus (ID) Articles
1 STATISTICAL PROCESS CONTROL 69 CONTROL CHARTS 28
2 AUTOCORRELATION 35 FLOWCHARTING 27
3 CONTROL CHARTS 21 QUALITY CONTROL 22
4 AVERAGE RUN LENGTH 18 STATISTICAL PROCESS CONTROL 22
5 SPC 13 CORRELATION METHODS 19
6 AUTOCORRELATED DATA 9 MATHEMATICAL MODELS 19
7 AUTOCORRELATED PROCESSES 9 AUTOCORRELATION 17
8 TIME SERIES ANALYSIS 8 COMPUTER SIMULATION 15
9 AUTOCORRELATED PROCESS 6 PARAMETER ESTIMATION 11
10 CONTROL CHART 6 REGRESSION ANALYSIS 11
plot(x = results, k = 10, pause = FALSE)
#M$CR[1]
CR <‐ citations(M,field="article", sep = ";")
CR$Cited[1:10]
CR
MONTGOMERY, DC, MASTRANGELO, CM, SOME STATISTICAL PROCESS CONTROL METHODS FOR AUTOCORRELATED DATA (1991) JOURNAL OF QUALITY TECHNOLOGY, 23, PP 179-193
15
HARRIS, TJ, ROSS, WH, STATISTICAL PROCESS CONTROL PROCEDURES FOR CORRELATED OBSERVATIONS (1991) CANADIAN JOURNAL OF CHEMICAL ENGINEERING, 69, PP 48-57
14
LUCAS, JM, SACCUCCI, MS, EXPONENTIALLY WEIGHTED MOVING AVERAGE CONTROL SCHEMES: PROPERTIES AND ENHANCEMENTS (1990) TECHNOMETRICS, 32, PP 1-12
13
WARDELL, DG, MOSKOWITZ, H, PLANTE, RD, CONTROL CHARTS IN THE PRESENCE OF DATA CORRELATION (1992) MANAGEMENT SCIENCE, 38, PP 1084-1105
12
ZHANG, NF, A STATISTICAL CONTROL CHART FOR STATIONARY PROCESS DATA (1998) TECHNOMETRICS, 40, PP 24-38
12
JOHNSON, RA, BAGSHAW, M, THE EFFECT OF SERIAL CORRELATION ON THE PERFORMANCE OF CUSUM TESTS (1974) TECHNOMETRICS, 16, PP 103-112
10
ALWAN, LC, ROBERTS, HV, TIME-SERIES MODELING FOR STATISTICAL PROCESS CONTROL (1988) JOURNAL OF BUSINESS AND ECONOMIC STATISTICS, 6, PP 87-95
9
ALWAN, LC, ROBERTS, HV, TIME-SERIES MODELING FOR STATISTICAL PROCESS CONTROL (1988) JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 6, PP 87-95
8
BAGSHAW, M, JOHNSON, RA, THE EFFECT OF SERIAL CORRELATION ON THE PERFORMANCE OF CUSUM TESTS II (1975) TECHNOMETRICS, 17, PP 73-80
8
HARRIS, TJ, ROSS, WH, STATISTICAL PROCESS CONTROL PROCEDURES FOR CORRELATED OBSERVATIONS (1991) THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, 69, PP 48-57
8
The function lotka estimates Lotka’s law coefficients for scientific productivity (Lotka A.J., 1926). Lotka’s law describes the frequency of publication by authors in any given field as an inverse square law, where the number of authors publishing a certain number of articles is a fixed ratio to the number of authors publishing a single article. This assumption implies that the theoretical beta coefficient of Lotka’s law is equal to 2. Using lotka function is possible to estimate the Beta coefficient of our bibliographic collection and assess,through a statistical test, the similarity of this empirical distribution with the theoretical one.
L <‐ lotka(results)
# Author Productivity. Empirical Distribution
L$AuthorProd
# Beta coefficient estimate
L$Beta
[1] 2.792959
# Constant
L$C
[1] 0.9175485
# Goodness of fit
L$R2
[1] 0.9802325
# P‐value of K‐S two sample test
L$p.value
[1] 0.8186212
# Observed distribution
Observed=L$AuthorProd[,3]
# Theoretical distribution with Beta = 2
Theoretical=10^(log10(L$C)‐2*log10(L$AuthorProd[,1]))
plot(L$AuthorProd[,1],Theoretical,type="l",col="red",ylim=c(0, 1), xlab="Articles",ylab="Freq. of
Authors",main="Scientific Productivity")
lines(L$AuthorProd[,1],Observed,col="blue")
legend(x="topright",c("Theoretical (B=2)","Observed"),col=c("red","blue"),lty =
c(1,1,1),cex=0.6,bty="n")
**___**
Manuscript’s attributes are connected to each other through the manuscript itself: author(s) to journal, keywords to publication date, etc. These connections of different attributes generate bipartite networks that can be represented as rectangular matrices (Manuscripts x Attributes). Furthermore, scientific publications regularly contain references to other scientific works. This generates a further network, namely, cocitation or coupling network. These networks are analysed in order to capture meaningful properties of the underlying research system, and in particular to determine the influence of bibliometric units such as scholars and journals.
cocMatrix is a general function to compute a bipartite network selecting one of the metadata attributes. For example, to create a network Manuscript x Publication Source you have to use the field tag “SO”:
For a complete list of field tags see https://images.webofknowledge.com/WOKRS410B4/help/WOS/h_fieldtags.html
A <‐ cocMatrix(M, Field = "SO", sep = ";")
A is a rectangular binary matrix, representing a bipartite network where rows and columns are manuscripts and sources respectively. The generic element is 1 if the manuscript has been published in source , 0 otherwise. The column sum is the number of manuscripts published in source. Sorting, in decreasing order, the column sums of A, you can see the most relevant publication sources:
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
JOURNAL OF QUALITY TECHNOLOGY
19
QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL
11
JOURNAL OF APPLIED STATISTICS
7
IIE TRANSACTIONS (INSTITUTE OF INDUSTRIAL ENGINEERS)
6
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
5
COMMUNICATIONS IN STATISTICS: SIMULATION AND COMPUTATION
5
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH
4
QUALITY ENGINEERING
4
TECHNOMETRICS
4
QINGHUA DAXUE XUEBAO/JOURNAL OF TSINGHUA UNIVERSITY
3
COMPUTERS AND INDUSTRIAL ENGINEERING
2
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
2
INTERNATIONAL JOURNAL OF INDUSTRIAL AND SYSTEMS ENGINEERING
2
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING, INFORMATION AND CONTROL
2
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
2
Following this approach, you can compute several bipartite networks: Citation network
A <‐ cocMatrix(M, Field = "CR", sep = ". ")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
TECHNOMETRICS AUTOCORRELATE ENGINEERING OBSERVATION AUTOCORRELATIO
97 87 63 62 56
MULTIVARIAT COMMUNICATION INTRODUCTIO TRANSACTIONS EXPONENTIALL
55 55 55 51 51
MASTRANGELO DISTRIBUTION INTERNATIONAL INTERNATIONA MANUFACTURIN
48 47 46 41 39
Author network
A <‐ cocMatrix(M, Field = "AU", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
APLEY DW TSUNG F DYER JN
4 4 3
RUNGER GC ATIENZA OO TANG LC
3 3 3
ANG BW LU C-W REYNOLDSJR MR
3 3 3
ZHANG NF TRIANTAFYLLOPOULOS K TSUNG F
3 2 2
TESTIK MC CAPIZZI G MASAROTTO G
2 2 2
Country network Authors’ Countries is not a standard attribute of the bibliographic data frame. You need to extract this information from affiliation attribute using the function metaTagExtraction.
M <‐ metaTagExtraction(M, Field = "AU_CO", sep = ";")
A <‐ cocMatrix(M, Field = "AU_CO", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
UNITED STATES CHINA TAIWAN HONG KONG TURKEY
57 9 9 7 7
GERMANY KOREA INDIA IRAN BRAZIL
6 5 5 5 5
UNITED KINGDOM TUNISIA SINGAPORE ITALY PORTUGAL
4 4 4 3 3
metaTagExtraction allows to extract the following additional field tags: Authors’ countries (Field = “AU_CO”); First author of each cited reference (Field = “CR_AU”); Publication source of each cited reference (Field = “CR_SO”); and Authors’ affiliations (Field = “AU_UN”).
Author keyword network
A <‐ cocMatrix(M, Field = "DE", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
STATISTICAL PROCESS CONTROL AUTOCORRELATION
69 35
CONTROL CHARTS AVERAGE RUN LENGTH
21 18
SPC AUTOCORRELATED DATA
13 9
AUTOCORRELATED PROCESSES TIME SERIES ANALYSIS
9 8
QUALITY CONTROL CONTROL CHART
6 6
AUTOCORRELATED PROCESS MULTIVARIATE STATISTICAL PROCESS CONTROL
6 6
STATISTICAL PROCESS CONTROL (SPC) EWMA
5 4
CUSUM
4
Keyword Plus network
A <‐ cocMatrix(M, Field = "ID", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
CONTROL CHARTS FLOWCHARTING STATISTICAL PROCESS CONTROL
28 26 22
QUALITY CONTROL MATHEMATICAL MODELS CORRELATION METHODS
21 19 19
AUTOCORRELATION COMPUTER SIMULATION PARAMETER ESTIMATION
15 15 11
AUTOCORRELATED PROCESS REGRESSION ANALYSIS MONITORING
10 10 8
PROCESS CONTROL MONTE CARLO METHODS AVERAGE RUN LENGTHS
7 7 7
Bibliographic coupling Two articles are said to be bibliographically coupled if at least one cited source appears in the bibliographies or reference lists of both articles (Kessler, 1963). A coupling network can be obtained using the general formulation: \(B=A.A^T\). Where A is a bipartite network.
The function biblioNetwork calculates, starting from a bibliographic data frame, the most frequently used coupling networks: Authors, Sources, and Countries.
biblioNetwork uses two arguments to define the network to compute:
NetMatrix <‐ biblioNetwork(M, analysis = "coupling", network = "references", sep = ". ")
Articles with only a few references, therefore, would tend to be more weakly bibliographically coupled, if coupling strength is measured simply according to the number of references that articles contain in common. This suggests that it might be more practicable to switch to a relative measure of bibliographic coupling. couplingSimilarity function calculates Jaccard or Salton similarity coefficient among vertices of a coupling network.
NetMatrix <‐ biblioNetwork(M, analysis = "coupling", network = "authors", sep = ";")
# plot authors' similarity (first 20 authors)
net=networkPlot(NetMatrix, n = 20, Title = "Authors' Coupling", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
# calculate jaccard similarity coefficient
S <‐ couplingSimilarity(NetMatrix, type="jaccard")
# plot authors' similarity (first 20 authors)
net=networkPlot(S, n = 20, Title = "Authors' Coupling", remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
Bibliographic cocitation We talk about cocitation of two articles when both are cited in a third article. Thus, cocitation can be seen as the counterpart of bibliographic coupling. A cocitation network can be obtained using the general formulation: \(C=A^t.A\). where A is a bipartite network.
NetMatrix <‐ biblioNetwork(M, analysis = "co‐citation", network = "references", sep = ". ")
net=networkPlot(NetMatrix, n = 20, Title = "Bibliographic cocitation", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
Bibliographic collaboration Scientific collaboration network is a network where nodes are authors and links are coauthorships as the latter is one of the most well documented forms of scientific collaboration (Glanzel, 2004). An author collaboration network can be obtained using the general formulation: \(AC=A^t.A\). where A is a bipartite network Manuscripts x Authors.
NetMatrix <‐ biblioNetwork(M, analysis = "collaboration", network = "authors", sep = ";")
net=networkPlot(NetMatrix, n = 20, Title = "Bibliographic collaboration", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
Visualizing bibliographic networks
Country Scientific Collaboration
# Create a country collaboration network
M <‐ metaTagExtraction(M, Field = "AU_CO", sep = ";")
NetMatrix <‐ biblioNetwork(M, analysis = "collaboration", network = "countries", sep = ";")
# Plot the network
net=networkPlot(NetMatrix, n = 20, Title = "Country Collaboration", type = "circle", size=TRUE,remove.multiple=FALSE)
net=networkPlot(NetMatrix, n = 20, Title = "Country Collaboration", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
CoCitation Network
# Create a co‐citation network
NetMatrix <‐ biblioNetwork(M, analysis = "co‐citation", network = "references", sep = ". ")
# Plot the network
net=networkPlot(NetMatrix, n = 5, Title = "Co‐Citation Network", type = "fruchterman", size=T, remove.multiple=FALSE)
Keyword cooccurrences
# Create keyword co‐occurrencies network
NetMatrix <‐ biblioNetwork(M, analysis = "co‐occurrences", network = "keywords", sep = ";")
# Plot the network
net=networkPlot(NetMatrix, n = 20, Title = "Keyword Co‐occurrences", type = "kamada", size=T)
CoWord Analysis: Conceptual structure of a field The aim of the coword analysis is to map the conceptual structure of a framework using the word cooccurrences in a bibliographic collection. The analysis can be performed through dimensionality reduction techniques such as Multidimensional Scaling (MDS) or Multiple Correspondence Analysis (MCA). Here, we show an example using the function conceptualStructure that performs a MCA to draw a conceptual structure of the field and Kmeans clustering to identify clusters of documents which express common concepts. Results are plotted on a twodimensional map. conceptualStructure includes natural language processing (NLP) routines (see the function termExtraction) to extract terms from titles and abstracts. In addition, it implements the Porter’s stemming algorithm to reduce inflected (or sometimes derived) words to their word stem, base or root form.
# Conceptual Structure using keywords
CS <- conceptualStructure(M,field="ID", minDegree=4, k.max=5, stemming=FALSE)
Historical Co-Citation Network Historiographic map is a graph proposed by E. Garfield to represent a chronological network map of most relevant co-citations resulting from a bibliographic collection. The function generates a chronological co-citation network matrix which can be plotted using histPlot:
# Create a historical co-citation network
histResults <- histNetwork(M, n = 10, sep = ". ")
Error in `$<-.data.frame`(`*tmp*`, "LCS", value = numeric(0)) :
replacement has 0 rows, data has 10