#install.packages("dplyr")
#install.packages("Matrix")
#install.packages("stringr")
#install.packages("igraph")
#install.packages("bibliometrix", dependencies=TRUE)
library("dplyr")
library("Matrix")
library("stringr")
library("igraph")
library("FactoMineR")
library("factoextra")
library("ggplot2")
library("bibliometrix") ### load bibliometrix package
D <- readFiles("thales.bib")
#D
# base ISI WoK
# M <‐ convert2df(D, dbsource = "isi", format = "bibtex")
# base SCOPUS
M<‐ convert2df(D, dbsource = "scopus", format = "bibtex")
Articles extracted 100
Articles extracted 151
#M
results <‐ biblioAnalysis(M, sep = ";")
S=summary(object = results, k = 10, pause = FALSE)
Main Information about data
Articles 151
Sources (Journals, Books, etc.) 81
Keywords Plus (ID) 774
Author's Keywords (DE) 374
Period 1975 - 2017
Average citations per article 11.46
Authors 244
Author Appearances 352
Authors of single authored articles 21
Authors of multi authored articles 223
Articles per Author 0.619
Authors per Article 1.62
Co-Authors per Articles 2.33
Collaboration Index 1.78
Annual Scientific Production
Year Articles
1975 2
1981 1
1982 1
1983 3
1987 1
1990 3
1991 1
1992 3
1993 1
1995 3
1996 1
1997 5
1999 5
2000 3
2001 1
2002 1
2003 2
2004 1
2005 3
2006 4
2007 6
2008 14
2009 9
2010 7
2011 9
2012 14
2013 6
2014 8
2015 12
2016 19
2017 2
Annual Percentage Growth Rate 0
Most Productive Authors
Authors Articles Authors Articles Fractionalized
1 NIAKI,STA 8 NIAKI,STA 4.00
2 ASLAM,M 7 KAYA,I 3.08
3 JUN,C-H 7 GADRE,MP 2.67
4 GADRE,MP 6 RATTIHALLI,RN 2.67
5 RATTIHALLI,RN 6 ABBASI,B 2.50
6 ABBASI,B 5 SCHWERTMAN,NC 2.50
7 KAHRAMAN,C 5 TESTIK,MC 2.50
8 KAYA,I 5 ASLAM,M 2.42
9 WU,Z 5 JUN,C-H 2.42
10 CASTAGLIOLA,P 4 WU,Z 1.75
Top manuscripts per citations
Paper TC
1 CALVIN TW,(1983),IEEE TRANS. COMP., HYBRIDS, MANUFACT. TECHNOL. 116
2 MARIONJR RR ;STOUMBOS ZG,(1999),J QUAL TECHNOL 101
3 WANG J-H ;RAZ T,(1990),INT J PROD RES 87
4 CHIU WK,(1975),TECHNOMETRICS 79
5 KANAGAWA A ;TAMAKI F ;OHTA H,(1993),INT J PROD RES 78
6 LAGASSE RS ;STEINBERG ES ;KATZ RI ;SAUBERMANN AJ,(1995),ANESTHESIOLOGY 67
7 RYAN TP ;SCHWERTMAN NC,(1997),J QUAL TECHNOL 61
8 CALABRESE JOELM,(1995),MANAGE SCI 48
9 TOPALIDOU E ;PSARAKIS S,(2009),QUAL RELIAB ENG INT 42
10 KAYA I,(2009),INF SCI 34
TCperYear
1 3.41
2 5.61
3 3.22
4 1.88
5 3.25
6 3.05
7 3.05
8 2.18
9 5.25
10 4.25
Most Productive Countries
Country Articles Freq
1 USA 30 0.2069
2 IRAN 20 0.1379
3 TAIWAN 13 0.0897
4 TURKEY 11 0.0759
5 INDIA 9 0.0621
6 CHINA 8 0.0552
7 FRANCE 6 0.0414
8 SINGAPORE 6 0.0414
9 SAUDI ARABIA 5 0.0345
10 BRAZIL 4 0.0276
Total Citations per Country
Country Total Citations Average Article Citations
1 USA 693 23.10
2 IRAN 174 8.70
3 TURKEY 157 14.27
4 JAPAN 104 52.00
5 CHINA 97 12.12
6 SINGAPORE 81 13.50
7 TAIWAN 70 5.38
8 INDIA 55 6.11
9 CANADA 49 12.25
10 GREECE 49 24.50
Most Relevant Sources
Sources Articles
1 QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL 17
2 COMMUNICATIONS IN STATISTICS - THEORY AND METHODS 8
3 JOURNAL OF QUALITY TECHNOLOGY 8
4 INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 7
5 JOURNAL OF APPLIED STATISTICS 7
6 QUALITY ENGINEERING 7
7 COMPUTERS AND INDUSTRIAL ENGINEERING 4
8 INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY 4
9 EXPERT SYSTEMS WITH APPLICATIONS 3
10 IIE TRANSACTIONS (INSTITUTE OF INDUSTRIAL ENGINEERS) 3
Most Relevant Keywords
Author Keywords (DE) Articles Keywords-Plus (ID) Articles
1 ATTRIBUTE CONTROL CHART 25 CONTROL CHARTS 51
2 STATISTICAL PROCESS CONTROL 24 FLOWCHARTING 45
3 ATTRIBUTE CONTROL CHARTS 21 ATTRIBUTE CONTROL CHARTS 25
4 AVERAGE RUN LENGTH 19 GRAPHIC METHODS 22
5 CONTROL CHART 10 AVERAGE RUN LENGTHS 18
6 MARKOV CHAIN 9 QUALITY CONTROL 17
7 SPC 9 MARKOV PROCESSES 13
8 ATTRIBUTES CONTROL CHART 7 IN-CONTROL 11
9 ATTRIBUTES CONTROL CHARTS 7 CONTROL LIMITS 10
10 PROCESS CONTROL 7 PARAMETER ESTIMATION 10
plot(x = results, k = 10, pause = FALSE)
#M$CR[1]
CR <‐ citations(M,field="article", sep = ";")
CR$Cited[1:10]
CR
PATEL, HI, QUALITY CONTROL METHODS FOR MULTIVARIATE BINOMIAL AND POISSON DISTRIBUTIONS (1973) TECHNOMETRICS, 15, PP 103-112
11
MARCUCCI, M, MONITORING MULTINOMIAL PROCESSES (1985) J QUAL TECHNOL, 17, PP 86-91
7
RYAN, TP, SCHWERTMAN, NC, OPTIMAL LIMITS FOR ATTRIBUTES CONTROL CHARTS (1997) JOURNAL OF QUALITY TECHNOLOGY, 29 (1), PP 86-98
7
WOODALL, WH, CONTROL CHARTS BASED ON ATTRIBUTE DATA: BIBLIOGRAPHY AND REVIEW (1997) JOURNAL OF QUALITY TECHNOLOGY, 29 (2), PP 172-183
7
BOURKE, PD, DETECTING A SHIFT IN FRACTION NONCONFORMING USING RUN-LENGTH CONTROL CHARTS WITH 100% INSPECTION (1991) JOURNAL OF QUALITY TECHNOLOGY, 23, PP 225-238
6
TOPALIDOU, E, PSARAKIS, S, REVIEW OF MULTINOMIAL AND MULTIATTRIBUTE QUALITY CONTROL CHARTS (2009) QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 25 (7), PP 773-804
6
BORROR, CM, CHAMP, CW, RIGDON, SE, POISSON EWMA CONTROL CHARTS (1998) JOURNAL OF QUALITY TECHNOLOGY, 30 (4), PP 352-361
5
BERSIMIS, S, PSARAKIS, S, PANARETOS, J, MULTIVARIATE STATISTICAL PROCESS CONTROL CHARTS: AN OVERVIEW (2007) QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 23 (5), PP 517-543
4
BOURKE, PD, SAMPLE SIZE AND THE BINOMIAL CUSUM CONTROL CHART: THE CASE OF 100% INSPECTION (2001) METRIKA, 53, PP 51-70
4
BROOK, D, EVANS, DA, AN APPROACH TO THE PROBABILITY DISTRIBUTION OF CUSUM RUN LENGTH (1972) BIOMETRIKA, 59 (3), PP 539-549
4
The function lotka estimates Lotka’s law coefficients for scientific productivity (Lotka A.J., 1926). Lotka’s law describes the frequency of publication by authors in any given field as an inverse square law, where the number of authors publishing a certain number of articles is a fixed ratio to the number of authors publishing a single article. This assumption implies that the theoretical beta coefficient of Lotka’s law is equal to 2. Using lotka function is possible to estimate the Beta coefficient of our bibliographic collection and assess,through a statistical test, the similarity of this empirical distribution with the theoretical one.
L <‐ lotka(results)
# Author Productivity. Empirical Distribution
L$AuthorProd
# Beta coefficient estimate
L$Beta
[1] 2.414755
# Constant
L$C
[1] 0.673307
# Goodness of fit
L$R2
[1] 0.983198
# P‐value of K‐S two sample test
L$p.value
[1] 0.6271671
# Observed distribution
Observed=L$AuthorProd[,3]
# Theoretical distribution with Beta = 2
Theoretical=10^(log10(L$C)‐2*log10(L$AuthorProd[,1]))
plot(L$AuthorProd[,1],Theoretical,type="l",col="red",ylim=c(0, 1), xlab="Articles",ylab="Freq. of
Authors",main="Scientific Productivity")
lines(L$AuthorProd[,1],Observed,col="blue")
legend(x="topright",c("Theoretical (B=2)","Observed"),col=c("red","blue"),lty =
c(1,1,1),cex=0.6,bty="n")
**___**
Manuscript’s attributes are connected to each other through the manuscript itself: author(s) to journal, keywords to publication date, etc. These connections of different attributes generate bipartite networks that can be represented as rectangular matrices (Manuscripts x Attributes). Furthermore, scientific publications regularly contain references to other scientific works. This generates a further network, namely, cocitation or coupling network. These networks are analysed in order to capture meaningful properties of the underlying research system, and in particular to determine the influence of bibliometric units such as scholars and journals.
cocMatrix is a general function to compute a bipartite network selecting one of the metadata attributes. For example, to create a network Manuscript x Publication Source you have to use the field tag “SO”:
For a complete list of field tags see https://images.webofknowledge.com/WOKRS410B4/help/WOS/h_fieldtags.html
A <‐ cocMatrix(M, Field = "SO", sep = ";")
A is a rectangular binary matrix, representing a bipartite network where rows and columns are manuscripts and sources respectively. The generic element is 1 if the manuscript has been published in source , 0 otherwise. The column sum is the number of manuscripts published in source. Sorting, in decreasing order, the column sums of A, you can see the most relevant publication sources:
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL
17
COMMUNICATIONS IN STATISTICS - THEORY AND METHODS
8
JOURNAL OF QUALITY TECHNOLOGY
8
JOURNAL OF APPLIED STATISTICS
7
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH
7
QUALITY ENGINEERING
7
INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY
4
COMPUTERS AND INDUSTRIAL ENGINEERING
4
STUDIES IN FUZZINESS AND SOFT COMPUTING
3
INTERNATIONAL JOURNAL OF RELIABILITY, QUALITY AND SAFETY ENGINEERING
3
IIE TRANSACTIONS (INSTITUTE OF INDUSTRIAL ENGINEERS)
3
EXPERT SYSTEMS WITH APPLICATIONS
3
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS
2
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
2
SEQUENTIAL ANALYSIS
2
Following this approach, you can compute several bipartite networks: Citation network
A <‐ cocMatrix(M, Field = "CR", sep = ". ")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
INTRODUCTIO TECHNOMETRICS INTERNATIONA DISTRIBUTIO INTERNATIONAL
82 71 69 64 51
DISTRIBUTION ENGINEERING TRANSACTIONS MULTIVARIAT EXPONENTIALL
51 48 47 46 40
MANUFACTURIN APPLICATION NONCONFORMIN COMMUNICATION TRANSACTION
37 35 33 31 30
Author network
A <‐ cocMatrix(M, Field = "AU", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
NIAKI STA ASLAM M JUN C-H GADRE MP RATTIHALLI RN
8 7 6 6 6
ABBASI B CASTAGLIOLA P TESTIK MC DOKOUHAKI P WU Z
5 4 4 4 4
KAHRAMAN C GLBAY M KOOLI I LIMAM M WEI CH
3 3 3 3 3
Country network Authors’ Countries is not a standard attribute of the bibliographic data frame. You need to extract this information from affiliation attribute using the function metaTagExtraction.
M <‐ metaTagExtraction(M, Field = "AU_CO", sep = ";")
A <‐ cocMatrix(M, Field = "AU_CO", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
UNITED STATES IRAN TAIWAN TURKEY CHINA
37 21 14 14 11
INDIA KOREA BRAZIL FRANCE SINGAPORE
10 7 6 6 6
SAUDI ARABIA CANADA SPAIN HONG KONG PAKISTAN
5 5 4 4 4
metaTagExtraction allows to extract the following additional field tags: Authors’ countries (Field = “AU_CO”); First author of each cited reference (Field = “CR_AU”); Publication source of each cited reference (Field = “CR_SO”); and Authors’ affiliations (Field = “AU_UN”).
Author keyword network
A <‐ cocMatrix(M, Field = "DE", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
ATTRIBUTE CONTROL CHART STATISTICAL PROCESS CONTROL
25 24
ATTRIBUTE CONTROL CHARTS AVERAGE RUN LENGTH
21 19
CONTROL CHART MARKOV CHAIN
10 9
SPC ATTRIBUTES CONTROL CHARTS
9 7
ATTRIBUTES CONTROL CHART PROCESS CONTROL
7 7
ARL POISSON DISTRIBUTION
6 6
MULTI-ATTRIBUTE CONTROL CHART GENETIC ALGORITHMS
6 5
AVERAGE TIME TO SIGNAL
5
Keyword Plus network
A <‐ cocMatrix(M, Field = "ID", sep = ";")
sort(Matrix::colSums(A), decreasing = TRUE)[1:15]
CONTROL CHARTS FLOWCHARTING ATTRIBUTE CONTROL CHARTS
50 45 25
GRAPHIC METHODS AVERAGE RUN LENGTHS QUALITY CONTROL
22 18 17
MARKOV PROCESSES IN-CONTROL PARAMETER ESTIMATION
13 11 10
CONTROL LIMITS PROBABILITY SAMPLE SIZES
10 9 8
POISSON DISTRIBUTION QUALITY CHARACTERISTIC PROCESS MONITORING
8 8 7
Bibliographic coupling Two articles are said to be bibliographically coupled if at least one cited source appears in the bibliographies or reference lists of both articles (Kessler, 1963). A coupling network can be obtained using the general formulation: \(B=A.A^T\). Where A is a bipartite network.
The function biblioNetwork calculates, starting from a bibliographic data frame, the most frequently used coupling networks: Authors, Sources, and Countries.
biblioNetwork uses two arguments to define the network to compute:
NetMatrix <‐ biblioNetwork(M, analysis = "coupling", network = "references", sep = ". ")
Articles with only a few references, therefore, would tend to be more weakly bibliographically coupled, if coupling strength is measured simply according to the number of references that articles contain in common. This suggests that it might be more practicable to switch to a relative measure of bibliographic coupling. couplingSimilarity function calculates Jaccard or Salton similarity coefficient among vertices of a coupling network.
NetMatrix <‐ biblioNetwork(M, analysis = "coupling", network = "authors", sep = ";")
# plot authors' similarity (first 20 authors)
net=networkPlot(NetMatrix, n = 20, Title = "Authors' Coupling", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
# calculate jaccard similarity coefficient
S <‐ couplingSimilarity(NetMatrix, type="jaccard")
# plot authors' similarity (first 20 authors)
net=networkPlot(S, n = 20, Title = "Authors' Coupling", remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
Bibliographic cocitation We talk about cocitation of two articles when both are cited in a third article. Thus, cocitation can be seen as the counterpart of bibliographic coupling. A cocitation network can be obtained using the general formulation: \(C=A^t.A\). where A is a bipartite network.
NetMatrix <‐ biblioNetwork(M, analysis = "co‐citation", network = "references", sep = ". ")
net=networkPlot(NetMatrix, n = 20, Title = "Bibliographic cocitation", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
Bibliographic collaboration Scientific collaboration network is a network where nodes are authors and links are coauthorships as the latter is one of the most well documented forms of scientific collaboration (Glanzel, 2004). An author collaboration network can be obtained using the general formulation: \(AC=A^t.A\). where A is a bipartite network Manuscripts x Authors.
NetMatrix <‐ biblioNetwork(M, analysis = "collaboration", network = "authors", sep = ";")
net=networkPlot(NetMatrix, n = 20, Title = "Bibliographic collaboration", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
Visualizing bibliographic networks
Country Scientific Collaboration
# Create a country collaboration network
M <‐ metaTagExtraction(M, Field = "AU_CO", sep = ";")
NetMatrix <‐ biblioNetwork(M, analysis = "collaboration", network = "countries", sep = ";")
# Plot the network
net=networkPlot(NetMatrix, n = 20, Title = "Country Collaboration", type = "circle", size=TRUE,remove.multiple=FALSE)
net=networkPlot(NetMatrix, n = 20, Title = "Country Collaboration", type = "fruchterman", size=FALSE, remove.multiple=TRUE, vos.path="c:/Users/DE/Desktop/bibliometRics/VOSviewer")
CoCitation Network
# Create a co‐citation network
NetMatrix <‐ biblioNetwork(M, analysis = "co‐citation", network = "references", sep = ". ")
# Plot the network
net=networkPlot(NetMatrix, n = 5, Title = "Co‐Citation Network", type = "fruchterman", size=T, remove.multiple=FALSE)
Keyword cooccurrences
# Create keyword co‐occurrencies network
NetMatrix <‐ biblioNetwork(M, analysis = "co‐occurrences", network = "keywords", sep = ";")
# Plot the network
net=networkPlot(NetMatrix, n = 20, Title = "Keyword Co‐occurrences", type = "kamada", size=T)
CoWord Analysis: Conceptual structure of a field The aim of the coword analysis is to map the conceptual structure of a framework using the word cooccurrences in a bibliographic collection. The analysis can be performed through dimensionality reduction techniques such as Multidimensional Scaling (MDS) or Multiple Correspondence Analysis (MCA). Here, we show an example using the function conceptualStructure that performs a MCA to draw a conceptual structure of the field and Kmeans clustering to identify clusters of documents which express common concepts. Results are plotted on a twodimensional map. conceptualStructure includes natural language processing (NLP) routines (see the function termExtraction) to extract terms from titles and abstracts. In addition, it implements the Porter’s stemming algorithm to reduce inflected (or sometimes derived) words to their word stem, base or root form.
# Conceptual Structure using keywords
CS <- conceptualStructure(M,field="ID", minDegree=4, k.max=5, stemming=FALSE)
Historical Co-Citation Network Historiographic map is a graph proposed by E. Garfield to represent a chronological network map of most relevant co-citations resulting from a bibliographic collection. The function generates a chronological co-citation network matrix which can be plotted using histPlot:
# Create a historical co-citation network
?histNetwork
histResults <- histNetwork(M, n = 10, sep = ";")
Error in regexpr(df$Year[i], df$Paper[i]) : invalid 'pattern' argument