Description

This is a documents that describes the data analysis steps that was carried for the bibliometric analysis for CDOE. The journal is celebrating its 50th year anniversary and a scientometric analysis was conducted for all the publication from 1973-2022. The data consisted of 3438 documents consisting of research articles and review. The R package “Bibliometrix” was used for analysis. Bibliometrix is an open-source software for automating the stages of data-analysis and data-visualization. After converting and uploading bibliographic data in R, Bibliometrix performs a descriptive analysis and different research-structure analysis.

Bibliographic Collection

Data source: SCOPUS

Data format: BIBTEX

Query: SO = “Community Dentistry and Oral Epidemilogy”

Timespan: 1973-2022

Document Type: Articles, review

Query data: Jan, 07, 2023

Install and load bibliometrix R-package

# Stable version from CRAN (Comprehensive R Archive Network)
# if you need to execute the code, remove # from the beginning of the next line
# install.packages("bibliometrix")
###OR###
# Most updated version from GitHub
# if you need to execute the code, remove # from the beginning of the next lines
#install.packages("remotes")         
   #remotes::install_github("massimoaria/bibliometrix")
   #remotes::install_github("massimoaria/bibliometrixData")

library(bibliometrix)

Data Loading and Converting

setwd("~/CDOE scientometric")
# Converting the loaded files into a R bibliographic dataframe
M <- convert2df("scopus.csv", dbsource = "scopus", format = "csv")

Converting your scopus collection into a bibliographic dataframe

Done!


Generating affiliation field tag AU_UN from C1:  Done!

Section 1: Descriptive Analysis

Descriptive analysis provides some snapshots about the annual research development, the top “k” productive authors, papers, countries and most relevant keywords.

A. Main findings about the collection

# options(width=160)
results <- biblioAnalysis(M)
summary(results, k = 25, pause = F, width = 130)


MAIN INFORMATION ABOUT DATA

 Timespan                              1973 : 2022 
 Sources (Journals, Books, etc)        1 
 Documents                             3438 
 Annual Growth Rate %                  4.31 
 Document Average Age                  24.8 
 Average citations per doc             30.9 
 Average citations per year per doc    1.547 
 References                            115354 
 
DOCUMENT TYPES                     
 article      3320 
 review       118 
 
DOCUMENT CONTENTS
 Keywords Plus (ID)                    3999 
 Author's Keywords (DE)                4042 
 
AUTHORS
 Authors                               6885 
 Author Appearances                    12456 
 Authors of single-authored docs       304 
 
AUTHORS COLLABORATION
 Single-authored docs                  440 
 Documents per Author                  0.499 
 Co-Authors per Doc                    3.62 
 International co-authorships %        19.05 
 

Annual Scientific Production

 Year    Articles
    1973       18
    1974       47
    1975       51
    1976       48
    1977       56
    1978       63
    1979       68
    1980       68
    1981       57
    1982       66
    1983       74
    1984       78
    1985       85
    1986       91
    1987       83
    1988       88
    1989       83
    1990       74
    1991       88
    1992       86
    1993       84
    1994       85
    1995       69
    1996       87
    1997       71
    1998       74
    1999       59
    2000       60
    2001       57
    2002       57
    2003       61
    2004       64
    2005       49
    2006       50
    2007       61
    2008       63
    2009       62
    2010       61
    2011       63
    2012       98
    2013       50
    2014       65
    2015       63
    2016       65
    2017       62
    2018       78
    2019       63
    2020       68
    2021       75
    2022      142

Annual Percentage Growth Rate 4.31 


Most Productive Authors

   Authors        Articles Authors        Articles Fractionalized
1   LOCKER D            62  LOCKER D                        27.73
2   SPENCER AJ          56  SPENCER AJ                      23.60
3   SHEIHAM A           47  SHEIHAM A                       16.15
4   THOMSON WM          38  HOLST D                         12.62
5   POULSEN S           30  ANAISE JZ                       12.08
6   HOOGSTRATEN J       28  GRYTTEN J                       12.00
7   TSAKOS G            28  PETERSEN PE                     11.90
8   HOLST D             26  RISE J                          11.62
9   SLADE GD            26  THOMSON WM                      10.96
10  HAUSEN H            25  HAUGEJORDEN O                   10.73
11  PERES MA            25  POULSEN S                       10.02
12  GRYTTEN J           24  RIORDAN PJ                       9.74
13  MURTOMAA H          24  SLADE GD                         9.56
14  LO ECM              23  MURTOMAA H                       9.07
15  PETERSEN PE         23  HOOGSTRATEN J                    9.05
16  RIORDAN PJ          23  SCHWARZ E                        9.04
17  WATT RG             23  HELOE LA                         8.67
18  BRENNAN DS          22  LO ECM                           7.93
19  DO LG               21  HAUSEN H                         7.85
20  GILBERT GH          21  HOROWITZ HS                      7.78
21  HAUGEJORDEN O       21  BRENNAN DS                       7.16
22  RISE J              21  WATT RG                          7.15
23  VAN'T HOF MA        21  PITTS NB                         7.06
24  FREEMAN R           20  FREEMAN R                        6.87
25  ISMAIL AI           20  ISMAIL AI                        6.66


Top manuscripts per citations

                                            Paper                                           DOI   TC TCperYear   NTC
1  PETERSEN PE, 2003, COMMUNITY DENT ORAL EPIDEMIOL        10.1046/j..2003.com122.x             1596     76.00 20.63
2  SLADE GD, 1997, COMMUNITY DENT ORAL EPIDEMIOL           10.1111/j.1600-0528.1997.tb00941.x   1471     54.48 21.70
3  ISMAIL AI, 2007, COMMUNITY DENT ORAL EPIDEMIOL          10.1111/j.1600-0528.2007.00347.x      828     48.71 10.44
4  PETERSEN PE, 2005, COMMUNITY DENT ORAL EPIDEMIOL        10.1111/j.1600-0528.2004.00219.x      707     37.21 11.07
5  FEATHERSTONE JDB, 1999, COMMUNITY DENT ORAL EPIDEMIOL   10.1111/j.1600-0528.1999.tb01989.x    634     25.36 13.16
6  SHEIHAM A, 2000, COMMUNITY DENT ORAL EPIDEMIOL          10.1034/j.1600-0528.2000.028006399.x  605     25.21  9.70
7  POLDER BJ, 2004, COMMUNITY DENT ORAL EPIDEMIOL          10.1111/j.1600-0528.2004.00158.x      586     29.30  8.95
8  THYLSTRUP A, 1978, COMMUNITY DENT ORAL EPIDEMIOL        10.1111/j.1600-0528.1978.tb01173.x    469     10.20 20.00
9  LOCKER D, 2007, COMMUNITY DENT ORAL EPIDEMIOL           10.1111/j.1600-0528.2007.00418.x      416     24.47  5.25
10 TSAI C, 2002, COMMUNITY DENT ORAL EPIDEMIOL             10.1034/j.1600-0528.2002.300304.x     373     16.95  5.53
11 GUPTA PC, 1980, COMMUNITY DENT ORAL EPIDEMIOL           10.1111/j.1600-0528.1980.tb01302.x    361      8.20 13.53
12 STEELE JG, 2004, COMMUNITY DENT ORAL EPIDEMIOL          10.1111/j.0301-5661.2004.00131.x      340     17.00  5.19
13 WATT RG, 2007, COMMUNITY DENT ORAL EPIDEMIOL            10.1111/j.1600-0528.2007.00348.x      335     19.71  4.22
14 KAY EJ, 1996, COMMUNITY DENT ORAL EPIDEMIOL             10.1111/j.1600-0528.1996.tb00850.x    300     10.71  8.84
15 KRAMER IRH, 1980, COMMUNITY DENT ORAL EPIDEMIOL         10.1111/j.1600-0528.1980.tb01249.x    296      6.73 11.10
16 DE SOUZA CORTES MI, 2002, COMMUNITY DENT ORAL EPIDEMIOL 10.1034/j.1600-0528.2002.300305.x     290     13.18  4.30
17 DE OLIVEIRA BH, 2005, COMMUNITY DENT ORAL EPIDEMIOL     10.1111/j.1600-0528.2005.00225.x      285     15.00  4.46
18 PETERSEN PE, 2004, COMMUNITY DENT ORAL EPIDEMIOL        10.1111/j.1600-0528.2004.00175.x      281     14.05  4.29
19 MURTI PR, 1985, COMMUNITY DENT ORAL EPIDEMIOL           10.1111/j.1600-0528.1985.tb00468.x    280      7.18 12.25
20 HEYDECKE G, 2003, COMMUNITY DENT ORAL EPIDEMIOL         10.1034/j.1600-0528.2003.00029.x      266     12.67  3.44
21 PETERSEN PE, 2009, COMMUNITY DENT ORAL EPIDEMIOL        10.1111/j.1600-0528.2008.00448.x      259     17.27  6.07
22 BERGSTRÖM J, 1989, COMMUNITY DENT ORAL EPIDEMIOL        10.1111/j.1600-0528.1989.tb00626.x    255      7.29 10.17
23 MONSE B, 2010, COMMUNITY DENT ORAL EPIDEMIOL            10.1111/j.1600-0528.2009.00514.x      250     17.86  6.91
24 LOCKER D, 2000, COMMUNITY DENT ORAL EPIDEMIOL           10.1034/j.1600-0528.2000.280301.x     248     10.33  3.98
25 THOMSON WM, 2004, COMMUNITY DENT ORAL EPIDEMIOL         10.1111/j.1600-0528.2004.00173.x      243     12.15  3.71


Corresponding Author's Countries

          Country Articles    Freq SCP MCP MCP_Ratio
1  USA                 248 0.17139 206  42     0.169
2  UNITED KINGDOM      178 0.12301 125  53     0.298
3  BRAZIL              152 0.10504 100  52     0.342
4  AUSTRALIA           135 0.09330  93  42     0.311
5  CANADA               80 0.05529  57  23     0.287
6  NETHERLANDS          73 0.05045  50  23     0.315
7  NORWAY               73 0.05045  56  17     0.233
8  FINLAND              59 0.04077  48  11     0.186
9  SWEDEN               51 0.03525  38  13     0.255
10 GERMANY              41 0.02833  25  16     0.390
11 JAPAN                37 0.02557  28   9     0.243
12 DENMARK              35 0.02419  22  13     0.371
13 NEW ZEALAND          30 0.02073  16  14     0.467
14 CHINA                29 0.02004  20   9     0.310
15 HONG KONG            29 0.02004  17  12     0.414
16 KOREA                19 0.01313  14   5     0.263
17 FRANCE               16 0.01106   9   7     0.438
18 SPAIN                14 0.00968  10   4     0.286
19 BELGIUM              12 0.00829  10   2     0.167
20 CHILE                12 0.00829   6   6     0.500
21 SWITZERLAND          12 0.00829   6   6     0.500
22 IRELAND              11 0.00760   6   5     0.455
23 MALAYSIA             11 0.00760   9   2     0.182
24 THAILAND              9 0.00622   4   5     0.556
25 SOUTH AFRICA          7 0.00484   6   1     0.143


SCP: Single Country Publications

MCP: Multiple Country Publications


Total Citations per Country

           Country      Total Citations Average Article Citations
1  USA                            10225                     41.23
2  UNITED KINGDOM                  7754                     43.56
3  CANADA                          4522                     56.52
4  BRAZIL                          3695                     24.31
5  SWITZERLAND                     3279                    273.25
6  AUSTRALIA                       3213                     23.80
7  NETHERLANDS                     3177                     43.52
8  SWEDEN                          1938                     38.00
9  NORWAY                          1904                     26.08
10 GERMANY                         1625                     39.63
11 HONG KONG                       1413                     48.72
12 DENMARK                         1287                     36.77
13 FINLAND                         1275                     21.61
14 JAPAN                           1109                     29.97
15 NEW ZEALAND                     1051                     35.03
16 FRANCE                           588                     36.75
17 BELGIUM                          575                     47.92
18 CHINA                            471                     16.24
19 SPAIN                            403                     28.79
20 THAILAND                         295                     32.78
21 PHILIPPINES                      250                    250.00
22 KOREA                            234                     12.32
23 MALAYSIA                         222                     20.18
24 IRELAND                          220                     20.00
25 SOUTH AFRICA                     210                     30.00


Most Relevant Sources

                             Sources        Articles
1 COMMUNITY DENTISTRY AND ORAL EPIDEMIOLOGY     3438


Most Relevant Keywords

   Author Keywords (DE)      Articles Keywords-Plus (ID)     Articles
1     DENTAL CARIES               519   HUMAN                    3891
2     EPIDEMIOLOGY                354   FEMALE                   3600
3     ORAL HEALTH                 331   MALE                     3588
4     CARIES                      153   DENTAL CARIES            2522
5     CHILDREN                    136   ADULT                    2336
6     QUALITY OF LIFE             105   ARTICLE                  2234
7     EPIDEMIOLOGY  ORAL           97   CHILD                    2086
8     ORAL HYGIENE                 87   ADOLESCENT               2073
9     DENTAL CARE                  85   HUMANS                   1926
10    ADULTS                       78   AGED                     1301
11    FLUORIDE                     77   DENTAL CARE              1074
12    PERIODONTAL DISEASE          77   MIDDLE AGED              1029
13    TOOTH LOSS                   72   PREVALENCE                970
14    PREVENTION                   68   DMF INDEX                 772
15    EARLY CHILDHOOD CARIES       66   HEALTH SURVEY             748
16    PUBLIC HEALTH                66   HEALTH                    674
17    DENTAL ANXIETY               65   COMPARATIVE STUDY         636
18    ADOLESCENTS                  56   ORAL HEALTH               603
19    DENTAL FLUOROSIS             55   QUESTIONNAIRE             554
20    DISPARITIES                  52   PRESCHOOL CHILD           536
21    PREVALENCE                   52   CHILD  PRESCHOOL          511
22    GINGIVITIS                   51   AGE FACTORS               444
23    FLUORIDATION                 50   AGE                       441
24    DENTAL HEALTH                45   STATISTICS                441
25    FLUORIDES                    45   PSYCHOLOGICAL ASPECT      417
plot(x = results, k = 25, pause = F)

B. Authors dominance ranking

The percentage of times an author is listed as the first author of a multi-authored publication is calculated as the author’s dominance factor (DF), representing the author’s dominance in publishing research articles.

DF <- dominance(results, k=25)
DF
write.table(DF, file = "dominance factor.tsv", 
            sep="\t", quote = FALSE, col.names=TRUE, row.names=FALSE)

C. To calculate the h-index of the first 25 most productive authors

An author with an index of “h” has written “h” papers, each of which has been referenced at least “h” times in other publications. This hybrid statistic measures an author’s productivity and the influence of their citations. The h-index does not consider publications that have received many citations and tend to increase over time, favouring authors with longer careers.1 Conversely, the g index gives the highly cited publication more weight.3 The m-index is calculated by dividing the h-index by the duration of an author’s active period. As a result, it is not time-dependent and considers the duration of an author’s career.

authors=gsub(","," ", names(results$Authors)[1:25])
indices <- Hindex(M, elements= authors, field= "author", sep= ";",)
indices$H
write.table(indices$H, file = "authors h index.tsv", 
            sep="\t", quote = FALSE, col.names=TRUE, row.names=FALSE)

D. To calculate author’s production over time

In addition to frequency over time, it assessed the volume of articles in the year, represented by a proportionate increase in circle size and the effect as measured by the yearly citation as shown by the circle’s colour (the darker the colour, the higher the article impact).

res <- authorProdOverTime(M, k=25) 
print(res$dfAU) 
plot(res$graph)

E. Top keywords

The most frequently used author’s keywords were identified and used to determine the main trending themes of the journal.

topKW=KeywordGrowth(M, Tag = "ID", sep = ";", top=10, cdf=TRUE) 
topKW
#install.packages("reshape2") 
library(reshape2) 
library(ggplot2) 
DF=melt(topKW, id='Year')
ggplot(DF,aes(Year,value, group=variable, color=variable))+geom_line()

F. Three fields plots for relatioship between Keywords, Authors and Author’s coutry

The Sanky plot, also known as the three-field plot, consists of rectangles of different heights and various colours used to depict the relevant elements in the diagram was plotted for the top 20 most productive

threeFieldsPlot(M, fields = c("DE", "AU", "AU_CO"), n = c(20, 20, 20))

G.Lotka’s Law

Lotka’s law was calculated, which describes an author’s productivity by measuring the authors’ frequency of publication in CDOE

results <- biblioAnalysis(M) 
L=lotka(results) 
L
write.table(L, file = "Lotka's law.tsv", sep="\t", quote = FALSE, col.names=TRUE, row.names=FALSE)

Section 2: Scientific Maps

Each publication in the network map is represented by a circular node and the related nodes, and connected with a line. The size of the nodes and the width of the line that connects the two nodes represent the relationship’s strength. The relative positions of the node represent the inter-relatedness of these nodes, with a different colour representing different groups formed by clusters of related nodes. A. Collaboration Networks: Authors, Countries, Institution. Collaboration networks show how authors, institutions (e.g. universities or departments) and countries relate to others in a specific field of research. B. Keyword Co-occurence Network C. Author’s co-citation Network

A. Collaboration Maps

1. Author collaboration network

The co-author network discovers regular study groups, hidden groups of scholars, and pivotal authors.

NetMatrix <- biblioNetwork(M, analysis = "collaboration",  network = "authors", sep = ";")
net=networkPlot(NetMatrix,  n = 50, Title = "Author collaboration",type = "auto", 
                size=10,size.cex=T,edgesize = 5,labelsize=1, remove.isolates = T, cluster = "walktrap")

Descriptive analysis of author collaboration network characteristics

netstat <- networkStat(NetMatrix)
summary(netstat,k=15)

2.Institutional collaboration network Uncovers relevant institutions

Educational institutes commonly collaborating together.

NetMatrix <- biblioNetwork(M, analysis = "collaboration",  network = "universities", sep = ";")
net=networkPlot(NetMatrix,  n = 50, Title = "Institutional collaboration",type = "auto", size=4,size.cex=F,edgesize = 3,labelsize=1, remove.isolates = T, cluster = "walktrap")

Descriptive analysis of institutional collaboration network

netstat <- networkStat(NetMatrix)
summary(netstat,k=15)

3.Country collaboration network

COuntries commonly working and collaborating together in the journal.

M <- metaTagExtraction(M, Field = "AU_CO", sep = ";")
NetMatrix <- biblioNetwork(M, analysis = "collaboration",  network = "countries", sep = ";")
net=networkPlot(NetMatrix,  n = dim(NetMatrix)[1], Title = "Country collaboration",
type = "auto", size=10,size.cex=T,edgesize = 5,labelsize=1, cluster="walktrap", remove.isolates = T)

Descriptive analysis of country collaboration network characteristics

netstat <- networkStat(NetMatrix)
summary(netstat,k=15)

B. Co-word Analysis through Keyword co-occurrences

Plot options - normalize = “association” (the vertex similarities are normalized using association strength) - n = 50 (the function plots the main 50 cited references) - type = “auto” (auto layout is selected) - size.cex = TRUE (the size of the vertices is proportional to their degree) - size = 20 (the max size of the vertices) - label.cex = TRUE (The vertex label sizes are proportional to their degree)* remove.multiple=TRUE (multiple edges are removed) - remove.isolates=TRUE - edgesize = 10 (The thickness of the edges is proportional to their strength. Edgesize defines the max value of the thickness) - labelsize = 3 (defines the max size of vertex labels) - label.n = 50 (Labels are plotted only for the main 50 vertices) - edges.min = 2 (plots only edges with a strength greater than or equal to 2) - all other arguments assume the default values

NetMatrix <- biblioNetwork(M, analysis = "co-occurrences", 
network = "author_keywords", sep = ";")

net=networkPlot(NetMatrix, normalize="association", 
n = 50, Title = "Keyword Co-occurrences",
type = "auto", size.cex=TRUE, size=20, label.cex= T,
remove.multiple=T, remove.isolates = TRUE, edgesize = 10, labelsize=3,edges.min=2, )

Descriptive analysis of keyword co-occurrences network

netstat <- networkStat(NetMatrix)
summary(netstat,k=10)

C.The Intellectual Structure of the field - Co-citation Analysis

Citation analysis is one of the main classic techniques in bibliometrics. It shows the structure of a specific field through the linkages between nodes (e.g. authors, papers, journal), while the edges can be differently interpretated depending on the network type, that are namely co-citation, direct citation, bibliographic coupling. Please see Aria, Cuccurullo (2017).

Article (References) co-citation analysis

Plot options n = 50 (the funxtion plots the main 50 cited references) type = “fruchterman” (the network layout is generated using the Fruchterman-Reingold Algorithm) size.cex = TRUE (the size of the vertices is proportional to their degree) size = 20 (the max size of vertices) remove.multiple=FALSE (multiple edges are not removed) labelsize = 1 (defines the size of vertex labels) edgesize = 10 (The thickness of the edges is proportional to their strength.Edgesize defines the max value of the thickness) edges.min = 5 (plots only edges with a strength greater than or equal to 5) all other arguments assume the default values

#NetMatrix <- biblioNetwork(M, analysis = "co-citation", network = "references", sep = ";")
#net=networkPlot(NetMatrix, n = 50, Title = "Co-Citation Network", type = "auto", 
                #size.cex=F, normalize = "association", weighted = T,
                #remove.multiple=FALSE, labelsize=1,edgesize = 10, label.color = T, 
                #cluster = "walktrap", halo= T)

Section 3: Factorial Analysis

A. Co-word analysis draws clusters of keywords. They are considered as themes, whose density and centrality can be used in classifying themes and mapping in a two-dimensional diagram. B. Thematic map is a very intuitive plot and we can analyze themes according to the quadrant in which they are placed: (1) upper-right quadrant: motor-themes; (2) lower-right quadrant: basic themes; (3) lower-left quadrant: emerging or disappearing themes; (4) upper-left quadrant: very specialized/niche themes.

Citation Aria, M., Cuccurullo, C., D’Aniello, L., Misuraca, M., & Spano, M. (2022). Thematic Analysis as a New Culturomic Tool: The Social Media Coverage on COVID-19 Pandemic in Italy. Sustainability, 14(6), 3643, (https://doi.org/10.3390/su14063643).

Aria M., Misuraca M., Spano M. (2020) ]Mapping the evolution of social research and data science on 30 years of Social Indicators Research, Social Indicators Research.](DOI: )https://doi.org/10.1007/s11205-020-02281-3)

Cobo, M. J., Lopez-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). An approach for detecting, quantifying, and visualizing the evolution of a research field: A practical application to the fuzzy sets theory field. Journal of Informetrics, 5(1), 146-166.

A. Co-word Analysis through Correspondence Analysis

suppressWarnings(
CS <- conceptualStructure(M, method="MCA", field="DE", 
stemming=FALSE, minDegree= 5, documents=5, clust=6, labelsize=15)
)

B. Thematic Maps

Map=thematicMap(M, field = "DE", n = 250, minfreq = 3,
                stemming = FALSE, size = 0.7, n.labels=10, repel = TRUE)
plot(Map$map)

Cluster description

Clusters=Map$words[order(Map$words$Cluster,-Map$words$Occurrences),]
library(dplyr)
CL <- Clusters %>% group_by(.data$Cluster_Label) %>% top_n(5, .data$Occurrences)
CL
write.table(CL, file = "cluster description.tsv", 
            sep="\t", quote = FALSE, col.names=TRUE, row.names=FALSE)