Bibliometric analysis of the field of soft robotics

Author

Julius Fenn

1 Notes

I derived 12,363 documents from https://www.webofscience.com/ with the following search terms:

(“soft robot” OR “soft robotics”)

Date of retrieval: 24th of July 2025 (references were sorted by “Date: newest first”)

2 global variables

Define your global variables (e.g., to run specific processes only once to save computation time):

recompileData = FALSE

3 load packages, additional functions, .bib files

### load packages
require(pacman)
p_load('bibliometrix', 'tidyverse')


### load .bib files
if(recompileData){
  # List all .bib files in your working directory
  bib_files <- list.files("data", pattern = "\\.bib$", full.names = TRUE)
  print(bib_files)
  # bib_files <- "data/savedrecs12001_12363.bib" # for testing
  # Convert all of them in one go
  M <- convert2df(file = bib_files, dbsource = "wos", format = "bibtex", remove.duplicates = TRUE)
  
  rm(bib_files)
  
  setwd("data")
  saveRDS(object = M, file = "M_out.rds")
}else{
  setwd("outputs")
  M <- readRDS(file = "M_out.rds")
}

3.1 additional functions

## APA7 style for plots
theme_apa7 <- function() {
  theme_minimal(base_size = 12, base_family = "sans") +
    theme(
      plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
      plot.subtitle = element_text(size = 12, hjust = 0.5),
      axis.title = element_text(face = "bold"),
      axis.text = element_text(size = 11),
      panel.grid.major = element_line(color = "gray85"),
      panel.grid.minor = element_blank(),
      panel.border = element_blank(),
      axis.line = element_line(color = "black")
    )
}

## Extract information of identified clusters
get_top_articles_by_cluster <- function(cluster_df, min_cluster_size = 6, top_n = 3) {
  # Count clusters and articles per cluster
  cluster_sizes <- cluster_df %>%
    count(cluster, name = "ClusterSize")

  # Filter clusters with at least `min_cluster_size` articles
  eligible_clusters <- cluster_sizes %>%
    filter(ClusterSize >= min_cluster_size) %>%
    pull(cluster)

  # For each eligible cluster, get top N articles by btw_centrality
  top_articles <- cluster_df %>%
    filter(cluster %in% eligible_clusters) %>%
    group_by(cluster) %>%
    arrange(desc(btw_centrality)) %>%
    slice_head(n = top_n) %>%
    ungroup()

  # Return both total clusters and result table
  list(
    total_clusters = n_distinct(cluster_df$cluster),
    eligible_clusters = length(eligible_clusters),
    top_articles = top_articles
  )
}

3.2 remove all entries without DOIs

Remark: despite downloading 12,363 documents, it is not possible to download all such retrieved documents (e.g., missing entries)

Percentage of entries removed:

nrow(M)

[1] 11352

round(x = sum(is.na(M$DI)) / length(M$DI) * 100, digits = 2)

[1] 8.78

M <- M[!is.na(M$DI), ]
nrow(M)

[1] 10355

4 📊 Descriptive bibliometric analysis

results <- biblioAnalysis(M, sep = ";")
S <- summary(results, k=10, pause=FALSE)



MAIN INFORMATION ABOUT DATA

 Timespan                              1997 : 2025 
 Sources (Journals, Books, etc)        1363 
 Documents                             10355 
 Annual Growth Rate %                  29.04 
 Document Average Age                  3.35 
 Average citations per doc             32.26 
 Average citations per year per doc    5.579 
 References                            225387 
 
DOCUMENT TYPES                     
 article                             7591 
 article article                     1 
 article; book chapter               21 
 article; data paper                 1 
 article; early access               256 
 article; proceedings paper          40 
 article; retracted publication      1 
 correction                          11 
 editorial material                  119 
 letter                              5 
 meeting abstract                    2 
 news item                           2 
 proceedings paper                   1548 
 review                              727 
 review; book chapter                3 
 review; early access                27 
 
DOCUMENT CONTENTS
 Keywords Plus (ID)                    6821 
 Author's Keywords (DE)                17686 
 
AUTHORS
 Authors                               21950 
 Author Appearances                    59078 
 Authors of single-authored docs       161 
 
AUTHORS COLLABORATION
 Single-authored docs                  198 
 Documents per Author                  0.472 
 Co-Authors per Doc                    5.71 
 International co-authorships %        26.41 
 

Annual Scientific Production

 Year    Articles
    1997        1
    2003        1
    2005        2
    2006        6
    2007        2
    2008        4
    2009        5
    2010        4
    2011       12
    2012       22
    2013       26
    2014       91
    2015      127
    2016      175
    2017      320
    2018      509
    2019      933
    2020     1105
    2021     1032
    2022      913
    2023     1708
    2024     2097
    2025     1260

Annual Percentage Growth Rate 29.04 


Most Productive Authors

   Authors        Articles Authors        Articles Fractionalized
1         WANG Y       269        WANG Y                     46.8
2         LI Y         256        LI Y                       39.4
3         ZHANG Y      231        WANG Z                     37.2
4         WANG Z       220        ZHANG Y                    31.7
5         LIU Y        207        WANG X                     31.6
6         WANG X       207        LIU Y                      30.3
7         CHEN Y       185        LIU J                      28.9
8         WANG H       171        CHEN Y                     28.3
9         LIU J        168        WANG J                     26.9
10        WANG J       162        WANG H                     26.7


Top manuscripts per citations

                                Paper                                    DOI   TC TCperYear   NTC
1  RUS D, 2015, NATURE                         10.1038/nature14543           4343     394.8 35.97
2  AMJADI M, 2016, ADV FUNCT MATER             10.1002/adfm.201504755        2616     261.6 23.51
3  WANG S, 2018, NATURE                        10.1038/nature25494           1806     225.8 20.10
4  SHEPHERD RF, 2011, PROC NATL ACAD SCI U S A 10.1073/pnas.1116564108       1739     115.9  4.97
5  KIM Y, 2018, NATURE                         10.1038/s41586-018-0185-0     1690     211.2 18.81
6  KIM S, 2013, TRENDS BIOTECHNOL              10.1016/j.tibtech.2013.03.002 1585     121.9  6.44
7  DICKEY MD, 2017, ADV MATER                  10.1002/adma.201606425        1409     156.6 16.18
8  ILIEVSKI F, 2011, ANGEW CHEM-INT EDIT       10.1002/anie.201006464        1402      93.5  4.00
9  SHINTAKE J, 2018, ADV MATER                 10.1002/adma.201707035        1352     169.0 15.05
10 TEE BCK, 2012, NAT NANOTECHNOL              10.1038/NNANO.2012.192        1250      89.3  5.78


Corresponding Author's Countries

          Country Articles   Freq  SCP MCP MCP_Ratio
1  CHINA              3287 0.3204 2604 683     0.208
2  USA                1997 0.1946 1621 376     0.188
3  KOREA               654 0.0637  525 129     0.197
4  JAPAN               600 0.0585  521  79     0.132
5  UNITED KINGDOM      517 0.0504  304 213     0.412
6  ITALY               515 0.0502  319 196     0.381
7  GERMANY             339 0.0330  197 142     0.419
8  SINGAPORE           261 0.0254  155 106     0.406
9  SWITZERLAND         261 0.0254  148 113     0.433
10 AUSTRALIA           187 0.0182  119  68     0.364


SCP: Single Country Publications

MCP: Multiple Country Publications


Total Citations per Country

      Country      Total Citations Average Article Citations
1  USA                      110293                     55.23
2  CHINA                     89020                     27.08
3  ITALY                     17964                     34.88
4  KOREA                     17837                     27.27
5  GERMANY                   13883                     40.95
6  SINGAPORE                 10880                     41.69
7  UNITED KINGDOM            10127                     19.59
8  SWITZERLAND                9652                     36.98
9  JAPAN                      9238                     15.40
10 AUSTRALIA                  6831                     36.53


Most Relevant Sources

                                                     Sources        Articles
1  SOFT ROBOTICS                                                         690
2  IEEE ROBOTICS AND AUTOMATION LETTERS                                  629
3  ACS APPLIED MATERIALS \\& INTERFACES                                  298
4  ADVANCED FUNCTIONAL MATERIALS                                         257
5  ADVANCED MATERIALS                                                    245
6  ADVANCED INTELLIGENT SYSTEMS                                          228
7  ADVANCED MATERIALS TECHNOLOGIES                                       175
8  SMART MATERIALS AND STRUCTURES                                        163
9  2023 IEEE INTERNATIONAL CONFERENCE ON SOFT ROBOTICS ROBOSOFT          143
10 2024 IEEE 7TH INTERNATIONAL CONFERENCE ON SOFT ROBOTICS ROBOSOFT      143


Most Relevant Keywords

   Author Keywords (DE)      Articles Keywords-Plus (ID)     Articles
1              SOFT ROBOTICS     2315            DESIGN          2103
2              SOFT ROBOT         692            FABRICATION      877
3              SOFT               525            ACTUATORS        494
4              ACTUATORS          443            SOFT             488
5              ROBOTS             370            DRIVEN           461
6              ROBOTICS           355            BEHAVIOR         321
7              3D PRINTING        346            LOCOMOTION       312
8              SOFT ROBOTS        296            MODEL            291
9              CONTROL            237            COMPOSITES       280
10             SOFT ACTUATOR      235            PERFORMANCE      279

Plot for Annual Number of Published Articles:

# Step 1: Aggregate number of articles per year
annual_articles <- data.frame(Year = results$Years) %>%
  group_by(Year) %>%
  summarise(ArticleCount = n()) %>%
  arrange(Year)

# Step 2: Get first and last year
first_year <- min(annual_articles$Year)
last_year <- max(annual_articles$Year)

# Step 3: Generate x-axis breaks including first and last year
x_breaks <- unique(c(first_year,
                     pretty(annual_articles$Year, n = 8),
                     last_year))

# Step 4: Plot with x-axis including first and last year
ggplot(annual_articles, aes(x = Year, y = ArticleCount)) +
  geom_line(linewidth = 1.1, color = "black") +
  geom_point(size = 2.5, color = "black") +
  geom_text(aes(label = ArticleCount), 
            vjust = -0.7, size = 3.2, color = "black") +
  labs(
    title = "Annual Scientific Production",
    x = "Year",
    y = "Number of Articles"
  ) +
  scale_x_continuous(breaks = sort(unique(x_breaks))) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
  theme_apa7()

Plot for Annual Citations:

# Load required packages
library(dplyr)
library(ggplot2)

# Step 1: Aggregate total citations per year
annual_citations <- data.frame(Year = results$Years, Citations = results$TCperYear) %>%
  group_by(Year) %>%
  summarise(TotalCitations = sum(Citations, na.rm = TRUE)) %>%
  arrange(Year)

# Step 2: Extract first and last year
first_year <- min(annual_citations$Year)
last_year  <- max(annual_citations$Year)

# Step 3: Generate x-axis breaks including first and last year
x_breaks <- sort(unique(c(first_year,
                          pretty(annual_citations$Year, n = 8),
                          last_year)))

# Step 4: Plot with explicit x-axis breaks
ggplot(annual_citations, aes(x = Year, y = TotalCitations)) +
  geom_line(linewidth = 1.1, color = "black") +
  geom_point(size = 2.5, color = "black") +
  geom_text(aes(label = round(TotalCitations, 1)), 
            vjust = -0.7, size = 3.2, color = "black") +
  labs(
    title = "Annual Citations",
    x = "Year",
    y = "Total Citations"
  ) +
  scale_x_continuous(breaks = x_breaks) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
  theme_apa7()

4.1 number of authors

# Step 1: Combine authors and years into a dataframe
df_authors <- data.frame(
  Year = results$Years,
  FirstAuthor = results$FirstAuthors
)

# Step 2: Count unique first authors per year
unique_authors_per_year <- df_authors %>%
  group_by(Year) %>%
  summarise(UniqueAuthors = n_distinct(FirstAuthor)) %>%
  arrange(Year)

# Step 3: Get axis breaks that include first and last year
x_breaks <- sort(unique(c(
  min(unique_authors_per_year$Year),
  pretty(unique_authors_per_year$Year, n = 8),
  max(unique_authors_per_year$Year)
)))

# Step 4: Plot as a line chart
ggplot(unique_authors_per_year, aes(x = Year, y = UniqueAuthors)) +
  geom_line(linewidth = 1.1, color = "black") +
  geom_point(size = 2.5, color = "black") +
  geom_text(aes(label = UniqueAuthors), 
            vjust = -0.7, size = 3.2, color = "black") +
  labs(
    title = "Annual Number of Unique First Authors",
    x = "Year",
    y = "Unique First Authors"
  ) +
  scale_x_continuous(breaks = x_breaks) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
  theme_apa7()

Overall number of authors in data set (also coauthors):

length(unique(names(results$Authors)))

[1] 21950

4.2 most important affiliations

Remark: only the top 50 are shown

DT::datatable(data = results$Aff_frac[1:50,])

4.3 which country is leading the field:

Remark: only the country of the first author is considered

# Step 1: Prepare data
df <- data.frame(
  Year = results$Years,
  Country = str_trim(results$CO),
  stringsAsFactors = FALSE
)

# Step 2: Get top 6 countries
top_countries <- df %>%
  count(Country, sort = TRUE) %>%
  slice_head(n = 6) %>%
  pull(Country)

# Step 3: Aggregate by year and country
top_country_year <- df %>%
  filter(Country %in% top_countries) %>%
  count(Year, Country, name = "ArticleCount")

# Step 4: Plot with legend
ggplot(top_country_year, aes(x = Year, y = ArticleCount, color = Country, group = Country)) +
  geom_line(linewidth = 1.1) +
  geom_point(size = 2) +
  geom_text(aes(label = ArticleCount), vjust = -0.6, size = 3.1, show.legend = FALSE) +
  labs(
    title = "Annual Scientific Production – Top 6 Countries",
    x = "Year",
    y = "Number of Articles",
    color = "Country"  # Legend title
  ) +
  scale_x_continuous(breaks = pretty(unique(df$Year), n = 6)) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
  theme_apa7()

4.4 look out for Shelly and Falk:

despite Shelly having 55 articles listed in Web of Science (see: https://www.webofscience.com/wos/woscc/summary/35dc76f0-d712-4ae6-8cae-b563137dbca2-017054c94a/date-descending/1), in no title/ abstract the search terms

(“soft robot” OR “soft robotics”)

could be found

# Shelly:
sum(str_detect(M$AU, regex("Levy-Tzedek", ignore_case = TRUE)))

[1] 0

# Authors:
M$AU[str_detect(M$AU, regex("Levy-Tzedek", ignore_case = TRUE))]

character(0)

# DOIs:
M$DI[str_detect(M$AU, regex("Levy-Tzedek", ignore_case = TRUE))]

character(0)

# Falk:
sum(str_detect(M$AU, regex("Tauber F", ignore_case = TRUE)))

[1] 4

# Authors:
M$AU[str_detect(M$AU, regex("Tauber F", ignore_case = TRUE))]

[1] "TAUBER F;DESMULLIEZ M;PICCIN O;STOKES A;A. A"                                 
[2] "TAUBER FJ;SLESARENKO V"                                                       
[3] "MEDER F;BAYTEKIN B;DEL DOTTORE E;MEROZ;YASMINE Y;TAUBER F;WALKER I;MAZZOLAI B"
[4] "CONRAD S;SPECK T;TAUBER F"

# DOIs:
M$DI[str_detect(M$AU, regex("Tauber F", ignore_case = TRUE))]

[1] "10.1088/1748-3190/acbb48"      "10.3389/frobt.2023.1129827"    "10.1088/1748-3190/aca198"     
[4] "10.1007/978-3-030-64313-3\\_6"

5 📘 Bibliographic Coupling vs. Co-Citation

These methods both assess the similarity between documents, but from opposite directions in the citation network:

Metric	Bibliographic Coupling	Co-Citation
Perspective	Forward-looking (shared references)	Backward-looking (cited together)
Definition	Two documents cite the same third document	Two documents are both cited by a later document
Stabilizes Over Time?	✅ Yes – reference list is fixed when published	❌ No – co-citation grows over time as more papers cite both
Used For	Measuring current similarity of documents	Mapping historic/latent structure of a field

Bibliographic Coupling between references:

Purpose: Creates a document-to-document network.
Meaning: Two papers are linked if they cite the same reference(s).
Use case: Reveal thematic similarity or topical clusters among papers.

!!! my computer is not capable to run this analyses for the complete data set (only subset of first 2000 documents):

bc_matrix <- biblioNetwork(M[1:2000,], analysis = "coupling", network = "references", sep = ";")
bc_network <- networkPlot(bc_matrix, n=30, Title="Bibliographic Coupling", type="fruchterman", labelsize=0.7, cluster = "fast_greedy")

rm(bc_matrix) # remove object to free RAM

# if you want to understand the meaning on the single cluster it is recommended to read the most 2-3 central article of the respective cluster:
bc_network_summary <- get_top_articles_by_cluster(cluster_df = bc_network$cluster_res, min_cluster_size = 6, top_n = 3)
bc_network_summary$top_articles

# A tibble: 6 × 5
  vertex            cluster btw_centrality clos_centrality pagerank_centrality
  <chr>               <dbl>          <dbl>           <dbl>               <dbl>
1 alu a, 2025             1          19.1           0.0208              0.0269
2 long y, 2025            1          19.1           0.0208              0.0269
3 sun w, 2025             1          19.1           0.0208              0.0269
4 zhang c, 2025-3         2           3.54          0.0179              0.0369
5 xue w, 2025-1           2           3.54          0.0179              0.0369
6 narayanan p, 2025       2           3.54          0.0179              0.0369

Co-citation between references:

Purpose: Creates a network of references that are frequently cited together.
Meaning: Two references are linked if they are cited together by a third paper.
Use case: Map intellectual foundations or classic works frequently grouped by others.

!!! my computer is not capable to run this analyses for the complete data set (only subset of last 1000 documents):

cc_matrix <- biblioNetwork(M[9000:nrow(M),], analysis = "co-citation", network = "references", sep = ";")
cc_network <- networkPlot(cc_matrix, n=30, Title="co-citation", type="fruchterman", labelsize=0.7, cluster = "fast_greedy")

rm(cc_matrix) # remove object to free RAM

# if you want to understand the meaning on the single cluster it is recommended to read the most 2-3 central article of the respective cluster:
cc_network_summary <- get_top_articles_by_cluster(cluster_df = cc_network$cluster_res, min_cluster_size = 6, top_n = 3)
cc_network_summary$top_articles

# A tibble: 6 × 5
  vertex                cluster btw_centrality clos_centrality pagerank_centrality
  <chr>                   <dbl>          <dbl>           <dbl>               <dbl>
1 zolfagharian a 2016-1       1          24.6           0.0238              0.0301
2 kim s 2013-1                1          23.4           0.0238              0.0301
3 majidi c 2014               1          22.6           0.0238              0.0301
4 rus d 2015                  2           7.02          0.0217              0.0360
5 wehner m 2016               2           7.02          0.0217              0.0360
6 ilievski f 2011-1           2           7.02          0.0217              0.0360

6 🌐 Collaboration networks

between countries:

# Country collaboration
M <- metaTagExtraction(M, Field = "AU_CO", sep = ";")
net_cty <- biblioNetwork(M, analysis="collaboration", network="countries", sep=";")
tmp <- networkPlot(net_cty, n=50, Title="Country Collaboration", type="fruchterman", size=TRUE, labelsize=0.7)

## the top 10 collaborating countries:
tmp$cluster_res[order(tmp$cluster_res$btw_centrality, decreasing = TRUE),][1:10,]

           vertex cluster btw_centrality clos_centrality pagerank_centrality
5  united kingdom       1      133.03583      0.01515152          0.03021562
1           china       5       78.47044      0.01492537          0.02920869
2             usa       5       77.93218      0.01492537          0.02938908
7         germany       5       77.53385      0.01515152          0.02987237
4           italy       5       61.02593      0.01449275          0.02907677
10      australia       2       60.56359      0.01408451          0.03056287
13         canada       1       50.96416      0.01315789          0.02732532
11    netherlands       5       45.09749      0.01408451          0.02681110
9     switzerland       1       43.87449      0.01315789          0.02616385
29         poland       2       34.86812      0.01369863          0.02281280

between authors (research groups):

# Author collaboration
net_aut <- biblioNetwork(M, analysis="collaboration", network="authors", sep=";")
tmp <- networkPlot(net_aut, n=50, Title="Author Collaboration", type="fruchterman", labelsize=0.5, edges.min    = 3)

## the top 10 collaborating authors:
tmp$cluster_res[order(tmp$cluster_res$btw_centrality, decreasing = TRUE),][1:10,]

   vertex cluster btw_centrality clos_centrality pagerank_centrality
12 chen x       2      211.48615     0.015873016          0.02158052
7  chen y       3      159.96875     0.014285714          0.02418322
22 wang l       3       79.81346     0.013333333          0.01596452
20  liu z       2       65.02344     0.012820513          0.01559453
16 yang y       3       64.01719     0.011904762          0.01237944
48 wang m       2       50.42152     0.009259259          0.01521823
37 yang h       2       48.11193     0.012820513          0.01328925
2    li y       1       39.76918     0.011764706          0.04466039
31   li h       3       38.32660     0.012820513          0.01157194
35   wu y       3       31.64626     0.012048193          0.01414629

7 📌 Conceptual structure (keywords co-occurrence)

Remark: also possible for abstracts

net_key <- biblioNetwork(M, analysis="co-occurrences", network="keywords", sep=";")
tmp <- networkPlot(net_key, n=30, normalize="association", weighted=TRUE,
            Title="Keyword Co‑occurrence", type="fruchterman", size=TRUE, labelsize=0.7)