Introduction

Literature and evidence review essential in public health practice
Exponential growth in volume of literature
Initial first steps usually:
- Developing search strategy
- Reviwing and filtering abstracts
- Obtaining full text (if possible)
- Data extraction

This can be a manual and protracted interative process which may involve using specialised searching services, downloading abstracts, reading and filtering, secondary searching and so on, and may involve sifting many thousands of abstracts.

Often we may just want a rapid overview of the literature to help focus further reviewing.

In this vignette we demonstrate the use of R packages for large scale extraction of abstracts, and analytical techniques for identifying topics or themes in the abstracts.

The vignette is based on a number of R packages:

europepmc - this is a sophisticated tool which interacts with the PubMedCentral API and provdes access to additional fields.
adjutant - this is a fully fledged package with retrieval and clustering functions. 3.tidytext - a package for text mining using tidy data principles.
Rtsne - this uses the tSNE algorithm for data reduction and cluster visualisation
dbscan - applies the HDBSCAN algorithm for data clustering
myScrapers - wraps some functions built on other packages to automate the search, extraction, and filtering process.

We have “hacked” some of the functions in these packages and written additional functions to develop a work flow from searching and retrieval to analysis

A simple example using `europepmc`

Searching Europe PubMed Central (epmc)

This is a package which allows searching of EuropePMC via the API.

It can be downloaded from CRAN.


if(!require("europepmc")) install.packages("europepmc")
library(europepmc)

The main function is epmc_search which allows us to search the site and retrieve abstracts, metadata and citation counts.

We’ll use it with the search term “deep learning” AND “public health”.


head(epmc_search(params$search, limit = 10))
#> # A tibble: 6 x 28
#>   id    source pmid  doi   title authorString journalTitle journalVolume
#>   <chr> <chr>  <chr> <chr> <chr> <chr>        <chr>        <chr>        
#> 1 3143~ MED    3143~ 10.3~ A De~ Zhang S, Po~ Stud Health~ 264          
#> 2 3145~ MED    3145~ 10.1~ Arti~ Patel UK, A~ J Neurol     <NA>         
#> 3 3092~ MED    3092~ 10.6~ [Art~ Lin SH, Che~ Hu Li Za Zhi 66           
#> 4 3116~ MED    3116~ 10.1~ "[Ap~ Uchida M, N~ Sangyo Eise~ <NA>         
#> 5 3118~ MED    3118~ 10.1~ Comp~ Soliman M, ~ Epidemics    28           
#> 6 PPR9~ PPR    <NA>  10.1~ Atro~ Ratul MAR, ~ <NA>         <NA>         
#> # ... with 20 more variables: pubYear <chr>, journalIssn <chr>,
#> #   pageInfo <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> #   inPMC <chr>, hasPDF <chr>, hasBook <chr>, citedByCount <int>,
#> #   hasReferences <chr>, hasTextMinedTerms <chr>,
#> #   hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> #   hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> #   firstPublicationDate <chr>, issue <chr>, pmcid <chr>, hasSuppl <chr>

This doesn’t extract the abstract text or Mesh headings (keywords) - to facilitate this we have wrapped the search function, into get_full_search in myScrapers.

library(tictoc)

tic()
search1 <- get_full_search(search = params$search, limit = params$limit)
toc()
#> 254.51 sec elapsed

head(search1, 20)
#> # A tibble: 20 x 32
#>    id    source pmid  doi   title authorString journalTitle journalVolume
#>    <chr> <chr>  <chr> <chr> <chr> <chr>        <chr>        <chr>        
#>  1 3143~ MED    3143~ 10.3~ A De~ Zhang S, Po~ Stud Health~ 264          
#>  2 3145~ MED    3145~ 10.1~ Arti~ Patel UK, A~ J Neurol     <NA>         
#>  3 3092~ MED    3092~ 10.6~ [Art~ Lin SH, Che~ Hu Li Za Zhi 66           
#>  4 3116~ MED    3116~ 10.1~ "[Ap~ Uchida M, N~ Sangyo Eise~ <NA>         
#>  5 3118~ MED    3118~ 10.1~ Comp~ Soliman M, ~ Epidemics    28           
#>  6 PPR9~ PPR    <NA>  10.1~ Atro~ Ratul MAR, ~ <NA>         <NA>         
#>  7 3114~ MED    3114~ 10.3~ The ~ Cheon S, Ki~ Int J Envir~ 16           
#>  8 3141~ MED    3141~ 10.1~ Sate~ Bruzelius E~ J Am Med In~ 26           
#>  9 3112~ MED    3112~ 10.1~ Deep~ Khalighifar~ J Med Entom~ <NA>         
#> 10 3114~ MED    3114~ 10.2~ Prom~ Balyen L, P~ Asia Pac J ~ 8            
#> 11 3142~ MED    3142~ 10.1~ Auto~ Obeid JS, W~ BMC Med Inf~ 19           
#> 12 3127~ MED    3127~ 10.1~ Mach~ Doupe P, Fa~ Value Health 22           
#> 13 3119~ MED    3119~ 10.1~ Deep~ Graffy PM, ~ Br J Radiol  92           
#> 14 3121~ MED    3121~ 10.3~ Dire~ Qian F, Che~ Int J Envir~ 16           
#> 15 3097~ MED    3097~ 10.1~ Auto~ Graffy PM, ~ Abdom Radio~ <NA>         
#> 16 3097~ MED    3097~ 10.3~ A De~ Lim J, Kim ~ Int J Envir~ 16           
#> 17 3134~ MED    3134~ 10.1~ Erra~ Ruamviboons~ NPJ Digit M~ 2            
#> 18 PPR9~ PPR    <NA>  10.1~ Deve~ Xu J, Xu K,~ <NA>         <NA>         
#> 19 3140~ MED    3140~ 10.1~ Stra~ Wong TY, Sa~ Ophthalmolo~ <NA>         
#> 20 3080~ MED    3080~ 10.1~ Deep~ Lee SM, Seo~ J Thorac Im~ 34           
#> # ... with 24 more variables: pubYear <chr>, journalIssn <chr>,
#> #   pageInfo <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> #   inPMC <chr>, hasPDF <chr>, hasBook <chr>, citedByCount <int>,
#> #   hasReferences <chr>, hasTextMinedTerms <chr>,
#> #   hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> #   hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> #   firstPublicationDate <chr>, issue <chr>, pmcid <chr>, hasSuppl <chr>,
#> #   name <int>, absText <list>, mesh <list>, keywords <chr>

We can see that the get_full_search function returns addition metadata such as citation counts, whether the journal is open access and whether there is PDF available. By default, 1000 article descriptions are downloaded. It also includes mesh headings and abstract text.

we can see how many articles are available altogether by running epmc_profile.


profile <- epmc_profile(query = params$search)

Running epmc_profile allows us to see that there are 704 articles of which 638 are full text articles, and 489 are open access.

Analysing abstracts

Abstracts per year

We can easily look at annual abstract frequency - we can readily see the growth in publication frequency in the last 3 years.


search1 %>%
  count(pubYear) %>%
  ggplot(aes(pubYear, n)) +
  geom_col(fill = "blue") +
  labs(title = "Abstracts per year", 
       subtitle = paste("Search: ", params$search)) +
  phecharts::theme_phe() +
  theme(axis.text.x = element_text(angle = 45 ,hjust = 1))

Journal frequency

Similarly we can identify the most frequent journals


journal_count <- search1 %>%
  count(journalTitle) %>%
  top_n(20) %>%
  arrange(-n)

 journal_count %>%
  ggplot(aes(reorder(journalTitle, n), n)) +
  geom_col(fill = "blue") +
  coord_flip() +
  labs(title = "Journal frequency") +
  phecharts::theme_phe()

Int J Environ Res Public Health and PLoS One are the most frequent journals publishing articles on “deep learning” AND “public health”.

Topic identification

Once we have a data frame of 704 records with abstract text, we can prepare the data for analysis. THe create_corpus function is designed for this.


out1 <- search1 %>%
  select(pmid, pmcid ,doi, title, pubYear, citedByCount, absText, journalTitle) %>%
  filter(absText != "NULL") %>%
  mutate(text = paste(title, absText))

Text mining

We will use a method exemplified in the adjutant package which uses unsupervised machine learning to try and cluster similar articles and attach themes.

In this approach undertake some natural language processing. We will

Split each abstract into groups is single words
Remove numbers and common (stop) words
Stem each word (definition:)
Calculate the tf-idf score for each word in each abstract - this gives more weight to words which are more “typical” of the abstracts
Create a document feature matrix
Undertake dimensionality reduction using tSNE to simplify
Run HDBSCAN to identify clusters
Name the clusters
QA the result

The ultimate output of this analysis is a visualisation of clustered and labelled abstracts and a interactive table.


library(tidytext)

corp <- create_corpus(df = search1)

head(corp$corpus)
#> # A tibble: 6 x 6
#>   pmid     word       n      tf   idf tf_idf
#>   <chr>    <chr>  <int>   <dbl> <dbl>  <dbl>
#> 1 10463892 achiev     1 0.00671  1.72 0.0116
#> 2 10463892 admiss     1 0.00671  4.41 0.0296
#> 3 10463892 applic     5 0.0336   1.44 0.0482
#> 4 10463892 assess     1 0.00671  1.52 0.0102
#> 5 10463892 autumn     1 0.00671  6.49 0.0436
#> 6 10463892 bsc        1 0.00671  6.49 0.0436


clust <- create_cluster(corpus = corp$corpus, minPts = 10)
#> 19.33 sec elapsed


clust$cluster_size
#> # A tibble: 14 x 2
#>    cluster     n
#>      <dbl> <int>
#>  1       0   212
#>  2       1    10
#>  3       2    15
#>  4       3    21
#>  5       4    39
#>  6       5    26
#>  7       6    16
#>  8       7    26
#>  9       8    19
#> 10       9    19
#> 11      10   105
#> 12      11    19
#> 13      12    65
#> 14      13    69

Labelling clusters


labels <- label_clusters(corp$corpus, clustering = clust$clustering, top_n = 4)
#> 0.63 sec elapsed

labels$labels
#> # A tibble: 14 x 2
#> # Groups:   cluster [14]
#>    cluster clus_names                                                     
#>      <dbl> <chr>                                                          
#>  1       0 data-learn-base-studi                                          
#>  2       1 pollut-qualiti-network-data                                    
#>  3       2 resist-antibiot-antimicrobi-health                             
#>  4       3 segment-imag-convolut-neural-deep-network-method-perform-result
#>  5       4 genom-identifi-data-studi                                      
#>  6       5 social-health-data-base                                        
#>  7       6 drug-advers-safeti-model-base-studi                            
#>  8       7 clinic-model-learn-method                                      
#>  9       8 breast-cancer-imag-base-studi                                  
#> 10       9 diabet-retinopathi-screen-imag-patient-learn-base              
#> 11      10 model-learn-data-studi                                         
#> 12      11 ai-intellig-artifici-health-data                               
#> 13      12 data-health-research-develop                                   
#> 14      13 student-educ-learn-studi

Visualise


p <- labels$results %>%
  left_join(search1, by = c("pmid.value" = "pmid")) %>%
  ggplot(aes(X1, X2)) +
  geom_point(aes(colour = clustered, size = citedByCount) ) +
  ggrepel::geom_text_repel(data = labels$plot, aes(medX, medY, label = clus_names), size = 3, colour = "#006d2c", alpha = 0.9)

p + scale_alpha_manual(values=c(1,0)) +
  viridis::scale_color_viridis(discrete = TRUE, option = "cividis", alpha = .6) +
  phecharts::theme_phe() +
  theme(panel.background = element_rect(fill = "#f0f0f0")) +
  labs(subtitle = paste("Clustering: ", nrow(labels$plot), " topics" ), 
       title = paste("Search ", "= ", params$search ))

Understanding the labels

Most cited articles


most_cited <- labels$results %>%
  left_join(search1, by = c("pmid.value" = "pmid")) %>%
  filter(cluster !=0) %>%
  group_by(clus_names) %>%
  top_n(n = 3, citedByCount) %>%
  select(clus_names, title, pubYear, citedByCount) %>%
  ungroup() %>%
  arrange(clus_names, -citedByCount)

most_cited %>%
  formattable::formattable()

clus_names	title	pubYear	citedByCount
ai-intellig-artifici-health-data	Artificial intelligence in cancer imaging: Clinical challenges and applications.	2019	4
ai-intellig-artifici-health-data	Global Evolution of Research in Artificial Intelligence in Health and Medicine: A Bibliometric Study.	2019	3
ai-intellig-artifici-health-data	Cognitive computing and eScience in health and life science research: artificial intelligence and obesity intervention programs.	2017	2
breast-cancer-imag-base-studi	Deep learning based tissue analysis predicts outcome in colorectal cancer.	2018	21
breast-cancer-imag-base-studi	Antibody-supervised deep learning for quantification of tumor-infiltrating immune cells in hematoxylin and eosin stained breast cancer samples.	2016	12
breast-cancer-imag-base-studi	Mammographic density and structural features can individually and jointly contribute to breast cancer risk assessment in mammography screening: a case-control study.	2016	7
clinic-model-learn-method	Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications.	2016	11
clinic-model-learn-method	Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives.	2018	8
clinic-model-learn-method	EliIE: An open-source information extraction system for clinical trial eligibility criteria.	2017	7
data-health-research-develop	Quality collaboratives: lessons from research.	2002	231
data-health-research-develop	Building better biomarkers: brain models in translational neuroimaging.	2017	72
data-health-research-develop	Making sense of big data in health research: Towards an EU action plan.	2016	44
diabet-retinopathi-screen-imag-patient-learn-base	Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning.	2016	48
diabet-retinopathi-screen-imag-patient-learn-base	Retinal Imaging Techniques for Diabetic Retinopathy Screening.	2016	9
diabet-retinopathi-screen-imag-patient-learn-base	Multi-categorical deep learning neural network to classify retinal images: A pilot study employing small database.	2017	7
drug-advers-safeti-model-base-studi	Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.	2015	63
drug-advers-safeti-model-base-studi	Drug drug interaction extraction from biomedical literature using syntax convolutional neural network.	2016	17
drug-advers-safeti-model-base-studi	Natural Products for Drug Discovery in the 21st Century: Innovations for Novel Drug Discovery.	2018	13
genom-identifi-data-studi	Comprehensive functional genomic resource and integrative model for the human brain.	2018	12
genom-identifi-data-studi	Pleiotropic Mechanisms Indicated for Sex Differences in Autism.	2016	9
genom-identifi-data-studi	Transcriptome-wide isoform-level dysregulation in ASD, schizophrenia, and bipolar disorder.	2018	9
model-learn-data-studi	Deep learning for neuroimaging: a validation study.	2014	53
model-learn-data-studi	Forecasting influenza in Hong Kong with Google search queries and statistical model fusion.	2017	11
model-learn-data-studi	Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models.	2017	10
pollut-qualiti-network-data	Design of a Mobile Low-Cost Sensor Network Using Urban Buses for Real-Time Ubiquitous Noise Monitoring.	2016	6
pollut-qualiti-network-data	A systematic review of data mining and machine learning for air pollution epidemiology.	2017	6
pollut-qualiti-network-data	Long short-term memory neural network for air pollutant concentration predictions: Method development and evaluation.	2017	5
pollut-qualiti-network-data	Towards Personal Exposures: How Technology Is Changing Air Pollution and Health Research.	2017	5
resist-antibiot-antimicrobi-health	DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data.	2018	14
resist-antibiot-antimicrobi-health	Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae.	2018	6
resist-antibiot-antimicrobi-health	Myxinidin2 and myxinidin3 suppress inflammatory responses through STAT3 and MAPKs to promote wound healing.	2017	4
resist-antibiot-antimicrobi-health	Using Machine Learning To Predict Antimicrobial MICs and Associated Genomic Features for Nontyphoidal Salmonella.	2019	4
segment-imag-convolut-neural-deep-network-method-perform-result	Urinary bladder segmentation in CT urography using deep-learning convolutional neural network and level sets.	2016	29
segment-imag-convolut-neural-deep-network-method-perform-result	ISLES 2015 - A public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI.	2017	24
segment-imag-convolut-neural-deep-network-method-perform-result	Deep convolutional neural network and 3D deformable approach for tissue segmentation in musculoskeletal magnetic resonance imaging.	2018	12
social-health-data-base	The use of social networking platforms for sexual health promotion: identifying key strategies for successful user engagement.	2015	16
social-health-data-base	Researching Mental Health Disorders in the Era of Social Media: Systematic Review.	2017	14
social-health-data-base	Characterizing the Discussion of Antibiotics in the Twittersphere: What is the Bigger Picture?	2015	13
student-educ-learn-studi	Clinical experience, performance in final examinations, and learning style in medical students: prospective study.	1998	93
student-educ-learn-studi	Intercalated degrees, learning styles, and career preferences: prospective longitudinal study of UK medical students.	1999	66
student-educ-learn-studi	Randomised controlled trial of clinical decision support tools to improve learning of evidence based medicine in medical students.	2003	62

Use of keywords

We can review the commonest Mesh headings associated with each cluster tag.


labels$results %>%
  left_join(search1, by = c("pmid.value" = "pmid")) %>%
  select(clus_names, mesh) %>%
  filter(mesh != "NULL") %>%
  unnest(mesh) %>%
  count(clus_names, mesh,sort = TRUE) %>%
  filter(n < 30) %>%
  ungroup() %>%
  group_by(clus_names) %>%
  top_n(10)  %>%
  mutate(summary = paste(mesh, collapse = "; " )) %>%
  select(-c(mesh, n)) %>%
  distinct() %>%
  arrange(clus_names) %>%
  knitr::kable()

clus_names	summary
ai-intellig-artifici-health-data	Artificial Intelligence; Big Data; Public Health
breast-cancer-imag-base-studi	Humans; Breast Neoplasms; Female; Middle Aged; Aged; Breast; Machine Learning; Mammography; Retrospective Studies; Adult; Aged, 80 and over; Algorithms; Breast Density; Deep Learning; Early Detection of Cancer; Image Interpretation, Computer-Assisted; Image Processing, Computer-Assisted; Magnetic Resonance Imaging; Male; Neoplasms; Risk Assessment; ROC Curve; Sensitivity and Specificity; Ultrasonography, Mammary
clinic-model-learn-method	Humans; Algorithms; Electronic Health Records; Natural Language Processing; Machine Learning; Neural Networks (Computer); Datasets as Topic; International Classification of Diseases; Artificial Intelligence; Bayes Theorem; Phenotype
data-health-research-develop	Public Health; Data Mining; Databases, Factual; Delivery of Health Care; Medical Informatics; Artificial Intelligence; Biomedical Research; Electronic Health Records; Machine Learning; Translational Medical Research
data-learn-base-studi	Female; Machine Learning; Male; Algorithms; Deep Learning; Middle Aged; Neural Networks (Computer); Aged; Image Processing, Computer-Assisted; Adult; Tomography, X-Ray Computed
diabet-retinopathi-screen-imag-patient-learn-base	Humans; Diabetic Retinopathy; Female; Male; Aged; Aged, 80 and over; Middle Aged; Retina; Adult; Cross-Sectional Studies; Diagnosis, Computer-Assisted; Diagnostic Techniques, Ophthalmological; Image Processing, Computer-Assisted; Neural Networks (Computer); Reproducibility of Results; ROC Curve; Young Adult
drug-advers-safeti-model-base-studi	Humans; Artificial Intelligence; Data Mining; Drug-Related Side Effects and Adverse Reactions; Neural Networks (Computer); Social Media; Area Under Curve; Automation, Laboratory; Back Pain; Biological Products; Computational Biology; Computer Simulation; Databases as Topic; Deep Learning; Drug Design; Drug Discovery; Drug Industry; Drug Interactions; Information Storage and Retrieval; Models, Chemical; Models, Theoretical; Natural Language Processing; Necrosis; Pharmacovigilance; Phytotherapy; Plants, Medicinal; Programming Languages; Publications; Robotics; Semantics; Software; Supervised Machine Learning
genom-identifi-data-studi	Humans; Genome-Wide Association Study; Computational Biology; Genetic Predisposition to Disease; Algorithms; Databases, Genetic; Deep Learning; Female; Genome, Human; Genomics; Polymorphism, Single Nucleotide
model-learn-data-studi	Machine Learning; Female; Neural Networks (Computer); Male; Algorithms; Deep Learning; Adult; Middle Aged; Aged; China; Prognosis
pollut-qualiti-network-data	Air Pollution; Air Pollutants; Environmental Monitoring; Humans; Neural Networks (Computer); Cities; Forecasting; Algorithms; Automation; Beijing; Data Mining; Deep Learning; Electroencephalography; Electrooculography; Environmental Exposure; Epidemiologic Studies; Hong Kong; Inventions; Machine Learning; Models, Statistical; Models, Theoretical; Particulate Matter; Polysomnography; Sleep Stages; Sleep Wake Disorders; Smartphone
resist-antibiot-antimicrobi-health	Humans; Anti-Bacterial Agents; Drug Resistance, Multiple, Bacterial; Machine Learning; Microbial Sensitivity Tests; Antimicrobial Cationic Peptides; Biofilms; Cell Membrane; DNA, Bacterial; Genome, Bacterial; High-Throughput Nucleotide Sequencing; Lipopolysaccharides; Sequence Analysis, DNA; Whole Genome Sequencing
segment-imag-convolut-neural-deep-network-method-perform-result	Humans; Female; Male; Algorithms; Image Processing, Computer-Assisted; Middle Aged; Neural Networks (Computer); Adult; Magnetic Resonance Imaging; Aged; Deep Learning; Image Interpretation, Computer-Assisted; Young Adult
social-health-data-base	Humans; Social Media; Neural Networks (Computer); Machine Learning; Adolescent; Adult; Algorithms; Analgesics, Opioid; Deep Learning; Female; Internet; Male; Middle Aged; Public Opinion; Young Adult
student-educ-learn-studi	Female; Male; Curriculum; Students, Medical; Educational Measurement; Learning; Adult; Education, Medical, Undergraduate; Problem-Based Learning; Young Adult

Investigating individial themes to identify full-text articles

Lets explore articles for which public health is a Mesh heading.


ph <- labels$results %>%
  left_join(search1, by = c("pmid.value" = "pmid")) %>%
  filter(str_detect(keywords, "Public Health"))

ph %>%
  count(clus_names, sort = TRUE)
#> # A tibble: 4 x 2
#>   clus_names                           n
#>   <chr>                            <int>
#> 1 data-health-research-develop         7
#> 2 model-learn-data-studi               2
#> 3 student-educ-learn-studi             2
#> 4 ai-intellig-artifici-health-data     1

There is one article tagged with ai-intellig-artifici-health-data which has Public Health as a mesh heading. We can use epmc_ftxt to extract the full text article.

library(rvest)



get_pmcids <- ph %>%
  filter(clus_names == "data-research-health-develop") %>%
  select(id, pmcid) %>%
  filter(!is.na(pmcid))


details <- mutate(ids, details = map(get_ids, epmc_details))

full_text <- details %>%
    mutate(full_text = map(details, "ftx")) %>%
    unnest(full_text) %>%
  filter(availability == "Free") %>%
  left_join(get_pmcids, by = c("value" = "id")) %>%
  distinct()


full_text <- europepmc::epmc_ftxt("PMC5171550")

ft <- full_text %>%
  html_text()

ft %>%
  str_split(., "\\. ") %>%
  enframe() %>%
  formattable::formattable()

Finally we can gather all the abstracts into a single interactive table which can be searched, filtered and shared.


labels$results %>%
  left_join(search1, by = c("pmid.value" = "pmid"))  %>%
  select(cluster, clus_names, doi, title, journalTitle, pubYear, citedByCount, absText) %>%
  mutate(doi = paste0("<a href = https://", doi, ">doi</a>")) %>%
  DT::datatable(escape = FALSE, extensions = c('Responsive','Buttons', 'FixedHeader'), 
                filter = "top", 
  options = list(
    autoWidth = TRUE,
    columnDefs = list(list(width = '450px')),
    dom = 'Bfrtip',
    buttons = c('csv', 'excel'),
    fixedHeader=TRUE) 
  )