Identificar temas na literatura sobre políticas públicas, relacionados ao financiamento da FAPESP, utilizando modelagem de tópicos e tokenização (LDA - topicmodels). * Dados obtidos da [Overton] (“https://www.overton.io/”)
## [1] "Language" "cited_topics" "cited_classification"
## [4] "Cited_url" "Title_menc" "Cited_doi_menc"
## [7] "languages_cited_menc" "Source_ID_menc" "Source_title_menc"
## [10] "Source_country_menc" "Source_type_menc" "Source_region_menc"
## [13] "Published_on_menc" "Document_URL_menc" "Document_type_menc"
## [16] "snippet_menc"
##
## en pt
## 3041 1
Foram identificados: * 1083 títulos duplicados; * 1 título em pt – Arquivo final com: 2060 observações e 16 variáveis
## Rows: 2,060
## Columns: 16
## $ Language <chr> "en", "en", "en", "en", "en", "en", "en", "en", "…
## $ cited_topics <chr> "Food | Sugar | Hepatitis E", "Cost of living | T…
## $ cited_classification <chr> "health | health>diseases and conditions | lifest…
## $ Cited_url <chr> "https://www.overton.io/document.php?policy_docum…
## $ Title_menc <chr> "seismo info 05 / 2023", "tilapia health: quo vad…
## $ Cited_doi_menc <chr> "10.1007/s00003-022-01412-x", "10.1111/tbed.14295…
## $ languages_cited_menc <chr> "fre", "eng", "eng", "eng", "por", "eng", "eng", …
## $ Source_ID_menc <chr> "adminch", "fao", "europa", "efsaeu", "iucn", "go…
## $ Source_title_menc <chr> "Government of Switzerland", "Food and Agricultur…
## $ Source_country_menc <chr> "Switzerland", "IGO", "EU", "EU", "France", "Sing…
## $ Source_type_menc <chr> "government", "igo", "government", "government", …
## $ Source_region_menc <chr> "Europe", "International Organizations", "Europe"…
## $ Published_on_menc <chr> "2023-06-30", "2023-02-14", "2022-08-16", "2021-1…
## $ Document_URL_menc <chr> "https://www.blv.admin.ch/dam/blv/fr/dokumente/le…
## $ Document_type_menc <chr> "Publication", "Publication", "Publication", "Pub…
## $ snippet_menc <chr> NA, NA, "The Amazon forest is the largest tropica…
Definiu-se as seguintes variáveis de interesse:
##
## 2000 2010 2020
## 15 1019 1009
## # A tibble: 6 × 2
## token n
## <chr> <int>
## 1 medical specialties 138
## 2 clinical medicine 137
## 3 health sciences 112
## 4 <NA> 100
## 5 health care 83
## 6 natural environment 78
## # A tibble: 20 × 2
## token n
## <chr> <int>
## 1 health 424
## 2 medicine 295
## 3 sciences 212
## 4 medical 168
## 5 clinical 151
## 6 specialties 138
## 7 natural 101
## 8 na 100
## 9 care 97
## 10 environment 87
## 11 biology 85
## 12 food 85
## 13 science 84
## 14 human 78
## 15 economy 74
## 16 chemistry 71
## 17 climate 68
## 18 physical 67
## 19 nature 65
## 20 de 61
## # A tibble: 20 × 2
## token n
## <chr> <int>
## 1 medical specialties 138
## 2 clinical medicine 137
## 3 health sciences 112
## 4 <NA> 100
## 5 health care 83
## 6 natural environment 78
## 7 medicine health 57
## 8 human activities 53
## 9 climate change 49
## 10 earth sciences 47
## 11 controlled trial 33
## 12 randomized controlled 33
## 13 systematic review 32
## 14 diseases disorders 30
## 15 physical sciences 28
## 16 branches science 27
## 17 specialties health 27
## 18 medicine medical 26
## 19 sciences health 25
## 20 sustainable development 25
## # A tibble: 20 × 2
## token n
## <chr> <int>
## 1 <NA> 100
## 2 randomized controlled trial 33
## 3 clinical medicine health 27
## 4 medical specialties health 27
## 5 health sciences health 24
## 6 medicine medical specialties 24
## 7 clinical medicine medicine 23
## 8 medical specialties clinical 22
## 9 specialties clinical medicine 22
## 10 agence nationale de 18
## 11 clinical medicine medical 18
## 12 de lalimentation de 18
## 13 de lenvironnement et 18
## 14 de sécurité sanitaire 18
## 15 et du travail 18
## 16 health medical specialties 18
## 17 lalimentation de lenvironnement 18
## 18 lenvironnement et du 18
## 19 nationale de sécurité 18
## 20 sanitaire de lalimentation 18
Tipos de publicações definidos com base nos sinônimos descritos em [Thesaurus] (“https://www.thesaurus.com/browse/Proposition”)
| termo_preferido | variacoes |
|---|---|
| Report | Report, Statement, Description, summary, Record |
| Plan | Plan, arrangement, deal, idea, intention, method, policy, procedure, program, project, proposal, strategy, system |
| Survey | Survey, Poll, Questionnaire, analysis, audit, check, inquiry, inspection, sample |
| Assessment | Assessmet, Appraisal, Evaluation, Estimation, estimate, Analysis, Valuation |
| Review | Review, Examination, Critique, revision |
| Overview | Overview, Summary, Synopsis, Rundown, Outline, Recap, sketch |
| Study | Study, Exploration, Research |
| Guides | GuidesH, andbook, Manual, Directory, Tutorial, Primer |
| Guidelines | Guidelines, Principles, Standards, Protocols, Criteria, Rules, advisement, assignment |
| Briefing | Briefing, Synopsis, Rundown, Summary, Debrief, Recap |
| Summary | Summary, Abstract, Recapitulation, Digest |
| Policy | Policy, Rule, Regulation, Principle, Procedure, policies |
| Proposition | Proposition, Suggestion, Proposal, Submission, Offer, Motion, hypothesis, invitation, motion, premise |
## # A tibble: 6 × 21
## Language cited_topics cited_classification Cited_url Title_menc Cited_doi_menc
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 en Food | Suga… health | health>dis… https://… seismo in… 10.1007/s0000…
## 2 en Cost of liv… economy, business a… https://… tilapia h… 10.1111/tbed.…
## 3 en Nature | Ph… environment | envir… https://… deforesta… 10.1016/j.sci…
## 4 en Botany | Pl… science and technol… https://… plant hea… 10.1007/s1374…
## 5 en Mariana dam… environment>nature … https://… impactos … 10.1016/j.jha…
## 6 en CapitaLand … environment>nature … https://… annual re… 10.1007/s4277…
## # ℹ 15 more variables: languages_cited_menc <chr>, Source_ID_menc <chr>,
## # Source_title_menc <chr>, Source_country_menc <chr>, Source_type_menc <chr>,
## # Source_region_menc <chr>, Published_on_menc <chr>, Document_URL_menc <chr>,
## # Document_type_menc <chr>, snippet_menc <chr>, topicos <chr>, ano <dbl>,
## # decada <dbl>, tit_resumo <chr>, termo_encontrado <chr>
Neste processo estão sendo definidos 15 tópicos e produzindo uma visualização com os 10 primeiros tokens de maior ocorrência em cada um dos tópicos.
## <<DocumentTermMatrix (documents: 1960, terms: 2668)>>
## Non-/sparse entries: 9778/5219502
## Sparsity : 100%
## Maximal term length: 30
## Weighting : term frequency (tf)
##
## 1 2 3 4 5 6
## 444 389 346 273 258 250
#Nuvem de palavras por tópicos
Realiza-se uma distribuição dos principais bigramas por cada tópico afim de facilitar a análise da equipe.
## # A tibble: 12 × 22
## Language cited_topics cited_classification Cited_url Title_menc
## <chr> <chr> <chr> <chr> <chr>
## 1 en Psychological trauma | Po… education | science… https://… cie0057 -…
## 2 en Biology | Branches of gen… science and technol… https://… adequacy …
## 3 en Human anatomy | Medical s… health>diseases and… https://… american …
## 4 en Fibromyalgia | Placebo-co… science and technol… https://… behandlin…
## 5 en Economy | Human activitie… environment | scien… https://… carbon co…
## 6 en Nature-based solutions | … environment | envir… https://… addressin…
## 7 en Medicine | Diseases and d… health>diseases and… https://… diagnosti…
## 8 en Women's empowerment | Soc… lifestyle and leisu… https://… gender eq…
## 9 en Digital elevation model |… economy, business a… https://… potential…
## 10 en Species reintroduction | … science and technol… https://… iucn guid…
## 11 en Soil | Panicum virgatum |… economy, business a… https://… assessing…
## 12 en Sodium | Tropical climate… economy, business a… https://… geographi…
## # ℹ 17 more variables: Cited_doi_menc <chr>, languages_cited_menc <chr>,
## # Source_ID_menc <chr>, Source_title_menc <chr>, Source_country_menc <chr>,
## # Source_type_menc <chr>, Source_region_menc <chr>, Published_on_menc <chr>,
## # Document_URL_menc <chr>, Document_type_menc <chr>, snippet_menc <chr>,
## # topicos <chr>, ano <dbl>, decada <dbl>, tit_resumo <chr>,
## # termo_encontrado <chr>, topico_modelagem <int>