Often the sciences are divided into two distinct subgroups. These are the social sciences and the natural sciences. The social sciences are usually thought of as being more involved with qualitative research, social topics, and policy related endeavors. On the other side, the natural sciences usually deal with a more quantitative field of interest. These include research questions that involve math relationships, natural occurrences like physics, and mathematical processes and modeling.
These two different subgroups of science tend to create divisions between researchers. In order to help bridge the gap between the two, we use the aRxiv package along with text analytics techniques to analyze the word frequency in abstracts of 1000 social and 1000 natural science papers. We divide both subgroups into two specific fields–(Math & Physics for natural: Sociology & Economics for social).
By looking at the word usage through wordclouds, we hope to show that research in the natural and social sciences are more alike than different.
math=arxiv_search(query = '"Math"', limit = 500)
mathCorpus=with(math, VCorpus(VectorSource(abstract)))%>%
tm_map(stripWhitespace) %>%
tm_map(removeNumbers) %>%
tm_map(removePunctuation) %>%
tm_map(content_transformer(tolower)) %>%
tm_map(removeWords, stopwords("english"))
wordcloud(mathCorpus, max.words = 50, scale = c(8, 1),
colors=brewer.pal(3, "Set1"),random.order = FALSE)
physics=arxiv_search(query = '"Physics"', limit = 500)
physicsCorpus=with(physics, VCorpus(VectorSource(abstract)))%>%
tm_map(stripWhitespace) %>%
tm_map(removeNumbers) %>%
tm_map(removePunctuation) %>%
tm_map(content_transformer(tolower)) %>%
tm_map(removeWords, stopwords("english"))
wordcloud(physicsCorpus, max.words = 50, scale = c(8, 1),
colors=brewer.pal(3, "Dark2"),random.order = FALSE)
sociology=arxiv_search(query = '"Sociology"', limit = 500)
sociologyCorpus=with(sociology, VCorpus(VectorSource(abstract)))%>%
tm_map(stripWhitespace) %>%
tm_map(removeNumbers) %>%
tm_map(removePunctuation) %>%
tm_map(content_transformer(tolower)) %>%
tm_map(removeWords, stopwords("english"))
wordcloud(sociologyCorpus, max.words = 50, scale = c(8, 1),
colors=brewer.pal(3, "Accent"),random.order = FALSE)
economics=arxiv_search(query = '"Economics"', limit = 500)
economicsCorpus=with(economics, VCorpus(VectorSource(abstract)))%>%
tm_map(stripWhitespace) %>%
tm_map(removeNumbers) %>%
tm_map(removePunctuation) %>%
tm_map(content_transformer(tolower)) %>%
tm_map(removeWords, stopwords("english"))
wordcloud(economicsCorpus, max.words = 50, scale = c(8, 1),
colors=brewer.pal(3, "Set2"),random.order = FALSE)
Looking at each discipline, we can see a clear pattern of the top used words being most relevant, and in my opinion, most synonymous with the respective discipline.
The field of Math uses algebra and group a lot with regards to research papers. This relates to the amount of algebra and group theory involved in most math research nowadays–as group theory and algebra is very important in encryption and computers.
Looking at physics, we see words like string, theory, and quantum being used, as physics involved many theories about quantum physics, string theory, and other concentrations.
Sociology, not surprisingly, has social and network as two of the most frequently used words. Sociology looks at social research, and implements network analysis in research.
Lastly, economics has economic and model when it comes to its two most frequently used words. This is because of the amount of economic concentrations that are probably cited, along with the large number of modeling techniques involved in such concentrations like econometrics.
When looking through a broader lens, we see clear word usage patterns that show similarities amongst concentrations and between the natural and social sciences.
Words like theories, models, and equations are seen used across all four disciplines. What we see is that research across the sciences deals with the same fundamentals when it comes down to the nitty gritty of it all. Whether its sociology or math, quantitative or qualitative, scientists employ the same underlying methods to conduct research. It is about building upon previous research, and using theories and equations to test hypotheses created through said research analysis and review.
In my opinion, we have lost sight of this basic fact, that science is united, no matter what discipline or field you fall in to.