Column

Comparison Wordcloud

Bar Graph

Grouped Bar Graph

Deviation Bar Graph

Dotchart

Deviation Dot Chart

About this Flexdashboard

For this flexdashboard I have used library(ggpubr) along with library(wordcloud) to visualize words from a Danish corpus.

Text and Lexicon used

The Danish corpus used is dan_mixed_2014_30K-sentences.txt from the Danish language section, Mixed, 2014, 30K that was obtained from http://wortschatz.uni-leipzig.de/en/download/. D. Goldhahn, T. Eckart & U. Quasthoff: Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages. In: Proceedings of the 8th International Language Ressources and Evaluation (LREC’12), 2012. The paper can be found at http://www.lrec-conf.org/proceedings/lrec2012/pdf/327_Paper.pdf .

I used the corpus “as is”. I only cleaned the text by performing the standard data cleaning things like removing the punctuation, numbers, symbols, extra white spaces, etc.

The lexicon I used is AFINN-da-32.txt. Finn Årup Nielsen, “A new ANEW: evaluation of a word list for sentiment analysis in microblogs”, Proceedings of the ESWC2011 Workshop on ‘Making Sense of Microposts’: Big things come in small packages. Volume 718 in CEUR Workshop Proceedings: 93-98. 2011 May. Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie, Mariann Hardey (editors). I found the lexicon at https://github.com/fnielsen/afinn/blob/master/afinn/data/AFINN-da-32.txt. The paper can be found at http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6006/pdf/imm6006.pdf .

References

If you are interested in Danish language text analysis resources you can read this pdf Danish resources by Finn Årup Nielsen that can be found at http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6956/pdf/imm6956.pdf .

Libraries used in the analysis:

Websites and Books: