Data Science for Health Management

RP x UNISBA VSOT 6 to 8 September 2022

Theme : Jaga My Mental Health, Developing Initiatives for Us

Setting Up Your R Studio Cloud Account

Please follow the steps outlined below to set up your R Studio Cloud Account, which will be used for you to practice some hands-on data science work for this workshop.

Congratulations, your R Studio Cloud Account is now ready !

The R Studio IDE Workspace Components

The R Studio IDE Workspace consists of the following key components

Data Types and Assignment of Variables in R

Creating and Managing Vectors in R

Loading the Packages Required in R for This Lesson

This is the code for installation of Pacman which is used to unpack all packages required in this lesson.

install.packages("pacman",repos = "http://cran.us.r-project.org")
## package 'pacman' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\aaron_chen_angus\AppData\Local\Temp\Rtmp48Hv4b\downloaded_packages

We will then proceed to load the packages required for this section

pacman::p_load(pacman, psych, rio, tidyverse, ggplot2, ggridges, devtools, vioplot, dplyr, ggwordcloud, ggforce, gridExtra, grid, wordcloud2, tm, SnowballC, wordcloud, RColorBrewer, kableExtra)
## Error in get(genname, envir = envir) : object 'testthat_print' not found

Quick Analysis of the VSOT Pre-Programme Grit Survey Results

For this demonstration, we will use the data derived from your VSOT Pre-Programme Grit Survey Results.

The data has been extracted and stored at the following GitHub link : https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/VSOT_GRIT.csv

vsot <- read.csv(file = "https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/VSOT_GRIT.csv", header = TRUE, sep = ",")

Check on the output and hence the integrity of the loaded data by reading the column names

vsot %>% colnames()
##  [1] "Your.name"     "Team"          "Institution"   "Q2"           
##  [5] "Q4"            "Q7"            "Q8"            "Q1"           
##  [9] "Q3"            "Q5"            "Q6"            "Grit_Score"   
## [13] "Q9"            "Q10"           "Vision_Mental"

You can also use the following command to view a sample of the table entries

kbl(vsot[1:28, 2:12]) %>%
    kable_styling(fixed_thead = TRUE)
Team Institution Q2 Q4 Q7 Q8 Q1 Q3 Q5 Q6 Grit_Score
1 RP 5 4 2 4 2 3 3 2 3.125
1 RP 3 4 3 3 3 4 5 2 3.375
1 RP 2 3 2 3 4 3 4 2 2.875
1 UNISBA 5 4 2 4 4 5 2 2 3.500
1 UNISBA 4 4 3 4 3 5 3 2 3.500
1 UNISBA 5 5 3 4 4 4 5 2 4.000
2 RP 2 4 3 2 2 2 4 2 2.625
2 RP 4 4 1 5 4 5 4 2 3.625
2 RP 4 4 4 4 3 2 4 2 3.375
2 UNISBA 4 2 5 2 3 1 1 4 2.750
2 UNISBA 4 4 4 4 2 4 4 2 3.500
3 RP 4 3 3 4 3 2 4 1 3.000
3 RP 3 5 2 4 4 4 4 2 3.500
3 RP 5 5 4 5 3 3 4 2 3.875
3 UNISBA 4 4 2 3 4 4 4 2 3.375
3 UNISBA 3 4 4 3 3 4 4 2 3.375
4 RP 5 3 4 5 5 3 5 1 3.875
4 RP 3 3 2 3 4 3 2 2 2.750
4 RP 5 4 4 5 5 2 3 2 3.750
4 UNISBA 3 4 2 5 1 3 4 2 3.000
4 UNISBA 4 5 3 4 4 4 4 2 3.750
4 UNISBA 4 5 2 4 2 4 2 2 3.125
5 RP 2 4 5 3 3 3 2 3 3.125
5 RP 4 4 3 4 4 3 2 2 3.250
5 RP 4 4 3 4 3 4 3 1 3.250
5 UNISBA 4 4 3 4 4 4 3 2 3.500
5 UNISBA 3 4 4 4 2 4 4 1 3.250
5 UNISBA 4 3 4 4 3 2 3 2 3.125

In the next sections we will explore how to make use of some standard data visualisation tools to display your data

Lastly, we will introduce heatmaps as a means of analysing and visualising qualitative data.

enjoy !

Histograms

Here is a histogram drawn to provide details on the distribution of Grit score across all respondents

vsot%>%
ggplot(aes(x=Grit_Score)) + 
geom_histogram(binwidth=0.1, fill="green", color="black", alpha=0.9) +
ggtitle("Distribution of Grit Score Across All Respondents") + ylim(0,5)

You can play around with different colours for the histogram fill, outline, as wel as transparency.

Box Plot

Here is a box plot drawn to show the difference in Grit scores between RP and UNISBA respondents

vsot%>%
ggplot(aes(x=Institution, y=Grit_Score)) + 
geom_boxplot(color="red", fill="green", alpha=0.2) + 
ggtitle("Grit Score for RP and UNISBA") + ylim(2.5,4.0)

We can also flip the boxplot axes by adding a coord flip command

vsot%>%
ggplot(aes(x=Institution, y=Grit_Score)) + 
geom_boxplot(color="orange", fill="red", alpha=0.2) + 
ggtitle("Grit Score for RP and UNISBA") + ylim(2.5,4.0) + coord_flip()

We can also differentiate the colours of the 2 boxplots by changing the code to include fill=institution

vsot%>%
ggplot(aes(x=Institution, y=Grit_Score, fill=Institution)) + 
geom_boxplot() + 
ggtitle("Grit Score for RP and UNISBA") + ylim(2.5,4.0) + coord_flip()

Density Plots

The following density plots are constructed as an alternative means of differentiating the grit scores for the two subgroups (RP & UNISBA)

vsot%>%
ggplot(aes(x=Grit_Score, color=Institution, fill=Institution)) +
geom_density(alpha=0.3,size=1)+ 
labs(x= "Distribution of Grit Scores",
subtitle="",
caption="Kruskal-Wallis chi-squared = 148.25, df = 70, p-value = 1.495e-07")

Ridgeline Plots

The following ridgeline plots are constructed to also show the difference between institutions in terms of the grit score.

Ridgeline Plot for NCSS_B05A and MR1

vsot%>%
ggplot(aes(x = Grit_Score, y = Institution, fill = Institution)) +
geom_density_ridges() +
theme_ridges() + 
theme(legend.position = "none")
## Picking joint bandwidth of 0.173

Violin Plots

The following violin plots are constructed to ascertain the relationship between Grit Scores and Question 8 : “I am diligent. I never give up”.

First, we will need to ensure the Q8 field is a factor.

vsot$Q8 <- as.factor(vsot$Q8)

Violin Plot for Grit Score and Q8

vsot%>%
ggplot(aes(x=Q8, 
           y=Grit_Score, 
           fill=Q8)) + 
  geom_violin()

Word Clouds

3 reasons why Word Clouds are a great way of presenting your qualitative data

  1. Word clouds add simplicity and clarity. The most used keywords stand out better in a word cloud
  2. Word clouds are a potent communication tool. They are easy to understand, to be shared and are impactful
  3. Word clouds are visually engaging than a table data

Who uses Word Clouds ?

There are 5 main steps to create Word Clouds in R

Step 1 : Create a Text File

In the following example, we will process the Lyrics to the Song, “Dynamite” by BTS. The raw text file is stored in GitHub : https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/Dynamite.txt

Step 2 : Install and Load the Required Packages

The following packages will need to be loaded

# Load
library("tm")
library("SnowballC")
library("wordcloud")
library("RColorBrewer")

Step 3 : Text Mining

The text is loaded using Corpus() function from text mining (tm) package.

# Read the text file from internet
filePath <- "https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/Dynamite.txt"
text <- readLines(filePath)
## Warning in readLines(filePath): incomplete final line found on 'https://
## raw.githubusercontent.com/aaron-chen-angus/community2campus/main/Dynamite.txt'

Load the data as a corpus

# Load the data as a corpus
docs <- Corpus(VectorSource(text))

Inspect the content of the document

inspect(docs)
## <<SimpleCorpus>>
## Metadata:  corpus specific: 1, document level (indexed): 0
## Content:  documents: 126
## 
##   [1] Cause in the stars tonight                                 
##   [2] So watch me bring the fire and set the night alight        
##   [3] Shoes on get up in the morn                                
##   [4] Cup of milk lets rock and roll                             
##   [5] King Kong kick the drum rolling on like a RollingStone     
##   [6] Sing song when Im walking home                             
##   [7] Jump up to the top LeBron                                  
##   [8] Ding dong call me on my phone                              
##   [9] Ice tea and a game of ping pong huh                        
##  [10] This is getting heavy                                      
##  [11] Can you hear the bass boom Im ready woohoo                 
##  [12] Life is sweet as honey                                     
##  [13] Yeah this beat chaching like money huh                     
##  [14] Disco overload Im into that Im good to go                  
##  [15] Im diamond you know I glow up                              
##  [16] Hey so lets go                                             
##  [17] Cause in the stars tonight                                 
##  [18] So watch me bring the fire and set the night alight hey    
##  [19] Shining through the city with a little funk and soul       
##  [20] So light it up like dynamite whoa oh oh                    
##  [21] Bring a friend join the crowd                              
##  [22] Whoever wanna come along                                   
##  [23] Word up talk the talk                                      
##  [24] Just move like we off the wall                             
##  [25] Day or night the skys alight                               
##  [26] So we dance to the break of dawn                           
##  [27] Ladies and gentlemen I got the medicine                    
##  [28] So you should keep ya eyes on the ball huh                 
##  [29] This is getting heavy                                      
##  [30] Can you hear the bass boom Im ready woohoo                 
##  [31] Life is sweet as honey                                     
##  [32] Yeah this beat changing like money                         
##  [33] Disco overload Im into that Im good to go                  
##  [34] Im diamond you know I glow up                              
##  [35] Lets go                                                    
##  [36] Cause in the stars tonight                                 
##  [37] So watch me bring the fire and set the night alight hey    
##  [38] Shining through the city with a little funk and soul       
##  [39] So light it up like dynamite whoa oh oh                    
##  [40] Dynamitelife is dynamite                                   
##  [41] Dynamitelife is dynamite                                   
##  [42] Shining through the city with a little funk and soul       
##  [43] So light it up like dynamite whoa oh oh                    
##  [44] Dynamite                                                   
##  [45] Dynamite                                                   
##  [46] Dynamite                                                   
##  [47] Light it up like dynamite                                  
##  [48] Dynamite                                                   
##  [49] Dynamite                                                   
##  [50] Dynamite                                                   
##  [51] Light it up like dynamite                                  
##  [52] Cause in the stars tonight                                 
##  [53] So watch me bring the fire and set the night alight        
##  [54] Shining through the city with a little funk and soul       
##  [55] So light it up like dynamite this is ah                    
##  [56] Cause in the stars tonight                                 
##  [57] So watch me bring the fire and set the night alight alright
##  [58] Shining through the city with a little funk and soul       
##  [59] So light it up like dynamite whoa light it up like dynamite
##  [60] Dynamite life is dynamite                                  
##  [61] Dynamite life is dynamite                                  
##  [62] Shining through the city with a little funk and soul       
##  [63] So light it up like dynamite whoa oh oh                    
##  [64] Cause in the stars tonight                                 
##  [65] So watch me bring the fire and set the night alight        
##  [66] Shoes on get up in the morn                                
##  [67] Cup of milk lets rock and roll                             
##  [68] King Kong kick the drum rolling on like a RollingStone     
##  [69] Sing song when Im walking home                             
##  [70] Jump up to the top LeBron                                  
##  [71] Ding dong call me on my phone                              
##  [72] Ice tea and a game of ping pong huh                        
##  [73] This is getting heavy                                      
##  [74] Can you hear the bass boom Im ready woohoo                 
##  [75] Life is sweet as honey                                     
##  [76] Yeah this beat chaching like money huh                     
##  [77] Disco overload Im into that Im good to go                  
##  [78] Im diamond you know I glow up                              
##  [79] Hey so lets go                                             
##  [80] Cause in the stars tonight                                 
##  [81] So watch me bring the fire and set the night alight hey    
##  [82] Shining through the city with a little funk and soul       
##  [83] So light it up like dynamite whoa oh oh                    
##  [84] Bring a friend join the crowd                              
##  [85] Whoever wanna come along                                   
##  [86] Word up talk the talk                                      
##  [87] Just move like we off the wall                             
##  [88] Day or night the skys alight                               
##  [89] So we dance to the break of dawn                           
##  [90] Ladies and gentlemen I got the medicine                    
##  [91] So you should keep ya eyes on the ball huh                 
##  [92] This is getting heavy                                      
##  [93] Can you hear the bass boom Im ready woohoo                 
##  [94] Life is sweet as honey                                     
##  [95] Yeah this beat changing like money                         
##  [96] Disco overload Im into that Im good to go                  
##  [97] Im diamond you know I glow up                              
##  [98] Lets go                                                    
##  [99] Cause in the stars tonight                                 
## [100] So watch me bring the fire and set the night alight hey    
## [101] Shining through the city with a little funk and soul       
## [102] So light it up like dynamite whoa oh oh                    
## [103] Dynamitelife is dynamite                                   
## [104] Dynamitelife is dynamite                                   
## [105] Shining through the city with a little funk and soul       
## [106] So light it up like dynamite whoa oh oh                    
## [107] Dynamite                                                   
## [108] Dynamite                                                   
## [109] Dynamite                                                   
## [110] Light it up like dynamite                                  
## [111] Dynamite                                                   
## [112] Dynamite                                                   
## [113] Dynamite                                                   
## [114] Light it up like dynamite                                  
## [115] Cause in the stars tonight                                 
## [116] So watch me bring the fire and set the night alight        
## [117] Shining through the city with a little funk and soul       
## [118] So light it up like dynamite this is ah                    
## [119] Cause in the stars tonight                                 
## [120] So watch me bring the fire and set the night alight alright
## [121] Shining through the city with a little funk and soul       
## [122] So light it up like dynamite whoa light it up like dynamite
## [123] Dynamite life is dynamite                                  
## [124] Dynamite life is dynamite                                  
## [125] Shining through the city with a little funk and soul       
## [126] So light it up like dynamite whoa oh oh

Text Transformation

Transformation is performed using tm_map() function to replace, for example, special characters from the text.

Replacing “/”, “@” and “|” with space:

toSpace <- content_transformer(function (x , pattern ) gsub(pattern, " ", x))
docs <- tm_map(docs, toSpace, "/")
## Warning in tm_map.SimpleCorpus(docs, toSpace, "/"): transformation drops
## documents
docs <- tm_map(docs, toSpace, "@")
## Warning in tm_map.SimpleCorpus(docs, toSpace, "@"): transformation drops
## documents
docs <- tm_map(docs, toSpace, "\\|")
## Warning in tm_map.SimpleCorpus(docs, toSpace, "\\|"): transformation drops
## documents

Cleaning the text

the tm_map() function is used to remove unnecessary white space, to convert the text to lower case, to remove common stopwords like ‘the’, “we”.

The information value of ‘stopwords’ is near zero due to the fact that they are so common in a language. Removing this kind of words is useful before further analyses. For ‘stopwords’, supported languages are Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Russian, Spanish and Swedish. Language names are case sensitive.

The R code below can be used to clean your text :

# Convert the text to lower case
docs <- tm_map(docs, content_transformer(tolower))
## Warning in tm_map.SimpleCorpus(docs, content_transformer(tolower)):
## transformation drops documents
# Remove numbers
docs <- tm_map(docs, removeNumbers)
## Warning in tm_map.SimpleCorpus(docs, removeNumbers): transformation drops
## documents
# Remove english common stopwords
docs <- tm_map(docs, removeWords, stopwords("english"))
## Warning in tm_map.SimpleCorpus(docs, removeWords, stopwords("english")):
## transformation drops documents
# Remove your own stop word
# specify your stopwords as a character vector
docs <- tm_map(docs, removeWords, c("blabla1", "blabla2")) 
## Warning in tm_map.SimpleCorpus(docs, removeWords, c("blabla1", "blabla2")):
## transformation drops documents
# Remove punctuations
docs <- tm_map(docs, removePunctuation)
## Warning in tm_map.SimpleCorpus(docs, removePunctuation): transformation drops
## documents
# Eliminate extra white spaces
docs <- tm_map(docs, stripWhitespace)
## Warning in tm_map.SimpleCorpus(docs, stripWhitespace): transformation drops
## documents
# Text stemming
docs <- tm_map(docs, stemDocument)
## Warning in tm_map.SimpleCorpus(docs, stemDocument): transformation drops
## documents

Step 4 : Build a Term-Document Matrix

Document matrix is a table containing the frequency of the words. Column names are words and row names are documents. The function TermDocumentMatrix() from text mining package can be used as follow :

dtm <- TermDocumentMatrix(docs)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
head(d, 20)
##            word freq
## dynamit dynamit   42
## like       like   26
## light     light   18
## alight   alight   12
## bring     bring   12
## night     night   12
## citi       citi   12
## funk       funk   12
## littl     littl   12
## shine     shine   12
## soul       soul   12
## caus       caus   10
## star       star   10
## tonight tonight   10
## fire       fire   10
## set         set   10
## watch     watch   10
## whoa       whoa   10
## life       life    8
## get         get    6

Step 5 : Generate the Word Cloud

Standard Word Cloud

A standard word cloud can be generated based on the code below.

# install.packages("ggwordcloud")
library(wordcloud2) 

# Basic plot
wordcloud2(data=d, size=1.6)

It is possible to change the word color using the color argument. You can provide a vector of color, or use random-dark or random-light. You can also customize the background color with backgroundColor.

# library
library(wordcloud2) 
 
# Gives a proposed palette
wordcloud2(d, size=1.6, color='random-dark')

# library
library(wordcloud2) 

# or a vector of colors. vector must be same length than input data
wordcloud2(d, size=1.6, color=rep_len( c("pink","blue","orange","teal"), nrow(d) ) )

# library
library(wordcloud2)
 
# Change the background color
wordcloud2(d, size=1.6, color='random-dark', backgroundColor="orange")

Word Orientation

You can also rotate the words in your WordCloud with 3 arguments: minRotation, maxRotation and rotateRatio.

library(wordcloud2) 
 
# wordcloud
wordcloud2(d, size = 2.3, minRotation = -pi/6, maxRotation = -pi/6, rotateRatio = 1)

Change the Shape of the Word Cloud

A standard word cloud can be generated based on the code below.

You can customise the wordcloud shape using the shape argument. Available shapes are:

# library
library(wordcloud2) 
 
# Change the shape to a star:
wordcloud2(d, size = 0.7, shape = 'star')

# library
library(wordcloud2) 
 
# Change the shape to a cardioid:
wordcloud2(d, size = 0.7, shape = 'cardioid')

Another Word Cloud Example

We will now use the lyrics to the song, “Pink Venom” by BLɅϽKPIИK which is stored at the following GitHub link https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/PinkVenom.txt

Load Packages

The text will contain some Korean in addition to English words, so we will omit the text cleaning step.

# Load Packages
library("tm")
library("SnowballC")
library("wordcloud")
library("RColorBrewer")

Text Mining

# Read the text file from GitHub
filePath <- "https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/PinkVenom.txt"
text <- readLines(filePath)
## Warning in readLines(filePath): incomplete final line found on 'https://
## raw.githubusercontent.com/aaron-chen-angus/community2campus/main/PinkVenom.txt'

Load the data as a corpus

# Load the data as a corpus
docs <- Corpus(VectorSource(text))

Inspect the content of the document

inspect(docs)
## <<SimpleCorpus>>
## Metadata:  corpus specific: 1, document level (indexed): 0
## Content:  documents: 59
## 
##  [1] BLACKPINK                                                   
##  [2] BLACKPINK                                                   
##  [3] BLACKPINK                                                   
##  [4] BLACKPINK                                                   
##  [5] Kick in the door waving the coco                            
##  [6] íŒ\235ì½\230ì\235´ë‚\230 ì±\231겨 껴들 ìƒ\235ê°\201 ë§\220ê³                     
##  [7] I talk that talk runways I walk walk                        
##  [8] ë\210\210 ê°\220ê³  pop pop ì•\210 ë´\220ë\217„ ì²\231                           
##  [9] One by one then two by two                                  
## [10] ë‚´ ì†\220ë\201\235 í\210­ í•\230ë‚\230ì—\220 다 무ë„\210ì§\200는 중               
## [11] ê°\200ì§œ 쇼 ì¹\230곤 í\231”ë ¤í–\210ì§\200                              
## [12] Makes no sense you couldn't get a dollar outta me           
## [13] ìž\220 ì\230¤ëŠ\230 ë°¤ì\235´ì•¼ 난 ë\217…ì\235„ í’\210ì\235\200 꽃                  
## [14] 네 í\230¼ì\235„ 빼앗ì\235\200 다ì\235Œ look what you made us do        
## [15] 천천íž\210 ë„\220 잠재울 fire fire                           
## [16] ìž”ì\235¸í•  ë§Œí\201¼ 아름다워 I bring the pain like         
## [17] This that pink venom this that pink venom                   
## [18] This that pink venom get 'em get 'em get 'em                
## [19] Straight to ya dome like whoa whoa whoa                     
## [20] Straight to ya dome like ah ah ah                           
## [21] Taste that pink venom taste that pink venom                 
## [22] Taste that pink venom get 'em get 'em get 'em               
## [23] Straight to ya dome like whoa whoa whoa                     
## [24] Straight to ya dome like ah ah ah                           
## [25] Black paint and ammo got bodies like Rambo                  
## [26] Rest in peace please light up a candle                      
## [27] This the life of a vandal masked up and I'm still in CELINE 
## [28] Designer crimes or it wouldn't be me ooh                    
## [29] Diamonds shining drive in silence I don't mind it I'm riding
## [30] Flying private side by side with the pilot up in the sky    
## [31] And I'm wilding styling on them and there's no chance       
## [32] 'Cause we got bodies on bodies like this a slow dance       
## [33] ìž\220 ì\230¤ëŠ\230 ë°¤ì\235´ì•¼ 난 ë\217…ì\235„ í’\210ì\235\200 꽃                  
## [34] 네 í\230¼ì\235„ 빼앗ì\235\200 다ì\235Œ look what you made us do        
## [35] 천천íž\210 ë„\220 잠재울 fire fire                           
## [36] ìž”ì\235¸í•  ë§Œí\201¼ 아름다워 I bring the pain like         
## [37] This that pink venom this that pink venom                   
## [38] This that pink venom get 'em get 'em get 'em                
## [39] Straight to ya dome like whoa whoa whoa                     
## [40] Straight to ya dome like ah ah ah                           
## [41] Taste that pink venom taste that pink venom                 
## [42] Taste that pink venom get 'em get 'em get 'em               
## [43] Straight to ya dome like whoa whoa whoa                     
## [44] Straight to ya dome like ah ah ah                           
## [45] ì›\220한다면 provoke us                                     
## [46] ê°\220당 못해 and you know this                             
## [47] ì\235´ë¯¸ í\215¼ì ¸ë²„린 shot that potion                        
## [48] 네 ë\210\210앞ì\235\200 í•‘í\201¬ë¹› ocean                               
## [49] Come and give me all the smoke                              
## [50] ë\217„ ì•„ë‹\210ë©´ 모 like I'm so rock and roll                 
## [51] Come and give me all the smoke                              
## [52] 다 줄 세워 ë´\220 ìž\220 stop drop I bring the pain like      
## [53] ë\235¼íƒ\200íƒ\200íƒ\200 트ë\235¼íƒ\200íƒ\200íƒ\200                                
## [54] ë\235¼íƒ\200íƒ\200íƒ\200 트ë\235¼íƒ\200íƒ\200íƒ\200                                
## [55] ë\235¼íƒ\200íƒ\200íƒ\200 트ë\235¼íƒ\200íƒ\200íƒ\200                                
## [56] Straight to ya straight to ya straight to ya dome like      
## [57] ë\235¼íƒ\200íƒ\200íƒ\200 트ë\235¼íƒ\200íƒ\200íƒ\200 BLACKPINK                      
## [58] ë\235¼íƒ\200íƒ\200íƒ\200 트ë\235¼íƒ\200íƒ\200íƒ\200 BLACKPINK                      
## [59] ë\235¼íƒ\200íƒ\200íƒ\200 트ë\235¼íƒ\200íƒ\200íƒ\200 BLACKPINK

Build a Term-Document Matrix

Document matrix is a table containing the frequency of the words. Column names are words and row names are documents. The function TermDocumentMatrix() from text mining package can be used as follows :

dtm <- TermDocumentMatrix(docs)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
p <- data.frame(word = names(v),freq=v)
head(p, 20)
##                                                word freq
## like                                           like   15
## that                                           that   14
## get                                             get   13
## pink                                           pink   12
## venom                                         venom   12
## 'em                                             'em   12
## whoa                                           whoa   12
## straight                                   straight   11
## the                                             the   10
## this                                           this    9
## dome                                           dome    9
## and                                             and    8
## blackpink                                 blackpink    7
## taste                                         taste    6
## <U+B77C><U+D0C0><U+D0C0><U+D0C0>           <U+B77C><U+D0C0><U+D0C0><U+D0C0>    6
## <U+D6B8><U+B77C><U+D0C0><U+D0C0><U+D0C0> <U+D6B8><U+B77C><U+D0C0><U+D0C0><U+D0C0>    6
## you                                             you    4
## fire                                           fire    4
## i'm                                             i'm    4
## <U+C790>                                         <U+C790>    3

Generate the Word Cloud

Standard Word Cloud

A standard word cloud can be generated based on the code below.

# library
library(wordcloud2) 

# Basic plot
wordcloud2(data=p, size=1.6)

It is possible to change the word color using the color argument. You can provide a vector of color, or use random-dark or random-light. You can also customize the background color with backgroundColor.

# library
library(wordcloud2) 
 
# Gives a proposed palette
wordcloud2(p, size=1.6, color='random-dark')

# library
library(wordcloud2) 
 
# or a vector of colors. vector must be same length than input data
wordcloud2(p, size=1.6, color=rep_len( c("black","pink"), nrow(p) ) )

# library
library(wordcloud2) 
 
# Change the background color
wordcloud2(p, size=1.6, color='random-light', backgroundColor="black")

Word Orientation

#library
library(wordcloud2) 
 
# wordcloud
wordcloud2(p, size = 2.3, minRotation = -pi/6, maxRotation = -pi/6, rotateRatio = 1, color = 'pink', backgroundColor = 'black')

Change the Shape of the Word Cloud

# library
library(wordcloud2) 
 
# Change the shape to a diamond:
wordcloud2(p, size = 0.7, shape = 'diamond')

# library
library(wordcloud2) 
 
# Change the shape to a triangle-forward:
wordcloud2(p, size = 0.7, shape = 'triangle-forward', color=rep_len( c("black","pink"), nrow(p) ))

Using a Mask to Customise Your Word Cloud Shape

It is also possible to use a mask to customise your Word Cloud Shape, but this will take some time and we only have less than an hour for this session. Please feel free to check out this video to see how it is done :

Here’s a sample of what you can achieve with Word Cloud Masking

Try It Out

In your groups, please create word clouds based on your team’s vision for mental health submitted during the pre-event survey.

The raw txt files for your respective teams are provided via the links to GitHub below :

Team 1 : https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/VSOT_Team01.txt

Team 2 : https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/VSOT_Team02.txt

Team 3 : https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/VSOT_Team03.txt

Team 4 : https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/VSOT_Team04.txt

Team 5 : https://raw.githubusercontent.com/aaron-chen-angus/community2campus/main/VSOT_Team05.txt