# Load the data
sh <- read.delim("data/Handbook of LA.txt")
# Create a text corpus
doc <- Corpus(VectorSource(sh))
head(doc)<<SimpleCorpus>>
Metadata: corpus specific: 1, document level (indexed): 0
Content: documents: 1
CUED 7540: Learning Analytics IV
In this final module, we’ll explore Wordclouds, Social Network Analysis (SNA), and Heatmaps—three advanced visualization techniques. These tools are highly customizable, and today, we’ll focus on the foundational steps to get you started.
By the end of this module, you’ll be able to create these visualizations and uncover patterns in your data. Remember, all the codes are provided, but I encourage you to revise them and try using your own dataset for a deeper understanding.
We’ll begin by creating a basic word cloud from students’ text data, such as survey responses or discussion board posts.
1. Setup
Step 1: Loading data
We will use a different command read.delim() and you will choose the text file. In this example, we’ll use “Sherlock.Homes.txt” from the ‘data’ folder.
We will load a text file from the Handbook of Learning Analtics (2022). This book was converted to a txt file for our analytics. Check our data folder.
# Load the data
sh <- read.delim("data/Handbook of LA.txt")
# Create a text corpus
doc <- Corpus(VectorSource(sh))
head(doc)<<SimpleCorpus>>
Metadata: corpus specific: 1, document level (indexed): 0
Content: documents: 1
Step 2: Text Preprocessing
Clean the text data to improve the quality of the word cloud.
# Define a function to replace specific patterns with a space
toSpace <- content_transformer(function (x, pattern) gsub(pattern, " ", x))
# Apply text transformations
doc <- tm_map(doc, toSpace, "/")Warning in tm_map.SimpleCorpus(doc, toSpace, "/"): transformation drops
documents
doc <- tm_map(doc, toSpace, "@")Warning in tm_map.SimpleCorpus(doc, toSpace, "@"): transformation drops
documents
doc <- tm_map(doc, toSpace, "\\|")Warning in tm_map.SimpleCorpus(doc, toSpace, "\\|"): transformation drops
documents
doc <- tm_map(doc, content_transformer(tolower)) # Convert to lowercaseWarning in tm_map.SimpleCorpus(doc, content_transformer(tolower)):
transformation drops documents
doc <- tm_map(doc, removeWords, c(stopwords("english"), "https","can","doi")) # Remove common stopwordsWarning in tm_map.SimpleCorpus(doc, removeWords, c(stopwords("english"), :
transformation drops documents
doc <- tm_map(doc, removeNumbers) # Remove numbersWarning in tm_map.SimpleCorpus(doc, removeNumbers): transformation drops
documents
doc <- tm_map(doc, removePunctuation) # Remove punctuationWarning in tm_map.SimpleCorpus(doc, removePunctuation): transformation drops
documents
doc <- tm_map(doc, stripWhitespace) # Remove extra whitespaceWarning in tm_map.SimpleCorpus(doc, stripWhitespace): transformation drops
documents
# You can add more words to remove by updating the stopwords list.When you run the code section above, you might get a lengthy list of warining message. As long as it’s not an error message, we can continue.
Step 3: Creating the Wordcloud
Now, let’s create the Wordcloud.
# Create a term-document matrix
dtm <- TermDocumentMatrix(doc)
# Convert the matrix to a data frame of word frequencies
m <- as.matrix(dtm)
v <- sort(rowSums(m), decreasing = TRUE)
d <- data.frame(word = names(v), freq = v)
# Display the top words
head(d, 10) word freq
learning learning 3037
analytics analytics 1443
data data 1112
– – 1084
url url 638
education education 466
educational educational 441
research research 417
knowledge knowledge 360
analysis analysis 358
# Create the word cloud
set.seed(1234)
wordcloud(words = d$word, freq = d$freq, min.freq = 1,
max.words = 100, random.order = FALSE, rot.per = 0.35,
colors = brewer.pal(8, "Dark2"))Question: How might customizing the word cloud (e.g., colors, word frequency) enhance the insights you gain from text data?
Step 1: Setup and Load/Inspect Data
We’ll use a dataset of student assignment scores to create a heatmap.
#We need various libraries this time.
library(ggplot2)
Attaching package: 'ggplot2'
The following object is masked from 'package:NLP':
annotate
library(reshape2)
#import/load the dataset
data_hm <- read.csv("data/student_assignment_scores.csv")
#inspect your data
head(data_hm) Student_ID Assignment_1 Assignment_2 Assignment_3 Assignment_4 Assignment_5
1 Student_1 98 77 52 98 54
2 Student_2 86 89 70 67 58
3 Student_3 50 52 51 54 65
4 Student_4 74 82 87 95 75
5 Student_5 59 91 59 89 92
6 Student_6 85 73 89 89 91
Assignment_6 Assignment_7 Assignment_8 Assignment_9 Assignment_10
1 87 54 80 82 79
2 80 84 57 87 54
3 55 63 65 92 71
4 69 63 52 90 69
5 64 77 86 71 99
6 83 82 62 75 90
# Reshape the data for the heatmap
data_long <- melt(data_hm, id.vars = "Student_ID", variable.name = "Assignment", value.name = "Score")Step 2: Generate the heatmap
# Generate the heatmap to visualize student progress across assignments
ggplot(data_long, aes(x = Assignment, y = Student_ID, fill = Score)) +
geom_tile() +
scale_fill_gradient(low = "black", high = "blue") +
labs(title = "Heatmap of Student Progress Across Assignments", x = "Assignment", y = "Student ID") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))Question: Try modifying the color gradient or axis labels. How does it change the interpretation of the data?
Application and Impact: How can these advanced visualization techniques enhance your understanding and decision-making processes as an educator, instructional designer, or policymaker? Consider specific scenarios where these tools could provide deeper insights or drive more informed decisions.
Challenges and Opportunities: What challenges might you face in implementing these visualizations in real-world educational settings? How can you overcome these challenges to effectively utilize these tools?
Future Considerations: Reflect on how these techniques could evolve in your field. What future possibilities do you see for the use of advanced visualizations in learning analytics?
Congratulations, you’ve completed the final module!
To receive full score, you will need to render this document and publish via a method such as: Quarto Pub, Posit Cloud, RPubs , GitHub Pages, or other methods. Once you have shared a link to you published document with me and I have reviewed your work, you will be officially done with the current module.
Complete the following steps to submit your work for review by:
First, change the name of the author: in the YAML header at the very top of this document to your name. The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.
Next, click the “Render” button in the toolbar above to “render” your R Markdown document to a HTML file that will be saved in your R Project folder. You should see a formatted webpage appear in your Viewer tab in the lower right pan or in a new browser window. Let me know if you run into any issues with rendering.
Finally, publish. To do publish, follow the step from the link
If you have any questions about this module, or run into any technical issues, don’t hesitate to contact me.
Once I have checked your link, you will be notified!
Social Network Analysis with Interaction Dataset
Step 1: Setup and Load data
Step 2: Prepare the Data
Convert the data into a format suitable for network analysis.
Step 3: Visualize the network
It’s hard to see the interaction patterns so we will customize the network to make it easy to look.
Step 4: Customize the network
*Click the ‘Show in New Window’ icon on the top right of the preview.