Install Required Packages

Create a table

  1. one row per transcript/interview
  2. one column containing the text

A simple format like this works perfectly:

ID Speaker Transcript
1 Patient I have been feeling tired lately and sleeping poorly
2 Patient The medication helped but I still feel anxious
3 Clinician Patient reports improvement in mood
df <- data.frame(
  ID = c(1,2,3),
  Speaker = c("Patient", "Patient", "Clinician"),
  Transcript = c(
    "I have been feeling tired lately and sleeping poorly",
    "The medication helped but I still feel anxious",
    "Patient reports improvement in mood"
  )
)

Word Cloud

words <- df %>%
  unnest_tokens(word, Transcript) %>%
  count(word, sort = TRUE)

wordcloud2(words)