1. Introduction

Measurement in school

Since the inception of modern school, measurement has been a central feature of its dynamic (Foucault, 2002). In recent decades this tendency has increased in interaction with globalization and a technical-managerial conception of accountability in educational settings; but, are we measuring what we value or valuing what is easy to measure? (Biesta, 2014). In Uruguayan middle and high schools -as in other countries-, while there are very sophisticated mechanisms to evaluate the cognitive and behavioral aspects of institutional life, the emotion-related contents tend to be overlooked, even when features such as personal well-being and social-emotional skills are considered as central goals of the educational system. The resulting asymmetry in available information can bias the decision-making process of teachers, principals and other stakeholders.

While many of the common methods to assess emotion could produce useful outcomes (Mauss & Robinson, 2009), self-report seems to fit in the conditions and constraints of educational settings. However, conjunction of structural and conjunctural reasons has led to the relative absence of emotion and emotion-related measurement in school settings in countries such as Uruguay. Some of these reasons are the scarcity of culturally adapted scales between Spanish-speaking countries (Vallejo et a., 2017), the unavailability of educational psychologists to administer and interpret them, and the ignorance or/and skepticism about psychometric scales in the Uruguayan educational community.

This document presents preliminary data about the validity of a Emotion Words dictionary in Rioplatense Spanish, and a simple procedure to apply it and obtain interpretable data about emotional experience and emotion-related content using text mining tools on open-ended texts that enumerate emotion words (EW).

Antecedents

This approach is conceptually based on the word frequency paradigm (Boyd, 2017, Vina, Boyd & Pennebaker, 2020). It assumes the “word as attention principle” (Boyd & Schwartz, 2021), understanding that the words used to describe emotional experience do not reflect the experience itself but the way the person attends to affective states. For example, a report about the main emotions experienced during a period of time should be interpreted as a window in the way the person attends and reflects about its own experience, and only indirectly as a measure of actual emotional experience -and with caution; see Mauss & Robinson, 2009-.

The essence of the process is similar to LIWC (Pennebaker et al, 2015), the most common software in psychological text analysis, as both methods rely on word counting and dictionary-based analysis. It also shares features with the sentiment analysis (SA) tradition (Kwartler, 2017), and concretely with the SA applied in education (Misuraca et al 2020, Zhou y Ye, 2020). However, there are some specificities, both in objectives and mechanisms. First, LIWC uses a highly complex dictionary with as much as 70 dimensions, while this dictionary is focused only in the valence or polarity understood in relation with pleasantness. Also, both LIWC and SA approaches are meant to analyze a very ample range of different kinds of text; in contrast, this process is centered in a very specific one: enumeration of EW. In addition, while one of the main attributes in SA and LIWC tradition is dealing with big or relatively big datasets, this method is well adapted for relatively small ones. Finally, this procedure contrasts with a typical SA in education in the sample used (middle/high school, non-virtual education), the language of text (Spanish) and the approach (dictionary-based + manual) (Zhou y Ye, 2020).

To show the entire process, we apply it on two an ad hoc collection of uruguayan middle/high school students’ (n=322, n=207) and teachers’ (n=61, n=33) answers to questions about emotional experience.

We briefly present some of the technical basis of the procedure´s code, written in R (R Core Team, 2020), using RStudio (RStudio Team, 2020). The code and the final dictionary are presented for reuse and adaptation.

In the final part, we discuss some possible applications in educational settings. We also suggest possible uses of this process to integrate open-ended questions in traditional measures (both self-report and task-based) of emotional vocabulary and social-emotional skills. We conclude by making explicit some limitations of this approach.

2. Procedures and Methods

Methods

Context and objectives

During 2021, the COVID-19 pandemics forced Uruguayan educational authorities to suspend face-to-face teaching during all the first semester. This situation deeply affected AUIC Middle and High School, from the city of Rosario. Before returning to face-to-face education in the second semester, it was considered useful to gather students and teachers report about their emotional experience, the emotions associated with the school, the level of negative affectation they face in different areas (such as mental health, economy, etc.), and the main sources of stress and well-being both in general and in the school. In this paper we only report the questions where the procedure was applied.

The main objective was to get a clear and reliable picture of reported emotional experience whose procedures and results could be easily interpreted by teachers and the principal team. The process of gathering, preparing and analyzing the data was constrained by time pressures, with a maximum of three weeks to complete the entire process.

Design

The presented analysis could be described as instrumental research, as is committed to the development and evaluation of the assessment procedure and dictionary. The project was pre-registered in OSF on 12/10/2021 www.osf.io/7peb4

The heart of the procedure could be expressed in four successive steps.

  • First, the formulation of open-ended questions that limit the form but not the content of answers.

  • Second, the gathering of the data.

  • Third, the processing of the subjects’ answers.

  • Fourth, the analysis of the dictionary.

Each of these stages are detailed in the next sections.

Formulation of questions and Instruments

An ad hoc questionnaire was created, with 3 demographic questions (role, gender and school grade), 1 closed-ended questionnaire about affectation during pandemic, and 1 open-ended questions. In the second collection, PANAS and SWLS were used, instead of the close-ended questionnaire about affectation.

The open-ended questions where the procedure was applied were written considering three main criteria: They use “transparent” language; they suggest an answer style (enumeration of emotion words) but not a content, encouraging people to use familiar expressions, even if they are colloquial; they allow a very ample range of emotion-related concepts (emotions, but also moods and sentiments).

A previous version of the questions was discussed with teachers and the principal in a virtual meeting. In this article we will center in this question: Which are the main emotions and moods experienced during the first semester? Enumerate it (Original spanish question: ¿Cuáles han sido las principales emociones y estados de ánimo que has experimentado desde marzo hasta ahora? Enumerá el máximo que puedas).

Data gathering

The data collection process was centralized by the principal office, using the normal media for communication. The questionnaire -divulged via Google Forms- was active during one week in the firs collection and three weeks in the second one. A message and two reminders were sent, inviting the school community to complete it. Considering privacy and other ethical issues, the participation was voluntary and anonymous. Informed consent was signed by students parents; teacher sign a digital consent.

  • Collection 1 25/6/2021 - 2/7/2021. The sample was composed of 322 students (from second year of middle school to the last year of high school) and 61 teachers/school workers, representing 69% and 48% of the population. 257 participants identified themselves as women, 120 as men, 2 did not identify themselves as man or woman, and 4 preferred to not report their gender identity.

  • Collection 2 24/11/2021 - 24/12/2021. The sample was composed of 207 students (from first year of middle school to the last year of high school) and 33 teachers/school workers. 159 participants identified themselves as women, 73 as men, 4 did not identify themselves as man or woman, and 4 preferred to not report their gender identity.

Dictionary development

The construction of lexicons is one of the main tasks in the dictionary-based sentiment analysis and word count tradition. The compilation of the dictionaries has been done automatically and manually (Özdemir & Bergler, 2015). The manual methods could involve reasoned assignment from the author (Nielsen, 2011), crowdsourcing (Mohammed & Turney, 2013) or expert judges (Pennebaker et al. 2015).

In this particular dictionary, the words were extracted from the bottom-up, combining material from the data collections and other useful materials generated in the normal school activities (mainly, anonymous self-evaluations and reflexive writing home-works). The classification was done manually by NM, considering the normal use of the words and the rates (independent of this study) of 29 students of usual words. The dictionary has three categories: positive (153), negative (235 stems) and ambivalent (68). The ambivalent list includes both emotional states with no clear valence (e.g. “pensativa”) and words that can be interpreted as positive or negative depending on the context (e.g. “mutación”, “resignificación”).

Both enrichment via text analysis and categorization of expert judges are planned as part of this project.

Data transformation procedures

Preparation.

Once the raw data is available, RStudio session is prepared and the data is imported. We install the packages we need: the essential ones are the tidyverse (Wickham et al., 2019), tidytext (Silge & Robinson, 2016) and SnowballC (Bouchet-Valat, 2020). We also use ggwordcloud (Le Pennec & Slowikowski, 2019) and ggcorrplot (Kassambara, 2019) to create graphics, and rebus (Cotton, 2017) and tm (Feinerer & Hornik 2020) for some details.

library(tm)
library(tidyverse)
library(stringr)
library(tidytext)
library(rebus)
library(viridis)
library(SnowballC)
library(syuzhet)
library(widyr)
library(ggraph)
library(ggwordcloud)
library(ggcorrplot)

#Cargamos los datos que vamos a usar
SegundaRecoleccionBruta <- read.csv("~/R/LexicoDeEmociones-2-Liceo.csv", header = TRUE, sep = ",", dec = ".", comment.char = "", strip.white = TRUE, stringsAsFactors = FALSE, encoding="UTF-8")

Primera_recoleccion_bruta <- read.csv("~/R/Estados de ánimo y Liceo.csv", header = TRUE, sep = ",", dec = ".", comment.char = "", strip.white = TRUE, stringsAsFactors = FALSE, encoding="UTF-8")

Diccionario_de_emociones_versiónenero2022 <- read.csv("~/R/Diccionario_de_emociones_versiónenero2022.csv", header = TRUE, sep = ",", dec = ".", comment.char = "", strip.white = TRUE, stringsAsFactors = FALSE, encoding="UTF-8")

Calculate negative affection index.

We calculate a general index of negative affectation during the semester. This measure was obtained by summing the punctuation of subjects in a 3 points Likert scale, reporting how much negative affection they perceive in 8 areas of life (family life, friendship, school, extra activities, intimacy, economy, health and mental health). This is considered an imperfect but synthetic measure of the negative stimuli experienced during the last semester. It will be used to evaluate the external validity of the dictionary.

Primera_recoleccion <- Primera_recoleccion_bruta %>% 
  mutate(rol = ifelse(rol == "Soy docente o funcionario/a", "docente", "estudiante")) %>% 
  mutate(afect_familia = factor(afect_familia, levels = c("Nada afectada", "Un poco afectada", "Muy afectada")), 
         afect_salud = factor(afect_salud, levels = c("Nada afectada", "Un poco afectada", "Muy afectada")), 
         afect_intimidad = factor(afect_intimidad, levels = c("Nada afectada", "Un poco afectada", "Muy afectada")),
         afect_economia = factor(afect_economia, levels = c("Nada afectada", "Un poco afectada", "Muy afectada")),
         afect_estudios = factor(afect_estudios, levels = c("Nada afectada", "Un poco afectada", "Muy afectada")), 
         afect_saludmental = factor(afect_saludmental, levels = c("Nada afectada", "Un poco afectada", "Muy afectada")), 
         afect_amistad = factor(afect_amistad, levels = c("Nada afectada", "Un poco afectada", "Muy afectada")), 
         afect_actividadesextra = factor(afect_actividadesextra, levels = c("Nada afectada", "Un poco afectada", "Muy afectada"))) %>%
  mutate(afect_familia = case_when(afect_familia == "Nada afectada" ~ 0,
                                   afect_familia == "Un poco afectada" ~ 1,
                                   afect_familia == "Muy afectada" ~ 2),
         afect_economia = case_when(afect_economia == "Nada afectada" ~ 0,
                                   afect_economia == "Un poco afectada" ~ 1,
                                   afect_economia == "Muy afectada" ~ 2),
         afect_salud = case_when(afect_salud == "Nada afectada" ~ 0,
                                   afect_salud == "Un poco afectada" ~ 1,
                                   afect_salud == "Muy afectada" ~ 2),
         afect_estudios = case_when(afect_estudios == "Nada afectada" ~ 0,
                                   afect_estudios == "Un poco afectada" ~ 1,
                                   afect_estudios == "Muy afectada" ~ 2),
         afect_amistad = case_when(afect_amistad == "Nada afectada" ~ 0,
                                   afect_amistad == "Un poco afectada" ~ 1,
                                   afect_amistad == "Muy afectada" ~ 2),
         afect_saludmental = case_when(afect_saludmental == "Nada afectada" ~ 0,
                                   afect_saludmental == "Un poco afectada" ~ 1,
                                   afect_saludmental == "Muy afectada" ~ 2),
         afect_actividadesextra = case_when(afect_actividadesextra == "Nada afectada" ~ 0,
                                   afect_actividadesextra == "Un poco afectada" ~ 1,
                                   afect_actividadesextra == "Muy afectada" ~ 2), 
         afect_intimidad = case_when(afect_intimidad == "Nada afectada" ~ 0,
                                   afect_intimidad == "Un poco afectada" ~ 1,
                                   afect_intimidad == "Muy afectada" ~ 2)) %>%
  mutate(afect_economia = ifelse(is.na(afect_economia), 0, afect_economia), 
         afect_familia = ifelse(is.na(afect_familia), 0, afect_familia),
         afect_salud = ifelse(is.na(afect_salud), 0, afect_salud),
         afect_estudios = ifelse(is.na(afect_estudios), 0, afect_estudios),
         afect_amistad = ifelse(is.na(afect_amistad), 0, afect_amistad),
         afect_saludmental = ifelse(is.na(afect_saludmental), 0, afect_saludmental),
         afect_actividadesextra = ifelse(is.na(afect_actividadesextra), 0, afect_actividadesextra),
         afect_intimidad = ifelse(is.na(afect_intimidad), 0, afect_intimidad)) %>%
  mutate(indice_de_afectacion = afect_familia + afect_economia + afect_actividadesextra + afect_salud + afect_estudios + afect_amistad + afect_intimidad + afect_saludmental) %>% 
  select(rol, genero, clase, afect_familia, afect_economia, afect_salud, afect_estudios, afect_amistad, afect_intimidad, afect_saludmental, afect_actividadesextra, indice_de_afectacion, emociones_generales) %>% 
  mutate(sujeto = row_number())

Dictionary appliance

  • We remove punctuation, normalize to lowercase and tokenize.

  • Stemming. As we are interested in the content, it’s necessary to reduce variability due to inflection. One approach for doing this is stemming. We use the SnowballC package, with the function wordStem in spanish. It’s important to note that this stemming process usually doesn’t identify the actual word stem. For example, in the case of “ansiedad” and “ansioso”, it will fail to find only one stem, producing “ansied” and “ansio”. Also is important to acknowledge that if a word is written with spelling mistakes (e.g. “anciedad”), this process will produce a correlative stem (“ancied”). Because of this, it’s expected to have more stems than reconstructed words as “ansiedad”, “ansioso” and “anciedad” will be reconstructed as one term: “ansiedad”.

  • Inner join with dictionary. Each stem that appears in the response and in the dictionary will be kept, conserving all the metadata from PD and the information from the dictionary (reconstruction and polarity). If the dictionary is well constructed, this process should keep and annotate all meaningful words and filter non significative ones (noise).

#Se calculan proporciones de palabras
#First collection

Primera_recoleccion_preparada <- Primera_recoleccion %>% 
  unnest_tokens(word, emociones_generales, to_lower = TRUE) %>%
  mutate(raiz = wordStem(word, "spanish")) %>% 
  inner_join(Diccionario_de_emociones_versiónenero2022, by = c("raiz" = "raiz")) %>% 
  group_by(sujeto) %>% 
  count(polaridad, rol, genero, clase, afect_familia, afect_economia, afect_salud, afect_estudios, afect_amistad, afect_intimidad, afect_saludmental, afect_actividadesextra, indice_de_afectacion) %>% 
  spread(polaridad, n) %>% 
  mutate(ambivalente = ifelse(is.na(ambivalente), 0, ambivalente), positiva = ifelse(is.na(positiva), 0, positiva), negativa = ifelse(is.na(negativa), 0, negativa))%>%
  mutate(palabras_escritas = sum(ambivalente, negativa, positiva), ambivalentes = ambivalente/palabras_escritas, negativas = negativa/palabras_escritas, positivas = positiva/palabras_escritas) %>%
  ungroup()

#Second collection

Segunda_recoleccion_preparada <- SegundaRecoleccionBruta %>% 
  select(Género, Rol, Clase, Emociones, EmocionesLiceo , AP , AN , SWLS) %>% 
  mutate(Emociones = removeNumbers(Emociones), EmocionesLiceo = removeNumbers(EmocionesLiceo)) %>% 
  mutate(sujeto = row_number()) %>% 
  unnest_tokens(word, Emociones, to_lower = TRUE) %>%
  mutate(raiz = wordStem(word, "spanish")) %>% 
  inner_join(Diccionario_de_emociones_versiónenero2022, by = c("raiz" = "raiz")) %>% 
  group_by(sujeto) %>% 
  count(polaridad, Género, Rol, AN, AP, SWLS) %>% 
  spread(polaridad, n) %>% 
  mutate(ambivalente = ifelse(is.na(ambivalente), 0, ambivalente), positiva = ifelse(is.na(positiva), 0, positiva), negativa = ifelse(is.na(negativa), 0, negativa))%>%
  mutate(palabras_escritas = sum(ambivalente, negativa, positiva), ambivalentes = ambivalente/palabras_escritas, negativas = negativa/palabras_escritas, positivas = positiva/palabras_escritas) %>%
  mutate(interac_negativo = negativas * AN, interac_positivo = positivas * AP) %>%
  ungroup()

Dictionary and prepared datasets

The dictionary: its present version (January, 2022)

Diccionario_de_emociones_versiónenero2022

Prepared Data: First Collection

Primera_recoleccion_preparada

Prepared Data: Second Collection

Segunda_recoleccion_preparada

3. Results: assessing validity

Predictive validity

Correlation with self-reported difficulties experienced during pandemic

We filter the 25% with less emotion words (as the measure is based in proportion, too little words can bias the measure)

paracorrelacionar_Primera_recoleccion <- Primera_recoleccion_preparada %>% filter(palabras_escritas >= 3) %>% select(negativas, ambivalentes, positivas, indice_de_afectacion) 
matrizcorrelacion_Primera_recoleccion <- paracorrelacionar_Primera_recoleccion  %>% cor() %>% data.frame()

p.mat_Primera_recoleccion <- cor_pmat(paracorrelacionar_Primera_recoleccion)

ggcorrplot(matrizcorrelacion_Primera_recoleccion, p.mat = p.mat_Primera_recoleccion, type = "upper", outline.col = "white", lab = TRUE, color = ) +
  ggtitle("Correlación entre emociones y afectación") +
  theme_classic() +
  labs(y="", x="") +
  theme(legend.position="none")

Correlation with SWLS

We filter the 25% with less emotion words (as the measure is based in proportion, too little words can bias it)

paracorrelacionar_Segunda_recoleccion <- Segunda_recoleccion_preparada %>% filter(palabras_escritas >= 2) %>% select(ambivalentes, negativas, positivas, AP, AN, SWLS)

matrizcorrelacion_Segunda_recoleccion <- paracorrelacionar_Segunda_recoleccion  %>% cor() %>% data.frame()

p.mat_Segunda_recoleccion <- cor_pmat(paracorrelacionar_Segunda_recoleccion)

ggcorrplot(matrizcorrelacion_Segunda_recoleccion, p.mat = p.mat_Segunda_recoleccion, type = "upper", outline.col = "white", lab = TRUE, color = ) +
  ggtitle("Correlación con SWLS") +
  labs(y="", x="", subtitle = 
  "  SWLS: satisfacción con la vida 
  AP:Afecto positivo (PANAS)
  AN: Afecto negativo (PANAS)
  positivas/negativas/ambivalentes: proporción de palabras") +
  theme_classic() +
  theme(legend.position="none")

The proportion of EW significantly correlates with SWLS and difficulties experienced during pandemics. In the case of SWLS, the correlation is equivalent to the one exhibit by PANAS and SWLS.

Interaction

Interestingly, the interaction between PANAS and EW has a higher correlation; it´s a small difference, however; it is recommendable to explore this relationship in future research. In any case, the possible complementarity between Likert scales and automated analysis of open-ended answers is an important topic to discuss.

The relation between affectation, emotional experience and life satisfaction is expectable (Forte et al., 2021), and while the weak correlation could be surprising, it’s consistent with classic philosophy (e.g. Marco Aurelio, 2005) and psychological research (e.g. Diener, Suh, Lucas & Smith, 1999) about the modest influence of external events compared with personal traits in the emotional experience, and the existence of other factors beside emotion to understand happiness as satisfaction with life.

paracorrelacionar_Segunda_recoleccion3 <- Segunda_recoleccion_preparada %>% filter(palabras_escritas >= 2) %>% select(interac_negativo, interac_positivo, SWLS)

matrizcorrelacion_Segunda_recoleccion3 <- paracorrelacionar_Segunda_recoleccion3  %>% cor() %>% data.frame()

p.mat_Segunda_recoleccion3 <- cor_pmat(paracorrelacionar_Segunda_recoleccion3)

ggcorrplot(matrizcorrelacion_Segunda_recoleccion3, p.mat = p.mat_Segunda_recoleccion3, type = "upper", outline.col = "white", lab = TRUE, color = ) +
  ggtitle("Correlación de interacción open-likert con SWLS") +
  labs(y="", x="", subtitle = 
  "  SWLS: satisfacción con la vida 
  Interac_negativo= negativas * AN
  Interac_positivo= positivas * AP
  ") +
  theme_classic() +
  theme(legend.position="none")

Difference in emotion report between First and Second Collection

As expected, statistically significant differences were found between the measure performed in a crisis situation (first recollection) and in a normal situation (second recollection).

t.test(Primera_recoleccion_preparada$positivas, Segunda_recoleccion_preparada$positivas, alternative = "two.sided", var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  Primera_recoleccion_preparada$positivas and Segunda_recoleccion_preparada$positivas
## t = -8.3033, df = 386.02, p-value = 1.727e-15
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.2733170 -0.1686617
## sample estimates:
## mean of x mean of y 
## 0.2519993 0.4729886
t.test(Primera_recoleccion_preparada$negativas, Segunda_recoleccion_preparada$negativas, alternative = "two.sided", var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  Primera_recoleccion_preparada$negativas and Segunda_recoleccion_preparada$negativas
## t = 9.3979, df = 391.26, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.2018239 0.3086060
## sample estimates:
## mean of x mean of y 
## 0.7203693 0.4651544

Convergent validity

Correlation with PANAS

paracorrelacionar_Segunda_recoleccion2 <- Segunda_recoleccion_preparada %>% filter(palabras_escritas >= 2) %>% select(ambivalentes, negativas, positivas, AP, AN)

matrizcorrelacion_Segunda_recoleccion2 <- paracorrelacionar_Segunda_recoleccion2  %>% cor() %>% data.frame()

p.mat_Segunda_recoleccion2 <- cor_pmat(paracorrelacionar_Segunda_recoleccion2)

ggcorrplot(matrizcorrelacion_Segunda_recoleccion2, p.mat = p.mat_Segunda_recoleccion2, type = "upper", outline.col = "white", lab = TRUE, color = ) +
  ggtitle("Correlación con PANAS") +
  labs(y="", x="") +
  theme_classic() +
  theme(legend.position="none")

Convergence with other dictionaries

Contingency table of words. Relation with other dictionaries (AFINN, NRC, LIWC)

To assess convergent validity of the valence assignment,we compare it with three highly used lexicons: AFINN (Nielsen, 2011), NRC (Mohammed & Turney, 2013) and LIWC (PEnnebaker et al. 2015), in their spanish versions. We create a 8196 terms lexicon including the combination of a simplified version of AFINN (where integers were transformed in positive/negative categories) and the positive/negative categories in NRC.

195 of the terms where present in this mixed lexicon. In this subset of the words, the discrepancy is very small: only three words were categorized as positive in the dictionary and negative in AFINN/NRC. The words were “compromiso”, “fuerte” and “imponente”, which certainly have a positive valence in this context. In negative terms, the discrepancy were in “sueño” and “inquietud”, clearly with negative valence in this context -“inquietud” as a reference to stress, and “sueño” as need to sleep-.

#Se carga AFINN
afinn <- read.csv("~/R/lexico_afinn.csv", header = TRUE, sep = ",", dec = ".", comment.char = "", strip.white = TRUE, stringsAsFactors = FALSE, encoding="UTF-8") %>% mutate(sentimiento = ifelse(puntuacion < 0, "negativo", "positivo")) %>% mutate(word = palabra, puntuacion = NULL, palabra = NULL) 

#Se carga NRC
positive_or_negative <- c("positive", "negative")
nrc <- read.csv("~/R/lexico_nrc.csv", header = TRUE, sep = ",", dec = ".", comment.char = "", strip.white = TRUE, stringsAsFactors = FALSE, encoding= "UTF-8") %>% filter(sentiment %in% positive_or_negative) %>% mutate(sentimiento = ifelse(sentiment == "positive", "positivo", "negativo"), sentiment = NULL, X = NULL, value = NULL)

#Se carga LIWC
liwc_simple <- read.csv("~/R/liwc_negativas_y_positivas_tidy.csv", header = TRUE, sep = ",", dec = ".", comment.char = "", strip.white = TRUE, stringsAsFactors = FALSE, encoding="UTF-8") %>% mutate(sentimiento = polaridad, X=NULL, polaridad=NULL)

#Se crea un diccionarios general
afinn_nrc_liwc <- bind_rows(afinn, nrc, liwc_simple) %>% mutate(sentimiento = str_replace_all(sentimiento, "negativo", "negativa")) %>% mutate(sentimiento = str_replace_all(sentimiento, "positivo", "positiva"))

#Se combinan el diccionario y afinn_nrc_liwc 
diccionarioconjunto <- Diccionario_de_emociones_versiónenero2022 %>% inner_join(afinn_nrc_liwc, by = c("reconstruccion" = "word")) %>% count(reconstruccion, polaridad, sentimiento)

#Se realiza tabla de contingencia
contingencia <- table(diccionarioconjunto$polaridad, diccionarioconjunto$sentimiento)
contingenciaporcentual <- prop.table(table(diccionarioconjunto$polaridad, diccionarioconjunto$sentimiento)) * 100
contingenciaporcentual
##              
##                 negativa   positiva
##   ambivalente  6.3725490  7.3529412
##   negativa    47.5490196  0.9803922
##   positiva     1.4705882 36.2745098
#Se filtran las palabras que no coinciden
diccionarioconjunto %>% filter(sentimiento=="negativa", polaridad == "positiva")
diccionarioconjunto %>% filter(sentimiento=="positiva", polaridad == "negativa")

4. Discussion

This article presents a procedure that goes from open-ended answers enumerating EW to a clean dataset that provides qualitative insight on emotional vocabulary, and is susceptible to quantitative analysis. As shown, the process is simple and rule-based, overcoming two of the difficulties of manual coding: the lack of transparency of the process, and the time consumption. As mentioned in Introduction, it also provides a way to deal with some frequent close-ended questionnaire bias.

Which could be the use of such a procedure? As a general procedure, it could be applied in an ample range of situations. Concretely in the school setting, some uses could be:

  • Regular tracking of emotional experience in school, or emotions associated with concrete aspects of school (teachers, activities, grade system, etc.).

  • Experience sampling in school (for example, to evaluate an intervention program).

  • Use in combination with other measures (e.g. emotional intelligence, IQ, socioeconomic variables, etc.), providing insight into the factors related with emotional experience in school.

  • Development of task-based measures of emotional skills that include open-ended questions. The obvious use is in the evaluation of emotional vocabulary, but more refined options are viable. For example, in a task of recognition of emotion in faces or eyes (as Baron-Cohen et al., 2001), instead of giving a list of emotions it’s possible to ask participants to write their own words, using and upgrading a dictionary to include vocabulary variation.

It’s important to ponder some of the limitations and strengths of this approach. As it departs from unstructured data, it implies some extra work compared with a close-ended questionnaire. Also, while this approach can avoid or deal with some forms of bias, if it is used in a self-report questionnaire it is still susceptible to some of the most common bias in the design and administration of self-report questionnaires (Choy & Pak, 2005). It is also fundamental to note that this procedure is centered in unigrams; some adaptations should be made to work with n-grams.

5. Conclusion

This article describes a text analysis procedure that transforms open-ended answers in a dataset susceptible to quantitative analysis. We explore this procedure by applying it on two an ad hoc gathered dataset of answers about emotional experience in students and teachers from a high school in Rosario, Uruguay.

Briefly, the procedure implies a precise formulation of the question, a preprocessing of the data, and the upgrading and appliance of a dictionary. If the dictionary upgrading is done properly, the output should be robust against regionalisms, spelling errors, typos and slang. Also is expected that after some appliance of the procedure in a given population, the dictionary will tend to stability (implying minimum manual work).

Even with its multiple limitations, this procedure offers interesting possibilities for assessment in educational settings. It could be used on its own, or as a complement of self-report scales, task-based measurements and open-ended questions. Also, as a general process, it could be integrated in task-based emotional skill measurement systems.

References

Baron-Cohen, S. , Wheelwright, S., Hill, J., Raste, Y. & Plumb, I. (2001). The ‘‘ Reading the Mind in the Eyes’’ Test Revised Version :A Study with Normal Adults, and Adults with Asperger Syndrome or High-functioning Autism. Child Psychol. Psychiat., 42(2), 241–251

Boyd, R. (2017) Psychological Text Analysis in the Digital Humanities. In: Hai-Jew S. (eds) Data Analytics in Digital Humanities. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-54499-1_7

Boyd, R. & Schwartz, H. (2021). Natural language analysis and the psychology of verbal behavior: The past, present, and future states of the field. ​Journal of Language and Social Psychology​, 40(1), 1–21.

Bouchet-Valat, M. (2020). Snowball Stemmers Based on the C “libstemmer” UTF-8 Library. https://cran.r-project.org/web/packages/SnowballC/SnowballC.pdf

Biesta, G. (2014). Measuring what we Value or Valuing what we Measure? Globalization, Accountability and the Question of Educational Purpose. Pensamiento Educativo. Revista de Investigación Educacional Latinoamericana, 51(1), 46-57.

Choy B., & Pak, A. (2005). A catalog of biases in questionnaires. Preventive chronic disease, 2(1), 13. Cotton, R. (2017). rebus: Build Regular Expressions in a Human Readable Way. R package version 0.1-3. https://CRAN.R-project.org/package=rebus

Diener, E., Suh, M. Lucas, E y Smith, H. (1999). Subjective Well-Being: Three Decades of Progress. Psychological Bulletin, 125, 276-302.

Feinerer, I. & Hornik, K. (2020). tm: Text Mining Package. R package version 0.7-8. https://CRAN.R-project.org/package=

Foucault, M. (2002). Vigilar y castigar. Buenos Aires: Siglo XXI

Forte, A., Orri, M., Brandizzi, M., Iannaco, C., Venturini, P., Liberato, D., … Monducci, E. (2021). “My Life during the Lockdown”: Emotional Experiences of European Adolescents during the COVID-19 Crisis. International Journal of Environmental Research and Public Health, 18(14), 7638.

Kassambara, K. (2019). ggcorrplot: Visualization of a Correlation Matrix using ‘ggplot2’. R package version 0.1.3. https://CRAN.R-project.org/package=ggcorrplot

Kwartler, T. (2016). Sentiment analysis in R. Datacamp course. https://www.datacamp.com/courses/sentiment-analysis-in-r

Kwartler, T. (2017). Text mining in R. Sussex: Wiley & sons.

Le Pennec, E. & Slowikowski, K. (2019). ggwordcloud: A Word Cloud Geom for ‘ggplot2’. R package version 0.5.0. https://CRAN.R-project.org/package=ggwordcloud

Mauss, I. & Robinson, M. (2009). Measures of emotion: A review. Cognition and Emotion, 23(2). 209-237.

Marco Aurelio (2005). Meditaciones. Madrid: Gredos.

Mohammad, S. M., & Turney, P. D. (2013). Crowdsourcing a word–emotion association lexicon. Computational Intelligence, 29(3), 436–465.

Misuraca, M., Forciniti, A., Scepi, G., & Spano, M. (2020). Sentiment Analysis for Education with R: packages, methods and practical applications. ArXiv, abs/2005.12840.

Nielsen, F. A. (2011). A new ANEW: evaluation of a word list for sentiment analysis in microblogs. In Proceedings of the ESWC2011 Workshop on ’Making Sense of Microposts’: Big things come in small packages, Heraklion, Crete, Greece, May 30, 2011, 93–98.

Özdemir, C. & Bergler, S. (2015). A Comparative Study of Different Sentiment Lexica for Sentiment Analysis of Tweets. Proceedings of Recent Advances in Natural Language Processing, 488–496, Hissar, Bulgaria.

Pennebaker, J.W., Boyd, R.L., Jordan, K., & Blackburn, K. (2015). ​The development and psychometric properties of LIWC2015​. Austin: University of Texas at Austin. R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

RStudio Team (2020). RStudio: Integrated Development for R. RStudio, PBC, Boston, MA. URL: http://www.rstudio.com/. Silge, J. & Robinson, D. (2016). tidytext: Text Mining and Analysis Using Tidy Data Principles in R. JOSS, 1(3). doi: 10.21105/joss.00037, http://dx.doi.org/10.21105/joss.00037.

Vallejo, P., Gómez, M., Machal, L,. Saavedra, A., Soler, F. & Morales, A. (2017). Developing Guidelines for Adapting Questionnaires into the Same Language in Another Culture. Terapia Psicológica, 35(2), 181-19.

Vine, V., Boyd, R. & Pennebaker, W. (2020). Natural emotion vocabularies as windows on distress and well-being. Nature Communications, 11, 4525.

Watson, D., Clark, L. A., y Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: the PANAS scales. Journal of personality and social psychology, 54(6), 1063-1070.

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T., Miller, E., Bache, S., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.

Zhou, J., & Ye, J. (2020). Sentiment analysis in education research: a review of journal publications. Interactive Learning Environments, 1–13. https://doi.org/10.1080/10494820.2020.1826985