Week 09 - Posit Primer

Introduction

This visualization looks at the positive and negative connotations of student discussion posts in relation to their final grade across subjects.

Prepare the Environment

Load packages:

library(tidyverse)
library(ggplot2)

Load in the data:

course_text <- read_csv("data/course-text.csv")

Data Wrangling

Select columns of interest, name the subjects for easier interpretation, and drop NA values.

emotion_data <- course_text %>%
  select(subject,
         final_grade,
         posemo, 
         negemo) %>%
  mutate(subject = recode(subject, 
                          "AnPhA" = "Anatomy",
                          "BioA" = "Biology", 
                          "FrScA" = "Forensics", 
                          "OcnA" =  "Oceanography", 
                          "PhysA" = "Physics")) %>%
  drop_na()

Visualize

Now plot the data:

ggplot(emotion_data, aes(x = negemo, y = posemo, color = final_grade)) +
  geom_point() +
  scale_color_gradient(low = "blue", high = "green") + 
  ylim(0, 11) +
  xlim(0, 3) +
  labs(title = "Relationship between Negative and Positive Emotions by Subject",
       x = "Negative Emotion Score",
       y = "Positive Emotion Score") +
  theme_minimal() + 
  facet_wrap(~subject)
Warning: Removed 9 rows containing missing values (`geom_point()`).

Communicate

The graph shows that in courses like Oceanography, Forensics, and Anatomy, those who have no negative emotion ratings seem to have lower grades. However, this could because there was a low participation rating in the discussion posts, resulting in a lower grade. In Physics, those with higher negative tones in their discussion posts are likely to have lower grades. Strangely, it appears that in Biology, those with more extreme positive and negative scores are likely to have lower grades.

These variations across courses could show differences in discussion across courses, subject matter, or the classes themselves. It may also show how different teachers are grading discussion posts. Further analysis into the proportion of point completed, number of discussion posts, or specific emotions may reveal what might be the case.

The clarity of these visualizations were hindered by extreme values such as those over 15 in the positive axis or over 3 on the negative. Scaling the plots allowed for clear plotting to examine the data.