LA-1 Presentation

Author

Nagendra P, yashwanth

Problem Statement : Use a beeswarm plot to represent exam score distributions across departments.

Objective;

This R script visualizes exam and course grade distributions using beeswarm plots.

It generates mock student data with exam scores, sex, and semester.

Course grades are computed as the average of three exam scores.

Beeswarm plots are created to compare grade distributions by sex and semester, with optional boxplot overlays.

Step 1 : Load the required library

  # Only once
library(beeswarm)

This package allows you to create beeswarm plots, which are useful for visualizing the distribution of a numeric variable grouped by categories (e.g., grades by sex or semester).

Step 2 : Generate mock student data

set.seed(123)
student_data <- data.frame(
  semester = rep(c("Spring", "Fall"), length.out = 233),
  sex = sample(c("Man", "Woman"), 233, replace = TRUE),
  exam1 = round(runif(233, 50, 100), 1),
  exam2 = round(runif(233, 50, 100), 1),
  exam3 = round(runif(233, 50, 100), 1)
)

# Course grade as average
student_data$course_grade <- round(rowMeans(student_data[, c("exam1", "exam2", "exam3")]), 1)

This code creates a data frame student_data with 233 students. Each student has the following:

  • semester: Alternates between “Spring” and “Fall”.

  • sex: Randomly assigned as “Man” or “Woman”.

  • exam1, exam2, exam3: Random scores between 50 and 100 for three exams.

  • course_grade: The average of the three exam scores, rounded to one decimal place.

  • The set.seed(123) ensures the random values are reproducible.Step 3 : Beeswarm plot: Course grade distribution by sex

Step 3 : Beeswarm plot: Course grade distribution by sex

beeswarm(course_grade ~ sex, data = student_data,
         col = c("skyblue", "pink"),
         pch = 16,
         main = "Course Grades by Sex",
         xlab = "Sex",
         ylab = "Course Grade")

The code creates a beeswarm plot showing course grades by sex:

  • x-axis: Sex (Man or Woman).

  • y-axis: Course grades.

  • Colors: “Skyblue” for men, “Pink” for women.

  • Plot style: Solid circles (pch = 16), with a title and axis labels.

# Step 4 :  Beeswarm plot: Exam1 distribution by semester

::: {.cell}

```{.r .cell-code}
beeswarm(exam1 ~ semester, data = student_data,
         col = c("lightgreen", "orange"),
         pch = 16,
         main = "Exam 1 Grades by Semester",
         xlab = "Semester",
         ylab = "Exam 1 Grade")
```

::: {.cell-output-display}
![](project17-presentation_files/figure-html/unnamed-chunk-4-1.png){width=672}
:::
:::
  • x-axis: Displays semester (Spring or Fall).

  • y-axis: Displays Exam 1 grade

  • Colors: “lightgreen” for Spring and “orange” for Fall.

  • Points: Solid circles (pch = 16)

  • Title: “Exam 1 Grades by Semester”.

  • Labels: x-axis labeled as “Semester” and y-axis labeled as “Exam 1 Grade”.

This plot shows the distribution of Exam 1 grades for each semester.

# Beeswarm with boxplot overlay
beeswarm(course_grade ~ sex, data = student_data,
         col = c("skyblue", "pink"),
         pch = 16,
         method = "center",
         main = "Course Grades by Sex with Boxplot",
         xlab = "Sex",
         ylab = "Course Grade")

boxplot(course_grade ~ sex, data = student_data, 
        add = TRUE, border = "gray40", col = NA)

This code creates a beeswarm plot of course grades by sex, with a boxplot overlay:

  • Beeswarm: Shows individual grades (colored points).

  • Boxplot: Adds median and quartiles (gray outline)

  • Colors: Skyblue for men, pink for women

  • Purpose: Compare grade distribution and summary stats by sex.

Conclusion:

The beeswarm plot with boxplot overlay shows:

  • Individual variation in course grades for each sex.

  • Boxplots summarize the data with median and spread.

  • If both groups have similar medians and spreads, there’s no strong difference in performance.

  • If one group is higher or more spread out, it suggests possible performance or consistency differences.

    Visual inspection can hint at trends, but statistical tests (e.g., t-test) are needed for confirmation.