In modern academic curricula, electives play a pivotal role in personalizing a student’s education. By analyzing the path from foundational subjects to final-year electives, we gain valuable insights into student interests and academic flows. This report utilizes a synthetic dataset to trace the journey of students across three academic years using a Sankey diagram and a violin plot.
Objectives
Analyze student subject flow from Year 1 to Year 3.
Visualize transitions using a Sankey diagram.
Evaluate the distribution of Year 3 elective selections with a violin plot.
Interpret academic trends and derive insights for curriculum planning.
Dataset Description
Dataset Name: synthetic_grade_sheet.csv
Attributes:
Roll.Number: Unique student ID.
Subject_1 to Subject_10: Core subject names over three years.
Elective: Final year elective selected by the student.
Justification:
The dataset simulates academic records over three years.
Contains detailed progression across subjects.
The Elective column provides meaningful endpoints for visualization.
Methodology
Step 1: Load Required Libraries
library(ggplot2)library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(networkD3)
Step 2: Load and Explore Data
data <-read.csv("synthetic_grade_sheet.csv")head(data)
Roll.Number Subject_1 Subject_2 Subject_3 Subject_4 Subject_5 Subject_6
1 21CS001 D C A C U U
2 21CS002 D B A A S C
3 21CS003 S A E D C D
4 21CS004 A D E C S A
5 21CS005 A B E S S D
6 21CS006 U S D B U D
Subject_7 Subject_8 Subject_9 Subject_10 Elective
1 B C C S Cyber Security
2 A B E E Data Science
3 U A E U Data Science
4 E S S U Data Science
5 A S E C Cyber Security
6 S U B A Data Science
data_violin <- data %>%select(Year3 = Elective) %>%pivot_longer(cols =everything(), names_to ="Year", values_to ="Subject")# Create Violin Plotggplot(data_violin, aes(x = Year, y = Subject, fill = Year)) +geom_violin(trim =FALSE, scale ="count", alpha =0.8) +geom_jitter(width =0.2, alpha =0.3, size =1) +theme_minimal(base_size =14) +labs(title ="Distribution of Elective Subject Choices",subtitle ="Violin plot showing frequency of Year 3 Elective Selections",x ="Academic Year",y ="Elective Subject",fill ="Year" ) +theme(plot.title =element_text(face ="bold"),legend.position ="none" )
Interpretation and Insights
Subject Retention: Several core subjects from Year 1 transition into Year 2, suggesting consistent academic paths.
Elective Popularity: Violin plot reveals electives with higher student interest, potentially due to better career outcomes or teaching.
Decision Bottlenecks: Sankey flows narrow around specific Year 2 subjects, indicating critical academic decision points.
Conclusion
This analysis effectively showcases student academic flows using clear visualizations. The insights can assist educators and administrators in optimizing subject offerings and improving student experience.