Team-9 Student_heatmap

Anshul & Aaditya

TEAM - 9

rpub link : https://rpubs.com/PhEnOMeoN/1309652

Q - Develop a heatmap of student attendance across subjects and weeks during the semester.

Step 1: Load Packages and Data

# Load required libraries for data handling and visualization
library(tidyverse)     # Includes ggplot2 and dplyr for plotting and data wrangling
library(lubridate)     # For working with date columns
library(scales)        # For formatting axis/legend labels (e.g., percentages)
library(viridis)       # For colorblind-friendly color palettes

# Read the CSV file into R (update the filename/path if needed)
attendance_df <- read_csv("2018-2019_Daily_Attendance_20240429.csv")

Step 2: Convert and Prepare Columns

attendance_df <- attendance_df %>%
  mutate(
    Date = ymd(Date),            # Convert to date
    Week = isoweek(Date),        # Extract ISO week
    AttendanceRate = Present / Enrolled  # Compute attendance rate
  )

Step 3: Group by School and Week

heatmap_data <- attendance_df %>%
  group_by(`School DBN`, Week) %>%                     # Group data by school and week number
  summarise(AvgAttendance = mean(AttendanceRate, na.rm = TRUE)) %>%  # Compute weekly average attendance
  ungroup()                                            # Remove grouping to return regular dataframe

Step 4: Filter to Top 20 Schools

top_schools <- attendance_df %>%
  count(`School DBN`, sort = TRUE) %>%  # Count how many records each school has
  slice_head(n = 20) %>%                # Keep only the top 20 schools by record count
  pull(`School DBN`)                    # Extract just the school names into a vector

heatmap_filtered <- heatmap_data %>%
  filter(`School DBN` %in% top_schools)  # Keep only the records from the top 20 schools

Step 5: Create the Heatmap