LA-01

Author

Harshitha and Deekshitha

This program analyzes student attendance data and visualizes patterns using a heatmap across different weeks and schools.

Loading libraries

library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.5.3
Warning: package 'lubridate' was built under R version 4.5.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
library(ggplot2)
library(dplyr)

Student Attendance Analysis

Load the dataset containing daily student attendance records

The dataset includes Date, Present, Absent, Enrolled, and School ID

Reading data from file

attendance <- read.csv("C:/Users/ADMIN/OneDrive/Documents/2018-2019_Daily_Attendance_20240429.csv")

Dataset Overview

names(attendance)
[1] "School.DBN" "Date"       "Enrolled"   "Absent"     "Present"   
[6] "Released"  

Convert the Date column into Date format for processing

attendance$Date <- as.Date(attendance$Date)

Display info

head(attendance)
  School.DBN        Date Enrolled Absent Present Released week
1     01M015 57223-06-11      172     19     153        0   24
2     01M015 57223-06-12      171     17     154        0   24
3     01M015 57223-06-13      172     14     158        0   24
4     01M015 57223-06-18      173      7     166        0   25
5     01M015 57223-06-19      173      9     164        0   25
6     01M015 57223-06-20      173     11     162        0   25
str(attendance)
'data.frame':   277153 obs. of  7 variables:
 $ School.DBN: chr  "01M015" "01M015" "01M015" "01M015" ...
 $ Date      : Date, format: "57223-06-11" "57223-06-12" ...
 $ Enrolled  : int  172 171 172 173 173 173 173 174 174 174 ...
 $ Absent    : int  19 17 14 7 9 11 10 7 7 8 ...
 $ Present   : int  153 154 158 166 164 162 163 167 167 166 ...
 $ Released  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ week      : num  24 24 24 25 25 25 25 25 26 26 ...

Displays summary

summary(attendance)
  School.DBN             Date                Enrolled        Absent      
 Length:277153      Min.   :57223-06-10   Min.   :   1   Min.   :   0.0  
 Class :character   1st Qu.:57224-01-05   1st Qu.: 329   1st Qu.:  23.0  
 Mode  :character   Median :57248-09-11   Median : 476   Median :  38.0  
                    Mean   :57239-03-03   Mean   : 597   Mean   :  50.5  
                    3rd Qu.:57249-06-19   3rd Qu.: 684   3rd Qu.:  59.0  
                    Max.   :57250-01-21   Max.   :5955   Max.   :2151.0  
    Present          Released             week      
 Min.   :   1.0   Min.   :   0.000   Min.   : 1.00  
 1st Qu.: 291.0   1st Qu.:   0.000   1st Qu.:13.00  
 Median : 430.0   Median :   0.000   Median :27.00  
 Mean   : 544.5   Mean   :   1.983   Mean   :27.22  
 3rd Qu.: 640.0   3rd Qu.:   0.000   3rd Qu.:40.00  
 Max.   :5847.0   Max.   :5904.000   Max.   :53.00  

Visualization

Interpretation

The heatmap shows student attendance patterns across different weeks. Darker colors (red) indicate higher attendance, while lighter colors represent lower attendance. The visualization helps identify trends and fluctuations in attendance across schools.

The line graph shows how attendance changes over time. It helps in understanding overall attendance trends during the academic year.

ggplot(weekly_data, aes(x = factor(week), y = School.DBN, fill = avg_present)) +
  geom_tile() +
  scale_fill_gradient(low = "yellow", high = "red") +
  labs(title = "Student Attendance Heatmap",
       x = "Weeks",
       y = "Schools",
       fill = "Avg Present") +
  theme_minimal() +
  theme(axis.text.y = element_blank()) 

Plot a line graph to show attendance trend over time

ggplot(attendance, aes(x = Date, y = Present)) +
  geom_line(color = "blue") +
  labs(title = "Attendance Trend Over Time")

Create a bar chart showing average attendance per week

ggplot(weekly_data, aes(x = factor(week), y = avg_present)) +
  geom_col(fill = "orange") +
  labs(title = "Average Weekly Attendance",
       x = "Week",
       y = "Average Present")

Conclusion

The analysis of student attendance data using visualizations such as heatmap, line graph, and bar chart provides clear insights into attendance trends. The heatmap highlights variations across weeks and schools, while the line graph shows overall trends over time. These visualizations help in understanding attendance patterns and can support better decision-making.