library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
 
data <- read.csv("C:/Users/atg516/Desktop/HHS Discharge Project/HHS_Unaccompanied_Alien_Children_Program.csv")
summary(data)
##      Date           Children.apprehended.and.placed.in.CBP.custody.
##  Length:1015        Min.   :  1.00                                 
##  Class :character   1st Qu.: 59.75                                 
##  Mode  :character   Median :123.00                                 
##                     Mean   :116.70                                 
##                     3rd Qu.:167.00                                 
##                     Max.   :333.00                                 
##                     NA's   :701                                    
##  Children.in.CBP.custody Children.transferred.out.of.CBP.custody
##  Min.   :  7.00          Min.   :  0.00                         
##  1st Qu.: 82.25          1st Qu.: 84.75                         
##  Median :233.50          Median :176.00                         
##  Mean   :218.11          Mean   :164.52                         
##  3rd Qu.:300.25          3rd Qu.:227.50                         
##  Max.   :531.00          Max.   :440.00                         
##  NA's   :701             NA's   :701                            
##  Children.in.HHS.Care Children.discharged.from.HHS.Care
##  Min.   : 2109        Min.   :  2.0                    
##  1st Qu.: 6940        1st Qu.:196.0                    
##  Median : 8998        Median :279.0                    
##  Mean   : 9196        Mean   :283.3                    
##  3rd Qu.:10766        3rd Qu.:360.0                    
##  Max.   :22557        Max.   :827.0                    
## 
data_clean <- na.omit(data)

summary(data_clean)
##      Date           Children.apprehended.and.placed.in.CBP.custody.
##  Length:314         Min.   :  1.00                                 
##  Class :character   1st Qu.: 59.75                                 
##  Mode  :character   Median :123.00                                 
##                     Mean   :116.70                                 
##                     3rd Qu.:167.00                                 
##                     Max.   :333.00                                 
##  Children.in.CBP.custody Children.transferred.out.of.CBP.custody
##  Min.   :  7.00          Min.   :  0.00                         
##  1st Qu.: 82.25          1st Qu.: 84.75                         
##  Median :233.50          Median :176.00                         
##  Mean   :218.11          Mean   :164.52                         
##  3rd Qu.:300.25          3rd Qu.:227.50                         
##  Max.   :531.00          Max.   :440.00                         
##  Children.in.HHS.Care Children.discharged.from.HHS.Care
##  Min.   :2109         Min.   :  2.0                    
##  1st Qu.:4972         1st Qu.:114.2                    
##  Median :6150         Median :166.0                    
##  Mean   :5810         Mean   :158.7                    
##  3rd Qu.:7334         3rd Qu.:214.0                    
##  Max.   :8902         Max.   :373.0

This data set follows the monthly trends of the Unaccompanied Children Program, administered by the U.S. Dept of Health and Human Services and Custom and Border Protection.

The cleaned data set includes 314 observations as opposed to its 1015 observations where there was missing data from the years 2021 - 2024 for the following variables: Children.apprehended.and.placed.in.CBP.custody, Children.in.CBP.custody, and Children.transferred.out.of.CBP.custody.

These variables offer important context for understanding the flow and efficiency of the program.

hist(data_clean$Children.in.HHS.Care)

This histogram shows the daily frequency of children in HHS care and tells us how often population levels were observed across the program’s timeline from 2021-2025.

The distribution is bi-modal, meaning that there are two prominent levels of daily care count. The tallest bar, that peaks at above 60 on the Y-axis, means that for over 60 days, the number of children in HHS custody fell anywhere between 6000 and 6500. The space between the peaks indicate periods of rapid changes in admissions, transfers or discharge policies.

Overall, the number of children in HHS care shows to be highly dynamic with capacity fluctuations and program operational intensity.

plot(data_clean$Children.in.HHS.Care,data_clean$Children.discharged.from.HHS.Care)

The scatter plot here shows the relationship between the number of children in HHS care and the number of children discharged on a daily basis. Their relationship shows a positive linear relationship, meaning that as facilities are managing a large population, they would, by default, process more discharges per day.

Days where there are around 6000 children, there is a stretch of discharge counts, meaning that there is some variation in the number of children discharged which could be indicative of operational constraints, policy changes, or other factors affecting the discharge process.

cor(data_clean$Children.in.HHS.Care,data_clean$Children.discharged.from.HHS.Care)
## [1] 0.8700262

The results of the Pearson correlation between Children in HHS Care and Children discharges from HHS care was r = 0.87, meaning there is a strong positive linear relationship. As the number of children in HHS custody increases on a given day, the number of children discharged also tends to increase. This also indicates that higher intake leads to more daily discharges, and that the number of children in HHS care significantly influences the discharge rate.