The code below loads the dataset “CPSWaitingForAdoptionFY2014_2023.csv” as an object called “waitadopt”, and deletes the original file name.

library(readr)

CPSWaitingForAdoptionFY2014_2023 <- read_csv("CPSWaitingForAdoptionFY2014-2023.csv")
## Rows: 12158 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Region, Gender, Race/Ethnicity, Age Group
## dbl (3): Fiscal Year, Chidlren Waiting on Adoption 31 August, Average Months...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
waitadopt <- CPSWaitingForAdoptionFY2014_2023

rm(CPSWaitingForAdoptionFY2014_2023)

The first independent viariable selected is Gender. Most observations are “male” or “female,” but one was entered as “unknown.” This chunk changes “Unknown” to “NA”.

waitadopt$Gender[waitadopt$Gender == 'Unknown'] <- NA

To analyze the gender variable in relation to other factors, it needs to be recoded. This chunk recodes “female” as “1”, and “male” as “0” onto a new variable called “Gendervar”.

waitadopt <- waitadopt %>% mutate(Gendervar=ifelse(Gender=='Female',1,0))
waitadopt
## # A tibble: 12,158 × 8
##    `Fiscal Year` Region    Gender `Race/Ethnicity` `Age Group`         
##            <dbl> <chr>     <chr>  <chr>            <chr>               
##  1          2023 1-Lubbock Female African American Birth to 5 Years Old
##  2          2023 1-Lubbock Female African American 6-12 Years Old      
##  3          2023 1-Lubbock Female African American 13-17 Years Old     
##  4          2023 1-Lubbock Female African American 13-17 Years Old     
##  5          2023 1-Lubbock Female African American Birth to 5 Years Old
##  6          2023 1-Lubbock Female African American Birth to 5 Years Old
##  7          2023 1-Lubbock Female African American Birth to 5 Years Old
##  8          2023 1-Lubbock Female African American 6-12 Years Old      
##  9          2023 1-Lubbock Female African American 6-12 Years Old      
## 10          2023 1-Lubbock Female African American 6-12 Years Old      
## # ℹ 12,148 more rows
## # ℹ 3 more variables: `Chidlren Waiting on Adoption 31 August` <dbl>,
## #   `Average Months since Termination of Parental Rights` <dbl>,
## #   Gendervar <dbl>

The second independent variable is age group, and the dependent variable is the average number of months since termination of parental rights. This chunk runs a model to examine the relationship between gender and this dependent variable.

genderage_model <- lm(`Average Months since Termination of Parental Rights`~`Age Group`+Gender, data = waitadopt)
summary(genderage_model)
## 
## Call:
## lm(formula = `Average Months since Termination of Parental Rights` ~ 
##     `Age Group` + Gender, data = waitadopt)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.390  -7.565  -1.845   4.636 141.540 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      35.5530     0.2976  119.48   <2e-16 ***
## `Age Group`6-12 Years Old       -19.8078     0.3468  -57.12   <2e-16 ***
## `Age Group`Birth to 5 Years Old -29.9454     0.3540  -84.59   <2e-16 ***
## GenderMale                        3.9366     0.2779   14.16   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.32 on 12153 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.3811, Adjusted R-squared:  0.3809 
## F-statistic:  2494 on 3 and 12153 DF,  p-value: < 2.2e-16

Based on the p-value, there seems to be a significant correlation between the dependent and independent variables. That is, when all other factors are theoretically set to “0”, children 13 to 17 years old spend an average of 39.49 months waiting for adoption since termination of parental rights, which is significantly more that the average of 19 months. However, it is important to consider that this must be fundamentally true, since younger children have not lived as long. On the other hand, the variable Gender suggests that there is also a significant correlation between being male and having to wait longer for adoption. The r-squared value of 0.38 above means that differences in age groups and genders explain for an estimated 38% of the variance in the average number of months waiting for adoption. Nonetheless, it is worth mentioning that age group seems to be a much stronger predictor in explaining the dependent variable compared to gender, based on the summary above. To test linearity, the graph below shows a visual representation:

plot(genderage_model,which = 1)

The resulting graph is difficult to interpret, since it shows a range of 5 to 40 on the x axis, but the observations fall on specific numbers: 6, 9ish, 16, 19, 36, and 39, which seem arbitrary and do not reflect either gender or age group. On the other hand, the y axis seems to represent the average number of months, which seem to be based on an average, since some of the values fall under “0”. The chunk below renders another plot:

plot(genderage_model,which = 2)

This seems to be more logical an easier to understand, but still the values are not clearly defined, since the x-axis runs from -4 to 4, and the y-axis from -2 to 10. In either graph, there seems to be some linearity, but the end of the curve deviates significantly from the line as the values increase.