The aim of the following analysis is to examine the relationships between poverty and education, hoping to identify key factors to an individuals educational achievements. Diving deep into this topic will hopefully provide useful insights into addressing the United States’ struggle with the education system, as some political figures look to disband and defund the U.S. Board of Education. This analysis is important for state- and country-wide educational boards as they seek solutions for strugling schools and education systems.
The datasets come from numerous sources, including the US Census Bureau and the Bureau of Labor Statistics. The four datasets that will be used are the Presidential Election Data, Unemployment Data, Poverty Data, Education Data. Links to these datasets can be found below:
R/R Markdown were used in this project as it is free and open-source, allowing users to customize their experience with various libraries and features that other coding software such as SAS do not have. R provides an easy-to-use and comprehensive toolset of statistical analyses and tests. While these advanced analyses will not be used in this project, further work can be done to provide additional insights to education and the role poverty plays in adult’s educational experience.
This analysis seeks to answer the question: what role does poverty play in U.S. adults educational experience?.
As per project directions, I was tasked with only keeping relevant information within each dataset.
Before filtering the Presidential Election Data, the dataset had 72617 observations of 12 variables. Below were the instructions for the Presidential Election Data.
The following information should be kept in the data:
After filtering the Presidential Election Data, the dataset had 3153 observations of 5 variables.
Before filtering the Unemployment Data, the dataset had 290441 observations of 5 variables. Below were the instructions for the Unemployment Data.
The following information should be kept in the data:
After filtering the Unemployment Data, the dataset had 3193 obervations of 3 variables.
Before filtering the Poverty Data, the dataset had 79919 observations of 5 variables. Below were the instructions for the Poverty Data.
The following information should be kept in the data:
After filtering the Poverty Data, the dataset had 3193 observations of 3 variables.
Before filtering the Education Data, the dataset had 3283 observations of 47 variables. Below were the instructions for the Education Data.
The following information should be kept in the data:
After filtering the Education Data, the dataset had 3273 observations of 5 variables.
When merging the datasets into one large dataset, I first converted each individual dataset’s FIPS value into a character variables.
Then, I joined all 4 individual datasets into one large dataset named Final Data.
The Final Data dataset now has 3114 observations of 13 variables. Below are the descriptions of each remaining variable:
The histogram above of Bachelors shows that for a majority of U.S. counties, less than 40% of their adult population has at least a Bachelor’s degree. The highest concentration is around 15-20%. The distribution of Bachelors is right-skewed, indicating that very few counties have a high proportion of adult residents who finished a 4-year college or university. There are a few potential outliers beyond a 70% Bachelor’s Degree rate, indicating a higher-than-usual cincentration of college graduates.
The scatterplot of Bachelors and Poverty.Rate expands upon potential explanations as to the right-skewed distribution of the Bachelors variable. As shown above, there is a negative association between Bachelors and Poverty.Rate, meaning as the county’s poverty rate increases, the percentage of residents with at least a Bachelors degree decreases. This negative correlation suggests that adults or soon-to-be adults in a higher poverty-stricken area have a lower chance of finishing a traditional 4-year college or university. There appear to be two counties with a Bachelor’s rate of close to 0% with a relatively low poverty rate, suggesting a potential confounding variable that the dataset may not account for like access to available schooling.
To potentially show more to the narrative, the above scatterplot shows the relationship between Less.Than.HS and Poverty.Rate. Less.Than.HS is the “lowest” available education option for residents in this study. There is a slight positive association between Less.Than.HS and Poverty.Rate, meaning as the county’s poverty rate increases, the percentage of residents without a High School diploma also increases. This positive correlation suggests that adults or soon-to-be adults in a higher poverty-stricken area have a higher chance of failing to graduate high school. This narrative aligns with the previous scatterplot that those in a higher poverty-stricken county are less likely to receive a higher education. There appears to be a few outliers, one of which around the 75% Less.Than.HS rate. Further research can be done on this county, potentially finding confounding variables that this dataset does not account for.
This exploratory data analysis provides insights to the potential reasoning behind adults educational experience as it relates to poverty levels. Counties with a higher level of poverty have a higher proportion of adults who did not graduate high school and a lower proportion of adults with at least a Bachelor’s degree. These findings show a trend of counties with high poverty and low educational results. These findings are important to policy makers, as they can use this information to target higher poverty-stricken counties and provide more funding for their educational systems, hoping to educate the residents.