Census data is an important source of information that allows us to learn about how people in the United States are living. There are several factors that this type of data measures, but perhaps the most important are factors related to measures of economic well-being. These measures can be directly related to lifestyle patterns and changes, and can have significant impacts on one's quality of life. Thus, we decided to study two questions regarding examining indicators of financial status and their relation to different lifestyle patterns.
Our first question examines the correlation between household income and the type of health insurance people hold, either public and private. This question touches on how accessible certain types of health insurance are, which opens conversations about how people can qualify for health insurance and what opportunities allow people to have the choice between public and private health insurance. This can also open discussions on health issues between people who have public or private insurance, and if coverage options between public and private have anything to do with potential discrepancies in health outcomes. Health insurance access and coverage can directly relate to how well health issues are managed, and higher instances of poorly managed health issues can also lead to further financial burdens to individuals and their families. Asking questions like this can lead to health insurance policy reform by considering how public and private insurance access and qualification relates to income.
Our second question asks how much income people of different marital statuses are left with after paying monthly rent. This question was asked as an equalizer to see how well individuals were able to afford rent because there were major discrepancies between the household income and monthly rent categories. This question touches on how affordable rent is given one's income, but it also sheds light on what people can generally afford for housing based on marital status. Additionally, this question gives insight to how well people are able to afford other necessary expenses after rent, such as grocery bills and transportation costs. Marital status can have a huge impact on finances, from filing taxes to the number of people that are supported on one's income. This research can further inform questions like our first question, by asking how much income people have to spend on other needs, like health insurance, and if they have the means to support themselves based on the coverage they would like to receive.
These questions bring to light major facets of the wellbeing of many American citizens, and they could hold the power to spark changes in policy regarding issues like health insurance and income assistance programs. Asking these questions can affect change that has a remarkable impact on the economic status of many Americans.
We originally obtained our data from the 2019 US Census Bureau. We selected approximately 40 variables from this database, which we then used to create our questions and analyses. The variables that we mainly focused on to study our two questions were: household income (HHINCOME), access to private and public health insurance (HCOVPRIV and HCOVPUB), marital status (MARST), and monthly gross rent (RENTGRS). We chose to use 2019 data becaues many of our analyses dealt with economic well being, which could have been negatively affected by the COVID-19 pandemic. Below is a portion of the data collected for these variables. In our full dataset, there are over 3 million observations in total, but many cases were taken out through our preliminary analyses for data cleaning purposes.Â
The total household income is a variable the Census defines as the total
gross income between everyone in the household that is over the age of
15. Gross income is defined here as any income received on a regular
basis before payments for taxes, social security, etc. In the dataset,
there are almost 2400 observations in which this variable has an entered
value of 9999999, which we omitted as it represents large numbers out of
the scope of the Census's records. In addition to total household
income, we also examined monthly gross rent, which accounts for both the
monthly rent and the average monthly utility payment taken from the
total utility payments from that calendar year.Â
Type of health insurance is split between public and private health insurance, and these two variables were measured as binary variables with 1 representing not having that particular type of insurance and 2 as having that type of insurance. We chose to only focus on public and private healthcare and exclude the variables for medicare and medicaid because there is a larger difference in policy offerings between public and private healthcare. We also chose to not include medicare and medicaid because they would fall under public health insurance, which could lead to potential collinearity with the public insurance observations in our results. To produce some of our initial figures, we used the variable "HCOVANY", which determined if the person had any kind of health insurance. This was useful in making follow-up questions to look deeper into. This variable was also measured by a 1 for no and 2 for yes.Â
In terms of marital status, the Census Bureau has 6 different options
to choose from. The way they are measured is as follows: Married and
living together (1), Married and not living together (2), Separated
(3), Divorced (4), Widowed (5), and Never married/single (6). Below is a
bar chart showing the proportion of each group in the dataset. It is
clear that the majority of American residents in 2019 that responded to
the census were either married and together or never married.Â
Our overall dataset had 49 different variables, so it was difficult to narrow down which variables we wanted to use. As stated earlier, there are over 3 million observations, so we were working with a very large dataset. This was useful as we had lots of data to use for our predictive models, but also a hindrance at times due to the volume of observations we were attempting to work with.Â
Through our research, we sought to understand how household income relates to health insurance. First, we looked at trends in whether household income impacted if Americans had any health insurance. We found out through our preliminary analysis that the median income was higher for those who did have insurance. We used a box plot to analyze a binary variable entitled `HCOVANY` which indicated a 1 if the person did not have any insurance and a 2 if they did. Once we gathered that information, we pivoted to trying to understand the question of whether the type, private versus public, of health insurance was correlated with household income for American residents in 2019. In order to answer this question, we collected data from the US Census on many indicators of quality of life, including multiple indicators of household income and binary variables used to indicate private versus public health insurance. While doing our preliminary analysis of the data, we found that there were several edge cases in our private and public health insurance variables. First, we noticed that several people reported having both public and private health insurance. For our analysis, we were interested in understanding the difference between the two as it related to household income, so we decided to remove those cases from our analysis. There were also several people in our dataset who did not have health insurance at all. Thus, we created a variable entitled `real_cases` to symbolize only those having one type of insurance or the other. From there, we tidied the data to ensure that there was only one column representing health insurance types.Â
Once we tidied the data, we were able to start to visualize it. We created a boxplot to represent the median household income as compared to the two types of health insurance. We decided to use a boxplot in particular because we knew that there were outliers in the household income variable, and thus the median would be a more robust measure of centrality as compared to the mean. Here, we saw that the median household income for those having private health insurance was vastly larger than that for those having public health insurance. In fact, the median household income for those with private health insurance was just below the third quartile for the household income for those with public health insurance. This means that the median household income for those with private health insurance is nearly as high as the top 75th percentile of household income for those with public health insurance.Â
We also ran logistic regressions to predict private and public health
insurance coverage in relation to household income. In order to do this,
we first re-coded both the private and public health insurance variables
into a binary 0-1 variable. We then ran a logistic regression for both.
We concluded that household income was not a statistically significant
determinant of having public health insurance with a p-value of .763,
but there was a trend towards significance at the 95% confidence
interval for household income being a predictor for having private
health insurance, with a p-value of 0.056. When we plotted our logistic
regression for private and public health insurance, we saw that the line
of best fit for our plot predicting whether or not an observation had
private health insurance based on their income was not representative of
our points. For our logistic regression plot for having public
insurance, however, the line of best fit did reasonably fit our data.
However, when interpreting this information, we must remember that at
the 95% confidence interval, there was not evidence of either type of
insurance being a statistically significant determinant for household
income.Â
We also conducted a two-sample t-test predicting household income based
on the type of health insurance held. Here, we found that there was
statistically significant evidence at the 95% confidence level to reject
the null hypothesis that the household income for the two insurance
groups was equal. The mean household income for the private health
insurance group was $68,332.31 while the mean household income for the
public health insurance group was vastly smaller at only $34,444.55.
When interpreting this result, however, we must keep in mind that in the
boxplot above, we did see large positive outliers for household income
in the private insurance group, which could be artificially inflating
the household income figure. Regardless, this t-test in conjunction with
our logistic regressions and initial analyses seem to signify an income
imbalance between the two groups.Â
For our final follow up question, we saw that married and widowed individuals had the most variation in income, and decided to investigate that occurrence. We wanted to see if they have higher paying jobs or mortgages, or if they are living in houses with higher house values. To study this, we first tidied the data in order to eliminate extraneous variables that were unrelated to marital status and non-rent income. To do this, we first created the non-rent income variable by eliminating `RNTGRS`, monthly gross rent from `HHINCOME`, household income. As mentioned previously, this is useful to get rid of the discrepancies between household income and rent. The new variable, non-rent income, looks at people's income after paying monthly rent, and we were able to analyze how much money was left over for various groups of marital statuses by using this as our dependent variable.Â
As a result of focusing on the variables needed, we found that, in comparison to widowed individuals, married people seemed to have higher non-rent income. This was visualized through the creation of a box-plot, as seen below, that compares marital status and non-rent income. As seen in this box plot, non-rent income is higher in individuals who are married than those who are not, defined along the x-axis. This can be shown through the skew of the marriage status levels, which vary in their deviation. The six marital status categories are married together, married not together, separated, divorced, widowed, and never married/single in order from 1-6 respectively. The final marital status, 6, is slightly positively skewed, and the married together box plot shows a perfect skew with all four quartiles at the highest non-rent income when compared to other marital statuses. This ultimately proves that non-rent income is lower in individuals who have never been married or are single.
This claim is further supported through the summary statistics that show a slight negative correlation of -0.216 between non-rent income and marital status. This shows that non rent income increases while marital status is lower, or at a level of 1 when individuals are married and together.
We then decided to run an ANOVA test to compare the different marital statuses and non-rent income levels. Here we saw that there was a statistically significant effect for using marital status as a predictor for non-rent income levels. We then decided to run a Tukey multiple comparison test to find which pairs were statistically significantly different. We found out that all but 5 and 3, widowed and separated, were statistically significant at the 95% confidence level. The difference between widowed and separated, though, was trending towards significance, meaning that if our sample size was larger we may have seen a statistically significant difference at the 95% confidence level.Â
This shows the impact on American lives and delves into the relationship between the income of married individuals as opposed to the income of those who are unmarried, or even widowed. Through the investigation of these variables, it is obvious that married people make higher non-rent income than those who are not.
We originally set out to find the relationship between household income and the type of health insurance one possesses. Through our analysis, we found that people with private health insurance generally have a much greater income than those with public health insurance. In fact, the median income of those with public insurance was less than half that of those with private insurance. While we would expect those with private insurance to generally have higher earnings, the magnitude of the disparity was shocking. The data shows that those able to afford private health insurance generally choose it in favor of public insurance, which may suggest that there is a difference in quality between the two types of plans. These results convey the possibility that lower-income families are being priced out of high-quality healthcare, which could be a pressing national issue. Not only does this impact families with a low household income, it has nation-wide ramifications. When people are unable to work due to illness or the illness of a family member, they are not contributing to the economy. When a considerable portion of the population is more likely than necessary to miss work, the labor force is not operating at its full potential, which could have a substantial impact on GDP. Aside from the economic implications of these results, the public welfare externalities must be considered. The ability of families to quickly and effectively manage illness would impact not just the welfare of these families, but also public health across the country by more effectively controlling the spread of disease. Additionally, individuals dealing with chronic illnesses may not be getting the symptom management they need, leading to a lower quality of life and an inability to hold certain jobs, work certain hours, or make enough money to pay for housing, bills, and medical expenses. To continue research on this concerning reality, the types of insurance should be further broken down by coverage to establish more categories of quality of care and create a clearer picture of how exactly the affordability of these plans impacts American families. Furthermore, geographic location of these households should be taken into account to determine if the system as it stands appears to more harshly affect certain populations than others. In this way, localized issues could receive more attention and policies in these areas could be tailored to reflect the specific struggles of the residing families. Overall, the disparity in affordability and accessibility between private and public health insurance is a concerning issue with national consequences and merits further research.
Our second research question examined the relationship between marital status and non-rent income. We found that individuals that were separated, divorced, widowed, or single tended to earn less after accounting for rent than married individuals. There was also greater variation in non-rent income for unmarried people. The Tukey pairwise comparison showed that all but one of the marital status pairs had a statistically significant difference in mean earnings, and that particular pair was still trending towards significance. Expecting all groups to have roughly equivalent mean non-rent income, we were surprised at how stratified by marital status income truly was. When drafting policy at any level of government, it is very important to keep in mind the challenges faced by the constituents. Any measures taken to improve the quality of life for a given population should aim to do the most good for the most people. As such, it is important to analyze which groups face the greatest obstacles and focus on removing them or alleviating their negative consequences. Obviously, people with different marital statuses experience distinct financial pressures. For example, many married people have children, which constitute a major expense, while those that are separated or divorced often have significant legal expenses. Given these differences, policies aimed at easing the financial burden on any one of these groups should be specialized accordingly. Our research found that certain strata of marriage status (separated, divorced, single, widowed) faced consistently lower incomes than their married counterparts. Thus, we should aim to enhance every citizen\'s quality of life, but perhaps it would be wise to begin with the unmarried population, given that they clearly are subject to a certain level of financial burden as compared to the married population. To further this research, other aspects of financial health aside from non-rent income should be considered for each marital stratum in order to establish how much of this income disparity is due to inherent unobservables in the individuals, perhaps leading them to join a less lucrative career field, and how much is due to financial obstacles disproportionately affecting a certain class. This would also help to create a more holistic image of the struggles faced by individuals of each marital status, so that these hindrances can be more effectively targeted. In conclusion, married individuals generate significantly more non-rent income than their unmarried counterparts, and the reasons for this disparity should be further examined to understand the effects of policies on certain groups of people.