Introduction
Research Question: How much does financial aid affect graduation rates in different types of institutions?
College Completion, the source, examines data and trends at 3,800 degree-granting institutions in the United States (excluding territories) that reported a first-time, full-time degree-seeking undergraduate cohort, had a total of at least 100 students at the undergraduate level in 2013, and awarded undergraduate degrees between 2011 and 2013. It also includes colleges and universities that met the same criteria in 2010.
One limitation is that the data is dated. As such, the relationship depicted by this data might not accurately reflect more recently developed trends.
Exploratory data analysis
Preview of the Dataset
| Alabama A&M University |
Public |
7142 |
10.0 |
| University of Alabama at Birmingham |
Public |
6088 |
29.4 |
| Amridge University |
Private not-for-profit |
2540 |
0.0 |
| University of Alabama at Huntsville |
Public |
6647 |
16.5 |
| Alabama State University |
Public |
7256 |
8.8 |
| University of Alabama at Tuscaloosa |
Public |
10390 |
42.7 |
Each row represents a university, its type, financial aid value, and graduation rate.
- For private for-profit colleges, their graduation rate median is 29.11% and their financial aid amount median is $4,632.
- For private not-for-profit colleges, their graduation rate median is 41.00% and their financial aid amount median is $13,322.
- For public colleges, their graduation rate median is 13.20% and their financial aid amount median is $4,667.
Graph 
Each point represents a university.
All three graphs portray a positive correlation between graduation rate and financial aid amount. The slopes for the private for-profit and public institutions’ regression lines seem approximately equal. The slope for private not-for-profit institution regression line is less steep than the other two.
All three types of institution have graduation rates from 0% to 100%. Private not-for-profit has the highest graduation rate median at 41.00%, followed by private for-profit at 29.11%, and public at 13.20%.
Regarding financial aid amount, the ranges for private-for-profit and public are sizably smaller than private not-for-profit’s range. Private not-for-profit has the highest financial aid amount median at $13,322, followed by public at $4,667, and private for-profit at $4,632.
Multiple regression
| intercept |
13.361 |
2.239 |
5.968 |
0.000 |
8.972 |
17.751 |
| aid_values |
0.003 |
0.000 |
7.335 |
0.000 |
0.002 |
0.004 |
| typePrivate not-for-profit |
1.589 |
2.509 |
0.633 |
0.527 |
-3.331 |
6.509 |
| typePublic |
-12.760 |
2.548 |
-5.008 |
0.000 |
-17.755 |
-7.765 |
| aid_values:typePrivate not-for-profit |
-0.002 |
0.000 |
-3.296 |
0.001 |
-0.002 |
-0.001 |
| aid_values:typePublic |
0.000 |
0.001 |
-0.152 |
0.879 |
-0.001 |
0.001 |
The numerical explanatory variable is the financial aid amount, the categorical explanatory variable is the type of institution, and the numerical outcome variable is the graduation rate.
Statistical interpretation
Intercept:
- The intercept of 13.4% represents the graduation rate for a private for-profit institution with financial aid amount of $0.
- The intercept of 14.99% (= 13.4 + 1.59) represents the graduation rate for a private not-for-profit institution with financial aid amount of $0.
- The intercept of 0.6% (= 13.4 - 12.8) represents the graduation rate for a public institution with financial aid amount of $0.
In our data however, the intercept has limited practical interpretation as no institution has an average of $0 financial aid amount given.
Slope:
- The slope of 0.00300 represents that all other things being equal, for every increase in $1 in financial aid amount, there is on average an associated increase of 0.003% in graduation rate for private for-profit institutions.
- The slope of 0.00100 (= 0.00300 - 0.00200) represents that all other things being equal, for every increase in $1 in financial aid amount, there is on average an associated increase of 0.001% in graduation rate for private not-for-profit institutions.
- The slope of 0.00300 (= 0.00300 - 0.) represents that all other things being equal, for every increase in $1 in financial aid amount, there is on average an associated increase of 0.003% in graduation rate for public institutions.
Modelling equation for:
- Private for-profit: \(\widehat{grad rate} = 13.4 + 0.00300 * finaid\) (%)
- Private not-for-profit: \(\widehat{grad rate} = 14.99 + 0.00100 * finaid\) (%)
- Private for-profit: \(\widehat{grad rate} = 0.6 + 0.00300 * finaid\) (%)
As observed initially, the regression lines for private for-profit institutions and for public institutions are approximately parallel and they are steeper than the regression line for private for-profit institutions. For a given financial aid amount, the graduation rate for private for-profit institutions are about 12.8 percent higher than that of public institutions.
One limitation is that this relationship only confirms correlation but it does not suggest causation. Also, the question of data containing institutions with 0% graduation rate is included in the analysis; in other words, such data points may mislead the analysis.
Non-statistical interpretation
As expected, the graduation rates increase with financial aid amount for all three types of institutions. The rate of increase is faster for private for-profit and public institutions than it is for private not-for-profit institutions.
Inference for multiple regression
Regression Equations:
- Private for-profit: \(\widehat{grad rate} = 13.4 + 0.00300 * finaid\) (%)
- Private not-for-profit: \(\widehat{grad rate} = 14.99 + 0.00100 * finaid\) (%)
- Private for-profit: \(\widehat{grad rate} = 0.6 + 0.00300 * finaid\) (%)
Confidence Interval Analysis:
- For private for-profit colleges, the 95% confidence interval for the slope was (0.002, 0.004). As such, we are 95 % confident this interval captures the true slope. In other words, if we replicate this process, 95% of the confidence intervals constructed this way will capture the true slope.
- For private not-for-profit colleges, the 95% confidence interval for the slope was (0.003 + (-0.002), 0.003 + (-0.001)) = (0.001, 0.002). As such, we are 95 % confident this interval captures the true slope.
- For public, the 95% confidence interval for the slope was (0.003 + (-0.001), 0.003 + 0.001) = (0.002, 0.004). As such, we are 95 % confident this interval captures the true slope.
P-value Analysis:
- For private for-profit colleges, the p-value for the slope was 0.000, which is less than \(\alpha\) = 0.001, so we reject the null hypothesis that the slope is equal to 0. Thus, we have strong evidence that the slope is not equal to 0, in other words, there exists a relationship between graduation rate and financial aid amount in private for-profit colleges.
- For private not-for-profit colleges, the p-value for the slope was 0.001, which is less than \(\alpha\) = 0.01, so we reject the null hypothesis that the slope is equal to 0. Thus, we have strong evidence that the slope is not equal to 0, in other words, there exists a relationship between graduation rate and financial aid amount in private not-for-profit colleges.
- For public colleges, the p-value for the slope was 0.879, which is greater than \(\alpha\) = 0.01, so we fail to reject the null hypothesis that the slope is equal to 0. Thus, we do not have sufficient evidence that the slope is not equal to 0. In other words, we cannot conclude there exists a relationship between graduation rate and financial aid amount in public colleges.
Residual Visualization: 

Residual Analysis:
- The scatterplot of residuals shows no apparent pattern, which suggests there is a constant spread of residuals.Thus, the equal variance condition is met. The histogram shows a unimodal and symmetric distribution of residuals. This suggests the distribution of the residuals is approximately normal. Thus, the normal population condition is met.
- Also, from the original scatterplot, we can see that the points are approximately linear.
- Therefore, the conditions for inference are met and our inference for regression is valid.
Conclusion
- Summary: In our analysis of the relationship between financial aid and graduation rates for different types of institutions, all three types of institutions showed positive correlation. Inference for regression hypothesis testing showed that there exists a definite relationship between the two variables for private not-for-profit and private for-profit colleges. However, the existence of a relationship could not be proven for public colleges via hypothesis testing. Linear regression models were used to depict specific relationships.
- Take-home message: Graduation rate increases as financial aid amount increases for all three types of institutions. The rate at which graduation rate increases for a given increase in financial aid amount is greater for private for-profit and public colleges than it is for private not-for-private colleges.
- Limitations: One limitation is that the data used is dated; it was collected between 2010 and 2013. Also, the data is contaminated with a good amount of colleges that have 0% graduation rate recorded. In addition, hypothesis testing could not prove there exists a relationship between graduation rate and financial aid amount for public colleges. Lastly, since this study is observational and not an experiment, we cannot conclude causation; we can only infer correlation.
- Future work: For future work, we can break down confounding factors and incorporate them into future studies. For example, we can further examine data while considering factors such as race, grades, wealth, and gender to inspect each factor’s relationship with graduation rate.
Citations and References
These data were pulled from the College Completion microsite produced by The Chronicle of Higher Education with support from the Bill & Melinda Gates Foundation: Data Collector Original Dataset