State your research question, a description of the variables you’ll use, and your data sources (please include website links if possible).
Research Question: How much does finacial aid affect graduation rates and retention rates in different types of institutions?
Variables: The name of the university/college (name) is our identification variable Amount of financial aid (aid_values) is our numerical explanatory/predictor variable The types of universities (type) is our categorical explanatory/predictor variable The graduation rate in expected time (ie. completing a degree after 4 years at 4-year institutions) (grad_100), graduation rate in 1.5 times expected time (ie. completing a degree after 6 years at 4-year institutions) (grad_150), and retention rate(retain) are our numerical outcome variables.
These data were pulled from the College Completion microsite produced by The Chronicle of Higher Education with support from the Bill & Melinda Gates Foundation. [http://collegecompletion.chronicle.com/][https://data.world/databeats/college-completion/workspace/file?filename=cc_institution_details.csv]
clean_names() function from the janitor package then select() only the variables you are going to use.Example:
| name | type | aid_values | grad_100 | grad_150 | retain |
|---|---|---|---|---|---|
| Alabama A&M University | Public | 7142 | 10.0 | 29.1 | 63.1 |
| University of Alabama at Birmingham | Public | 6088 | 29.4 | 53.5 | 80.2 |
| Amridge University | Private not-for-profit | 2540 | 0.0 | 66.7 | 37.5 |
| University of Alabama at Huntsville | Public | 6647 | 16.5 | 48.4 | 81.0 |
| Alabama State University | Public | 7256 | 8.8 | 25.2 | 62.2 |
| University of Alabama at Tuscaloosa | Public | 10390 | 42.7 | 66.7 | 87.0 |
Create “exploratory data analysis” visualizations of your data. At this point these are preliminary and can change for the submission, but the only requirement is that your visualizations use each of the measurement variables included in your dataset to test out if they work.