library(git2r)
library(usethis)
Each of our team members originally selected a few datasets that we compared as a group. Our group decided to explore a dataset Ana found related to the topic of racial and ethnic diversity at two-year and four-year colleges and universities in the United States. We initially thought to compare diversity at two-year community colleges versus 4-year schools and also were curious about the variations in diversity between the two that could be found comparatively across the country.
As we were exploring the dataset we identified a number of named colleges and universities that are known as “for-profit” institutions. Some of these institutions have been the subject of recent lawsuits regarding predatory (Halperin, 2020; Legal Services Center, 2020; Redman, 2020). The high number of current lawsuits regarding for-profit colleges’ predatory behaviors are a result of the actions of current Education Secretary Betsy DeVos. In 2019, DeVos “repealed an Obama-era regulation that sought to crack down on for-profit colleges and universities that produced graduates with no meaningful job prospects and mountains of student debt they could not hope to repay” (Green, 2019. para. 1). Another action that DeVos took was to deny debt forgiveness to students who had been prey to for-profit predatory educational institutions (Lobosco, 2019; Turner, 2019). We were aware of past and recent news articles describing the marketing strategies of these colleges that were aimed at students of color, low-income students, immigrant communities and students who are first in their family to go to college (Bonadies et al. 2018; Conti, 2019; Voorhees, 2019). Previous studies have found that loan debt is higher for students who have attended For-Profit institutions, which disproportionately affects students of color.
Race and For-Profits
(Body, 2019)
Loan Default Rates at For-Profits
(Body, 2019)
Despite their high cost, For-profit institutions have a lower graduation rate and employment rate than non-profit institutions.
For-Profit Graduation & Employment Rates
(Lopez, 2015)
We will examine our dataset and explore possible consistencies and/or inconsistencies with these previous reports.
How do racial and ethnic demographics vary between not-for-profit institutions and for-profit colleges in our dataset?
In our dataset, do the For-Profit Institutions have a higher proportion of students of color than other community colleges and four-year colleges, thus providing support to the argument that For-Profit colleges target students of color for enrollment?
For-Profit Institutions: For-profit institutions are defined by the way that “revenue earned by the school is invested”. For-profit colleges have investors who want to make a profit. Their operations management is determined in part on maximizing the return profit for investors. “Money earned by the shcool may be used to pay out investors and award boneuses to executives, as well as sustain the operation’s profitability through aggressive marketing and recruitment strategies” (TBS Staff, 2019, para. 8)
Not-for Profit Institutions: Non-profit colleges can be either public or private. Regardless of whether the school is public or private, non-profit colleges must “reinvest the money earned through enrollment into the educational mission” (TBS Staff, 2019, para. 8).
Our dataset did not include information on whether colleges were for-profit.
In order to determine which of the schools on our list were for-profit institutions, we decided to find a website with a list of for-profit institutions, scrape the data from that website, to create a second dataset with a list of for profit schools.
#Loading the rvest package
library('rvest')
## Loading required package: xml2
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.1 v dplyr 1.0.0
## v tidyr 1.1.0 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x readr::guess_encoding() masks rvest::guess_encoding()
## x purrr::is_empty() masks git2r::is_empty()
## x dplyr::lag() masks stats::lag()
## x purrr::pluck() masks rvest::pluck()
## x dplyr::pull() masks git2r::pull()
## x purrr::when() masks git2r::when()
library(dplyr)
#Specifying the url for desired website to be scraped
url <- 'https://en.wikipedia.org/wiki/List_of_for-profit_universities_and_colleges'
#Reading the HTML code from the website
webpage <- read_html(url)
#Using CSS selectors to scrape the rankings section
for_profit_html <- html_nodes(webpage,'li a')
#Converting the ranking data to text
for_profit <- html_text(for_profit_html)
#Let's have a look at the rankings
head(for_profit)
## [1] "1 In the United States"
## [2] "2 Distance education (online)"
## [3] "3 Outside the United States"
## [4] "3.1 Closed or merged"
## [5] "4 For-profit colleges that became non-profit colleges"
## [6] "4.1 For-profit colleges that became non-profit colleges (closed)"
# for_profit
for_profit_new <- for_profit[25:215]
head(for_profit_new)
## [1] "Academy of Art University"
## [2] "American Career College"
## [3] "American College of Education"
## [4] "American InterContinental University"
## [5] "Career Education Corporation"
## [6] "American Military University"
for_profit_new[167:168] <- NA
for_profit_new
## [1] "Academy of Art University"
## [2] "American Career College"
## [3] "American College of Education"
## [4] "American InterContinental University"
## [5] "Career Education Corporation"
## [6] "American Military University"
## [7] "American National University"
## [8] "American University"
## [9] "National American University"
## [10] "American Public University"
## [11] "American Sentinel University"
## [12] "Ancora Education"
## [13] "Antonelli College"
## [14] "Art Institutes"
## [15] "ASA College"
## [16] "Aspen University"
## [17] "Bay State College"
## [18] "Berkeley College"
## [19] "University of California, Berkeley"
## [20] "Berklee College of Music"
## [21] "Berkeley College at Yale University"
## [22] "Blair College"
## [23] "Colorado Springs, Colorado"
## [24] "Blue Cliff College"
## [25] "Bradford School (Columbus)"
## [26] "Columbus, Ohio"
## [27] "Bradford School (Pittsburgh)"
## [28] "Pittsburgh, Pennsylvania"
## [29] "Branford Hall Career Institute"
## [30] "Broadview University"
## [31] "Brookline College"
## [32] "Bryant & Stratton College"
## [33] "Burrell College of Osteopathic Medicine"
## [34] "California Miramar University"
## [35] "California Northstate University College of Medicine"
## [36] "Capella University"
## [37] "Centura College"
## [38] "Chamberlain College of Nursing"
## [39] "Adtalem"
## [40] "Charleston School of Law"
## [41] "Charter College"
## [42] "The College of Westchester"
## [43] "West Chester University"
## [44] "West Chester, Pennsylvania"
## [45] "Colorado Technical University"
## [46] "Career Education Corporation"
## [47] "Columbia Southern University"
## [48] "Columbia University"
## [49] "Conservatory of Recording Arts and Sciences"
## [50] "Cortiva Institute"
## [51] "Daymar College"
## [52] "DeVry University"
## [53] "Keller School of Management"
## [54] "DigiPen Institute of Technology"
## [55] "Redmond, Washington"
## [56] "Eagle Gate College"
## [57] "ECPI University"
## [58] "Engine City Technical Institute"
## [59] "South Plainfield, New Jersey"
## [60] "Fashion Institute of Design & Merchandising"
## [61] "Fashion Institute of Technology"
## [62] "New York City"
## [63] "Five Towns College"
## [64] "Dix Hills, New York"
## [65] "Florida Career College"
## [66] "Florida Coastal School of Law"
## [67] "InfiLaw System"
## [68] "Florida Metropolitan University"
## [69] "Florida National University"
## [70] "Hialeah, Florida"
## [71] "Fortis College"
## [72] "Fox College"
## [73] "Chicago metropolitan area"
## [74] "Bedford Park"
## [75] "Tinley Park"
## [76] "Full Sail University"
## [77] "Winter Park, Florida"
## [78] "Georgia Medical Institute"
## [79] "Medical College of Georgia"
## [80] "Augusta University"
## [81] "Grand Canyon University"
## [82] "Grantham University"
## [83] "Kansas City, Missouri"
## [84] "Hamilton College"
## [85] "Hamilton College"
## [86] "Hamilton University"
## [87] "Harris School of Business"
## [88] "Idaho College of Osteopathic Medicine"
## [89] "International Education Corporation"
## [90] "Las Vegas College"
## [91] "Laureate International Universities"
## [92] "Walden University"
## [93] "Lincoln Tech"
## [94] "Lincoln University"
## [95] "Los Angeles Film School"
## [96] "McCann School of Business and Technology"
## [97] "Miami International University of Art and Design"
## [98] "Mildred Elley"
## [99] "Miller-Motte"
## [100] "Minneapolis Business College"
## [101] "Roseville, Minnesota"
## [102] "Monroe College"
## [103] "Mountain West College"
## [104] "National American University"
## [105] "Mall of America"
## [106] "American University"
## [107] "National College"
## [108] "National Institute of Technology (United States)"
## [109] "National Institutes of Technology"
## [110] "National Paralegal College"
## [111] "National University College"
## [112] "Neumont University"
## [113] "NewSchool of Architecture and Design"
## [114] "The New School"
## [115] "Northwestern College"
## [116] "Northwestern University"
## [117] "Ohio Business College"
## [118] "Olympia Career Training Institute"
## [119] "Pacific College of Oriental Medicine"
## [120] "Parks College"
## [121] "Paier College of Art"
## [122] "Pennco Tech"
## [123] "Pima Medical Institute"
## [124] "Pinnacle Career Institute"
## [125] "Pioneer Pacific College"
## [126] "Pittsburgh Technical Institute"
## [127] "Oakdale, Pennsylvania"
## [128] "Cranberry Township, Butler County, Pennsylvania"
## [129] "Platt College"
## [130] "Plaza College"
## [131] "Porter and Chester Institute"
## [132] "Post University"
## [133] "LIU Post"
## [134] "Potomac College"
## [135] "Provo College"
## [136] "Rasmussen College"
## [137] "Recording Radio Film Connection"
## [138] "Redstone College"
## [139] "Rocky Mountain College of Art and Design"
## [140] "Lakewood, Colorado"
## [141] "Rocky Mountain University of Health Professions"
## [142] "Rocky Vista University College of Osteopathic Medicine"
## [143] "SAE Institute"
## [144] "Salem International University"
## [145] "Salem, West Virginia"
## [146] "Salter College"
## [147] "San Joaquin Valley College"
## [148] "Schiller International University"
## [149] "School of Visual Arts"
## [150] "Seacoast Career Schools"
## [151] "South College"
## [152] "South University"
## [153] "Southern University"
## [154] "University of the South"
## [155] "Southern Careers Institute"
## [156] "Southern States University"
## [157] "Southwestern College"
## [158] "Southwestern University"
## [159] "Lincoln University"
## [160] "Spartan College of Aeronautics and Technology"
## [161] "Specs Howard School of Media Arts"
## [162] "Spencerian College"
## [163] "Stevens-Henager College"
## [164] "Stratford University"
## [165] "Strayer University"
## [166] "Sullivan University"
## [167] NA
## [168] NA
## [169] "UEI College"
## [170] "United States University"
## [171] "Universal Technical Institute"
## [172] "University of Advancing Technology"
## [173] "University of Phoenix"
## [174] "University of the Potomac"
## [175] "U.S. Career Institute"
## [176] "Fort Collins, Colorado"
## [177] "Vista College"
## [178] "Walden University"
## [179] "Waldorf College"
## [180] "Washington Technology University"
## [181] "West Coast University"
## [182] "Western Business College"
## [183] "Western International University"
## [184] "Apollo Group"
## [185] "Western State College of Law"
## [186] "Western State University College of Law"
## [187] "Fullerton, California"
## [188] "Western Governors University"
## [189] "Wood Tobé-Coburn School"
## [190] "New York City"
## [191] "Wyoming Technical Institute (WyoTech)"
final_for_profit <- na.omit(for_profit_new)
#final_for_profit
df <-data_frame(final_for_profit)
## Warning: `data_frame()` is deprecated as of tibble 1.1.0.
## Please use `tibble()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
df
## # A tibble: 189 x 1
## final_for_profit
## <chr>
## 1 Academy of Art University
## 2 American Career College
## 3 American College of Education
## 4 American InterContinental University
## 5 Career Education Corporation
## 6 American Military University
## 7 American National University
## 8 American University
## 9 National American University
## 10 American Public University
## # ... with 179 more rows
The webscraping exercise produced a list of over 1,000 schools. We decided to take a random sample of n=35 schools on that list, which also appeared in our dataset. We selected 35 schools so that we would comply with the Central Limit Theorem. The result of the randomization we ran on the webscraped data is below. Each of us compared a portion of the list with our dataset to ensure that all of our selections were present on both lists. We ran multiple randomizations until we had a consistent list with 35 schools appearing on both datasets. Some of the randomly chosen schools had multiple locations, as is shown in the example set of schools below. The final list of schools was geographically diverse.
sample_n(df,35)
## # A tibble: 35 x 1
## final_for_profit
## <chr>
## 1 Centura College
## 2 Pittsburgh Technical Institute
## 3 The College of Westchester
## 4 SAE Institute
## 5 Academy of Art University
## 6 Pittsburgh, Pennsylvania
## 7 Pima Medical Institute
## 8 West Chester, Pennsylvania
## 9 Rocky Mountain University of Health Professions
## 10 Wyoming Technical Institute (WyoTech)
## # ... with 25 more rows
The dataset we chose to work with produced diversity counts at institutions of higher learning in the United States. The datset was retrieved from Kaggle, available here: https://www.kaggle.com/jessemostipak/college-tuition-diversity-and-pay?select=diversity_school.csv
The entire dataset is composed of five separate .csv files addressing school diversity, historical tuition rates, salary protential, the cost of tuition in 2016 and the income of students compared to tuition. We utilized the school diversity file only.
setwd("C:/Users/Jim/Desktop/jen/Data 101/class 7.28/Data101_Project_1")
diversity <- read_csv("diversity2.csv")
## Parsed with column specification:
## cols(
## name = col_character(),
## total_enrollment = col_double(),
## state = col_character(),
## category = col_character(),
## enrollment = col_double()
## )
head(diversity)
## # A tibble: 6 x 5
## name total_enrollment state category enrollment
## <chr> <dbl> <chr> <chr> <dbl>
## 1 University of Phoe~ 195059 Arizo~ Women 134722
## 2 University of Phoe~ 195059 Arizo~ American Indian / Alas~ 876
## 3 University of Phoe~ 195059 Arizo~ Asian 1959
## 4 University of Phoe~ 195059 Arizo~ Black 31455
## 5 University of Phoe~ 195059 Arizo~ Hispanic 13984
## 6 University of Phoe~ 195059 Arizo~ Native Hawaiian / Paci~ 1019
There were numerous “tidying” exercises that we undertook to clean our data and make the webscraping dataset and the diversity dataset comparable. Our tidying exercises included, but were not limited to, creating the following: 1. a new column calculating the percent of students in attendance based on race and ethnicity 2. a new binary column identifying the selected for-profit and not-for-profit schools represented by “1” and “0” respectively.
3. two new datasets: one for our randomly selected for-profit schools and one for our randomly selected not-for-profit schools.
diversity_new <- diversity %>% mutate(enrollment_percentage = enrollment/ total_enrollment*100)
head(diversity_new)
## # A tibble: 6 x 6
## name total_enrollment state category enrollment enrollment_perce~
## <chr> <dbl> <chr> <chr> <dbl> <dbl>
## 1 University~ 195059 Arizo~ Women 134722 69.1
## 2 University~ 195059 Arizo~ American Ind~ 876 0.449
## 3 University~ 195059 Arizo~ Asian 1959 1.00
## 4 University~ 195059 Arizo~ Black 31455 16.1
## 5 University~ 195059 Arizo~ Hispanic 13984 7.17
## 6 University~ 195059 Arizo~ Native Hawai~ 1019 0.522
We removed these categories to eliminate categories where there could be duplicitous counts.
diversity_new5 <- diversity_new %>% filter(category != "Women" & category != "Two Or More Races" & category != "Non-Resident Foreign" & category != "Unknown")
This function was used to assess how many separate schools were included in the dataset. There were over 4000 schools listed.
#unique(diversity_new5$name)
These functions were used to attempt to find individual schools. It was successful.
We checked the structure of our new dataset.
str(diversity_new5)
## tibble [32,235 x 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ name : chr [1:32235] "University of Phoenix-Arizona" "University of Phoenix-Arizona" "University of Phoenix-Arizona" "University of Phoenix-Arizona" ...
## $ total_enrollment : num [1:32235] 195059 195059 195059 195059 195059 ...
## $ state : chr [1:32235] "Arizona" "Arizona" "Arizona" "Arizona" ...
## $ category : chr [1:32235] "American Indian / Alaska Native" "Asian" "Black" "Hispanic" ...
## $ enrollment : num [1:32235] 876 1959 31455 13984 1019 ...
## $ enrollment_percentage: num [1:32235] 0.449 1.004 16.126 7.169 0.522 ...
## - attr(*, "spec")=
## .. cols(
## .. name = col_character(),
## .. total_enrollment = col_double(),
## .. state = col_character(),
## .. category = col_character(),
## .. enrollment = col_double()
## .. )
This itial approach went one-by-one. For schools with multiple locations, some with 20 or more, we needed a faster and more efficient option.
diversity_new5$name[diversity_new5$name == "Spencerian College at Louisville (Ky.)"] <- "Spencerian College"
diversity_new5$name[diversity_new5$name == "Spencerian College at Lexington (Ky.)"] <- "Spencerian College"
diversity_new5$name[diversity_new5$name == "NewSchool of Architecture and Design"] <- "NewSchool Arch. Design"
diversity_new5$name[diversity_new5$name == "Western International University"] <- "Western Intl Univ."
diversity_new5$name[diversity_new5$name == "Schiller International University"] <- "Schiller Intl Univ."
diversity_new5$name[diversity_new5$name == "National Paralegal College"] <- "Natl Paralegal College"
diversity_new5$name[diversity_new5$name == "WestCoast University"] <- "West Coast Univ."
#Failed Attempt
# ME <- diversity_new %>%
# select(contains('Mildred Elley'))
# ME
We discovered the “grepl” function which allowed us to change entire groups of rows that contained a similar word or set of words. All of the different locations could be changed in one set of code instead of multiple sets.
diversity_new5$name[grepl('Mildred Elley', diversity_new5$name)] <- 'Mildred Elley'
This code identified all of the entries that included “West Coast University” in the name.
which(grepl("West Coast University", diversity_new5$name))
## [1] 14911 14912 14913 14914 14915 14916 14917 15702 15703 15704 15705 15706
## [13] 15707 15708 17809 17810 17811 17812 17813 17814 17815 19538 19539 19540
## [25] 19541 19542 19543 19544 23465 23466 23467 23468 23469 23470 23471 31669
## [37] 31670 31671 31672 31673 31674 31675
These indices were used to show the current names in the dataset.
diversity_new5$name[14911]
## [1] "West Coast University -Los Angeles"
diversity_new5$name[15707]
## [1] "West Coast University -Orange County"
diversity_new5$name[17813]
## [1] "West Coast University at Ontario"
diversity_new5$name[19541]
## [1] "West Coast University trasound Institute"
diversity_new5$name[23466]
## [1] "West Coast University at Dallas"
diversity_new5$name[31675]
## [1] "West Coast University at Miami"
diversity_new5$name[grepl('West Coast University', diversity_new5$name)] <- 'West Coast Univ.'
We used the same indices to ensure that the names of all groups changed to the single name.
diversity_new5$name[14911]
## [1] "West Coast Univ."
diversity_new5$name[15707]
## [1] "West Coast Univ."
diversity_new5$name[17813]
## [1] "West Coast Univ."
diversity_new5$name[19541]
## [1] "West Coast Univ."
diversity_new5$name[23466]
## [1] "West Coast Univ."
diversity_new5$name[31675]
## [1] "West Coast Univ."
which(grepl("Brookline College", diversity_new5$name))
## [1] 15744 15745 15746 15747 15748 15749 15750 22324 22325 22326 22327 22328
## [13] 22329 22330 23661 23662 23663 23664 23665 23666 23667 24788 24789 24790
## [25] 24791 24792 24793 24794
diversity_new5$name[15746]
## [1] "Brookline College at Phoenix"
diversity_new5$name[24791]
## [1] "Brookline College at Albuquerque"
diversity_new5$name[grepl("Brookline College", diversity_new5$name)] <- "Brookline College"
diversity_new5$name[15746]
## [1] "Brookline College"
diversity_new5$name[24791]
## [1] "Brookline College"
We changed all of the names of our schools to simplified versions which was particularly important for the schools with multiple locations.
diversity_new5$name[grepl('Clover Park Technical College', diversity_new5$name)] <- 'Clover Park Tech College'
diversity_new5$name[grepl('San Diego City College', diversity_new5$name)] <- 'San Diego City College'
diversity_new5$name[grepl('Aspen University', diversity_new5$name)] <- 'Aspen Univ.'
diversity_new5$name[grepl('Grand Canyon University', diversity_new5$name)] <- 'Grand Canyon Univ.'
diversity_new5$name[grepl('American Public University', diversity_new5$name)] <- 'American Public Univ.'
diversity_new5$name[grepl('Blue Cliff College', diversity_new5$name)] <- 'Blue Cliff College'
diversity_new5$name[grepl('University of Phoenix', diversity_new5$name)] <- 'University of Phoenix'
diversity_new5$name[grepl('Stevens-Henager College', diversity_new5$name)] <- 'Stevens-Henager College'
diversity_new5$name[grepl('DeVry University', diversity_new5$name)] <- 'DeVry Univ.'
diversity_new5$name[grepl('Pioneer Pacific College', diversity_new5$name)] <- 'Pioneer Pacific College'
diversity_new5$name[grepl('National College at', diversity_new5$name)] <- 'National College'
diversity_new5$name[grepl('Strayer University',diversity_new5$name)]<- 'Strayer Univ.'
diversity_new5$name[grepl('Lincoln Tech',diversity_new5$name)]<- 'Lincoln Tech'
diversity_new5$name[grepl('Fashion Institute of Design and Merchandising', diversity_new5$name)]<- 'Fashion Institute'
diversity_new5$name[grepl('Centura College',diversity_new5$name)]<- 'Centura College'
diversity_new5$name[grepl('Rasmussen College',diversity_new5$name)]<- 'Rasmussen College'
diversity_new5$name[grepl('Fortis College',diversity_new5$name)]<- 'Fortis College'
For some reason Southeastern Community College (Iowa) would not change names. It also would not appear if we searched it using the which function. We were only able to filter this school by the total_enrollment column. Perhaps the difficulties with the name are a result of the parentheses.
which(grepl("Southeastern Community College (Iowa)", diversity_new5$name))
## integer(0)
diversity_new5$name[grepl("Southeastern Community College (Iowa)", diversity_new5$name)] <- "Southeastern Community College Iowa"
which(grepl("2987", diversity_new5$total_enrollment))
## [1] 10753 10754 10755 10756 10757 10758 10759
diversity_new5$name[10753:10759]
## [1] "Southeastern Community College (Iowa)"
## [2] "Southeastern Community College (Iowa)"
## [3] "Southeastern Community College (Iowa)"
## [4] "Southeastern Community College (Iowa)"
## [5] "Southeastern Community College (Iowa)"
## [6] "Southeastern Community College (Iowa)"
## [7] "Southeastern Community College (Iowa)"
which(grepl("Southeastern Community College", diversity_new5$name))
## [1] 10753 10754 10755 10756 10757 10758 10759 16318 16319 16320 16321 16322
## [13] 16323 16324
All of the schools with multiple locations have a single name now, however, they still are separated by rows. We would like to combine all of the statistic for each “category” for all of the locations into one summarized row. We have not yet determined how to accomplish this. For example in the following image you can see four locations for the University of Houston.
University of Houston Example
What we would like to do is combine each of the four into one set of 7 lines with the averages calculated for each category’s enrollment and enrollment percentage.
For consistency of methodology we decided to select 35 schools from the not-for-profit educational institutions to compare to the 35 randomly selected for-profit schools. The final list of randomly selected not-for-profit schools was also geographically diverse.
sample_n(diversity_new,35)
## # A tibble: 35 x 6
## name total_enrollment state category enrollment enrollment_perc~
## <chr> <dbl> <chr> <chr> <dbl> <dbl>
## 1 <NA> 372 <NA> Native Hawa~ 3 0.806
## 2 ITT Technic~ 402 New M~ Non-Residen~ 0 0
## 3 Colby Commu~ 1508 Kansas Women 967 64.1
## 4 Danville Ar~ 3207 Illin~ American In~ 14 0.437
## 5 Saint Josep~ 135 Louis~ American In~ 0 0
## 6 Lane Commun~ 9236 Oregon Women 4676 50.6
## 7 Horizon Uni~ 34 Calif~ Two Or More~ 2 5.88
## 8 Stephens Co~ 862 Misso~ Black 117 13.6
## 9 St. John's ~ 20445 New Y~ Two Or More~ 642 3.14
## 10 Northland P~ 3211 Arizo~ Two Or More~ 48 1.49
## # ... with 25 more rows
diversity_new5$name[grepl('University of Houston', diversity_new5$name)] <- 'Univ. of Houston'
diversity_new5$name[grepl('University of Colorado', diversity_new5$name)] <- 'Univ. of Colorado'
diversity_new5$name[grepl('University of Massachusetts', diversity_new5$name)] <- 'Univ. of Massachusetts'
diversity_new5$name[grepl('Arizona College', diversity_new5$name)] <- 'Arizona College'
diversity_new5$name[grepl('University of Idaho', diversity_new5$name)] <- 'Univ. of Idaho'
diversity_new5$name[grepl('Clark College', diversity_new5$name)] <- 'Clark College'
diversity_new5$name[grepl('City University of New York Hunter College', diversity_new5$name)] <- 'City Univ. New York Hunter College'
diversity_new5$name[grepl('State University of New York', diversity_new5$name)] <- 'State Univ. of New York'
diversity_new5$name[grepl('Pennsylvania State University',diversity_new5$name)]<- 'Penn State Univ.'
diversity_new5$name[grepl('University of Minnesota',diversity_new5$name)]<- 'Univ. of Minn'
diversity_new5$name[grepl('Arizona State University',diversity_new5$name)]<- 'Arizona State'
diversity_new5$name[grepl('ITT Technical Institute',diversity_new5$name)]<- 'ITT Tech'
We were initially going to use the column to create our plots and complete out analysis, however we realized it would be faster to create separate datasets with the randomly selected schools. The below coding successfully create a new column with “1” for the identified for-profit colleges and “0” for the rest.
diversity_new3 <- diversity_new %>% mutate(diversity_new,forProfit = ifelse(name=="Spencerian_College", 1,
ifelse(name=="Aspen University", 1,
ifelse(name=="American Public University system", 1, 0))))
diversity_new3$forProfit[47834] # Check on Spencerian_College entry
## [1] 0
diversity_new3$forProfit[24533] # Check on Aspen University
## [1] 1
diversity_new3$forProfit[123] # Check on American Public University system
## [1] 1
Is there a simpler way to group these?
for_profit2 <- diversity_new5 %>%
filter(name == "Spencerian College" |
name == "Mildred Elley" |
name == "Brookline College" |
name == "Grand Canyon Univ." |
name == "Aspen Univ." |
name == "American Public Univ." |
name == "Western Intl Univ." |
name == "NewSchool Arch. Design" |
name == "Schiller Intl Univ." |
name == "Natl Paralegal College" |
name == "West Coast Univ." |
name == "Blue Cliff College" |
name == "Walden University" |
name == "Neumont University" |
name == "University of Phoenix" |
name == "Stevens-Henager College" |
name == "DeVry Univ." |
name == "Pioneer Pacific College" |
name == "Stratford University" |
name == "Capella University" |
name == "Grantham University" |
name == "Redstone College" |
name == "National College" |
name == "Strayer Univ." |
name == "Lincoln Tech" |
name == "Fashion Institute" |
name == "Centura College" |
name == "Rasmussen College" |
name == "Fortis College" |
name == "Full Sail University" |
name == "Rocky Mountain College of Art & Design" |
name == "Minneapolis Business College" |
name == "Paier College of Art" |
name == "Vista College" |
name == "Bay State College" )
str(for_profit2)
## tibble [1,183 x 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ name : chr [1:1183] "University of Phoenix" "University of Phoenix" "University of Phoenix" "University of Phoenix" ...
## $ total_enrollment : num [1:1183] 195059 195059 195059 195059 195059 ...
## $ state : chr [1:1183] "Arizona" "Arizona" "Arizona" "Arizona" ...
## $ category : chr [1:1183] "American Indian / Alaska Native" "Asian" "Black" "Hispanic" ...
## $ enrollment : num [1:1183] 876 1959 31455 13984 1019 ...
## $ enrollment_percentage: num [1:1183] 0.449 1.004 16.126 7.169 0.522 ...
## - attr(*, "spec")=
## .. cols(
## .. name = col_character(),
## .. total_enrollment = col_double(),
## .. state = col_character(),
## .. category = col_character(),
## .. enrollment = col_double()
## .. )
head(for_profit2)
## # A tibble: 6 x 6
## name total_enrollment state category enrollment enrollment_perce~
## <chr> <dbl> <chr> <chr> <dbl> <dbl>
## 1 Universit~ 195059 Arizo~ American Indi~ 876 0.449
## 2 Universit~ 195059 Arizo~ Asian 1959 1.00
## 3 Universit~ 195059 Arizo~ Black 31455 16.1
## 4 Universit~ 195059 Arizo~ Hispanic 13984 7.17
## 5 Universit~ 195059 Arizo~ Native Hawaii~ 1019 0.522
## 6 Universit~ 195059 Arizo~ White 58209 29.8
not_profit <- diversity_new5 %>%
filter(name == "Univ. of Idaho" |
total_enrollment == "2987"|
name == "Clover Park Technical College"|
name == "Clark College"|
name == "City Univ. New York Hunter College"|
name == "Univ. of Houston" |
name == "Univ. of Colorado"|
name == "Univ. of Massachusetts" |
name == "Arizona College" |
name == "San Diego City College" |
name == "State Univ. of New York" |
name == "Eastern Shore Community College" |
name == "Southern Oregon University" |
name == "Santa Monica College" |
name == "Adirondack Community College" |
name == "East Georgia State College" |
name == "Smith College" |
name == "East Tennessee State University" |
name == "Austin Community College" |
name == "Tompkins Cortland Community College" |
name == "Penn State Univ." |
name == "Univ. of Minn" |
name == "Arizona State" |
name == "ITT Tech" |
name == "University of Hawaii Hawaii Community College"|
name == "Burlington College" |
name == "University of Central Oklahoma" |
name == "Southern Virginia University" |
name == "Pennsylvania Highlands Community College" |
name == "Colgate University"|
name == "University of Pittsburg")
For some reason the name filter for “Southeastern Community College (Iowa)” would not select properly. The same happened for the attempted name change above.
head(not_profit)
## # A tibble: 6 x 6
## name total_enrollment state category enrollment enrollment_perce~
## <chr> <dbl> <chr> <chr> <dbl> <dbl>
## 1 Univ. o~ 51147 Minnes~ American India~ 163 0.319
## 2 Univ. o~ 51147 Minnes~ Asian 3998 7.82
## 3 Univ. o~ 51147 Minnes~ Black 1785 3.49
## 4 Univ. o~ 51147 Minnes~ Hispanic 1544 3.02
## 5 Univ. o~ 51147 Minnes~ Native Hawaiia~ 31 0.0606
## 6 Univ. o~ 51147 Minnes~ White 33674 65.8
library(ggplot2)
plot1 <- for_profit2 %>%
ggplot() +
geom_bar(aes(x=name, y= enrollment_percentage ,
fill = category),
position = "fill",
stat = "identity" ) +
coord_flip() +
theme_minimal() +
ggtitle("Diversity % of Enrollment in For Profit US Schools", ) +
theme (plot.title = element_text(hjust = .01, size=15)) +
labs(fill = "Race") +
theme(legend.justification = -20,
legend.position="bottom",
legend.text = element_text(size=6) ,
) +
xlab("School Name") +
ylab ("Percent of Enrollment")
plot1
plot2 <- not_profit %>%
ggplot() +
geom_bar(aes(x=name, y= enrollment_percentage ,
fill = category),
position = "fill",
stat = "identity" ) +
coord_flip() +
theme_minimal() +
ggtitle("Diversity % of Enrollment in Not for Profit US Schools", ) +
theme (plot.title = element_text(hjust = .01, size=12)) +
labs(fill = "Race") +
theme( legend.position="bottom",
legend.justification = "left",
legend.text = element_text(size=6) ,
) +
xlab("School Name") +
ylab ("Percent of Enrollment")
plot2
As the graphs above illustrate, our data verifies that indeed there are a higher proportion of students of color at for-profit colleges than at non-profit colleges. White Students (shown in pink in the above plots) have much greater attendance at non-profit colleges. These findings provide support for the argument that there are higher numbers of enrollment of students of color at for-profit colleges. One possible explanation could be targeted marketing campaigns by for-profit colleges to students of color. Further research is required.
Given the well-documented high cost of for-profit education, the exorbitant student debt incurred at these schools, the lower rates of employment upon graduation and the low levels of satisfaction with educational quality from previous students, all potential students should think twice before enrolling in for-profit colleges. This is particularly true for students of color who may be identified by for-profit colleges as susceptible to predatory exploitative practices that will provide additional profits to the college’s shareholders without making good on their promise to provide a marketable, quality education. The predatory actions of for-profit colleges contribute to our nation’s growing economic divide between the haves and the have nots and perpetuates the unequal education system we have today. Fortunately, there are many people and organizations working to end these predatory practices, but in the meantime while they continue to exist, we must inform one another about the importance of obtaining a quality education at not-for-profit institutions, like Montgomery College.
The following is a list of more detailed analysis that could be done with this dataset:
Comparing diversity of for-profit and not-for-profit schools in the same geographic regions utilizing local racial demographics.
Comparing community colleges, to four-year-institutions, to for-profit institutions
Comparing the tuition rates, salary potential and overall profits received from community colleges, four-year institutions and for-profit institutions.
A longitudinal study could be conducted to see if there have been changes in the demographic makeup of for-profit colleges in the time before the Obama-era regulations, the lack of regulation during the DeVos era and the patterns that emerge after the predatory lending lawsuits and any resulting legislation.
Body, D. (2019, Mar. 19). Worse Off Than When They Enrolled: The Consequence of For-Profit Colleges for People of Color. The Aspen Institute. https://www.aspeninstitute.org/blog-posts/worse-off-than-when-they-enrolled-the-consequence-of-for-profit-colleges-for-people-of-color/
Bonadies, G.G., Rovenger, J., Connor, E., Shum, B. & Merrill, T. (2018, Jul. 30). For-Profit Schools’ Predatory Practices and Students of Color: A Mission to Enroll Rather than Educate, Harvard Law Review Blog. https://blog.harvardlawreview.org/for-profit-schools-predatory-practices-and-students-of-color-a-mission-to-enroll-rather-than-educate/
Conti, A. (2019, Sep. 10). How For-Profit Colleges Have Targeted and Taken Advantage of Black Students. Vice. https://www.vice.com/en_us/article/bjwj3d/how-for-profit-colleges-have-targeted-and-taken-advantage-of-black-students
Green, E.L. (2019, Jun. 28). DeVos Repeals Obama-Era Rule Cracking down on For-Profit Colleges, New York Times. https://www.nytimes.com/2019/06/28/us/politics/betsy-devos-for-profit-colleges.html
Halperin, D. 22 States Sue DeVos to Overturn Anti-Student Rule. Republic Report. https://www.republicreport.org/2020/22-states-sue-devos-to-overturn-anti-student-rule/
Legal Services Center(2020), Project on Predatory Student Lending: Cases, Harvard Law School. https://predatorystudentlending.org/cases/
Lobosco, K. (2019, Jul. 23). For-profit college students are waiting 958 days for loan relief, CNN. https://www.cnn.com/2019/07/23/politics/betsy-devos-loan-forgiveness-for-profit-college-students/index.html
Lopez, M. (2015, Feb. 12). BEWARE: For-Profit Colleges. The Patriot Post. https://bcchspatriotpost.com/2391/news/beware-for-profit-colleges/
Redman, H. (2020, Jun. 27). AG Sues Department of Education Over For-Profit College Rules. Urban Milwaukee. https://urbanmilwaukee.com/2020/06/27/ag-sues-department-of-education-over-for-profit-college-rules/
TBS Staff (2019, Jul. 29). For-Profit Colleges vs. Non-Profit Colleges - What’s The Difference? The Best Schools Magazine. https://thebestschools.org/magazine/for-profit-vs-non-profit
Turner, C. (2019, Nov. 14). Devos Refuses to Forgive student Debt For Those DeFrauded by For-Profit Colleges, All Things Considered, NPR. https://www.npr.org/2019/11/14/779465130/devos-refuses-to-forgive-student-debt-for-those-defrauded-by-for-profit-colleges
Voorhees, K. (2019, Oct. 17). Civil Rights Groups: For-Profit Colleges Exploit Black and Latino Students. The Leadership Conference Education Fund. https://civilrights.org/edfund/2019/10/17/civil-rights-groups-for-profit-colleges-exploit-black-and-latino-students/