library(git2r)
library(usethis)

Racial Demographics at For-Profit and Not-For-Profit Educational Institutions

“Until we get equality in education, we won’t have an equal society.”

~ Supreme Court Justice Sonia Sotomayor

Topic Background

Each of our team members originally selected a few datasets that we compared as a group. Our group decided to explore a dataset Ana found related to the topic of racial and ethnic diversity at two-year and four-year colleges and universities in the United States. We initially thought to compare diversity at two-year community colleges versus 4-year schools and also were curious about the variations in diversity between the two that could be found comparatively across the country.

As we were exploring the dataset we identified a number of named colleges and universities that are known as “for-profit” institutions. Some of these institutions have been the subject of recent lawsuits regarding predatory (Halperin, 2020; Legal Services Center, 2020; Redman, 2020). The high number of current lawsuits regarding for-profit colleges’ predatory behaviors are a result of the actions of current Education Secretary Betsy DeVos. In 2019, DeVos “repealed an Obama-era regulation that sought to crack down on for-profit colleges and universities that produced graduates with no meaningful job prospects and mountains of student debt they could not hope to repay” (Green, 2019. para. 1). Another action that DeVos took was to deny debt forgiveness to students who had been prey to for-profit predatory educational institutions (Lobosco, 2019; Turner, 2019). We were aware of past and recent news articles describing the marketing strategies of these colleges that were aimed at students of color, low-income students, immigrant communities and students who are first in their family to go to college (Bonadies et al. 2018; Conti, 2019; Voorhees, 2019). Previous studies have found that loan debt is higher for students who have attended For-Profit institutions, which disproportionately affects students of color.

**Race and For-Profits**

Race and For-Profits

(Body, 2019)

**Loan Default Rates at For-Profits**

Loan Default Rates at For-Profits

(Body, 2019)

Despite their high cost, For-profit institutions have a lower graduation rate and employment rate than non-profit institutions.

**For-Profit Graduation & Employment Rates**

For-Profit Graduation & Employment Rates

(Lopez, 2015)

We will examine our dataset and explore possible consistencies and/or inconsistencies with these previous reports.

Initial Questions

  1. How do racial and ethnic demographics vary between not-for-profit institutions and for-profit colleges in our dataset?

  2. In our dataset, do the For-Profit Institutions have a higher proportion of students of color than other community colleges and four-year colleges, thus providing support to the argument that For-Profit colleges target students of color for enrollment?

Definitions:

For-Profit Institutions: For-profit institutions are defined by the way that “revenue earned by the school is invested”. For-profit colleges have investors who want to make a profit. Their operations management is determined in part on maximizing the return profit for investors. “Money earned by the shcool may be used to pay out investors and award boneuses to executives, as well as sustain the operation’s profitability through aggressive marketing and recruitment strategies” (TBS Staff, 2019, para. 8)

Not-for Profit Institutions: Non-profit colleges can be either public or private. Regardless of whether the school is public or private, non-profit colleges must “reinvest the money earned through enrollment into the educational mission” (TBS Staff, 2019, para. 8).

Webscraping and Randomly Selecting Sample of 35 For-Profit Colleges

Our dataset did not include information on whether colleges were for-profit.

In order to determine which of the schools on our list were for-profit institutions, we decided to find a website with a list of for-profit institutions, scrape the data from that website, to create a second dataset with a list of for profit schools.

Following Web-Scraping Tutorial to Scrape Wikipedia Website

Loading Libraries

#Loading the rvest package
library('rvest')
## Loading required package: xml2
library(tidyverse)
## -- Attaching packages ---------------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.1     v dplyr   1.0.0
## v tidyr   1.1.0     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts ------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter()         masks stats::filter()
## x readr::guess_encoding() masks rvest::guess_encoding()
## x purrr::is_empty()       masks git2r::is_empty()
## x dplyr::lag()            masks stats::lag()
## x purrr::pluck()          masks rvest::pluck()
## x dplyr::pull()           masks git2r::pull()
## x purrr::when()           masks git2r::when()
library(dplyr)
#Specifying the url for desired website to be scraped
url <- 'https://en.wikipedia.org/wiki/List_of_for-profit_universities_and_colleges'

#Reading the HTML code from the website
webpage <- read_html(url)

#Using CSS selectors to scrape the rankings section
for_profit_html <- html_nodes(webpage,'li a')

#Converting the ranking data to text
for_profit <- html_text(for_profit_html)

#Let's have a look at the rankings
head(for_profit)
## [1] "1 In the United States"                                          
## [2] "2 Distance education (online)"                                   
## [3] "3 Outside the United States"                                     
## [4] "3.1 Closed or merged"                                            
## [5] "4 For-profit colleges that became non-profit colleges"           
## [6] "4.1 For-profit colleges that became non-profit colleges (closed)"

Viewing new index

for_profit
##   [1] "1 In the United States"                                                                                               
##   [2] "2 Distance education (online)"                                                                                        
##   [3] "3 Outside the United States"                                                                                          
##   [4] "3.1 Closed or merged"                                                                                                 
##   [5] "4 For-profit colleges that became non-profit colleges"                                                                
##   [6] "4.1 For-profit colleges that became non-profit colleges (closed)"                                                     
##   [7] "5 See also"                                                                                                           
##   [8] "6 References"                                                                                                         
##   [9] "By state"                                                                                                             
##  [10] "in insular areas"                                                                                                     
##  [11] "By subject area"                                                                                                      
##  [12] "History of"                                                                                                           
##  [13] "Finance"                                                                                                              
##  [14] "Law"                                                                                                                  
##  [15] "Literacy"                                                                                                             
##  [16] "Reform"                                                                                                               
##  [17] "Pre-kindergarten"                                                                                                     
##  [18] "Primary"                                                                                                              
##  [19] "Secondary"                                                                                                            
##  [20] "Higher"                                                                                                               
##  [21] "Organizations"                                                                                                        
##  [22] "v"                                                                                                                    
##  [23] "t"                                                                                                                    
##  [24] "e"                                                                                                                    
##  [25] "Academy of Art University"                                                                                            
##  [26] "American Career College"                                                                                              
##  [27] "American College of Education"                                                                                        
##  [28] "American InterContinental University"                                                                                 
##  [29] "Career Education Corporation"                                                                                         
##  [30] "American Military University"                                                                                         
##  [31] "American National University"                                                                                         
##  [32] "American University"                                                                                                  
##  [33] "National American University"                                                                                         
##  [34] "American Public University"                                                                                           
##  [35] "American Sentinel University"                                                                                         
##  [36] "Ancora Education"                                                                                                     
##  [37] "Antonelli College"                                                                                                    
##  [38] "Art Institutes"                                                                                                       
##  [39] "ASA College"                                                                                                          
##  [40] "Aspen University"                                                                                                     
##  [41] "Bay State College"                                                                                                    
##  [42] "Berkeley College"                                                                                                     
##  [43] "University of California, Berkeley"                                                                                   
##  [44] "Berklee College of Music"                                                                                             
##  [45] "Berkeley College at Yale University"                                                                                  
##  [46] "Blair College"                                                                                                        
##  [47] "Colorado Springs, Colorado"                                                                                           
##  [48] "Blue Cliff College"                                                                                                   
##  [49] "Bradford School (Columbus)"                                                                                           
##  [50] "Columbus, Ohio"                                                                                                       
##  [51] "Bradford School (Pittsburgh)"                                                                                         
##  [52] "Pittsburgh, Pennsylvania"                                                                                             
##  [53] "Branford Hall Career Institute"                                                                                       
##  [54] "Broadview University"                                                                                                 
##  [55] "Brookline College"                                                                                                    
##  [56] "Bryant & Stratton College"                                                                                            
##  [57] "Burrell College of Osteopathic Medicine"                                                                              
##  [58] "California Miramar University"                                                                                        
##  [59] "California Northstate University College of Medicine"                                                                 
##  [60] "Capella University"                                                                                                   
##  [61] "Centura College"                                                                                                      
##  [62] "Chamberlain College of Nursing"                                                                                       
##  [63] "Adtalem"                                                                                                              
##  [64] "Charleston School of Law"                                                                                             
##  [65] "Charter College"                                                                                                      
##  [66] "The College of Westchester"                                                                                           
##  [67] "West Chester University"                                                                                              
##  [68] "West Chester, Pennsylvania"                                                                                           
##  [69] "Colorado Technical University"                                                                                        
##  [70] "Career Education Corporation"                                                                                         
##  [71] "Columbia Southern University"                                                                                         
##  [72] "Columbia University"                                                                                                  
##  [73] "Conservatory of Recording Arts and Sciences"                                                                          
##  [74] "Cortiva Institute"                                                                                                    
##  [75] "Daymar College"                                                                                                       
##  [76] "DeVry University"                                                                                                     
##  [77] "Keller School of Management"                                                                                          
##  [78] "DigiPen Institute of Technology"                                                                                      
##  [79] "Redmond, Washington"                                                                                                  
##  [80] "Eagle Gate College"                                                                                                   
##  [81] "ECPI University"                                                                                                      
##  [82] "Engine City Technical Institute"                                                                                      
##  [83] "South Plainfield, New Jersey"                                                                                         
##  [84] "Fashion Institute of Design & Merchandising"                                                                          
##  [85] "Fashion Institute of Technology"                                                                                      
##  [86] "New York City"                                                                                                        
##  [87] "Five Towns College"                                                                                                   
##  [88] "Dix Hills, New York"                                                                                                  
##  [89] "Florida Career College"                                                                                               
##  [90] "Florida Coastal School of Law"                                                                                        
##  [91] "InfiLaw System"                                                                                                       
##  [92] "Florida Metropolitan University"                                                                                      
##  [93] "Florida National University"                                                                                          
##  [94] "Hialeah, Florida"                                                                                                     
##  [95] "Fortis College"                                                                                                       
##  [96] "Fox College"                                                                                                          
##  [97] "Chicago metropolitan area"                                                                                            
##  [98] "Bedford Park"                                                                                                         
##  [99] "Tinley Park"                                                                                                          
## [100] "Full Sail University"                                                                                                 
## [101] "Winter Park, Florida"                                                                                                 
## [102] "Georgia Medical Institute"                                                                                            
## [103] "Medical College of Georgia"                                                                                           
## [104] "Augusta University"                                                                                                   
## [105] "Grand Canyon University"                                                                                              
## [106] "Grantham University"                                                                                                  
## [107] "Kansas City, Missouri"                                                                                                
## [108] "Hamilton College"                                                                                                     
## [109] "Hamilton College"                                                                                                     
## [110] "Hamilton University"                                                                                                  
## [111] "Harris School of Business"                                                                                            
## [112] "Idaho College of Osteopathic Medicine"                                                                                
## [113] "International Education Corporation"                                                                                  
## [114] "Las Vegas College"                                                                                                    
## [115] "Laureate International Universities"                                                                                  
## [116] "Walden University"                                                                                                    
## [117] "Lincoln Tech"                                                                                                         
## [118] "Lincoln University"                                                                                                   
## [119] "Los Angeles Film School"                                                                                              
## [120] "McCann School of Business and Technology"                                                                             
## [121] "Miami International University of Art and Design"                                                                     
## [122] "Mildred Elley"                                                                                                        
## [123] "Miller-Motte"                                                                                                         
## [124] "Minneapolis Business College"                                                                                         
## [125] "Roseville, Minnesota"                                                                                                 
## [126] "Monroe College"                                                                                                       
## [127] "Mountain West College"                                                                                                
## [128] "National American University"                                                                                         
## [129] "Mall of America"                                                                                                      
## [130] "American University"                                                                                                  
## [131] "National College"                                                                                                     
## [132] "National Institute of Technology (United States)"                                                                     
## [133] "National Institutes of Technology"                                                                                    
## [134] "National Paralegal College"                                                                                           
## [135] "National University College"                                                                                          
## [136] "Neumont University"                                                                                                   
## [137] "NewSchool of Architecture and Design"                                                                                 
## [138] "The New School"                                                                                                       
## [139] "Northwestern College"                                                                                                 
## [140] "Northwestern University"                                                                                              
## [141] "Ohio Business College"                                                                                                
## [142] "Olympia Career Training Institute"                                                                                    
## [143] "Pacific College of Oriental Medicine"                                                                                 
## [144] "Parks College"                                                                                                        
## [145] "Paier College of Art"                                                                                                 
## [146] "Pennco Tech"                                                                                                          
## [147] "Pima Medical Institute"                                                                                               
## [148] "Pinnacle Career Institute"                                                                                            
## [149] "Pioneer Pacific College"                                                                                              
## [150] "Pittsburgh Technical Institute"                                                                                       
## [151] "Oakdale, Pennsylvania"                                                                                                
## [152] "Cranberry Township, Butler County, Pennsylvania"                                                                      
## [153] "Platt College"                                                                                                        
## [154] "Plaza College"                                                                                                        
## [155] "Porter and Chester Institute"                                                                                         
## [156] "Post University"                                                                                                      
## [157] "LIU Post"                                                                                                             
## [158] "Potomac College"                                                                                                      
## [159] "Provo College"                                                                                                        
## [160] "Rasmussen College"                                                                                                    
## [161] "Recording Radio Film Connection"                                                                                      
## [162] "Redstone College"                                                                                                     
## [163] "Rocky Mountain College of Art and Design"                                                                             
## [164] "Lakewood, Colorado"                                                                                                   
## [165] "Rocky Mountain University of Health Professions"                                                                      
## [166] "Rocky Vista University College of Osteopathic Medicine"                                                               
## [167] "SAE Institute"                                                                                                        
## [168] "Salem International University"                                                                                       
## [169] "Salem, West Virginia"                                                                                                 
## [170] "Salter College"                                                                                                       
## [171] "San Joaquin Valley College"                                                                                           
## [172] "Schiller International University"                                                                                    
## [173] "School of Visual Arts"                                                                                                
## [174] "Seacoast Career Schools"                                                                                              
## [175] "South College"                                                                                                        
## [176] "South University"                                                                                                     
## [177] "Southern University"                                                                                                  
## [178] "University of the South"                                                                                              
## [179] "Southern Careers Institute"                                                                                           
## [180] "Southern States University"                                                                                           
## [181] "Southwestern College"                                                                                                 
## [182] "Southwestern University"                                                                                              
## [183] "Lincoln University"                                                                                                   
## [184] "Spartan College of Aeronautics and Technology"                                                                        
## [185] "Specs Howard School of Media Arts"                                                                                    
## [186] "Spencerian College"                                                                                                   
## [187] "Stevens-Henager College"                                                                                              
## [188] "Stratford University"                                                                                                 
## [189] "Strayer University"                                                                                                   
## [190] "Sullivan University"                                                                                                  
## [191] "[1]"                                                                                                                  
## [192] "[2]"                                                                                                                  
## [193] "UEI College"                                                                                                          
## [194] "United States University"                                                                                             
## [195] "Universal Technical Institute"                                                                                        
## [196] "University of Advancing Technology"                                                                                   
## [197] "University of Phoenix"                                                                                                
## [198] "University of the Potomac"                                                                                            
## [199] "U.S. Career Institute"                                                                                                
## [200] "Fort Collins, Colorado"                                                                                               
## [201] "Vista College"                                                                                                        
## [202] "Walden University"                                                                                                    
## [203] "Waldorf College"                                                                                                      
## [204] "Washington Technology University"                                                                                     
## [205] "West Coast University"                                                                                                
## [206] "Western Business College"                                                                                             
## [207] "Western International University"                                                                                     
## [208] "Apollo Group"                                                                                                         
## [209] "Western State College of Law"                                                                                         
## [210] "Western State University College of Law"                                                                              
## [211] "Fullerton, California"                                                                                                
## [212] "Western Governors University"                                                                                         
## [213] "Wood Tobé-Coburn School"                                                                                              
## [214] "New York City"                                                                                                        
## [215] "Wyoming Technical Institute (WyoTech)"                                                                                
## [216] "American College of Technology"                                                                                       
## [217] "American Public University System"                                                                                    
## [218] "Charles Town, West Virginia"                                                                                          
## [219] "Manassas, Virginia"                                                                                                   
## [220] "American Sentinel University"                                                                                         
## [221] "Ashworth College"                                                                                                     
## [222] "Aspen University"                                                                                                     
## [223] "California InterContinental University"                                                                               
## [224] "California Southern University"                                                                                       
## [225] "University of Southern California"                                                                                    
## [226] "Capella University"                                                                                                   
## [227] "Grantham University"                                                                                                  
## [228] "London School of Business and Finance"                                                                                
## [229] "New Charter University"                                                                                               
## [230] "Hoover, Alabama"                                                                                                      
## [231] "New England College of Business and Finance"                                                                          
## [232] "New England College"                                                                                                  
## [233] "Setanta College"                                                                                                      
## [234] "Trident University International"                                                                                     
## [235] "University of Atlanta"                                                                                                
## [236] "Atlanta University Center"                                                                                            
## [237] "Clark Atlanta University"                                                                                             
## [238] "University of Liverpool"                                                                                              
## [239] "Laureate Education"                                                                                                   
## [240] "University of the Potomac"                                                                                            
## [241] "Washington, D.C."                                                                                                     
## [242] "Vienna, Virginia"                                                                                                     
## [243] "AMA Computer University"                                                                                              
## [244] "Anhembi Morumbi University"                                                                                           
## [245] "Arden University"                                                                                                     
## [246] "Global University Systems"                                                                                            
## [247] "[3]"                                                                                                                  
## [248] "BPP University"                                                                                                       
## [249] "Apollo Education Group"                                                                                               
## [250] "[4]"                                                                                                                  
## [251] "Cyprus College"                                                                                                       
## [252] "Nicosia, Cyprus"                                                                                                      
## [253] "Dnyaneshwar Vidyapeeth"                                                                                               
## [254] "London School of Business and Finance"                                                                                
## [255] "Global University Systems"                                                                                            
## [256] "[3]"                                                                                                                  
## [257] "Multimedia University"                                                                                                
## [258] "Nyenrode Business University"                                                                                         
## [259] "Breukelen"                                                                                                            
## [260] "Rai University"                                                                                                       
## [261] "Regenesys Business School"                                                                                            
## [262] "Sandton"                                                                                                              
## [263] "Ross University"                                                                                                      
## [264] "Ross University School of Medicine"                                                                                   
## [265] "Picard, Dominica"                                                                                                     
## [266] "Saint Kitts"                                                                                                          
## [267] "St. George's University"                                                                                              
## [268] "Grenada"                                                                                                              
## [269] "St Patrick's College, London"                                                                                         
## [270] "Global University Systems"                                                                                            
## [271] "[3]"                                                                                                                  
## [272] "Taylor's University"                                                                                                  
## [273] "Trinity School of Medicine"                                                                                           
## [274] "Universidad Europea de Madrid"                                                                                        
## [275] "University of Medicine and Health Sciences"                                                                           
## [276] "University of the Latin American Educational Center"                                                                  
## [277] "Rosario, Argentina"                                                                                                   
## [278] "University of Law"                                                                                                    
## [279] "Global University Systems"                                                                                            
## [280] "[3]"                                                                                                                  
## [281] "Allied American University"                                                                                           
## [282] "Anthem Institute"                                                                                                     
## [283] "Arizona Summit Law School"                                                                                            
## [284] "InfiLaw System"                                                                                                       
## [285] "Ashmead College"                                                                                                      
## [286] "ATI Enterprises"                                                                                                      
## [287] "Banner College"                                                                                                       
## [288] "Arlington, Virginia"                                                                                                  
## [289] "Banner Institute"                                                                                                     
## [290] "Briarcliffe College"                                                                                                  
## [291] "Career Education Corporation"                                                                                         
## [292] "Brightwood College"                                                                                                   
## [293] "[8]"                                                                                                                  
## [294] "Brooks College"                                                                                                       
## [295] "Brooks Institute of Photography"                                                                                      
## [296] "Brown Mackie College"                                                                                                 
## [297] "Education Management Corporation"                                                                                     
## [298] "Bryman College"                                                                                                       
## [299] "Collins College"                                                                                                      
## [300] "Charlotte School of Law"                                                                                              
## [301] "InfiLaw System"                                                                                                       
## [302] "Corinthian Colleges"                                                                                                  
## [303] "Le Cordon Bleu"                                                                                                       
## [304] "Career Education Corporation"                                                                                         
## [305] "Crown College"                                                                                                        
## [306] "Daniel Webster College"                                                                                               
## [307] "Nashua, New Hampshire"                                                                                                
## [308] "ITT Educational Services"                                                                                             
## [309] "Decker College"                                                                                                       
## [310] "Eagle Gate College"                                                                                                   
## [311] "Everest College"                                                                                                      
## [312] "Corinthian Colleges"                                                                                                  
## [313] "Everest Institute"                                                                                                    
## [314] "Corinthian Colleges"                                                                                                  
## [315] "FastTrain College"                                                                                                    
## [316] "[9]"                                                                                                                  
## [317] "Gibbs College"                                                                                                        
## [318] "Harrington College of Design"                                                                                         
## [319] "Career Education Corporation"                                                                                         
## [320] "Harrison College"                                                                                                     
## [321] "Heald College"                                                                                                        
## [322] "Corinthian Colleges"                                                                                                  
## [323] "High-Tech Institute"                                                                                                  
## [324] "International Academy of Design and Technology"                                                                       
## [325] "ITT Technical Institute"                                                                                              
## [326] "Kee Business College"                                                                                                 
## [327] "Corinthian Colleges, Inc."                                                                                            
## [328] "King's College"                                                                                                       
## [329] "Miami-Jacobs Career College"                                                                                          
## [330] "[10]"                                                                                                                 
## [331] "Missouri College"                                                                                                     
## [332] "Career Education Corporation"                                                                                         
## [333] "Mount Washington College"                                                                                             
## [334] "McNally Smith College of Music"                                                                                       
## [335] "Sanford-Brown College"                                                                                                
## [336] "Career Education Corporation"                                                                                         
## [337] "Stanford University"                                                                                                  
## [338] "Samford University"                                                                                                   
## [339] "Springfield College"                                                                                                  
## [340] "Springfield, Missouri"                                                                                                
## [341] "Springfield College"                                                                                                  
## [342] "Springfield, Massachusetts"                                                                                           
## [343] "Trump University"                                                                                                     
## [344] "University of the Rockies"                                                                                            
## [345] "Colorado Springs, Colorado"                                                                                           
## [346] "Zovio"                                                                                                                
## [347] "Ashford University"                                                                                                   
## [348] "Vatterott College"                                                                                                    
## [349] "Victory University"                                                                                                   
## [350] "Virginia College"                                                                                                     
## [351] "University of Virginia"                                                                                               
## [352] "Westwood College"                                                                                                     
## [353] "Art Institutes"                                                                                                       
## [354] "[11]"                                                                                                                 
## [355] "Concord Law School"                                                                                                   
## [356] "Keiser University"                                                                                                    
## [357] "[12]"                                                                                                                 
## [358] "Kendall College"                                                                                                      
## [359] "National Louis University"                                                                                            
## [360] "[13]"                                                                                                                 
## [361] "[14]"                                                                                                                 
## [362] "Purdue University Global"                                                                                             
## [363] "Kaplan University"                                                                                                    
## [364] "[15]"                                                                                                                 
## [365] "South University"                                                                                                     
## [366] "[16]"                                                                                                                 
## [367] "Southern New Hampshire University"                                                                                    
## [368] "[17]"                                                                                                                 
## [369] "Argosy University"                                                                                                    
## [370] "Student loan debt"                                                                                                    
## [371] "List of universities and colleges by country"                                                                         
## [372] "For-profit higher education in the United States"                                                                     
## [373] "^"                                                                                                                    
## [374] "Troubled Manhattan Commercial College to Close"                                                                       
## [375] "The New York Times"                                                                                                   
## [376] "^"                                                                                                                    
## [377] "Interboro Institute"                                                                                                  
## [378] "a"                                                                                                                    
## [379] "b"                                                                                                                    
## [380] "c"                                                                                                                    
## [381] "d"                                                                                                                    
## [382] "\"Arden University sold to Global University Systems\""                                                               
## [383] "Times Higher Education"                                                                                               
## [384] "^"                                                                                                                    
## [385] "\"BPP Law School changes hands in $1.1bn private equity deal\""                                                       
## [386] "the original"                                                                                                         
## [387] "^"                                                                                                                    
## [388] "\"Student protesters march on to root out Chile's false profits\""                                                    
## [389] "^"                                                                                                                    
## [390] "\"FT interview: Sebastián Piñera\""                                                                                   
## [391] "^"                                                                                                                    
## [392] "\"Committee accuses Chilean universities of financial irregularities\""                                               
## [393] "^"                                                                                                                    
## [394] "https://kfoxtv.com/news/local/brightwood-college-campuses-nationwide-including-el-paso-location-to-close"             
## [395] "^"                                                                                                                    
## [396] "\"Fla. Rep. Hastings Tied To FastTrain, For-Profit College Raided by FBI – Republic Report\""                         
## [397] "^"                                                                                                                    
## [398] "\"Miami-Jacobs to close four campuses\""                                                                              
## [399] "^"                                                                                                                    
## [400] "\"Art Institute campuses to be sold to foundation\""                                                                  
## [401] "^"                                                                                                                    
## [402] "\"Keiser U. Goes Nonprofit\""                                                                                         
## [403] "^"                                                                                                                    
## [404] "\"History\""                                                                                                          
## [405] "^"                                                                                                                    
## [406] "\"Kendall College's culinary, hospitality programs to land on Michigan Ave. after sale to National Louis University\""
## [407] "^"                                                                                                                    
## [408] "\"Kaplan Closes Transaction with Purdue for the Assets of Kaplan University\""                                        
## [409] "^"                                                                                                                    
## [410] "\"Large for-profit chain EDMC to be bought by the Dream Center, a missionary group\""                                 
## [411] "^"                                                                                                                    
## [412] "\"History\""                                                                                                          
## [413] "For-profit universities and colleges"                                                                                 
## [414] "Articles with limited geographic scope from November 2011"                                                            
## [415] "United States-centric"                                                                                                
## [416] "Dynamic lists"                                                                                                        
## [417] "Talk"                                                                                                                 
## [418] "Contributions"                                                                                                        
## [419] "Create account"                                                                                                       
## [420] "Log in"                                                                                                               
## [421] "Article"                                                                                                              
## [422] "Talk"                                                                                                                 
## [423] "Read"                                                                                                                 
## [424] "Edit"                                                                                                                 
## [425] "View history"                                                                                                         
## [426] "Main page"                                                                                                            
## [427] "Contents"                                                                                                             
## [428] "Current events"                                                                                                       
## [429] "Random article"                                                                                                       
## [430] "About Wikipedia"                                                                                                      
## [431] "Contact us"                                                                                                           
## [432] "Donate"                                                                                                               
## [433] "Wikipedia store"                                                                                                      
## [434] "Help"                                                                                                                 
## [435] "Community portal"                                                                                                     
## [436] "Recent changes"                                                                                                       
## [437] "Upload file"                                                                                                          
## [438] "What links here"                                                                                                      
## [439] "Related changes"                                                                                                      
## [440] "Upload file"                                                                                                          
## [441] "Special pages"                                                                                                        
## [442] "Permanent link"                                                                                                       
## [443] "Page information"                                                                                                     
## [444] "Cite this page"                                                                                                       
## [445] "Wikidata item"                                                                                                        
## [446] "Download as PDF"                                                                                                      
## [447] "Printable version"                                                                                                    
## [448] "Creative Commons Attribution-ShareAlike License"                                                                      
## [449] ""                                                                                                                     
## [450] "Terms of Use"                                                                                                         
## [451] "Privacy Policy"                                                                                                       
## [452] "Wikimedia Foundation, Inc."                                                                                           
## [453] "Privacy policy"                                                                                                       
## [454] "About Wikipedia"                                                                                                      
## [455] "Disclaimers"                                                                                                          
## [456] "Contact Wikipedia"                                                                                                    
## [457] "Developers"                                                                                                           
## [458] "Statistics"                                                                                                           
## [459] "Cookie statement"                                                                                                     
## [460] "Mobile view"                                                                                                          
## [461] ""                                                                                                                     
## [462] ""
for_profit_new <- for_profit[25:215]
head(for_profit_new)
## [1] "Academy of Art University"           
## [2] "American Career College"             
## [3] "American College of Education"       
## [4] "American InterContinental University"
## [5] "Career Education Corporation"        
## [6] "American Military University"

Cleaning our for-profit list

for_profit_new[167:168] <- NA
for_profit_new
##   [1] "Academy of Art University"                             
##   [2] "American Career College"                               
##   [3] "American College of Education"                         
##   [4] "American InterContinental University"                  
##   [5] "Career Education Corporation"                          
##   [6] "American Military University"                          
##   [7] "American National University"                          
##   [8] "American University"                                   
##   [9] "National American University"                          
##  [10] "American Public University"                            
##  [11] "American Sentinel University"                          
##  [12] "Ancora Education"                                      
##  [13] "Antonelli College"                                     
##  [14] "Art Institutes"                                        
##  [15] "ASA College"                                           
##  [16] "Aspen University"                                      
##  [17] "Bay State College"                                     
##  [18] "Berkeley College"                                      
##  [19] "University of California, Berkeley"                    
##  [20] "Berklee College of Music"                              
##  [21] "Berkeley College at Yale University"                   
##  [22] "Blair College"                                         
##  [23] "Colorado Springs, Colorado"                            
##  [24] "Blue Cliff College"                                    
##  [25] "Bradford School (Columbus)"                            
##  [26] "Columbus, Ohio"                                        
##  [27] "Bradford School (Pittsburgh)"                          
##  [28] "Pittsburgh, Pennsylvania"                              
##  [29] "Branford Hall Career Institute"                        
##  [30] "Broadview University"                                  
##  [31] "Brookline College"                                     
##  [32] "Bryant & Stratton College"                             
##  [33] "Burrell College of Osteopathic Medicine"               
##  [34] "California Miramar University"                         
##  [35] "California Northstate University College of Medicine"  
##  [36] "Capella University"                                    
##  [37] "Centura College"                                       
##  [38] "Chamberlain College of Nursing"                        
##  [39] "Adtalem"                                               
##  [40] "Charleston School of Law"                              
##  [41] "Charter College"                                       
##  [42] "The College of Westchester"                            
##  [43] "West Chester University"                               
##  [44] "West Chester, Pennsylvania"                            
##  [45] "Colorado Technical University"                         
##  [46] "Career Education Corporation"                          
##  [47] "Columbia Southern University"                          
##  [48] "Columbia University"                                   
##  [49] "Conservatory of Recording Arts and Sciences"           
##  [50] "Cortiva Institute"                                     
##  [51] "Daymar College"                                        
##  [52] "DeVry University"                                      
##  [53] "Keller School of Management"                           
##  [54] "DigiPen Institute of Technology"                       
##  [55] "Redmond, Washington"                                   
##  [56] "Eagle Gate College"                                    
##  [57] "ECPI University"                                       
##  [58] "Engine City Technical Institute"                       
##  [59] "South Plainfield, New Jersey"                          
##  [60] "Fashion Institute of Design & Merchandising"           
##  [61] "Fashion Institute of Technology"                       
##  [62] "New York City"                                         
##  [63] "Five Towns College"                                    
##  [64] "Dix Hills, New York"                                   
##  [65] "Florida Career College"                                
##  [66] "Florida Coastal School of Law"                         
##  [67] "InfiLaw System"                                        
##  [68] "Florida Metropolitan University"                       
##  [69] "Florida National University"                           
##  [70] "Hialeah, Florida"                                      
##  [71] "Fortis College"                                        
##  [72] "Fox College"                                           
##  [73] "Chicago metropolitan area"                             
##  [74] "Bedford Park"                                          
##  [75] "Tinley Park"                                           
##  [76] "Full Sail University"                                  
##  [77] "Winter Park, Florida"                                  
##  [78] "Georgia Medical Institute"                             
##  [79] "Medical College of Georgia"                            
##  [80] "Augusta University"                                    
##  [81] "Grand Canyon University"                               
##  [82] "Grantham University"                                   
##  [83] "Kansas City, Missouri"                                 
##  [84] "Hamilton College"                                      
##  [85] "Hamilton College"                                      
##  [86] "Hamilton University"                                   
##  [87] "Harris School of Business"                             
##  [88] "Idaho College of Osteopathic Medicine"                 
##  [89] "International Education Corporation"                   
##  [90] "Las Vegas College"                                     
##  [91] "Laureate International Universities"                   
##  [92] "Walden University"                                     
##  [93] "Lincoln Tech"                                          
##  [94] "Lincoln University"                                    
##  [95] "Los Angeles Film School"                               
##  [96] "McCann School of Business and Technology"              
##  [97] "Miami International University of Art and Design"      
##  [98] "Mildred Elley"                                         
##  [99] "Miller-Motte"                                          
## [100] "Minneapolis Business College"                          
## [101] "Roseville, Minnesota"                                  
## [102] "Monroe College"                                        
## [103] "Mountain West College"                                 
## [104] "National American University"                          
## [105] "Mall of America"                                       
## [106] "American University"                                   
## [107] "National College"                                      
## [108] "National Institute of Technology (United States)"      
## [109] "National Institutes of Technology"                     
## [110] "National Paralegal College"                            
## [111] "National University College"                           
## [112] "Neumont University"                                    
## [113] "NewSchool of Architecture and Design"                  
## [114] "The New School"                                        
## [115] "Northwestern College"                                  
## [116] "Northwestern University"                               
## [117] "Ohio Business College"                                 
## [118] "Olympia Career Training Institute"                     
## [119] "Pacific College of Oriental Medicine"                  
## [120] "Parks College"                                         
## [121] "Paier College of Art"                                  
## [122] "Pennco Tech"                                           
## [123] "Pima Medical Institute"                                
## [124] "Pinnacle Career Institute"                             
## [125] "Pioneer Pacific College"                               
## [126] "Pittsburgh Technical Institute"                        
## [127] "Oakdale, Pennsylvania"                                 
## [128] "Cranberry Township, Butler County, Pennsylvania"       
## [129] "Platt College"                                         
## [130] "Plaza College"                                         
## [131] "Porter and Chester Institute"                          
## [132] "Post University"                                       
## [133] "LIU Post"                                              
## [134] "Potomac College"                                       
## [135] "Provo College"                                         
## [136] "Rasmussen College"                                     
## [137] "Recording Radio Film Connection"                       
## [138] "Redstone College"                                      
## [139] "Rocky Mountain College of Art and Design"              
## [140] "Lakewood, Colorado"                                    
## [141] "Rocky Mountain University of Health Professions"       
## [142] "Rocky Vista University College of Osteopathic Medicine"
## [143] "SAE Institute"                                         
## [144] "Salem International University"                        
## [145] "Salem, West Virginia"                                  
## [146] "Salter College"                                        
## [147] "San Joaquin Valley College"                            
## [148] "Schiller International University"                     
## [149] "School of Visual Arts"                                 
## [150] "Seacoast Career Schools"                               
## [151] "South College"                                         
## [152] "South University"                                      
## [153] "Southern University"                                   
## [154] "University of the South"                               
## [155] "Southern Careers Institute"                            
## [156] "Southern States University"                            
## [157] "Southwestern College"                                  
## [158] "Southwestern University"                               
## [159] "Lincoln University"                                    
## [160] "Spartan College of Aeronautics and Technology"         
## [161] "Specs Howard School of Media Arts"                     
## [162] "Spencerian College"                                    
## [163] "Stevens-Henager College"                               
## [164] "Stratford University"                                  
## [165] "Strayer University"                                    
## [166] "Sullivan University"                                   
## [167] NA                                                      
## [168] NA                                                      
## [169] "UEI College"                                           
## [170] "United States University"                              
## [171] "Universal Technical Institute"                         
## [172] "University of Advancing Technology"                    
## [173] "University of Phoenix"                                 
## [174] "University of the Potomac"                             
## [175] "U.S. Career Institute"                                 
## [176] "Fort Collins, Colorado"                                
## [177] "Vista College"                                         
## [178] "Walden University"                                     
## [179] "Waldorf College"                                       
## [180] "Washington Technology University"                      
## [181] "West Coast University"                                 
## [182] "Western Business College"                              
## [183] "Western International University"                      
## [184] "Apollo Group"                                          
## [185] "Western State College of Law"                          
## [186] "Western State University College of Law"               
## [187] "Fullerton, California"                                 
## [188] "Western Governors University"                          
## [189] "Wood Tobé-Coburn School"                               
## [190] "New York City"                                         
## [191] "Wyoming Technical Institute (WyoTech)"
final_for_profit <- na.omit(for_profit_new)
#final_for_profit
 df <-data_frame(final_for_profit)
## Warning: `data_frame()` is deprecated as of tibble 1.1.0.
## Please use `tibble()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
df
## # A tibble: 189 x 1
##    final_for_profit                    
##    <chr>                               
##  1 Academy of Art University           
##  2 American Career College             
##  3 American College of Education       
##  4 American InterContinental University
##  5 Career Education Corporation        
##  6 American Military University        
##  7 American National University        
##  8 American University                 
##  9 National American University        
## 10 American Public University          
## # ... with 179 more rows

The webscraping exercise produced a list of nearly 200 schools. We decided to take a random sample of n=35 schools on that list, which also appeared in our dataset. We selected 35 schools so that we would comply with the Central Limit Theorem. The result of the randomization we ran on the webscraped data is below. Each of us compared a portion of the list with our dataset to ensure that all of our selections were present on both lists. We ran multiple randomizations until we had a consistent list with 35 schools appearing on both datasets. Some of the randomly chosen schools had multiple locations, as is shown in the example set of schools below. The final list of schools was geographically diverse.

sample_n(df,35)
## # A tibble: 35 x 1
##    final_for_profit                           
##    <chr>                                      
##  1 Augusta University                         
##  2 Lincoln University                         
##  3 Redmond, Washington                        
##  4 Paier College of Art                       
##  5 South University                           
##  6 Grand Canyon University                    
##  7 Antonelli College                          
##  8 American Sentinel University               
##  9 Miller-Motte                               
## 10 Conservatory of Recording Arts and Sciences
## # ... with 25 more rows

Jennifer

  • Spencerian College
    • Spencerian College at Louisville (Ky.)
    • Spencerian College at Lexington (Ky.)
  • Aspen University
  • American Public University
    • American Public University system
  • Western International University
  • NewSchool of Architecture and Design
  • Schiller International University
  • National Paralegal College
  • West Coast University
    • West Coast University at Ontario
    • West Coast University trasound Institute
    • West Coast University at Dallas
    • West Coast University at Miami
    • West Coast University -Los Angeles
    • West Coast University -Orange County
  • Mildred Elley
    • Mildred Elley
    • Mildred Elley at New York City
  • Brookline College
    • Brookline College at Tucson
    • Brookline College at Tempe (Ariz.)
    • Brookline College at Albuquerque
    • Brookline College at Phoenix
  • Grand Canyon University

Ana

  • Walden University
  • Blue Cliff College
    • Blue Cliff College at Metairie
    • Blue Cliff College at Alexandria
    • Blue Cliff College at Shreveport
    • Blue Cliff College at Gulfport
  • Neumont University
  • University of Pheonix
  • Stevens-Henager College
  • DeVry University
  • Pioneer Pacific College
  • Capella University
  • Grantham University
  • Stratford University
  • Redstone College
  • National College

Tiffany

  • Strayer University
  • Lincoln Tech
  • Full Sail University
  • Rocky Mountain College of Art and Design
  • Minneapolis Business College
  • Fashion Institute of Design & Merchandising
  • Paier College of Art
  • Vista College
  • Centura College
  • Rasmussen College
  • Redstone College
  • Fortis College
  • Bay State College

Data Wrangling & Analysis

The dataset we chose to work with produced diversity counts at institutions of higher learning in the United States. The datset was retrieved from Kaggle, available here: https://www.kaggle.com/jessemostipak/college-tuition-diversity-and-pay?select=diversity_school.csv

The entire dataset is composed of five separate .csv files addressing school diversity, historical tuition rates, salary protential, the cost of tuition in 2016 and the income of students compared to tuition. We utilized the school diversity file only.

Setting working directory

setwd("C:/Users/Jim/Desktop/jen/Data 101/class 7.28/Data101_Project_1") 
diversity <- read_csv("diversity2.csv")
## Parsed with column specification:
## cols(
##   name = col_character(),
##   total_enrollment = col_double(),
##   state = col_character(),
##   category = col_character(),
##   enrollment = col_double()
## )

Viewing data

head(diversity)
## # A tibble: 6 x 5
##   name                total_enrollment state  category                enrollment
##   <chr>                          <dbl> <chr>  <chr>                        <dbl>
## 1 University of Phoe~           195059 Arizo~ Women                       134722
## 2 University of Phoe~           195059 Arizo~ American Indian / Alas~        876
## 3 University of Phoe~           195059 Arizo~ Asian                         1959
## 4 University of Phoe~           195059 Arizo~ Black                        31455
## 5 University of Phoe~           195059 Arizo~ Hispanic                     13984
## 6 University of Phoe~           195059 Arizo~ Native Hawaiian / Paci~       1019

Cleaning the Data

There were numerous “tidying” exercises that we undertook to clean our data and make the webscraping dataset and the diversity dataset comparable. Our tidying exercises included, but were not limited to, creating the following: 1. a new column calculating the percent of students in attendance based on race and ethnicity 2. a new binary column identifying the selected for-profit and not-for-profit schools represented by “1” and “0” respectively.
3. two new datasets: one for our randomly selected for-profit schools and one for our randomly selected not-for-profit schools.

Creating new column for percent enrollment

diversity_new <- diversity %>%  mutate(enrollment_percentage = enrollment/ total_enrollment*100)
head(diversity_new)
## # A tibble: 6 x 6
##   name        total_enrollment state  category      enrollment enrollment_perce~
##   <chr>                  <dbl> <chr>  <chr>              <dbl>             <dbl>
## 1 University~           195059 Arizo~ Women             134722            69.1  
## 2 University~           195059 Arizo~ American Ind~        876             0.449
## 3 University~           195059 Arizo~ Asian               1959             1.00 
## 4 University~           195059 Arizo~ Black              31455            16.1  
## 5 University~           195059 Arizo~ Hispanic           13984             7.17 
## 6 University~           195059 Arizo~ Native Hawai~       1019             0.522

Removing Categories: Women, Non-Resident Foreign and Two or More Races

We removed these categories to eliminate categories where there could be duplicitous counts.

diversity_new5 <- diversity_new %>% filter(category != "Women" & category != "Two Or More Races" & category != "Non-Resident Foreign" & category != "Unknown" & category != "Total Minority") 

This function was used to assess how many separate schools were included in the dataset. There were over 4000 schools listed.

#unique(diversity_new5$name)

These functions were used to attempt to find individual schools. It was successful.

We checked the structure of our new dataset.

str(diversity_new5)
## tibble [27,630 x 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ name                 : chr [1:27630] "University of Phoenix-Arizona" "University of Phoenix-Arizona" "University of Phoenix-Arizona" "University of Phoenix-Arizona" ...
##  $ total_enrollment     : num [1:27630] 195059 195059 195059 195059 195059 ...
##  $ state                : chr [1:27630] "Arizona" "Arizona" "Arizona" "Arizona" ...
##  $ category             : chr [1:27630] "American Indian / Alaska Native" "Asian" "Black" "Hispanic" ...
##  $ enrollment           : num [1:27630] 876 1959 31455 13984 1019 ...
##  $ enrollment_percentage: num [1:27630] 0.449 1.004 16.126 7.169 0.522 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   name = col_character(),
##   ..   total_enrollment = col_double(),
##   ..   state = col_character(),
##   ..   category = col_character(),
##   ..   enrollment = col_double()
##   .. )

Changing Names of Schools with Multiple Locations into One Name (separate rows remain)

This itial approach went one-by-one. For schools with multiple locations, some with 20 or more, we needed a faster and more efficient option.

diversity_new5$name[diversity_new5$name == "Spencerian College at Louisville (Ky.)"] <- "Spencerian College"

diversity_new5$name[diversity_new5$name == "Spencerian College at Lexington (Ky.)"] <- "Spencerian College"
diversity_new5$name[diversity_new5$name == "NewSchool of Architecture and Design"] <- "NewSchool Arch. Design"
diversity_new5$name[diversity_new5$name == "Western International University"] <- "Western Intl Univ."
diversity_new5$name[diversity_new5$name == "Schiller International University"] <- "Schiller Intl Univ."
diversity_new5$name[diversity_new5$name == "National Paralegal College"] <- "Natl Paralegal College"
diversity_new5$name[diversity_new5$name == "WestCoast University"] <- "West Coast Univ."
#Failed Attempt
# ME <- diversity_new %>% 
# select(contains('Mildred Elley'))
# ME

Name change for the entire group containing a portion of the name

We discovered the “grepl” function which allowed us to change entire groups of rows that contained a similar word or set of words. All of the different locations could be changed in one set of code instead of multiple sets.

diversity_new5$name[grepl('Mildred Elley', diversity_new5$name)] <- 'Mildred Elley'

Showing the Multiple Location Indicies to Verify Name Change Success After

This code identified all of the entries that included “West Coast University” in the name.

which(grepl("West Coast University", diversity_new5$name))
##  [1] 12781 12782 12783 12784 12785 12786 13459 13460 13461 13462 13463 13464
## [13] 15265 15266 15267 15268 15269 15270 16747 16748 16749 16750 16751 16752
## [25] 20113 20114 20115 20116 20117 20118 27145 27146 27147 27148 27149 27150

These indices were used to show the current names in the dataset.

diversity_new5$name[14911]
## [1] "Salus University"
diversity_new5$name[15707]
## [1] "Concorde Career College at Garden Grove (Calif.)"
diversity_new5$name[17813]
## [1] "Argosy University Inland Empire"
diversity_new5$name[19541]
## [1] "Fortis College at Cincinnati"
diversity_new5$name[23466]
## [1] "Beacon College"
diversity_new5$name[31675]
## [1] NA

Single Line of Code to Change All Location Names to One Name

diversity_new5$name[grepl('West Coast University', diversity_new5$name)] <- 'West Coast Univ.'

Verifying that the group name change worked for each group

We used the same indices to ensure that the names of all groups changed to the single name.

diversity_new5$name[14911]
## [1] "Salus University"
diversity_new5$name[15707]
## [1] "Concorde Career College at Garden Grove (Calif.)"
diversity_new5$name[17813]
## [1] "Argosy University Inland Empire"
diversity_new5$name[19541]
## [1] "Fortis College at Cincinnati"
diversity_new5$name[23466]
## [1] "Beacon College"
diversity_new5$name[31675]
## [1] NA

One More Name Change Test

which(grepl("Brookline College", diversity_new5$name))
##  [1] 13495 13496 13497 13498 13499 13500 19135 19136 19137 19138 19139 19140
## [13] 20281 20282 20283 20284 20285 20286 21247 21248 21249 21250 21251 21252
diversity_new5$name[15746]
## [1] "Concorde Career College at San Diego"
diversity_new5$name[24791]
## [1] "Gupton Jones College of Funeral Service"
diversity_new5$name[grepl("Brookline College", diversity_new5$name)] <- "Brookline College"
diversity_new5$name[15746]
## [1] "Concorde Career College at San Diego"
diversity_new5$name[24791]
## [1] "Gupton Jones College of Funeral Service"

Simplifying Names for all Schools in our Samples

We changed all of the names of our schools to simplified versions which was particularly important for the schools with multiple locations.

Jennifer - For-Profit Name Changes

diversity_new5$name[grepl('Clover Park Technical College', diversity_new5$name)] <- 'Clover Park Tech College'

diversity_new5$name[grepl('San Diego City College', diversity_new5$name)] <- 'San Diego City College'

diversity_new5$name[grepl('Aspen University', diversity_new5$name)] <- 'Aspen Univ.'

diversity_new5$name[grepl('Grand Canyon University', diversity_new5$name)] <- 'Grand Canyon Univ.'

diversity_new5$name[grepl('American Public University', diversity_new5$name)] <- 'American Public Univ.'

Ana - For-Profit Name Changes

diversity_new5$name[grepl('Blue Cliff College', diversity_new5$name)] <- 'Blue Cliff College' 

diversity_new5$name[grepl('University of Phoenix', diversity_new5$name)] <- 'University of Phoenix' 

diversity_new5$name[grepl('Stevens-Henager College', diversity_new5$name)] <- 'Stevens-Henager College' 

diversity_new5$name[grepl('DeVry University', diversity_new5$name)] <- 'DeVry Univ.' 

diversity_new5$name[grepl('Pioneer Pacific College', diversity_new5$name)] <- 'Pioneer Pacific College' 

diversity_new5$name[grepl('National College at', diversity_new5$name)] <- 'National College'

Tiffany - For-Profit Name Changes

diversity_new5$name[grepl('Strayer University',diversity_new5$name)]<- 'Strayer Univ.'

diversity_new5$name[grepl('Lincoln Tech',diversity_new5$name)]<- 'Lincoln Tech' 

diversity_new5$name[grepl('Fashion Institute of Design and Merchandising', diversity_new5$name)]<- 'Fashion Institute'

diversity_new5$name[grepl('Centura College',diversity_new5$name)]<- 'Centura College'

diversity_new5$name[grepl('Rasmussen College',diversity_new5$name)]<- 'Rasmussen College' 

diversity_new5$name[grepl('Fortis College',diversity_new5$name)]<- 'Fortis College'

One Name that Refused to Change

For some reason Southeastern Community College (Iowa) would not change names. It also would not appear if we searched it using the which function. We were only able to filter this school by the total_enrollment column. Perhaps the difficulties with the name are a result of the parentheses.

#library(stringr)
#str_replace(diversity_new5$name, "\\(.*\\)", "")
#diversity_new5$name <- as.character(diversity_new5$name)
#unlist(strsplit(diversity_new5$name, " \\(.*\\)"))
#library(plyr)
#diversity_new5$name <- as.character(diversity_new5$name)
#c<-strsplit(diversity_new5$name, "\\(")
#ldply(c)
which(grepl("Southeastern Community College Iowa", diversity_new5$name))
## integer(0)
which(grepl("Southeastern Community College (Iowa)", diversity_new5$name))
## integer(0)
diversity_new5$name[grepl("Southeastern Community College (Iowa)", diversity_new5$name)] <- "Southeastern Community College Iowa"
which(grepl("2987", diversity_new5$total_enrollment))
## [1] 9217 9218 9219 9220 9221 9222
diversity_new5$name[10753:10759]
## [1] "Kent State University -Tuscarawas (Ohio)"
## [2] "Kent State University -Tuscarawas (Ohio)"
## [3] "Kent State University -Tuscarawas (Ohio)"
## [4] "Kent State University -Tuscarawas (Ohio)"
## [5] "Kent State University -Tuscarawas (Ohio)"
## [6] "Kent State University -Tuscarawas (Ohio)"
## [7] "Bellingham Technical College"
which(grepl("Southeastern Community College", diversity_new5$name))
##  [1]  9217  9218  9219  9220  9221  9222 13987 13988 13989 13990 13991 13992

Merging and Summarizing the Data for Schools with Multiple Locations

All of the schools with multiple locations have a single name now, however, they still are separated by rows. We would like to combine all of the statistic for each “category” for all of the locations into one summarized row. We have not yet determined how to accomplish this. For example in the following image you can see four locations for the University of Houston.

**University of Houston Example**

University of Houston Example

What we would like to do is combine each of the four into one set of 7 lines with the averages calculated for each category’s enrollment and enrollment percentage.

We were able to resolve this with the creation of dataset “c4” below (Rcode line number 871)

35 Randomly selected not-for-profit educational institutions

For consistency of methodology we decided to select 35 schools from the not-for-profit educational institutions to compare to the 35 randomly selected for-profit schools. The final list of randomly selected not-for-profit schools was also geographically diverse.

sample_n(diversity_new,35)
## # A tibble: 35 x 6
##    name         total_enrollment state   category    enrollment enrollment_perc~
##    <chr>                   <dbl> <chr>   <chr>            <dbl>            <dbl>
##  1 Clinton Com~             1870 New Yo~ Asian               32           1.71  
##  2 State Unive~             5968 New Yo~ Non-Reside~        393           6.59  
##  3 Master's Co~             1572 Califo~ Non-Reside~        115           7.32  
##  4 College of ~             1444 Arkans~ Black              181          12.5   
##  5 Ohio State ~             1188 Ohio    Two Or Mor~         29           2.44  
##  6 Trinity Int~             2202 Illino~ Hispanic            84           3.81  
##  7 Gordon Coll~             2105 Massac~ Black               66           3.14  
##  8 Transylvani~             1014 Kentuc~ Black               31           3.06  
##  9 Argosy Univ~              532 Tennes~ Black              278          52.3   
## 10 Monroe Coun~             3482 Michig~ Native Haw~          3           0.0862
## # ... with 25 more rows

Jennifer

  • University of Idaho
  • Southeastern Community College (Iowa)
  • Clover Park Technical College
  • Clark College
  • City University of New York Hunter College
  • University of Houston
    • University of Houston
    • University of Houston-Downtown
    • University of Houston-Clear Lake
    • University of Houston-Victoria
  • University of Colorado
    • University of Colorado at Colorado Springs
    • University of Colorado at Boulder
    • University of Colorado at Denver
  • University of Massachusetts
    • University of Massachusetts at Dartmouth
    • University of Massachusetts at Worcester
    • University of Massachusetts at Amherst
    • University of Massachusetts at Lowell
    • University of Massachusetts at Boston
  • Arizona College
    • Arizona College at Glendale
    • Arizona College at Mesa
  • San Diego City College

Ana

  • State University of New York College at Purchase
  • Eastern Shore Community College
  • Southern Oregon University
  • Santa Monica College
  • Adirondack Community College
  • East Georgia State College
  • Smith College
  • East Tennessee State University
  • Austin Community College
  • Tompkins Cortland Community College

Tiffany

  • University of Hawaii Hawaii Community College
  • Burlington College
  • Pennsylvania State University - Harrisburg
  • University of Minnesota
  • University of Central Oklahoma
  • Southern Virginia University
  • Pennsylvania Highlands Community College
  • Arizona State University
  • ITT Technical Institute at Knoxville
  • Colgate University
  • University of Pittsburg

Jennifer- Non-Profit Name Changes

diversity_new5$name[grepl('University of Houston', diversity_new5$name)] <- 'Univ. of Houston'

diversity_new5$name[grepl('University of Colorado', diversity_new5$name)] <- 'Univ. of Colorado'

diversity_new5$name[grepl('University of Massachusetts', diversity_new5$name)] <- 'Univ. of Massachusetts'

diversity_new5$name[grepl('Arizona College', diversity_new5$name)] <- 'Arizona College'

diversity_new5$name[grepl('University of Idaho', diversity_new5$name)] <- 'Univ. of Idaho'

diversity_new5$name[grepl('Clark College', diversity_new5$name)] <- 'Clark College'

diversity_new5$name[grepl('City University of New York Hunter College', diversity_new5$name)] <- 'City Univ. New York Hunter College'

Ana - Non-Profit Name Changes

diversity_new5$name[grepl('State University of New York', diversity_new5$name)] <- 'State Univ. of New York' 

Tiffany - Non-Profit Name Changes

diversity_new5$name[grepl('Pennsylvania State University',diversity_new5$name)]<- 'Penn State Univ.'

diversity_new5$name[grepl('University of Minnesota',diversity_new5$name)]<- 'Univ. of Minn'

diversity_new5$name[grepl('Arizona State University',diversity_new5$name)]<- 'Arizona State'

diversity_new5$name[grepl('ITT Technical Institute',diversity_new5$name)]<- 'ITT Tech' 

Create a new column to identify the randomly selected for-profit colleges sample.

We were initially going to use the column to create our plots and complete out analysis, however we realized it would be faster to create separate datasets with the randomly selected schools. The below coding successfully create a new column with “1” for the identified for-profit colleges and “0” for the rest.

Produces 1 for For-Profit Random Selections and 0 for the rest

diversity_new3 <- diversity_new %>% mutate(diversity_new,forProfit = ifelse(name=="Spencerian_College", 1,
     ifelse(name=="Aspen University", 1, 
        ifelse(name=="American Public University system", 1, 0))))
diversity_new3$forProfit[47834] # Check on Spencerian_College entry
## [1] 0
diversity_new3$forProfit[24533] # Check on Aspen University
## [1] 1
diversity_new3$forProfit[123] # Check on American Public University system
## [1] 1

Creating a two new datasets to compare of for-profit and not-for-profit institutions

Is there a simpler way to group these?

for_profit2 <- diversity_new5 %>% 
  filter(name == "Spencerian College" |
           name == "Mildred Elley" | 
           name == "Brookline College" | 
           name == "Grand Canyon Univ." | 
           name == "Aspen Univ." | 
           name == "American Public Univ." | 
           name == "Western Intl Univ." | 
           name == "NewSchool Arch. Design" | 
           name == "Schiller Intl Univ." | 
           name == "Natl Paralegal College" | 
           name == "West Coast Univ." | 
           name == "Blue Cliff College" | 
           name == "Walden University" |
           name == "Neumont University" |
           name == "University of Phoenix" | 
           name == "Stevens-Henager College" | 
           name == "DeVry Univ." | 
           name == "Pioneer Pacific College" | 
           name == "Stratford University" | 
           name == "Capella University" | 
           name == "Grantham University" | 
           name == "Redstone College" | 
           name == "National College" | 
           name == "Strayer Univ." | 
           name == "Lincoln Tech" | 
           name == "Fashion Institute" | 
           name == "Centura College" |  
           name == "Rasmussen College" | 
           name == "Fortis College" | 
           name == "Full Sail University" | 
           name == "Rocky Mountain College of Art & Design" | 
           name == "Minneapolis Business College" | 
           name == "Paier College of Art" | 
           name == "Vista College" | 
           name == "Bay State College" )
str(for_profit2)
## tibble [1,014 x 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ name                 : chr [1:1014] "University of Phoenix" "University of Phoenix" "University of Phoenix" "University of Phoenix" ...
##  $ total_enrollment     : num [1:1014] 195059 195059 195059 195059 195059 ...
##  $ state                : chr [1:1014] "Arizona" "Arizona" "Arizona" "Arizona" ...
##  $ category             : chr [1:1014] "American Indian / Alaska Native" "Asian" "Black" "Hispanic" ...
##  $ enrollment           : num [1:1014] 876 1959 31455 13984 1019 ...
##  $ enrollment_percentage: num [1:1014] 0.449 1.004 16.126 7.169 0.522 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   name = col_character(),
##   ..   total_enrollment = col_double(),
##   ..   state = col_character(),
##   ..   category = col_character(),
##   ..   enrollment = col_double()
##   .. )
head(for_profit2)
## # A tibble: 6 x 6
##   name       total_enrollment state  category       enrollment enrollment_perce~
##   <chr>                 <dbl> <chr>  <chr>               <dbl>             <dbl>
## 1 Universit~           195059 Arizo~ American Indi~        876             0.449
## 2 Universit~           195059 Arizo~ Asian                1959             1.00 
## 3 Universit~           195059 Arizo~ Black               31455            16.1  
## 4 Universit~           195059 Arizo~ Hispanic            13984             7.17 
## 5 Universit~           195059 Arizo~ Native Hawaii~       1019             0.522
## 6 Universit~           195059 Arizo~ White               58209            29.8
not_profit <- diversity_new5 %>% 
  filter(name == "Univ. of Idaho" |
           total_enrollment == "2987"|
           name == "Clover Park Technical College"|
           name == "Clark College"|
           name == "City Univ. New York Hunter College"|
           name == "Univ. of Houston" | 
           name == "Univ. of Colorado"|
           name == "Univ. of Massachusetts" |
           name == "Arizona College" |
           name == "San Diego City College" | 
           name == "State Univ. of New York" | 
           name == "Eastern Shore Community College" | 
           name == "Southern Oregon University" | 
           name == "Santa Monica College" | 
           name == "Adirondack Community College" | 
           name == "East Georgia State College" | 
           name == "Smith College" | 
           name == "East Tennessee State University" | 
           name == "Austin Community College" | 
           name == "Tompkins Cortland Community College" | 
           name == "Penn State Univ." | 
           name == "Univ. of Minn" | 
           name == "Arizona State" | 
           name == "ITT Tech" | 
           name == "University of Hawaii Hawaii Community College"|
           name == "Burlington College" | 
           name == "University of Central Oklahoma" | 
           name == "Southern Virginia University" | 
           name == "Pennsylvania Highlands Community College" | 
           name == "Colgate University"|
           name == "University of Pittsburg")

For some reason the name filter for “Southeastern Community College (Iowa)” would not select properly. The same happened for the attempted name change above.

**New columns of for-profit and non-profit prior to joining the datasets

Adding for-profit column

for_profit3 <- cbind(for_profit2, profit_status = "for-profit") 

for_profit4 <- for_profit3

Adding non-profit column

not_profit2 <- cbind(not_profit, profit_status = "non-profit")

not_profit3 <- not_profit2

Creating a Dataset with US Census information

According to the US census population statistics website the following is the breakdown of the racial demographics in the United States

  • White: 72%
  • Black or African American alone - 12.7%
  • American Indian and Alaska Native alone - 0.9%
  • Asian alone - 5.6%
  • Native Hawaiian and Other Pacific Islander alone - 0.2%
  • Hispanic - 18%
  • Some other race alone - 5.0%
  • Two or more races - 3.4%
census <- data.frame("category" = c("White", "Black", "American Indian/Alaska Native", "Asian", "Native Hawaiian / Pacific Islander", "Hispanic"), "Percent_Pop" = c(.72, .127, .09, .056, .002, .18))

census
##                             category Percent_Pop
## 1                              White       0.720
## 2                              Black       0.127
## 3      American Indian/Alaska Native       0.090
## 4                              Asian       0.056
## 5 Native Hawaiian / Pacific Islander       0.002
## 6                           Hispanic       0.180
pcensus <- ggplot(census, aes(reorder(x = category, Percent_Pop), y = Percent_Pop, fill = category))+
  geom_bar(stat = "identity")+ 
  geom_text(aes(label = scales::percent(Percent_Pop), 
                  y = Percent_Pop, 
                  group = category, vjust = -.2))+
  ggtitle("Percentage of U.S. Population by Race/Ethnicity") +
  labs(y = "Percent of Population", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_brewer()

pcensus 

Creating a category of census with same number of columns to bind with for profit and not for profit groups

census4bind <- cbind(not_profit, profit_status = "census")

# census percentages

census4bind$enrollment_percentage[census4bind$category == 'White'] <- '72'

census4bind$enrollment_percentage[census4bind$category == 'Black'] <- '12.7'

census4bind$enrollment_percentage[census4bind$category == 'American Indian / Alaska Native'] <- '0.9'

census4bind$enrollment_percentage[census4bind$category == 'Asian'] <- '5.6'

census4bind$enrollment_percentage[census4bind$category == 'Native Hawaiian / Pacific Islander'] <- '0.2'

census4bind$enrollment_percentage[census4bind$category == 'Hispanic'] <- '18'

# population totals

#census4bind$enrollment[census4bind$category == 'White'] <- '235560556'

#census4bind$enrollment[census4bind$category == 'Black'] <- '41550265'

#census4bind$enrollment[census4bind$category == 'American Indian / Alaska Native'] <- '2944507'

#census4bind$enrollment[census4bind$category == 'Asian'] <- '18321377'

#census4bind$enrollment[census4bind$category == 'Native Hawaiian / Pacific Islander'] <- '654334.9'

#census4bind$enrollment[census4bind$category == 'Hispanic'] <- '58890139'

Determining actual U.S. population numbers from Census total and percentages

327167439*0.72 # white population
## [1] 235560556
327167439*0.127 # black population
## [1] 41550265
327167439*0.009 # American Indian / Alaska Native population
## [1] 2944507
327167439*0.056 # Asian population
## [1] 18321377
327167439*0.002 # Native Hawaiian / Pacific Islander population
## [1] 654334.9
327167439*0.18 # Hispanic population
## [1] 58890139

Combine Random Sample For-Profit and Non-Profit Data Sets

combo_set <- rbind(3, not_profit2)
str(combo_set)
## 'data.frame':    1441 obs. of  7 variables:
##  $ name                 : chr  "3" "Univ. of Minn" "Univ. of Minn" "Univ. of Minn" ...
##  $ total_enrollment     : num  3 51147 51147 51147 51147 ...
##  $ state                : chr  "3" "Minnesota" "Minnesota" "Minnesota" ...
##  $ category             : chr  "3" "American Indian / Alaska Native" "Asian" "Black" ...
##  $ enrollment           : num  3 163 3998 1785 1544 ...
##  $ enrollment_percentage: num  3 0.319 7.817 3.49 3.019 ...
##  $ profit_status        : chr  "3" "non-profit" "non-profit" "non-profit" ...

Shortening the Enrollment percentage decimal points

combo_set2 <- combo_set %>% 
  mutate_if(is.numeric, round, digits = 2)

not_profit2 <- not_profit %>% 
  mutate_if(is.numeric, round, digits = 2)

for_profit3<- for_profit2 %>% 
  mutate_if(is.numeric, round, digits = 2)

Completed a bind with for-profit, non-profit and census percentages.

#c4b <- census4bind[,-3]
bindwcensus <- rbind(census4bind, for_profit4, not_profit3 )
str(bindwcensus)
## 'data.frame':    3894 obs. of  7 variables:
##  $ name                 : chr  "Univ. of Minn" "Univ. of Minn" "Univ. of Minn" "Univ. of Minn" ...
##  $ total_enrollment     : num  51147 51147 51147 51147 51147 ...
##  $ state                : chr  "Minnesota" "Minnesota" "Minnesota" "Minnesota" ...
##  $ category             : chr  "American Indian / Alaska Native" "Asian" "Black" "Hispanic" ...
##  $ enrollment           : num  163 3998 1785 1544 31 ...
##  $ enrollment_percentage: chr  "0.9" "5.6" "12.7" "18" ...
##  $ profit_status        : chr  "census" "census" "census" "census" ...

However this bind ended up with nearly 4000 variables. We had to simplify.

Enrollment_percentage variable became character type, so we changed it back to numeric type.

bindwcensus$enrollment_percentage <- as.numeric(as.character(bindwcensus$enrollment_percentage))
str(bindwcensus)
## 'data.frame':    3894 obs. of  7 variables:
##  $ name                 : chr  "Univ. of Minn" "Univ. of Minn" "Univ. of Minn" "Univ. of Minn" ...
##  $ total_enrollment     : num  51147 51147 51147 51147 51147 ...
##  $ state                : chr  "Minnesota" "Minnesota" "Minnesota" "Minnesota" ...
##  $ category             : chr  "American Indian / Alaska Native" "Asian" "Black" "Hispanic" ...
##  $ enrollment           : num  163 3998 1785 1544 31 ...
##  $ enrollment_percentage: num  0.9 5.6 12.7 18 0.2 72 0.9 5.6 12.7 18 ...
##  $ profit_status        : chr  "census" "census" "census" "census" ...

Visualizations

library(ggplot2)

Racial & Ethnic Composition at For-Profit Schools

plot1 <- for_profit2 %>%
  ggplot() +
  geom_bar(aes(x=name, y= enrollment_percentage , 
              fill = category),
              position = "fill", 
              stat = "identity" ) +
  coord_flip() +
  theme_minimal() +
  ggtitle("Diversity % of Enrollment in For Profit US Schools", ) +
  theme (plot.title = element_text(hjust = .01, size=15)) +
  labs(fill = "Race") +
  theme(legend.justification = -20, 
        legend.position="bottom", 
        legend.text = element_text(size=6) ,
        )  +
  xlab("School Name") +
  ylab ("Percent of Enrollment") 
plot1

### Racial & Ethnic Composition at Non-Profit Schools

plot2 <- not_profit %>%
  ggplot() +
  geom_bar(aes(x=name, y= enrollment_percentage , 
              fill = category),
              position = "fill", 
              stat = "identity" ) +
  coord_flip() +
  theme_minimal() +
  ggtitle("Diversity % of Enrollment in Not for Profit US Schools", ) +
  theme (plot.title = element_text(hjust = .01, size=12)) +
  labs(fill = "Race") +
  theme( legend.position="bottom", 
         legend.justification = "left",
        legend.text = element_text(size=6) ,
        )  +
  xlab("School Name") +
  ylab ("Percent of Enrollment") 
plot2

## Visualizations with the Combined Set

# attempt to wrap labels, but we found a more elegant solution using str_wrap() for the plots below 

#wrap.it <- function(x, len)
#{ 
 # sapply(x, function(y) paste(strwrap(y, len), 
  #                            collapse = "\n"), 
   #      USE.NAMES = FALSE)
#}


#wrap.labels <- function(x, len)
#{
 # if (is.list(x))
  #{
   # lapply(x, wrap.it, len)
  #} else {
  #  wrap.it(x, len)
  #}
#}
#}
#wr.lap <- wrap.labels(combo_set$category, 100)
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:git2r':
## 
##     config
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

Plot of Enrollment Percentages based on Race & Ethnicity: Combined For-profit and Non-profit Data, Without Census Data

ggplot(data=combo_set, aes(x=category, y=enrollment_percentage, fill=profit_status)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#9999CC", "#66CC99"))

library(dplyr)
combo_set2 %>%
    group_by(name, category, profit_status) %>% 
    summarise_each(funs(mean)) %>% 
  ggplot(aes(x=category, y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()
## Warning: `summarise_each_()` is deprecated as of dplyr 0.7.0.
## Please use `across()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `funs()` is deprecated as of dplyr 0.8.0.
## Please use a list of either functions or lambdas: 
## 
##   # Simple named list: 
##   list(mean = mean, median = median)
## 
##   # Auto named with `tibble::lst()`: 
##   tibble::lst(mean, median)
## 
##   # Using lambdas
##   list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

## Warning in mean.default(state): argument is not numeric or logical: returning NA

ggplotly(tooltip = "text")
ggplot(data=combo_set2, aes(x=category, y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Enrollment Percentage Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
  scale_fill_brewer()

ggplotly(tooltip = "text")

This plot is interesting to look at, however the bars are not universally in ascending order. There interactive tooltip provides information per school from our random samples, however it is a bit too much information for one visual

Plot of Enrollment Percentages based on Race & Ethnicity: Combined For-profit and Non-profit Data with Census Data Added

ggplot(data=bindwcensus, aes(x=category, y=enrollment_percentage, fill=profit_status)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#CC6666","#9999CC", "#66CC99"))

ggplot(data=combo_set2, aes(x=category, y=enrollment, fill=profit_status, text = paste("Enrollment:", enrollment, '</br>', '</br>School Name:', name))) + 
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Enrollment Numbers Comparisons") +
  labs(y = "Enrollment", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  scale_y_continuous(limits = c(0, 20000)) +
  theme_minimal()
## Warning: Removed 4 rows containing missing values (geom_bar).

ggplotly(tooltip = "text") 

Again, this is too much information and the bars are not in complete ascending order.

A beautiful mistake

#barchart <- table(bindwcensus$total_enrollment, bindwcensus$category)
#barplot(barchart, main="title ", horiz=TRUE,
 # names.arg="category")

Plot of Enrollment Numbers based on Race & Ethnicity: Combined For-profit and Non-profit Data, Without Census Data

ggplot(data=combo_set, aes(x=category, y=enrollment, fill=profit_status)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Total Enrollment Comparisons") +
  labs(y = "Enrollment", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#9999CC", "#66CC99"))

### Plot of Enrollment Numbers based on Race & Ethnicity: Combined For-profit and Non-profit Data, With Census Data

ggplot(data=bindwcensus, aes(x=category, y=enrollment, fill=profit_status)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Total Enrollment Comparisons") +
  labs(y = "Enrollment", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#CC6666","#9999CC", "#66CC99"))

The error here is that the census and non-profit percentages are the same because the non-profit dataset was used to make the census dataset and we only changed the enrollment_percentage numbers not the enrollment numbers.

The percent enrollment and enrollment total visualizations are quite different. So we examined the difference between the total enrollment figures.

sum(for_profit4$total_enrollment)
## [1] 3858828
sum(not_profit2$total_enrollment)
## [1] 5139036

Creating a shorter Combined Dataset

Combining all of the School locations into one so that there are less populated plots

Since the final combined dataset had nearly 4000 rows we decided to make a new dataset with simplified data. We combined all of the school entries with multiple locations into one entry with the rows populated by the averages for each variable.

# Removing the "state" category so we can make a new dataset with means for each column without an error due to categorical differences that cannot be merged with a mean calculation.
c3 <- combo_set2[,-3]

Creating a category of census to bind with for profit and not for profit groups

#creating a new dataset with all locations in one entry for each school.
c4 <- c3 %>%
    group_by(name, category, profit_status) %>% 
    summarise_each(funs(mean))

This list was based off of unmerged school locations so we merged the locations first prior to the merge.

Merging all locations in the for-profit, non-profit and census datasets to be merged.

# For-profit short combined list
c3a <- c3 %>% 
  filter(profit_status == "for-profit") %>% 
  group_by(name, category, profit_status) %>% 
    summarise_each(funs(mean))
#str(c3a)
# not-profit short combined list
c3b <- c3 %>% 
  filter(profit_status == "non-profit") %>% 
  group_by(name, category, profit_status) %>% 
    summarise_each(funs(mean))
#str(c3b)

Creating a category of census to bind with for profit and not for profit groups

# creating census short combined list
c3c <- c3b
c3c$profit_status[c3c$profit_status == "non-profit"] <- "census"
#str(c3c)

Changing the percentages in the census list to

# census percentages

c3c$enrollment_percentage[c3c$category == 'White'] <- as.numeric(72)

c3c$enrollment_percentage[c3c$category == 'Black'] <- as.numeric(12.7)

c3c$enrollment_percentage[c3c$category == 'American Indian / Alaska Native'] <- as.numeric(0.9)

c3c$enrollment_percentage[c3c$category == 'Asian'] <- as.numeric(5.6)

c3c$enrollment_percentage[c3c$category == 'Native Hawaiian / Pacific Islander'] <- as.numeric(0.2)

c3c$enrollment_percentage[c3c$category == 'Hispanic'] <- as.numeric(18)

# population totals

#census4bind$enrollment[census4bind$category == 'White'] <- '235560556'

#census4bind$enrollment[census4bind$category == 'Black'] <- '41550265'

#census4bind$enrollment[census4bind$category == 'American Indian / Alaska Native'] <- '2944507'

#census4bind$enrollment[census4bind$category == 'Asian'] <- '18321377'

#census4bind$enrollment[census4bind$category == 'Native Hawaiian / Pacific Islander'] <- '654334.9'

#census4bind$enrollment[census4bind$category == 'Hispanic'] <- '58890139'

c3c$enrollment_percentage <- as.numeric(as.character(c3c$enrollment_percentage))

#str(c3c)

Merge of shortened datasets

c3bind <-rbind(c3c, c3b, c3a )
#str(c3bind)

This was successful. There are now approximately 500 entries instead of 4000.

Determining population figures based off of the percentages for each group in the US census. These were ultimately not used because of how out of proportion they were with the figures used in the dataset. Hence we used the percentages instead.

327167439*0.72 # white population
## [1] 235560556
327167439*0.127 # black population
## [1] 41550265
327167439*0.009 # American Indian / Alaska Native population
## [1] 2944507
327167439*0.056 # Asian population
## [1] 18321377
327167439*0.002 # Native Hawaiian / Pacific Islander population
## [1] 654334.9
327167439*0.18 # Hispanic population
## [1] 58890139

**Visualization of Percent Enrollment with Census, For-Profit and Non-Profit data

ggplot(data=c3bind, aes(x=category, y=enrollment_percentage, fill=profit_status)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#CC6666","#9999CC", "#66CC99"))

The above plot works. Let’s see if we can now create a clearer interactive plot with information from each school based off of the shorter combined for-profit and non-profit datasets.

c4 %>%
  ggplot(aes(category, enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Enrollment Percentage Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")

So, shortening the dataset did not help us much in this instance. Let us try a facet wrap.

c4 %>%
  ggplot(aes(category, enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   facet_wrap(~ category)

ggplotly(tooltip = "text")
## Warning: `group_by_()` is deprecated as of dplyr 0.7.0.
## Please use `group_by()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.

That is not too clear either. How about a facet wrap with points or bars rather than interactive individual information?

library(RColorBrewer)

ggplot(c4, aes(reorder(profit_status, enrollment_percentage), enrollment_percentage, col = profit_status)) +
  geom_point(stat = 'identity') +
  facet_wrap(~ category)+
  theme_bw()

library(RColorBrewer)

ggplot(c4, aes(reorder(profit_status, enrollment_percentage), enrollment_percentage, fill = profit_status)) +
  geom_bar(stat = 'identity') +
  facet_wrap(~ category)+
  theme_bw()+
  scale_fill_brewer()

That is much more effective. However we are missing the census information for proportional comparisons. So we reproduced the above with the census information added. First, with the larger combined dataset of nearly 4000 rows.

ggplot(bindwcensus, aes(reorder(profit_status, enrollment_percentage), enrollment_percentage, fill = profit_status)) +
  geom_bar(stat = 'identity') +
  facet_wrap(~ category)+
  theme_bw()+
  scale_fill_brewer()

Then with the shorter dataset of 550 rows.

ggplot(c3bind, aes(reorder(profit_status, enrollment_percentage), enrollment_percentage, fill = profit_status)) +
  geom_bar(stat = 'identity') +
  facet_wrap(~ category)+
  theme_bw()+
  scale_fill_brewer()

There is a different percentage shown. That is interesting.

Isolated Plots based on Race/Ethnicity

Now, what if we wanted to examine individual school information? The larger interactive plots contained too much information. We tried to simplify it with a facet wrap, but again, it was too much information. So perhaps we can isolate each race/ethnicity into a single plot. It may expose some new information.

c4 %>%
  filter(category =="Asian") %>% 
  ggplot(aes(x=category, y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("Percent Enrollment of Asian Students") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")

This is useful for examining which schools are populated by different races. The outliers seem to be explained by geographic location primarily. That would have to be examined further. Overall though, Asian students seem to be enrolled in for-profits and not-for profits somewhat equally.

c4 %>%
  filter(category =="Native Hawaiian / Pacific Islander") %>% 
  ggplot(aes(x=reorder(category, -enrollment_percentage), y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("Percent Enrollment of Native Hawaiian / Pacific Islander") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")
#x = reorder(f.name, -age)

This visual shows a slightly greater proportion of Native Hawaiian and Pacific Islander students at for-profit institutions rather than non-profit institutions. calculations will have to be made to determine any significance.

c4 %>%
  filter(category =="Hispanic") %>% 
  ggplot(aes(reorder(category, enrollment_percentage), y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("Percent Enrollment of Latinx Students") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")

This visual shows a slightly greater proportion of Latinx students at for-profit institutions rather than non-profit institutions. Again calculations will have to be made to determine any significance.

c4 %>%
  filter(category =="American Indian / Alaska Native") %>% 
  ggplot(aes(x=category, y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("Percent Enrollment of American Indian / Alaska Native Students") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")

This visual shows a slightly greater proportion of American Indian and Alaskan Native students at for-profit institutions rather than non-profit institutions. Calculations will have to be made to determine any significance.

c4 %>%
  filter(category =="Black") %>% 
  ggplot(aes(x=reorder(category, -enrollment_percentage), y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("Percent Enrollment of Black Students") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")
#x = reorder(f.name, -age)

This plot is very interesting. It provides quite a striking visual to show how much greater the proportion of black students are at for-profit institutions compared to non-profit institutions. This difference certainly merits further exploration.

c4 %>%
  filter(category =="White") %>% 
  ggplot(aes(reorder(category, enrollment_percentage), y=enrollment_percentage, fill=profit_status, text = paste("Percent:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("Precent Enrollment of White Students") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")

This plot shows a fairly even representation of white people at for-profit and non-profit institutions.

** Comparisons of People of Color and White People at For-Profit and Non-Profit Institutions

With these individual breakdowns we thought it would be interesting to see what the overall breakdown would be between aggregated people of color as a category and white people as a category, particularly because of the vastly higher number of white people in the U.S. population compared with each individual group of people of color.

# creating a set with only categories for people of color 
c5a <- c4 %>% 
  filter(category != "White")

# removing individual categories
c5b <- c5a[,-2]
# creating a new dataset with percentages summed for overall people of color
c5c <- c5b %>%
    group_by(name, profit_status) %>% 
    summarise_each(funs(mean))

# adding a column to distinguish POC
c5c <- cbind(c5c, category = "People of Color")
# creating a set of all white categories
c5d <- c4 %>% 
  filter(category == "White")
# joining the overall people of color dataset with the white people dataset for comparison
c6 <- rbind(c5c, c5d)
ggplot(data=c6, aes(x=category, y=enrollment, fill=profit_status, text = paste("Enrollment:", enrollment, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Total Enrollment Comparisons") +
  labs(y = "Enrollment", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")
ggplot(data=c6, aes(x=category, y=enrollment, fill=profit_status, text = paste("Enrollment Percentage:", enrollment_percentage, '</br>', '</br>School Name:', name))) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Enrollment Percentage Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()

ggplotly(tooltip = "text")
ggplot(data=c6, aes(x=category, y=enrollment_percentage, fill=profit_status)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#9999CC", "#66CC99"))

ggplot(data=c6, aes(x=category, y=enrollment_percentage, fill=profit_status)) +
  geom_bar(stat="identity") +
   ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete()+
  theme_minimal()+
   scale_fill_manual(values=c("#9999CC", "#66CC99"))

ggplot(c6, aes(reorder(category, enrollment_percentage), enrollment_percentage, fill = category, label = scales::percent(enrollment_percentage))) +
  geom_bar(stat = 'identity') +
   ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
   geom_text(position = position_dodge(width = .5),    # move to center of bars
              vjust = -0.5,    # nudge above top of bar
              size = 2) + 
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity") +
  scale_fill_manual(values=c("#9999CC", "#66CC99")) +
  theme_light()+
  facet_wrap(~ profit_status) 

ggplot(c6, aes(reorder(category, enrollment_percentage), enrollment_percentage, fill = category)) +
  geom_bar(stat = 'identity') +
   ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity") +
  scale_fill_manual(values=c("#9999CC", "#66CC99")) +
  theme_light()+
  facet_wrap(~ profit_status) 

One factor we did not determine is why the Enrollment Percentage y-axis scale in the above plots was on a 2000 scale instead of a 100% scale.

Community Colleges versus 4 year schools

Creating a Dataset of35 Randomly Sampled Community Colleges

#create new dataset for only Community Colleges
comm_coll <- diversity_new5 %>% 
 filter(grepl('Community',diversity_new5$name)) %>% 
  mutate_if(is.numeric, round, digits = 2)

#Randomly Select 35 community colleges
comm_collsample <- sample_n(comm_coll,35)

comm_collsample
## # A tibble: 35 x 6
##    name         total_enrollment state   category    enrollment enrollment_perc~
##    <chr>                   <dbl> <chr>   <chr>            <dbl>            <dbl>
##  1 Oklahoma Ci~            13444 Oklaho~ American I~        565             4.2 
##  2 Nash Commun~             3537 North ~ Native Haw~          6             0.17
##  3 Mesalands C~              793 New Me~ Hispanic           352            44.4 
##  4 Rockland Co~             7520 New Yo~ Asian              343             4.56
##  5 Cankdeska C~              185 North ~ Asian                0             0   
##  6 Bluegrass C~            10952 Kentuc~ Black             1321            12.1 
##  7 Virginia Hi~             2505 Virgin~ American I~          4             0.16
##  8 University ~             1995 Arkans~ Asian               14             0.7 
##  9 Mayland Com~              973 North ~ Asian                4             0.41
## 10 Cerro Coso ~             4731 Califo~ Hispanic          1824            38.6 
## # ... with 25 more rows
#combine multiple locations into one name for school
comm_collshort <- comm_collsample[,-3] %>% 
  group_by(name, category) %>% 
  summarise_each(funs(mean))

Adding Type Column

comm_collshort2 <- cbind(comm_collshort, type = "Comm. College") 

Creating a Dataset of35 Randomly Sampled Universities

#create new dataset for 4 year Universities
uni <- diversity_new5 %>% 
 filter(grepl('University',diversity_new5$name)) %>% 
  mutate_if(is.numeric, round, digits = 2)

#Randomly Select 35 community colleges
set.seed(1)
uni_sample <- sample_n(uni,35)

uni_sample
## # A tibble: 35 x 6
##    name         total_enrollment state   category    enrollment enrollment_perc~
##    <chr>                   <dbl> <chr>   <chr>            <dbl>            <dbl>
##  1 University ~            17866 Califo~ Black              339             1.9 
##  2 Southwest U~              202 New Me~ White               78            38.6 
##  3 City Univer~             2545 Washin~ Native Haw~         10             0.39
##  4 Campbellsvi~             3427 Kentuc~ White             2648            77.3 
##  5 University ~            14906 North ~ Native Haw~         15             0.1 
##  6 William Pat~            11048 New Je~ Native Haw~         25             0.23
##  7 University ~              122 New Yo~ Black               30            24.6 
##  8 Bowling Gre~            16554 Ohio    American I~         34             0.21
##  9 Georgia Sou~            20517 Georgia Asian              280             1.36
## 10 South Unive~              249 Texas   Black               52            20.9 
## # ... with 25 more rows
#combine multiple locations into one name for school
uni_sampleshort <- uni_sample[,-3] %>% 
  group_by(name, category) %>% 
  summarise_each(funs(mean))
# Adding Type Column
uni_sampleshort2 <- cbind(uni_sampleshort, type = "Four Year") 
#Creating Census set to bind
census4unicc <- cbind(uni_sampleshort, type = "Census")
# census percentages

census4unicc$enrollment_percentage[census4unicc$category == 'White'] <- as.numeric(72)

census4unicc$enrollment_percentage[census4unicc$category == 'Black'] <- as.numeric(12.7)

census4unicc$enrollment_percentage[census4unicc$category == 'American Indian / Alaska Native'] <- as.numeric(0.9)

census4unicc$enrollment_percentage[census4unicc$category == 'Asian'] <- as.numeric(5.6)

census4unicc$enrollment_percentage[census4unicc$category == 'Native Hawaiian / Pacific Islander'] <- as.numeric(0.2)

census4unicc$enrollment_percentage[census4unicc$category == 'Hispanic'] <- as.numeric(18)


census4unicc$enrollment_percentage <- as.numeric(as.character(census4unicc$enrollment_percentage))

Combined Community College and University Dataset

#Combined Community College and Four Year
uni_commcoll <- rbind(comm_collshort2, uni_sampleshort2)

# Combined Community College, Four Year, and Census
uni_comm_census <- rbind(comm_collshort2, uni_sampleshort2, census4unicc)
bp <- ggplot(data = uni_commcoll, aes(x=category, y=enrollment_percentage)) + 
             geom_boxplot(aes(fill=type))+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))
bp + facet_wrap( ~ category)

ggplot(data=uni_commcoll, aes(x=category, y=enrollment_percentage, fill=type)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#9999CC", "#66CC99"))

ggplot(data=uni_comm_census, aes(x=category, y=enrollment_percentage, fill=type)) +
geom_bar(stat="identity", position=position_dodge()) +
  ggtitle("For-Profit/Non-Profit Racial & Ethnic Percent Enrollment Comparisons") +
  labs(y = "Enrollment Percentage", x = "Race/Ethnicity")+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  theme_minimal()+
   scale_fill_manual(values=c("#CC6666","#9999CC", "#66CC99"))

ggplot(uni_commcoll, aes(reorder(type, enrollment_percentage), enrollment_percentage, fill = type)) +
  geom_bar(stat = 'identity') +
  facet_wrap(~ category)+
  theme_bw()+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  scale_fill_brewer()

ggplot(uni_comm_census, aes(reorder(type, enrollment_percentage), enrollment_percentage, fill = type)) +
  geom_bar(stat = 'identity') +
  facet_wrap(~ category)+
  theme_bw()+
  scale_x_discrete(labels = function(x) str_wrap(x, width = 10))+
  scale_fill_brewer()

salary <- read_csv("salary_potential.csv")
## Parsed with column specification:
## cols(
##   rank = col_double(),
##   name = col_character(),
##   state_name = col_character(),
##   early_career_pay = col_double(),
##   mid_career_pay = col_double(),
##   make_world_better_percent = col_double(),
##   stem_percent = col_double()
## )
salary_added <- left_join(diversity,salary,by="name")
salary_added_final <- na.omit(salary_added)
tuition <- read_csv("tuition_cost.csv")
## Parsed with column specification:
## cols(
##   name = col_character(),
##   state = col_character(),
##   state_code = col_character(),
##   type = col_character(),
##   degree_length = col_character(),
##   room_and_board = col_double(),
##   in_state_tuition = col_double(),
##   in_state_total = col_double(),
##   out_of_state_tuition = col_double(),
##   out_of_state_total = col_double()
## )
tuition_added <- left_join(tuition,salary_added_final,by="name")
tuition_added_final <- na.omit(tuition_added)
diversity <- read_csv("diversity2.csv")
## Parsed with column specification:
## cols(
##   name = col_character(),
##   total_enrollment = col_double(),
##   state = col_character(),
##   category = col_character(),
##   enrollment = col_double()
## )
monsterfinal <- tuition_added_final 
ptc1<- monsterfinal %>%
  ggplot()+
  geom_point(aes(x=in_state_total, y=mid_career_pay))+
  stat_smooth(aes(x=in_state_total, y=mid_career_pay), method = 'lm', se = FALSE, lwd = 0.5, col="red")+
  ggtitle("In-State Tuition and Mid-Career Salary")+
  labs(x="In-State Tuition (per year)", y="Mid-Career Salary")+
  scale_fill_brewer()+
  theme_minimal()+
  theme(plot.title = element_text(hjust = 0.5))

ptc1
## `geom_smooth()` using formula 'y ~ x'

Data Analysis

As the graphs above illustrate, our data verifies that indeed there are a higher proportion of students of color at for-profit colleges than at non-profit colleges. White Students (shown in pink in the above plots) have much greater attendance at non-profit colleges. These findings provide support for the argument that there are higher numbers of enrollment of students of color at for-profit colleges. One possible explanation could be targeted marketing campaigns by for-profit colleges to students of color. Further research is required.

Given the well-documented high cost of for-profit education, the exorbitant student debt incurred at these schools, the lower rates of employment upon graduation and the low levels of satisfaction with educational quality from previous students, all potential students should think twice before enrolling in for-profit colleges. This is particularly true for students of color who may be identified by for-profit colleges as susceptible to predatory exploitative practices that will provide additional profits to the college’s shareholders without making good on their promise to provide a marketable, quality education. The predatory actions of for-profit colleges contribute to our nation’s growing economic divide between the haves and the have nots and perpetuates the unequal education system we have today. Fortunately, there are many people and organizations working to end these predatory practices, but in the meantime while they continue to exist, we must inform one another about the importance of obtaining a quality education at not-for-profit institutions, like Montgomery College.

Future Research

The following is a list of more detailed analysis that could be done with this dataset:

  1. Comparing diversity of for-profit and not-for-profit schools in the same geographic regions utilizing local racial demographics.

  2. Comparing community colleges, to four-year-institutions, to for-profit institutions

  3. Comparing the tuition rates, salary potential and overall profits received from community colleges, four-year institutions and for-profit institutions.

  4. A longitudinal study could be conducted to see if there have been changes in the demographic makeup of for-profit colleges in the time before the Obama-era regulations, the lack of regulation during the DeVos era and the patterns that emerge after the predatory lending lawsuits and any resulting legislation.

References

Body, D. (2019, Mar. 19). Worse Off Than When They Enrolled: The Consequence of For-Profit Colleges for People of Color. The Aspen Institute. https://www.aspeninstitute.org/blog-posts/worse-off-than-when-they-enrolled-the-consequence-of-for-profit-colleges-for-people-of-color/

Bonadies, G.G., Rovenger, J., Connor, E., Shum, B. & Merrill, T. (2018, Jul. 30). For-Profit Schools’ Predatory Practices and Students of Color: A Mission to Enroll Rather than Educate, Harvard Law Review Blog. https://blog.harvardlawreview.org/for-profit-schools-predatory-practices-and-students-of-color-a-mission-to-enroll-rather-than-educate/

Conti, A. (2019, Sep. 10). How For-Profit Colleges Have Targeted and Taken Advantage of Black Students. Vice. https://www.vice.com/en_us/article/bjwj3d/how-for-profit-colleges-have-targeted-and-taken-advantage-of-black-students

Green, E.L. (2019, Jun. 28). DeVos Repeals Obama-Era Rule Cracking down on For-Profit Colleges, New York Times. https://www.nytimes.com/2019/06/28/us/politics/betsy-devos-for-profit-colleges.html

Halperin, D. 22 States Sue DeVos to Overturn Anti-Student Rule. Republic Report. https://www.republicreport.org/2020/22-states-sue-devos-to-overturn-anti-student-rule/

Legal Services Center(2020), Project on Predatory Student Lending: Cases, Harvard Law School. https://predatorystudentlending.org/cases/

Lobosco, K. (2019, Jul. 23). For-profit college students are waiting 958 days for loan relief, CNN. https://www.cnn.com/2019/07/23/politics/betsy-devos-loan-forgiveness-for-profit-college-students/index.html

Lopez, M. (2015, Feb. 12). BEWARE: For-Profit Colleges. The Patriot Post. https://bcchspatriotpost.com/2391/news/beware-for-profit-colleges/

Redman, H. (2020, Jun. 27). AG Sues Department of Education Over For-Profit College Rules. Urban Milwaukee. https://urbanmilwaukee.com/2020/06/27/ag-sues-department-of-education-over-for-profit-college-rules/

TBS Staff (2019, Jul. 29). For-Profit Colleges vs. Non-Profit Colleges - What’s The Difference? The Best Schools Magazine. https://thebestschools.org/magazine/for-profit-vs-non-profit

Turner, C. (2019, Nov. 14). Devos Refuses to Forgive student Debt For Those DeFrauded by For-Profit Colleges, All Things Considered, NPR. https://www.npr.org/2019/11/14/779465130/devos-refuses-to-forgive-student-debt-for-those-defrauded-by-for-profit-colleges

Voorhees, K. (2019, Oct. 17). Civil Rights Groups: For-Profit Colleges Exploit Black and Latino Students. The Leadership Conference Education Fund. https://civilrights.org/edfund/2019/10/17/civil-rights-groups-for-profit-colleges-exploit-black-and-latino-students/