Prospective students face many options on where to earn a graduate accounting degree. On the east coast alone, there are thousands of universities offering a graduate accounting degree. Adding to this complexity, students must also evaluate whether each university program offers skills that employers are recruiting for. Some skills are technical and deal with specific topics such as SQL, Python and statistics while others are ‘soft’ skills like team work and collaboration.
In this projects, we approach it in three parts. Part I is the construction of a recommender system. Part II is to create a mapping system for recommended schools. Part III is a discussion on some of the issues with obtaining the data in a webscraping or copy and paste approach.
We will show how to build a recommender system using tinymodels by first loading pre-processed college meta data and sought-after data science skills. After classifying each program as a match for having sufficient data science training, we create a sample set and build a recommender system.
The recommender system at the end will be able to categorize any other accounting program as having a good data science program or not.
# Libraries
library(tidyverse)
## Warning: package 'tidyr' was built under R version 4.1.2
library(tidytext)
## Warning: package 'tidytext' was built under R version 4.1.2
# For tinymodels
library(DiceDesign)
library(tidymodels)
## Warning: package 'tidymodels' was built under R version 4.1.2
## Warning: package 'rsample' was built under R version 4.1.2
library(workflows)
library(tune)
library(mlbench)
## Warning: package 'mlbench' was built under R version 4.1.2
library(rsample)
library(recipes)
library(parsnip)
library(yardstick)
library(tm)
## Warning: package 'tm' was built under R version 4.1.2
# For mapping
library(sf)
## Warning: package 'sf' was built under R version 4.1.2
library(leaflet)
library(htmltools)
Building on the prior work by Team Four, we load three data frames: * Graduate Accounting Programs on the east coast * Dictionary of desired technical skills by employers * Dictionary of desired soft skills by employers
accounting_programs <- read_csv("https://github.com/cliftonleesps/607_final_project/blob/master/Acct_Curricula2.csv?raw=true", show_col_types = FALSE, )
technical_skills <- read_csv("https://github.com/cliftonleesps/607_final_project/raw/master/technical_skills.csv", show_col_types = FALSE)
soft_skills <- read_csv("https://github.com/cliftonleesps/607_final_project/raw/master/soft_skills.csv", show_col_types = FALSE)
# Geocoding Schools from Kratika Patel
library(sf)
library(tidyverse)
url1 <- "https://raw.githubusercontent.com/cliftonleesps/607_final_project/master/Acct_Curricula2.csv"
AcctCurricula <- data.frame(read.csv(url1))
col <- colnames(AcctCurricula)
col <- toupper(col)
col[1] <- "NAME"
colnames(AcctCurricula) <- col
Names <- AcctCurricula %>% select("NAME")
Names <- data.frame(NAME = unique(Names$NAME))
url2 <- "https://raw.githubusercontent.com/cliftonleesps/607_final_project/master/EDGE_GEOCODE_POSTSECSCH_2021.csv"
schools <- data.frame(read.csv(url2))
col <- colnames(schools)
col[1] <- "UNITID"
colnames(schools) <- col
#head(schools)
SchoolGeo <- schools %>%
filter(NAME %in% Names$NAME)
#Correct typos and clean names of Universities not detected in schools dataframe
Names %>%
filter(!(NAME %in% schools$NAME))
## NAME
## 1 Fitchberg State University
## 2 Pennsylvania State University
## 3 Saint Joseph's University\n
## 4 Strayer University - Delaware
## 5 Strayer University-North Carolina (online, for-profit)
## 6 University of Massachussetts - Amherst
## 7 University of Massachussetts - Dartmouth
## 8 University of North Carolina Chapel Hill
Names$NAME[Names$NAME == "Fitchberg State University"] <- "Fitchburg State University"
Names$NAME[Names$NAME == "Saint Joseph's University\n"] <- "Saint Joseph's University"
Names$NAME[Names$NAME == "Pennsylvania State University"] <- "Pennsylvania State University-Penn State Harrisburg"
Names$NAME[Names$NAME == "Strayer University - Delaware"] <- "Strayer University-Delaware"
Names$NAME[Names$NAME == "Strayer University-North Carolina (online, for-profit)"] <- "Strayer University-North Carolina"
Names$NAME[Names$NAME == "University of Massachussetts - Amherst"] <- "University of Massachusetts-Amherst"
Names$NAME[Names$NAME == "University of Massachussetts - Dartmouth"] <- "University of Massachusetts-Dartmouth"
Names$NAME[Names$NAME == "University of North Carolina Chapel Hill"] <- "University of North Carolina at Chapel Hill"
SchoolGeo <- schools %>%
filter(NAME %in% Names$NAME)
s <- schools %>% filter(NAME== "University of Connecticut")
SchoolGeo <- add_row(SchoolGeo, s)
SchoolGeo[39,2] <- "Fitchberg State University"
SchoolGeo[130,2] <- "Saint Joseph's University\n"
SchoolGeo[1,2] <- "Pennsylvania State University"
SchoolGeo[142,2] <- "Strayer University - Delaware"
SchoolGeo[143,2] <- "Strayer University-North Carolina (online, for-profit)"
SchoolGeo[41,2] <- "University of Massachussetts - Amherst"
SchoolGeo[45,2] <- "University of Massachussetts - Dartmouth"
SchoolGeo[113,2] <- "University of North Carolina Chapel Hill"
#Remove Duplicate row for Pennsylvania State University-Penn State Harrisburg
SchoolGeo <- SchoolGeo[!(SchoolGeo$UNITID == 49576722),]
#glimpse(SchoolGeo)
# subset(SchoolGeo, NAME == "Ramapo College of New Jersey")
#
# ?inner_join
#
# t <- right_join(SchoolGeo, temp_schools, by = c("NAME"= "name"))
# subset(t, NAME == "Ramapo College of New Jersey")
The collegiate accounting in its native form requires a little tidying. Each row is an observation of a course and its curriculum description. We’ll create a vector from each description and join with a vector of technical and a vector of soft skills. If there are any matches, the match_technical_skills attribute is set from zero to one.
# initialize some counters
current_school <- accounting_programs$School[1]
description <- accounting_programs$Description[1]
# temp_schools is where we keep our tidy data
temp_schools <- tibble(
name = "",
description = "",
match_technical_skills = 0,
match_soft_skills = 0
)
# Iterate through the accounting programs
# Since a college appears on more than one row, we have to aggregate all of the course descriptions grouping
# by college name
for (row in 2:nrow(accounting_programs)) {
# if we detect a different school name, then save the data to the tibble
if (current_school != accounting_programs$School[row]) {
temp_schools <- temp_schools %>%
add_row(
name = current_school,
description = paste0(description, accounting_programs$Description[row]),
match_technical_skills = 0,
match_soft_skills = 0
)
description <- accounting_programs$Description[row]
current_school <- accounting_programs$School[row]
} else if (!is.na(accounting_programs$Description[row])) {
# Just keep pasting the description for later
description <- paste0(description, accounting_programs$Description[row])
}
}
# Add the last school to the tibble
temp_schools <- temp_schools %>%
add_row(
name = current_school,
description = paste0(description, accounting_programs$Description[row]),
match_technical_skills = 0,
match_soft_skills = 0
)
# delete the first row
nrow(temp_schools)
## [1] 151
temp_schools <- temp_schools[-1,]
nrow(temp_schools)
## [1] 150
# Function to remove duplicate words to be used in the next for loop
rem_dup_word <- function(x){
x <- tolower(x)
x <- gsub("-", " ", x)
x <- gsub("/", " ", x)
x <- gsub("[[:punct:]]", "", x)
x <- gsub("[[:digit:]]", "", x)
x <- gsub("this course", "", x)
x <- gsub("topics include", "", x)
return(paste(unique(trimws(tibble(word = unlist(strsplit(x, split = " ", fixed = F, perl = T))) %>% anti_join(stop_words) %>% pull(word))),
collapse = " "))
}
# now iterate through the schools and split the descriptions
for (count in 1:nrow(temp_schools)) {
# get the current row
ts <- temp_schools[count,]
# Obtain the school anme
school_name <- ts[1]
# Use the rem_dup_word function on the 2nd element of ts, which contains the
# course description
description_string <- rem_dup_word(ts[2])
# Make each word in the `description_string` character vector a row element
# in a `school_descriptions` dataframe.
school_descriptions <- data.frame(as.list(str_split(description_string, " ")))
# Change the column name in the `school_descriptions` dataframe
colnames(school_descriptions) <- c("word")
# now join with the technical skills
# If any words match the vector of technical skills then we
# set technical_skill_match = 1
technical_skill_match <- inner_join(technical_skills, school_descriptions,by="word")
if (nrow(technical_skill_match) > 0) {
#print (school_name)
temp_schools[count,][3] <- 1
}
# now join with the soft skills
# If any words match the vector of soft skills, we
# soft_skills_match = 1
soft_skills_match <- inner_join(soft_skills, school_descriptions,by="word")
if (nrow(soft_skills_match) > 0) {
#print (school_name)
temp_schools[count,][4] <- 1
}
}
# create a new column school_score = match_technical_skills + match_soft_skills
temp_schools <- temp_schools %>% mutate (school_score = match_technical_skills + match_soft_skills)
# create another new column good_data_science_program = [YES,NO]
temp_schools <- temp_schools %>% mutate (good_data_science_program = ifelse( school_score >= 2, "YES", "NO"))
# drop the description column since it takes a lot of
# memory
temp_schools <- subset(temp_schools, select = -c(2))
ncol(temp_schools)
## [1] 5
# now join with SchoolGeo so we get the latitude and longtitude
temp_schools <- right_join(SchoolGeo, temp_schools, by = c("NAME"= "name"))
#temp_schools <- inner_join(SchoolGeo, temp_schools, by = c("NAME"= "name"))
# Start building the model
set.seed(4393003)
sample_size <- 100
glimpse(temp_schools)
## Rows: 150
## Columns: 27
## $ UNITID <int> 129020, 129215, 129242, 130253, 130943, 1309~
## $ NAME <chr> "Pennsylvania State University", "Eastern Co~
## $ STREET <chr> "352 Mansfield Road", "83 Windham St", "1073~
## $ CITY <chr> "Storrs", "Willimantic", "Fairfield", "Fairf~
## $ STATE <chr> "CT", "CT", "CT", "CT", "DE", "DE", "DE", "F~
## $ ZIP <chr> "06269", "06226", "06824-5195", "06825-1000"~
## $ STFIP <chr> "09", "09", "09", "09", "10", "10", "10", "1~
## $ CNTY <chr> "09013", "09015", "09001", "09001", "10003",~
## $ NMCNTY <chr> "Tolland County", "Windham County", "Fairfie~
## $ LOCALE <chr> "21", "31", "21", "21", "21", "21", "21", "2~
## $ LAT <dbl> 41.80910, 41.72167, 41.15767, 41.22089, 39.6~
## $ LON <dbl> -72.24995, -72.21875, -73.25590, -73.24333, ~
## $ CBSA <chr> "25540", "49340", "14860", "14860", "37980",~
## $ NMCBSA <chr> "Hartford-East Hartford-Middletown, CT", "Wo~
## $ CBSATYPE <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ CSA <chr> "278", "148", "408", "408", "428", "428", "4~
## $ NMCSA <chr> "Hartford-East Hartford, CT", "Boston-Worces~
## $ NECTA <chr> "73450", "79300", "71950", "71950", "N", "N"~
## $ NMNECTA <chr> "Hartford-East Hartford-Middletown, CT", "Wi~
## $ CD <chr> "0902", "0902", "0904", "0904", "1000", "100~
## $ SLDL <chr> "09054", "09049", "09133", "09134", "10025",~
## $ SLDU <chr> "09029", "09029", "09028", "09028", "10008",~
## $ SCHOOLYEAR <chr> "2020-2021", "2020-2021", "2020-2021", "2020~
## $ match_technical_skills <dbl> 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,~
## $ match_soft_skills <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1,~
## $ school_score <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 0, 1, 2, 2, 2, 2, 2,~
## $ good_data_science_program <chr> "YES", "NO", "YES", "YES", "YES", "YES", "YE~
random_schools <- sample(temp_schools, size= 2, replace = FALSE)
random_schools
## CITY school_score
## 1 Storrs 2
## 2 Willimantic 1
## 3 Fairfield 2
## 4 Fairfield 2
## 5 Newark 2
## 6 Wilmington 2
## 7 New Castle 2
## 8 Miami 2
## 9 Orlando 0
## 10 Boca Raton 1
## 11 Miami 2
## 12 Lakeland 2
## 13 Tallahassee 2
## 14 Gainesville 2
## 15 Coral Gables 2
## 16 Jacksonville 2
## 17 Tampa 2
## 18 Tampa 2
## 19 Babson Park 2
## 20 Pensacola 2
## 21 Orono 2
## 22 Standish 2
## 23 Portland 2
## 24 Waterville 2
## 25 Baltimore 2
## 26 Baltimore 0
## 27 Adelphi 2
## 28 College Park 2
## 29 Baltimore 2
## 30 Towson 2
## 31 Worcester 2
## 32 Wellesley 2
## 33 Longmeadow 2
## 34 Waltham 2
## 35 Chestnut Hill 2
## 36 Bridgewater 2
## 37 Worcester 2
## 38 Milton 2
## 39 Fitchburg 2
## 40 Lowell 2
## 41 Amherst 2
## 42 Boston 2
## 43 Dudley 2
## 44 Boston 2
## 45 North Dartmouth 2
## 46 Boston 2
## 47 Springfield 2
## 48 Westfield 2
## 49 Bloomfield 2
## 50 Caldwell 2
## 51 Teaneck 2
## 52 Madison 2
## 53 Jersey City 2
## 54 Union 2
## 55 West Long Branch 2
## 56 Montclair 2
## 57 Mahwah 2
## 58 Newark 2
## 59 Jersey City 2
## 60 South Orange 2
## 61 Garden City 2
## 62 Alfred 1
## 63 New York 2
## 64 Brooklyn 2
## 65 Staten Island 2
## 66 New York 2
## 67 Bronx 2
## 68 Queens 2
## 69 Bronx 2
## 70 Hempstead 2
## 71 New Rochelle 2
## 72 Ithaca 2
## 73 Syracuse 2
## 74 Brookville 2
## 75 Riverdale 1
## 76 Poughkeepsie 2
## 77 Dobbs Ferry 2
## 78 Rockville Centre 2
## 79 Bronx 2
## 80 Newburgh 2
## 81 Rochester 2
## 82 New York 2
## 83 Old Westbury 2
## 84 New York 2
## 85 Rochester 2
## 86 Saint Bonaventure 0
## 87 Brooklyn Heights 2
## 88 Albany 2
## 89 Loudonville 2
## 90 Brooklyn 2
## 91 Patchogue 2
## 92 Rochester 2
## 93 Queens 2
## 94 Albany 2
## 95 Vestal 2
## 96 Stony Brook 2
## 97 Utica 0
## 98 Geneseo 2
## 99 New Paltz 2
## 100 Oswego 2
## 101 Old Westbury 2
## 102 Syracuse 2
## 103 New York 2
## 104 Albany 2
## 105 Utica 2
## 106 Staten Island 2
## 107 New York 2
## 108 Buies Creek 2
## 109 Greenville 2
## 110 Elon 2
## 111 Boiling Springs 2
## 112 Greensboro 2
## 113 Chapel Hill 2
## 114 Raleigh 2
## 115 Wingate 2
## 116 Cullowhee 2
## 117 Bloomsburg 2
## 118 Radnor 2
## 119 Philadelphia 2
## 120 Elizabethtown 2
## 121 Philadelphia 0
## 122 La Plume 2
## 123 Wilkes-Barre 1
## 124 Philadelphia 1
## 125 Bethlehem 1
## 126 Aston 2
## 127 Philadelphia 2
## 128 Langhorne 1
## 129 Philadelphia 2
## 130 Scranton 0
## 131 Philadelphia 2
## 132 Villanova 0
## 133 Chester 2
## 134 York 2
## 135 Burlington 2
## 136 Fairfax 2
## 137 Fort Myers 2
## 138 Fort Myers 2
## 139 Trevose 1
## 140 Danville 2
## 141 Wilmington 2
## 142 Morrisville 2
## 143 Fairfax 2
## 144 Melbourne 2
## 145 New York 2
## 146 Miramar 2
## 147 Charlotte 2
## 148 Ft. Washington 1
## 149 Arlington 2
## 150 Storrs 0
# Randomly select schools
schools_bad <- temp_schools %>% filter(good_data_science_program == "NO")
schools_good <- temp_schools %>% filter(good_data_science_program == "YES")
sample_schools <- schools_good[sample(nrow(schools_good), sample_size - nrow(schools_bad)), ]
for (c in 1:nrow(schools_bad)) {
row <- schools_bad[c,]
sample_schools <- add_row(sample_schools, tibble(
name = row$name,
match_technical_skills = row$match_technical_skills,
match_soft_skills = row$match_soft_skills,
school_score = row$school_score,
good_data_science_program = row$good_data_science_program
)
)
}
# now we have our samples
school_split <- initial_split(sample_schools,
prop = 3/4)
school_split
## <Analysis/Assess/Total>
## <75/25/100>
school_train <- training(school_split)
school_test <- testing(school_split)
school_cv <- vfold_cv(school_train)
school_cv
## # 10-fold cross-validation
## # A tibble: 10 x 2
## splits id
## <list> <chr>
## 1 <split [67/8]> Fold01
## 2 <split [67/8]> Fold02
## 3 <split [67/8]> Fold03
## 4 <split [67/8]> Fold04
## 5 <split [67/8]> Fold05
## 6 <split [68/7]> Fold06
## 7 <split [68/7]> Fold07
## 8 <split [68/7]> Fold08
## 9 <split [68/7]> Fold09
## 10 <split [68/7]> Fold10
# define the recipe
school_recipe <-
# which consists of the formula (outcome ~ predictors)
recipe(good_data_science_program ~ match_technical_skills + match_soft_skills + school_score,
data = sample_schools) %>%
step_normalize(all_numeric()) %>%
step_impute_knn(all_predictors())
school_recipe
## Recipe
##
## Inputs:
##
## role #variables
## outcome 1
## predictor 3
##
## Operations:
##
## Centering and scaling for all_numeric()
## K-nearest neighbor imputation for all_predictors()
school_train_preprocessed <- school_recipe %>%
# apply the recipe to the training data
prep(school_train) %>%
# extract the pre-processed training dataset
juice()
school_train_preprocessed
## # A tibble: 75 x 4
## match_technical_skills match_soft_skills school_score good_data_science_prog~
## <dbl> <dbl> <dbl> <fct>
## 1 0.367 0.412 0.444 YES
## 2 0.367 0.412 0.444 YES
## 3 0.367 -2.40 -1.22 NO
## 4 0.367 0.412 0.444 YES
## 5 0.367 0.412 0.444 YES
## 6 0.367 0.412 0.444 YES
## 7 0.367 0.412 0.444 YES
## 8 0.367 0.412 0.444 YES
## 9 0.367 0.412 0.444 YES
## 10 0.367 0.412 0.444 YES
## # ... with 65 more rows
rf_model <-
# specify that the model is a random forest
rand_forest() %>%
# specify that the `mtry` parameter needs to be tuned
set_args(mtry = tune()) %>%
# select the engine/package that underlies the model
set_engine("ranger", importance = "impurity") %>%
# choose either the continuous regression or binary classification mode
set_mode("classification")
# set the workflow
rf_workflow <- workflow() %>%
# add the recipe
add_recipe(school_recipe) %>%
# add the model
add_model(rf_model)
rf_workflow
## == Workflow ====================================================================
## Preprocessor: Recipe
## Model: rand_forest()
##
## -- Preprocessor ----------------------------------------------------------------
## 2 Recipe Steps
##
## * step_normalize()
## * step_impute_knn()
##
## -- Model -----------------------------------------------------------------------
## Random Forest Model Specification (classification)
##
## Main Arguments:
## mtry = tune()
##
## Engine-Specific Arguments:
## importance = impurity
##
## Computational engine: ranger
rf_grid <- expand.grid(mtry = c(2,3))
rf_tune_results <- rf_workflow %>%
tune_grid(resamples = school_cv, #CV object
grid = rf_grid, # grid of values to try
metrics = metric_set(accuracy, roc_auc) # metrics we care about
)
## Warning: package 'ranger' was built under R version 4.1.2
## ! Fold05: internal: No event observations were detected in `truth` with event leve...
## ! Fold10: internal: No event observations were detected in `truth` with event leve...
rf_tune_results %>%
collect_metrics()
## # A tibble: 4 x 7
## mtry .metric .estimator mean n std_err .config
## <dbl> <chr> <chr> <dbl> <int> <dbl> <chr>
## 1 2 accuracy binary 1 10 0 Preprocessor1_Model1
## 2 2 roc_auc binary 1 8 0 Preprocessor1_Model1
## 3 3 accuracy binary 1 10 0 Preprocessor1_Model2
## 4 3 roc_auc binary 1 8 0 Preprocessor1_Model2
param_final <- rf_tune_results %>%
select_best(metric = "accuracy")
#param_final
rf_workflow <- rf_workflow %>%
finalize_workflow(param_final)
rf_fit <- rf_workflow %>%
# fit on the training set and evaluate on test set
last_fit(school_split)
#rf_fit
test_performance <- rf_fit %>% collect_metrics()
#test_performance
test_predictions <- rf_fit %>% collect_predictions()
#test_predictions
final_model <- fit(rf_workflow, sample_schools)
#final_model
# predict fictitious colleges
test_bad_college <- tibble(
name = "Test Bad college",
match_technical_skills = 0,
match_soft_skills = 1,
school_score = 1
)
test_good_college <- tibble(
name = "Test Good college",
match_technical_skills = 1,
match_soft_skills = 1,
school_score = 2
)
# Predict will output if the college has a good data science program
recommendation <- predict(final_model, new_data = test_bad_college)
print(paste0("For a college without a data science program the recommendation is ", recommendation$.pred_class))
## [1] "For a college without a data science program the recommendation is NO"
recommendation <- predict(final_model, new_data = test_good_college)
print(paste0("For a college with a data science program the recommendation is ", recommendation$.pred_class))
## [1] "For a college with a data science program the recommendation is YES"
# Dataframe of recommended schools.
temp_schools %>% filter(good_data_science_program == "YES")
## UNITID NAME
## 1 129020 Pennsylvania State University
## 2 129242 Fairfield University
## 3 130253 Sacred Heart University
## 4 130943 University of Delaware
## 5 130989 Goldey-Beacom College
## 6 131113 Wilmington University
## 7 132471 Barry University
## 8 133951 Florida International University
## 9 134079 Florida Southern College
## 10 134097 Florida State University
## 11 134130 University of Florida
## 12 135726 University of Miami
## 13 136172 University of North Florida
## 14 137351 University of South Florida
## 15 137847 The University of Tampa
## 16 138293 Webber International University
## 17 138354 The University of West Florida
## 18 161253 University of Maine
## 19 161518 Saint Joseph's College of Maine
## 20 161554 University of Southern Maine
## 21 161563 Thomas College
## 22 161873 University of Baltimore
## 23 163204 University of Maryland Global Campus
## 24 163286 University of Maryland-College Park
## 25 163453 Morgan State University
## 26 164076 Towson University
## 27 164562 Assumption University
## 28 164580 Babson College
## 29 164632 Bay Path University
## 30 164739 Bentley University
## 31 164924 Boston College
## 32 165024 Bridgewater State University
## 33 165334 Clark University
## 34 165529 Curry College
## 35 165820 Fitchberg State University
## 36 166513 University of Massachusetts-Lowell
## 37 166629 University of Massachussetts - Amherst
## 38 166638 University of Massachusetts-Boston
## 39 167260 Nichols College
## 40 167358 Northeastern University
## 41 167987 University of Massachussetts - Dartmouth
## 42 168005 Suffolk University
## 43 168254 Western New England University
## 44 168263 Westfield State University
## 45 183822 Bloomfield College
## 46 183910 Caldwell University
## 47 184603 Fairleigh Dickinson University-Metropolitan Campus
## 48 184694 Fairleigh Dickinson University-Florham Campus
## 49 185129 New Jersey City University
## 50 185262 Kean University
## 51 185572 Monmouth University
## 52 185590 Montclair State University
## 53 186201 Ramapo College of New Jersey
## 54 186399 Rutgers University-Newark
## 55 186432 Saint Peter's University
## 56 186584 Seton Hall University
## 57 188429 Adelphi University
## 58 190512 CUNY Bernard M Baruch College
## 59 190549 CUNY Brooklyn College
## 60 190558 College of Staten Island CUNY
## 61 190594 CUNY Hunter College
## 62 190637 CUNY Lehman College
## 63 190664 CUNY Queens College
## 64 191241 Fordham University
## 65 191649 Hofstra University
## 66 191931 Iona College
## 67 191968 Ithaca College
## 68 192323 Le Moyne College
## 69 192448 Long Island University
## 70 192819 Marist College
## 71 193016 Mercy College
## 72 193292 Molloy College
## 73 193308 Monroe College
## 74 193353 Mount Saint Mary College
## 75 193584 Nazareth College
## 76 193900 New York University
## 77 194091 New York Institute of Technology
## 78 194310 Pace University
## 79 195003 Rochester Institute of Technology
## 80 195173 St Francis College
## 81 195234 The College of Saint Rose
## 82 195474 Siena College
## 83 195544 St. Joseph's College-New York
## 84 195562 St. Joseph's College-Long Island
## 85 195720 Saint John Fisher College
## 86 195809 St. John's University-New York
## 87 196060 SUNY at Albany
## 88 196079 Binghamton University
## 89 196097 Stony Brook University
## 90 196167 SUNY College at Geneseo
## 91 196176 State University of New York at New Paltz
## 92 196194 SUNY College at Oswego
## 93 196237 SUNY College at Old Westbury
## 94 196413 Syracuse University
## 95 196592 Touro College
## 96 196680 Excelsior College
## 97 197045 Utica College
## 98 197197 Wagner College
## 99 197708 Yeshiva University
## 100 198136 Campbell University
## 101 198464 East Carolina University
## 102 198516 Elon University
## 103 198561 Gardner-Webb University
## 104 199102 North Carolina A & T State University
## 105 199120 University of North Carolina Chapel Hill
## 106 199193 North Carolina State University at Raleigh
## 107 199962 Wingate University
## 108 200004 Western Carolina University
## 109 211158 Bloomsburg University of Pennsylvania
## 110 211352 Cabrini University
## 111 212054 Drexel University
## 112 212197 Elizabethtown College
## 113 213303 Keystone College
## 114 214272 Neumann University
## 115 215062 University of Pennsylvania
## 116 215770 Saint Joseph's University\n
## 117 216339 Temple University
## 118 216852 Widener University
## 119 217059 York College of Pennsylvania
## 120 231174 University of Vermont
## 121 232186 George Mason University
## 122 367884 Hodges University
## 123 433660 Florida Gulf Coast University
## 124 449931 Averett University-Non-Traditional Programs
## 125 450298 Strayer University - Delaware
## 126 453163 Strayer University-North Carolina (online, for-profit)
## 127 460376 Fairfax University of America
## 128 480569 Florida Institute of Technology-Online
## 129 482413 DeVry College of New York
## 130 482459 DeVry University-Florida
## 131 482565 DeVry University-North Carolina
## 132 482653 DeVry University-Virginia
## STREET CITY STATE
## 1 352 Mansfield Road Storrs CT
## 2 1073 N Benson Rd Fairfield CT
## 3 5151 Park Ave Fairfield CT
## 4 104 Hullihen Hall Newark DE
## 5 4701 Limestone Rd Wilmington DE
## 6 320 Dupont Hwy New Castle DE
## 7 11300 NE 2nd Ave Miami FL
## 8 11200 S. W. 8 Street Miami FL
## 9 111 Lake Hollingsworth Dr Lakeland FL
## 10 222 S. Copeland Street Tallahassee FL
## 11 Tigert Hall Gainesville FL
## 12 University of Miami Coral Gables FL
## 13 1 UNF Drive Jacksonville FL
## 14 4202 East Fowler Ave Tampa FL
## 15 401 W Kennedy Blvd Tampa FL
## 16 1201 N Scenic Hwy Babson Park FL
## 17 11000 University Parkway Pensacola FL
## 18 168 College Avenue Orono ME
## 19 278 Whites Bridge Rd Standish ME
## 20 96 Falmouth St Portland ME
## 21 180 W River Rd Waterville ME
## 22 Charles at Mount Royal Baltimore MD
## 23 3501 University Blvd East Adelphi MD
## 24 M College Park MD
## 25 1700 East Cold Spring Lane Baltimore MD
## 26 8000 York Rd Towson MD
## 27 500 Salisbury St Worcester MA
## 28 231 Forest Street Wellesley MA
## 29 588 Longmeadow Street Longmeadow MA
## 30 175 Forest St Waltham MA
## 31 140 Commonwealth Avenue Chestnut Hill MA
## 32 131 Summer Street Bridgewater MA
## 33 950 Main St Worcester MA
## 34 1071 Blue Hill Ave Milton MA
## 35 160 Pearl St Fitchburg MA
## 36 1 University Ave Lowell MA
## 37 374 Whitmore Building 181 Presidents Drive Amherst MA
## 38 100 Morrissey Boulevard Boston MA
## 39 Center Rd Dudley MA
## 40 360 Huntington Ave Boston MA
## 41 285 Old Westport Rd North Dartmouth MA
## 42 73 Tremont St. Boston MA
## 43 1215 Wilbraham Rd Springfield MA
## 44 577 Western Ave Westfield MA
## 45 467 Franklin St Bloomfield NJ
## 46 120 Bloomfield Avenue Caldwell NJ
## 47 1000 River Rd Teaneck NJ
## 48 285 Madison Ave Madison NJ
## 49 2039 Kennedy Blvd Jersey City NJ
## 50 1000 Morris Avenue Union NJ
## 51 400 Cedar Ave West Long Branch NJ
## 52 1 Normal Avenue Montclair NJ
## 53 505 Ramapo Valley Rd Mahwah NJ
## 54 249 University Avenue, Blumenthal Hall Newark NJ
## 55 2641 Kennedy Blvd Jersey City NJ
## 56 400 S Orange Ave South Orange NJ
## 57 South Ave Garden City NY
## 58 One Bernard Baruch Way (55 Lexington Ave at 24th St) New York NY
## 59 2900 Bedford Ave Brooklyn NY
## 60 2800 Victory Blvd Staten Island NY
## 61 695 Park Ave New York NY
## 62 250 Bedford Park Blvd West Bronx NY
## 63 65-30 Kissena Blvd Queens NY
## 64 441 E Fordham Rd Bronx NY
## 65 100 Hofstra University Hempstead NY
## 66 715 North Ave New Rochelle NY
## 67 953 Danby Road Ithaca NY
## 68 1419 Salt Springs Rd Syracuse NY
## 69 720 Northern Blvd Brookville NY
## 70 3399 North Rd Poughkeepsie NY
## 71 555 Broadway Dobbs Ferry NY
## 72 1000 Hempstead Ave Rockville Centre NY
## 73 2501 Jerome Avenue Bronx NY
## 74 330 Powell Avenue Newburgh NY
## 75 4245 East Ave Rochester NY
## 76 70 Washington Sq South New York NY
## 77 Northern Blvd Old Westbury NY
## 78 1 Pace Plaza New York NY
## 79 1 Lomb Memorial Dr Rochester NY
## 80 180 Remsen Street Brooklyn Heights NY
## 81 432 Western Ave Albany NY
## 82 515 Loudon Rd Loudonville NY
## 83 245 Clinton Ave Brooklyn NY
## 84 155 W Roe Blvd Patchogue NY
## 85 3690 East Ave Rochester NY
## 86 8000 Utopia Pky Queens NY
## 87 1400 Washington Avenue Albany NY
## 88 4400 Vestal Parkway East Vestal NY
## 89 310 Administration Building Stony Brook NY
## 90 1 College Circle Geneseo NY
## 91 1 Hawk Drive New Paltz NY
## 92 7060 State Route 104 Oswego NY
## 93 223 Store Hill Rd Old Westbury NY
## 94 900 South Crouse Ave. Syracuse NY
## 95 500 7th Avenue New York NY
## 96 7 Columbia Cir Albany NY
## 97 1600 Burrstone Rd Utica NY
## 98 One Campus Rd Staten Island NY
## 99 500 W 185th St New York NY
## 100 143 Main Street Buies Creek NC
## 101 East 5th Street Greenville NC
## 102 100 Campus Drive Elon NC
## 103 Main St Boiling Springs NC
## 104 1601 E Market St Greensboro NC
## 105 103 South Bldg Cb 9100 Chapel Hill NC
## 106 2101 Hillsborough Street Raleigh NC
## 107 301 E. Wilson Street Wingate NC
## 108 Highway 107 Cullowhee NC
## 109 400 E Second St Bloomsburg PA
## 110 610 King of Prussia Rd Radnor PA
## 111 3141 Chestnut St Philadelphia PA
## 112 One Alpha Drive Elizabethtown PA
## 113 One College Green La Plume PA
## 114 One Neumann Drive Aston PA
## 115 34th & Spruce Street Philadelphia PA
## 116 5600 City Avenue Philadelphia PA
## 117 1801 North Broad Street Philadelphia PA
## 118 One University Place Chester PA
## 119 441 Country Club Rd York PA
## 120 85 S Prospect St Burlington VT
## 121 4400 University Dr Fairfax VA
## 122 4501 Colonial Blvd Fort Myers FL
## 123 10501 Fgcu Blvd S Fort Myers FL
## 124 420 W Main St Danville VA
## 125 800 North King Street Suite 101 Wilmington DE
## 126 4 Copley Pkwy Morrisville NC
## 127 4401 Village Drive Fairfax VA
## 128 150 West University Blvd Melbourne FL
## 129 180 Madison Ave., Ste. 1200 New York NY
## 130 2300 SW 145th Ave. Miramar FL
## 131 2015 Ayrsley Town Blvd., Ste. 109 Charlotte NC
## 132 1400 Crystal Dr, Ste 120 Arlington VA
## ZIP STFIP CNTY NMCNTY LOCALE LAT LON
## 1 06269 09 09013 Tolland County 21 41.80910 -72.24995
## 2 06824-5195 09 09001 Fairfield County 21 41.15767 -73.25590
## 3 06825-1000 09 09001 Fairfield County 21 41.22089 -73.24333
## 4 19716 10 10003 New Castle County 21 39.67958 -75.75282
## 5 19808 10 10003 New Castle County 21 39.74150 -75.68962
## 6 19720 10 10003 New Castle County 21 39.68230 -75.58700
## 7 33161-6695 12 12086 Miami-Dade County 21 25.87891 -80.19893
## 8 33199 12 12086 Miami-Dade County 21 25.75732 -80.37393
## 9 33801-5698 12 12105 Polk County 12 28.03244 -81.94820
## 10 32306-1037 12 12073 Leon County 12 30.44076 -84.29192
## 11 32611 12 12001 Alachua County 12 29.64629 -82.34791
## 12 33146 12 12086 Miami-Dade County 13 25.72126 -80.27866
## 13 32224-7699 12 12031 Duval County 11 30.27194 -81.50914
## 14 33620-9951 12 12057 Hillsborough County 11 28.06146 -82.41323
## 15 33606-1490 12 12057 Hillsborough County 11 27.94845 -82.46483
## 16 33827-0096 12 12105 Polk County 31 27.83878 -81.53231
## 17 32514-5750 12 12033 Escambia County 13 30.54908 -87.21851
## 18 04469 23 23019 Penobscot County 23 44.89926 -68.66933
## 19 04084-5236 23 23005 Cumberland County 41 43.82631 -70.48337
## 20 04103 23 23005 Cumberland County 13 43.66286 -70.27425
## 21 04901-5097 23 23011 Kennebec County 41 44.52491 -69.66473
## 22 21201-5720 24 24510 Baltimore city 11 39.30583 -76.61659
## 23 20783-8010 24 24033 Prince George's County 21 38.91271 -76.84758
## 24 20742 24 24033 Prince George's County 21 38.98818 -76.94472
## 25 21251-0001 24 24510 Baltimore city 11 39.34416 -76.58557
## 26 21252-0001 24 24005 Baltimore County 13 39.39362 -76.61116
## 27 01609-1296 25 25027 Worcester County 12 42.29423 -71.82899
## 28 02457-0310 25 25021 Norfolk County 21 42.29702 -71.26406
## 29 01106 25 25013 Hampden County 21 42.05509 -72.58338
## 30 02452-4705 25 25017 Middlesex County 13 42.38600 -71.22284
## 31 02467 25 25017 Middlesex County 13 42.33621 -71.16924
## 32 02325 25 25023 Plymouth County 21 41.98749 -70.97455
## 33 01610-1477 25 25027 Worcester County 12 42.24999 -71.82336
## 34 02186-2395 25 25021 Norfolk County 21 42.23806 -71.11654
## 35 01420-2697 25 25027 Worcester County 22 42.58830 -71.78967
## 36 01854-5104 25 25017 Middlesex County 21 42.65286 -71.32681
## 37 01003 25 25015 Hampshire County 21 42.38600 -72.52673
## 38 02125-3393 25 25025 Suffolk County 11 42.31288 -71.03687
## 39 01571-5000 25 25027 Worcester County 21 42.04403 -71.93028
## 40 02115-5005 25 25025 Suffolk County 11 42.33999 -71.08878
## 41 02747-2300 25 25005 Bristol County 22 41.62869 -71.00455
## 42 02108-3901 25 25025 Suffolk County 11 42.35795 -71.06092
## 43 01119-2684 25 25013 Hampden County 12 42.11502 -72.52047
## 44 01086-1630 25 25013 Hampden County 21 42.13270 -72.79650
## 45 07003 34 34013 Essex County 21 40.79510 -74.19431
## 46 07006-6195 34 34013 Essex County 21 40.83275 -74.27257
## 47 07666 34 34003 Bergen County 21 40.89721 -74.02899
## 48 07940 34 34027 Morris County 21 40.77450 -74.43212
## 49 07305 34 34017 Hudson County 11 40.70994 -74.08727
## 50 07083 34 34039 Union County 21 40.67798 -74.23350
## 51 07764-1898 34 34025 Monmouth County 21 40.28007 -74.00645
## 52 07043-1624 34 34031 Passaic County 21 40.86041 -74.19814
## 53 07430-1680 34 34003 Bergen County 21 41.08094 -74.17409
## 54 07102 34 34013 Essex County 11 40.73912 -74.17581
## 55 07306-5997 34 34017 Hudson County 11 40.72711 -74.07154
## 56 07079-2697 34 34013 Essex County 21 40.74234 -74.24603
## 57 11530-0701 36 36059 Nassau County 21 40.72144 -73.65332
## 58 10010 36 36061 New York County 11 40.74024 -73.98342
## 59 11210 36 36047 Kings County 11 40.63152 -73.94990
## 60 10314 36 36085 Richmond County 11 40.60183 -74.14849
## 61 10065 36 36061 New York County 11 40.76867 -73.96479
## 62 10468 36 36005 Bronx County 11 40.87296 -73.89538
## 63 11367 36 36081 Queens County 11 40.73518 -73.81610
## 64 10458 36 36005 Bronx County 11 40.85935 -73.88271
## 65 11549 36 36059 Nassau County 21 40.71596 -73.60078
## 66 10801-1890 36 36119 Westchester County 21 40.92572 -73.78805
## 67 14850-7002 36 36109 Tompkins County 23 42.42215 -76.49414
## 68 13214-1301 36 36067 Onondaga County 21 43.04919 -76.09043
## 69 11548-1327 36 36059 Nassau County 21 40.82071 -73.59368
## 70 12601 36 36027 Dutchess County 21 41.72094 -73.93548
## 71 10522 36 36119 Westchester County 21 41.02163 -73.87445
## 72 11571-5002 36 36059 Nassau County 21 40.68594 -73.62618
## 73 10468 36 36005 Bronx County 11 40.86446 -73.90022
## 74 12550 36 36071 Orange County 13 41.51387 -74.01265
## 75 14618-3790 36 36055 Monroe County 21 43.10158 -77.51858
## 76 10012-1091 36 36061 New York County 11 40.72945 -73.99726
## 77 11568-8000 36 36059 Nassau County 21 40.81245 -73.60780
## 78 10038-1598 36 36061 New York County 11 40.71101 -74.00472
## 79 14623-5603 36 36055 Monroe County 21 43.08419 -77.67386
## 80 11201-4305 36 36047 Kings County 11 40.69323 -73.99216
## 81 12203-1490 36 36001 Albany County 13 42.66430 -73.78666
## 82 12211-1462 36 36001 Albany County 21 42.71760 -73.75260
## 83 11205-3688 36 36047 Kings County 11 40.69042 -73.96766
## 84 11772 36 36103 Suffolk County 21 40.77593 -73.02466
## 85 14618-3597 36 36055 Monroe County 21 43.11626 -77.51306
## 86 11439 36 36081 Queens County 11 40.72252 -73.79610
## 87 12222 36 36001 Albany County 13 42.68549 -73.82466
## 88 13850-6000 36 36007 Broome County 22 42.08787 -75.96689
## 89 11794-0701 36 36103 Suffolk County 21 40.91476 -73.12046
## 90 14454-1465 36 36051 Livingston County 32 42.79664 -77.82189
## 91 12561-2443 36 36111 Ulster County 21 41.74094 -74.08219
## 92 13126 36 36075 Oswego County 32 43.45429 -76.54080
## 93 11568-0210 36 36059 Nassau County 21 40.79902 -73.57191
## 94 13244 36 36067 Onondaga County 12 43.04018 -76.13698
## 95 10018 36 36061 New York County 11 40.75320 -73.98940
## 96 12203-5159 36 36001 Albany County 13 42.70549 -73.86298
## 97 13502-4892 36 36065 Oneida County 13 43.09621 -75.27292
## 98 10301-4495 36 36085 Richmond County 11 40.61559 -74.09291
## 99 10033-3299 36 36061 New York County 11 40.85061 -73.92987
## 100 27506 37 37085 Harnett County 31 35.40915 -78.73824
## 101 27858-4353 37 37147 Pitt County 13 35.60719 -77.36829
## 102 27244-2010 37 37001 Alamance County 22 36.10415 -79.50344
## 103 28017-0997 37 37045 Cleveland County 32 35.24732 -81.66814
## 104 27411 37 37081 Guilford County 11 36.07282 -79.77338
## 105 27599 37 37135 Orange County 13 35.91177 -79.05097
## 106 27695-7001 37 37183 Wake County 11 35.78511 -78.67452
## 107 28174-0159 37 37179 Union County 21 34.98606 -80.44305
## 108 28723-9646 37 37099 Jackson County 32 35.30898 -83.18626
## 109 17815 42 42037 Columbia County 13 41.00782 -76.44784
## 110 19087-3698 42 42045 Delaware County 21 40.05636 -75.37526
## 111 19104 42 42101 Philadelphia County 11 39.95522 -75.19005
## 112 17022-2298 42 42071 Lancaster County 21 40.14924 -76.59322
## 113 18440-0200 42 42131 Wyoming County 21 41.55897 -75.77746
## 114 19014-1298 42 42045 Delaware County 21 39.87488 -75.44002
## 115 19104-6303 42 42101 Philadelphia County 11 39.95093 -75.19391
## 116 19131-1395 42 42101 Philadelphia County 11 39.99444 -75.23834
## 117 19122-6096 42 42101 Philadelphia County 11 39.98055 -75.15686
## 118 19013-5792 42 42045 Delaware County 21 39.86169 -75.35536
## 119 17403-3651 42 42133 York County 22 39.94614 -76.72798
## 120 05405-0160 50 50007 Chittenden County 13 44.47733 -73.19665
## 121 22030-4444 51 51059 Fairfax County 21 38.82998 -77.30743
## 122 33966 12 12071 Lee County 13 26.61087 -81.82142
## 123 33965-6565 12 12071 Lee County 21 26.46364 -81.77260
## 124 24541 51 51590 Danville city 32 36.57729 -79.41320
## 125 19801 10 10003 New Castle County 13 39.74336 -75.54791
## 126 27560 37 37183 Wake County 21 35.86546 -78.82245
## 127 22030-0000 51 51059 Fairfax County 21 38.84905 -77.34775
## 128 32901-6975 12 12009 Brevard County 13 28.06575 -80.62438
## 129 10016 36 36061 New York County 11 40.74775 -73.98349
## 130 33027 12 12011 Broward County 21 25.98764 -80.33984
## 131 28273 37 37119 Mecklenburg County 11 35.13766 -80.93197
## 132 22202 51 51013 Arlington County 12 38.86110 -77.04976
## CBSA NMCBSA CBSATYPE CSA
## 1 25540 Hartford-East Hartford-Middletown, CT 1 278
## 2 14860 Bridgeport-Stamford-Norwalk, CT 1 408
## 3 14860 Bridgeport-Stamford-Norwalk, CT 1 408
## 4 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 5 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 6 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 7 33100 Miami-Fort Lauderdale-Pompano Beach, FL 1 370
## 8 33100 Miami-Fort Lauderdale-Pompano Beach, FL 1 370
## 9 29460 Lakeland-Winter Haven, FL 1 422
## 10 45220 Tallahassee, FL 1 N
## 11 23540 Gainesville, FL 1 264
## 12 33100 Miami-Fort Lauderdale-Pompano Beach, FL 1 370
## 13 27260 Jacksonville, FL 1 300
## 14 45300 Tampa-St. Petersburg-Clearwater, FL 1 N
## 15 45300 Tampa-St. Petersburg-Clearwater, FL 1 N
## 16 29460 Lakeland-Winter Haven, FL 1 422
## 17 37860 Pensacola-Ferry Pass-Brent, FL 1 426
## 18 12620 Bangor, ME 1 N
## 19 38860 Portland-South Portland, ME 1 438
## 20 38860 Portland-South Portland, ME 1 438
## 21 12300 Augusta-Waterville, ME 2 N
## 22 12580 Baltimore-Columbia-Towson, MD 1 548
## 23 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV 1 548
## 24 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV 1 548
## 25 12580 Baltimore-Columbia-Towson, MD 1 548
## 26 12580 Baltimore-Columbia-Towson, MD 1 548
## 27 49340 Worcester, MA-CT 1 148
## 28 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 29 44140 Springfield, MA 1 N
## 30 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 31 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 32 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 33 49340 Worcester, MA-CT 1 148
## 34 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 35 49340 Worcester, MA-CT 1 148
## 36 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 37 44140 Springfield, MA 1 N
## 38 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 39 49340 Worcester, MA-CT 1 148
## 40 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 41 39300 Providence-Warwick, RI-MA 1 148
## 42 14460 Boston-Cambridge-Newton, MA-NH 1 148
## 43 44140 Springfield, MA 1 N
## 44 44140 Springfield, MA 1 N
## 45 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 46 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 47 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 48 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 49 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 50 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 51 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 52 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 53 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 54 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 55 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 56 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 57 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 58 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 59 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 60 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 61 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 62 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 63 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 64 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 65 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 66 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 67 27060 Ithaca, NY 1 296
## 68 45060 Syracuse, NY 1 532
## 69 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 70 39100 Poughkeepsie-Newburgh-Middletown, NY 1 408
## 71 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 72 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 73 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 74 39100 Poughkeepsie-Newburgh-Middletown, NY 1 408
## 75 40380 Rochester, NY 1 464
## 76 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 77 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 78 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 79 40380 Rochester, NY 1 464
## 80 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 81 10580 Albany-Schenectady-Troy, NY 1 104
## 82 10580 Albany-Schenectady-Troy, NY 1 104
## 83 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 84 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 85 40380 Rochester, NY 1 464
## 86 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 87 10580 Albany-Schenectady-Troy, NY 1 104
## 88 13780 Binghamton, NY 1 N
## 89 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 90 40380 Rochester, NY 1 464
## 91 28740 Kingston, NY 1 408
## 92 45060 Syracuse, NY 1 532
## 93 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 94 45060 Syracuse, NY 1 532
## 95 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 96 10580 Albany-Schenectady-Troy, NY 1 104
## 97 46540 Utica-Rome, NY 1 N
## 98 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 99 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 100 22180 Fayetteville, NC 1 246
## 101 24780 Greenville, NC 1 272
## 102 15500 Burlington, NC 1 268
## 103 43140 Shelby, NC 2 172
## 104 24660 Greensboro-High Point, NC 1 268
## 105 20500 Durham-Chapel Hill, NC 1 450
## 106 39580 Raleigh-Cary, NC 1 450
## 107 16740 Charlotte-Concord-Gastonia, NC-SC 1 172
## 108 19000 Cullowhee, NC 2 N
## 109 14100 Bloomsburg-Berwick, PA 1 146
## 110 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 111 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 112 29540 Lancaster, PA 1 N
## 113 42540 Scranton--Wilkes-Barre, PA 1 N
## 114 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 115 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 116 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 117 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 118 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 119 49620 York-Hanover, PA 1 276
## 120 15540 Burlington-South Burlington, VT 1 162
## 121 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV 1 548
## 122 15980 Cape Coral-Fort Myers, FL 1 163
## 123 15980 Cape Coral-Fort Myers, FL 1 163
## 124 19260 Danville, VA 2 N
## 125 37980 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 1 428
## 126 39580 Raleigh-Cary, NC 1 450
## 127 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV 1 548
## 128 37340 Palm Bay-Melbourne-Titusville, FL 1 N
## 129 35620 New York-Newark-Jersey City, NY-NJ-PA 1 408
## 130 33100 Miami-Fort Lauderdale-Pompano Beach, FL 1 370
## 131 16740 Charlotte-Concord-Gastonia, NC-SC 1 172
## 132 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV 1 548
## NMCSA NECTA
## 1 Hartford-East Hartford, CT 73450
## 2 New York-Newark, NY-NJ-CT-PA 71950
## 3 New York-Newark, NY-NJ-CT-PA 71950
## 4 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 5 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 6 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 7 Miami-Port St. Lucie-Fort Lauderdale, FL N
## 8 Miami-Port St. Lucie-Fort Lauderdale, FL N
## 9 Orlando-Lakeland-Deltona, FL N
## 10 N N
## 11 Gainesville-Lake City, FL N
## 12 Miami-Port St. Lucie-Fort Lauderdale, FL N
## 13 Jacksonville-St. Marys-Palatka, FL-GA N
## 14 N N
## 15 N N
## 16 Orlando-Lakeland-Deltona, FL N
## 17 Pensacola-Ferry Pass, FL-AL N
## 18 N 70750
## 19 Portland-Lewiston-South Portland, ME 76750
## 20 Portland-Lewiston-South Portland, ME 76750
## 21 N 78850
## 22 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## 23 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## 24 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## 25 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## 26 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## 27 Boston-Worcester-Providence, MA-RI-NH-CT 79600
## 28 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 29 N 78100
## 30 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 31 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 32 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 33 Boston-Worcester-Providence, MA-RI-NH-CT 79600
## 34 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 35 Boston-Worcester-Providence, MA-RI-NH-CT 74500
## 36 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 37 N 78100
## 38 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 39 Boston-Worcester-Providence, MA-RI-NH-CT 79600
## 40 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 41 Boston-Worcester-Providence, MA-RI-NH-CT 75550
## 42 Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 43 N 78100
## 44 N 78100
## 45 New York-Newark, NY-NJ-CT-PA N
## 46 New York-Newark, NY-NJ-CT-PA N
## 47 New York-Newark, NY-NJ-CT-PA N
## 48 New York-Newark, NY-NJ-CT-PA N
## 49 New York-Newark, NY-NJ-CT-PA N
## 50 New York-Newark, NY-NJ-CT-PA N
## 51 New York-Newark, NY-NJ-CT-PA N
## 52 New York-Newark, NY-NJ-CT-PA N
## 53 New York-Newark, NY-NJ-CT-PA N
## 54 New York-Newark, NY-NJ-CT-PA N
## 55 New York-Newark, NY-NJ-CT-PA N
## 56 New York-Newark, NY-NJ-CT-PA N
## 57 New York-Newark, NY-NJ-CT-PA N
## 58 New York-Newark, NY-NJ-CT-PA N
## 59 New York-Newark, NY-NJ-CT-PA N
## 60 New York-Newark, NY-NJ-CT-PA N
## 61 New York-Newark, NY-NJ-CT-PA N
## 62 New York-Newark, NY-NJ-CT-PA N
## 63 New York-Newark, NY-NJ-CT-PA N
## 64 New York-Newark, NY-NJ-CT-PA N
## 65 New York-Newark, NY-NJ-CT-PA N
## 66 New York-Newark, NY-NJ-CT-PA N
## 67 Ithaca-Cortland, NY N
## 68 Syracuse-Auburn, NY N
## 69 New York-Newark, NY-NJ-CT-PA N
## 70 New York-Newark, NY-NJ-CT-PA N
## 71 New York-Newark, NY-NJ-CT-PA N
## 72 New York-Newark, NY-NJ-CT-PA N
## 73 New York-Newark, NY-NJ-CT-PA N
## 74 New York-Newark, NY-NJ-CT-PA N
## 75 Rochester-Batavia-Seneca Falls, NY N
## 76 New York-Newark, NY-NJ-CT-PA N
## 77 New York-Newark, NY-NJ-CT-PA N
## 78 New York-Newark, NY-NJ-CT-PA N
## 79 Rochester-Batavia-Seneca Falls, NY N
## 80 New York-Newark, NY-NJ-CT-PA N
## 81 Albany-Schenectady, NY N
## 82 Albany-Schenectady, NY N
## 83 New York-Newark, NY-NJ-CT-PA N
## 84 New York-Newark, NY-NJ-CT-PA N
## 85 Rochester-Batavia-Seneca Falls, NY N
## 86 New York-Newark, NY-NJ-CT-PA N
## 87 Albany-Schenectady, NY N
## 88 N N
## 89 New York-Newark, NY-NJ-CT-PA N
## 90 Rochester-Batavia-Seneca Falls, NY N
## 91 New York-Newark, NY-NJ-CT-PA N
## 92 Syracuse-Auburn, NY N
## 93 New York-Newark, NY-NJ-CT-PA N
## 94 Syracuse-Auburn, NY N
## 95 New York-Newark, NY-NJ-CT-PA N
## 96 Albany-Schenectady, NY N
## 97 N N
## 98 New York-Newark, NY-NJ-CT-PA N
## 99 New York-Newark, NY-NJ-CT-PA N
## 100 Fayetteville-Sanford-Lumberton, NC N
## 101 Greenville-Kinston-Washington, NC N
## 102 Greensboro--Winston-Salem--High Point, NC N
## 103 Charlotte-Concord, NC-SC N
## 104 Greensboro--Winston-Salem--High Point, NC N
## 105 Raleigh-Durham-Cary, NC N
## 106 Raleigh-Durham-Cary, NC N
## 107 Charlotte-Concord, NC-SC N
## 108 N N
## 109 Bloomsburg-Berwick-Sunbury, PA N
## 110 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 111 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 112 N N
## 113 N N
## 114 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 115 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 116 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 117 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 118 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 119 Harrisburg-York-Lebanon, PA N
## 120 Burlington-South Burlington-Barre, VT 72400
## 121 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## 122 Cape Coral-Fort Myers-Naples, FL N
## 123 Cape Coral-Fort Myers-Naples, FL N
## 124 N N
## 125 Philadelphia-Reading-Camden, PA-NJ-DE-MD N
## 126 Raleigh-Durham-Cary, NC N
## 127 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## 128 N N
## 129 New York-Newark, NY-NJ-CT-PA N
## 130 Miami-Port St. Lucie-Fort Lauderdale, FL N
## 131 Charlotte-Concord, NC-SC N
## 132 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA N
## NMNECTA CD SLDL SLDU SCHOOLYEAR
## 1 Hartford-East Hartford-Middletown, CT 0902 09054 09029 2020-2021
## 2 Bridgeport-Stamford-Norwalk, CT 0904 09133 09028 2020-2021
## 3 Bridgeport-Stamford-Norwalk, CT 0904 09134 09028 2020-2021
## 4 N 1000 10025 10008 2020-2021
## 5 N 1000 10021 10004 2020-2021
## 6 N 1000 10017 10012 2020-2021
## 7 N 1224 12108 12038 2020-2021
## 8 N 1226 12116 12039 2020-2021
## 9 N 1215 12040 12022 2020-2021
## 10 N 1202 12009 12003 2020-2021
## 11 N 1203 12021 12008 2020-2021
## 12 N 1227 12114 12037 2020-2021
## 13 N 1204 12012 12004 2020-2021
## 14 N 1214 12063 12020 2020-2021
## 15 N 1214 12060 12018 2020-2021
## 16 N 1209 12042 12026 2020-2021
## 17 N 1201 12001 12001 2020-2021
## 18 Bangor, ME 2302 23123 23005 2020-2021
## 19 Portland-South Portland, ME 2301 23023 23026 2020-2021
## 20 Portland-South Portland, ME 2301 23040 23027 2020-2021
## 21 Waterville, ME 2301 23109 23016 2020-2021
## 22 N 2407 24045 24045 2020-2021
## 23 N 2404 24024 24024 2020-2021
## 24 N 2405 24021 24021 2020-2021
## 25 N 2407 24043 24043 2020-2021
## 26 N 2402 2442A 24042 2020-2021
## 27 Worcester, MA-CT 2502 25215 25010 2020-2021
## 28 Boston-Cambridge-Newton, MA-NH 2504 25170 25017 2020-2021
## 29 Springfield, MA-CT 2501 25103 25007 2020-2021
## 30 Boston-Cambridge-Newton, MA-NH 2505 25127 25016 2020-2021
## 31 Boston-Cambridge-Newton, MA-NH 2504 25129 25029 2020-2021
## 32 Boston-Cambridge-Newton, MA-NH 2508 25179 25036 2020-2021
## 33 Worcester, MA-CT 2502 25219 25010 2020-2021
## 34 Boston-Cambridge-Newton, MA-NH 2507 25163 25032 2020-2021
## 35 Leominster-Gardner, MA 2503 25205 25009 2020-2021
## 36 Boston-Cambridge-Newton, MA-NH 2503 25135 25013 2020-2021
## 37 Springfield, MA-CT 2502 25117 25006 2020-2021
## 38 Boston-Cambridge-Newton, MA-NH 2508 25187 25001 2020-2021
## 39 Worcester, MA-CT 2501 25208 25012 2020-2021
## 40 Boston-Cambridge-Newton, MA-NH 2507 25190 25028 2020-2021
## 41 New Bedford, MA 2509 25077 25038 2020-2021
## 42 Boston-Cambridge-Newton, MA-NH 2508 25186 25027 2020-2021
## 43 Springfield, MA-CT 2501 25110 25007 2020-2021
## 44 Springfield, MA-CT 2501 25105 25005 2020-2021
## 45 N 3410 34028 34028 2020-2021
## 46 N 3411 34027 34027 2020-2021
## 47 N 3405 34037 34037 2020-2021
## 48 N 3411 34027 34027 2020-2021
## 49 N 3410 34031 34031 2020-2021
## 50 N 3410 34020 34020 2020-2021
## 51 N 3406 34011 34011 2020-2021
## 52 N 3411 34040 34040 2020-2021
## 53 N 3405 34039 34039 2020-2021
## 54 N 3410 34029 34029 2020-2021
## 55 N 3410 34033 34033 2020-2021
## 56 N 3410 34027 34027 2020-2021
## 57 N 3604 36019 36006 2020-2021
## 58 N 3612 36075 36028 2020-2021
## 59 N 3609 36042 36021 2020-2021
## 60 N 3611 36063 36024 2020-2021
## 61 N 3612 36073 36028 2020-2021
## 62 N 3613 36081 36034 2020-2021
## 63 N 3606 36025 36016 2020-2021
## 64 N 3615 36078 36034 2020-2021
## 65 N 3604 36019 36006 2020-2021
## 66 N 3616 36088 36037 2020-2021
## 67 N 3623 36125 36058 2020-2021
## 68 N 3624 36128 36050 2020-2021
## 69 N 3603 36019 36005 2020-2021
## 70 N 3618 36106 36041 2020-2021
## 71 N 3617 36092 36035 2020-2021
## 72 N 3604 36021 36009 2020-2021
## 73 N 3613 36078 36033 2020-2021
## 74 N 3618 36104 36039 2020-2021
## 75 N 3625 36133 36055 2020-2021
## 76 N 3610 36066 36027 2020-2021
## 77 N 3603 36019 36005 2020-2021
## 78 N 3610 36066 36026 2020-2021
## 79 N 3625 36138 36059 2020-2021
## 80 N 3607 36052 36026 2020-2021
## 81 N 3620 36109 36044 2020-2021
## 82 N 3620 36110 36044 2020-2021
## 83 N 3608 36057 36025 2020-2021
## 84 N 3601 36007 36003 2020-2021
## 85 N 3625 36133 36055 2020-2021
## 86 N 3605 36024 36014 2020-2021
## 87 N 3620 36109 36044 2020-2021
## 88 N 3622 36123 36052 2020-2021
## 89 N 3601 36004 36002 2020-2021
## 90 N 3627 36133 36059 2020-2021
## 91 N 3619 36103 36042 2020-2021
## 92 N 3624 36130 36048 2020-2021
## 93 N 3603 36015 36005 2020-2021
## 94 N 3624 36129 36053 2020-2021
## 95 N 3610 36075 36031 2020-2021
## 96 N 3620 36109 36044 2020-2021
## 97 N 3622 36119 36047 2020-2021
## 98 N 3611 36063 36023 2020-2021
## 99 N 3613 36072 36031 2020-2021
## 100 N 3702 37053 37012 2020-2021
## 101 N 3701 37008 37005 2020-2021
## 102 N 3706 37064 37024 2020-2021
## 103 N 3710 37111 37044 2020-2021
## 104 N 3706 37061 37028 2020-2021
## 105 N 3704 37056 37023 2020-2021
## 106 N 3704 37033 37015 2020-2021
## 107 N 3709 37069 37035 2020-2021
## 108 N 3711 37119 37050 2020-2021
## 109 N 4209 42109 42027 2020-2021
## 110 N 4205 42165 42017 2020-2021
## 111 N 4203 42188 42007 2020-2021
## 112 N 4211 42098 42036 2020-2021
## 113 N 4212 42117 42020 2020-2021
## 114 N 4205 42161 42009 2020-2021
## 115 N 4203 42188 42008 2020-2021
## 116 N 4203 42192 42007 2020-2021
## 117 N 4202 42181 42003 2020-2021
## 118 N 4205 42159 42009 2020-2021
## 119 N 4210 42095 42028 2020-2021
## 120 Burlington-South Burlington, VT 5000 50C64 50CHI 2020-2021
## 121 N 5111 51037 51034 2020-2021
## 122 N 1219 12078 12027 2020-2021
## 123 N 1219 12078 12027 2020-2021
## 124 N 5105 51014 51020 2020-2021
## 125 N 1000 10002 10003 2020-2021
## 126 N 3704 37041 37016 2020-2021
## 127 N 5111 51037 51037 2020-2021
## 128 N 1208 12052 12017 2020-2021
## 129 N 3612 36075 36027 2020-2021
## 130 N 1220 12103 12035 2020-2021
## 131 N 3712 37092 37037 2020-2021
## 132 N 5108 51048 51030 2020-2021
## match_technical_skills match_soft_skills school_score
## 1 1 1 2
## 2 1 1 2
## 3 1 1 2
## 4 1 1 2
## 5 1 1 2
## 6 1 1 2
## 7 1 1 2
## 8 1 1 2
## 9 1 1 2
## 10 1 1 2
## 11 1 1 2
## 12 1 1 2
## 13 1 1 2
## 14 1 1 2
## 15 1 1 2
## 16 1 1 2
## 17 1 1 2
## 18 1 1 2
## 19 1 1 2
## 20 1 1 2
## 21 1 1 2
## 22 1 1 2
## 23 1 1 2
## 24 1 1 2
## 25 1 1 2
## 26 1 1 2
## 27 1 1 2
## 28 1 1 2
## 29 1 1 2
## 30 1 1 2
## 31 1 1 2
## 32 1 1 2
## 33 1 1 2
## 34 1 1 2
## 35 1 1 2
## 36 1 1 2
## 37 1 1 2
## 38 1 1 2
## 39 1 1 2
## 40 1 1 2
## 41 1 1 2
## 42 1 1 2
## 43 1 1 2
## 44 1 1 2
## 45 1 1 2
## 46 1 1 2
## 47 1 1 2
## 48 1 1 2
## 49 1 1 2
## 50 1 1 2
## 51 1 1 2
## 52 1 1 2
## 53 1 1 2
## 54 1 1 2
## 55 1 1 2
## 56 1 1 2
## 57 1 1 2
## 58 1 1 2
## 59 1 1 2
## 60 1 1 2
## 61 1 1 2
## 62 1 1 2
## 63 1 1 2
## 64 1 1 2
## 65 1 1 2
## 66 1 1 2
## 67 1 1 2
## 68 1 1 2
## 69 1 1 2
## 70 1 1 2
## 71 1 1 2
## 72 1 1 2
## 73 1 1 2
## 74 1 1 2
## 75 1 1 2
## 76 1 1 2
## 77 1 1 2
## 78 1 1 2
## 79 1 1 2
## 80 1 1 2
## 81 1 1 2
## 82 1 1 2
## 83 1 1 2
## 84 1 1 2
## 85 1 1 2
## 86 1 1 2
## 87 1 1 2
## 88 1 1 2
## 89 1 1 2
## 90 1 1 2
## 91 1 1 2
## 92 1 1 2
## 93 1 1 2
## 94 1 1 2
## 95 1 1 2
## 96 1 1 2
## 97 1 1 2
## 98 1 1 2
## 99 1 1 2
## 100 1 1 2
## 101 1 1 2
## 102 1 1 2
## 103 1 1 2
## 104 1 1 2
## 105 1 1 2
## 106 1 1 2
## 107 1 1 2
## 108 1 1 2
## 109 1 1 2
## 110 1 1 2
## 111 1 1 2
## 112 1 1 2
## 113 1 1 2
## 114 1 1 2
## 115 1 1 2
## 116 1 1 2
## 117 1 1 2
## 118 1 1 2
## 119 1 1 2
## 120 1 1 2
## 121 1 1 2
## 122 1 1 2
## 123 1 1 2
## 124 1 1 2
## 125 1 1 2
## 126 1 1 2
## 127 1 1 2
## 128 1 1 2
## 129 1 1 2
## 130 1 1 2
## 131 1 1 2
## 132 1 1 2
## good_data_science_program
## 1 YES
## 2 YES
## 3 YES
## 4 YES
## 5 YES
## 6 YES
## 7 YES
## 8 YES
## 9 YES
## 10 YES
## 11 YES
## 12 YES
## 13 YES
## 14 YES
## 15 YES
## 16 YES
## 17 YES
## 18 YES
## 19 YES
## 20 YES
## 21 YES
## 22 YES
## 23 YES
## 24 YES
## 25 YES
## 26 YES
## 27 YES
## 28 YES
## 29 YES
## 30 YES
## 31 YES
## 32 YES
## 33 YES
## 34 YES
## 35 YES
## 36 YES
## 37 YES
## 38 YES
## 39 YES
## 40 YES
## 41 YES
## 42 YES
## 43 YES
## 44 YES
## 45 YES
## 46 YES
## 47 YES
## 48 YES
## 49 YES
## 50 YES
## 51 YES
## 52 YES
## 53 YES
## 54 YES
## 55 YES
## 56 YES
## 57 YES
## 58 YES
## 59 YES
## 60 YES
## 61 YES
## 62 YES
## 63 YES
## 64 YES
## 65 YES
## 66 YES
## 67 YES
## 68 YES
## 69 YES
## 70 YES
## 71 YES
## 72 YES
## 73 YES
## 74 YES
## 75 YES
## 76 YES
## 77 YES
## 78 YES
## 79 YES
## 80 YES
## 81 YES
## 82 YES
## 83 YES
## 84 YES
## 85 YES
## 86 YES
## 87 YES
## 88 YES
## 89 YES
## 90 YES
## 91 YES
## 92 YES
## 93 YES
## 94 YES
## 95 YES
## 96 YES
## 97 YES
## 98 YES
## 99 YES
## 100 YES
## 101 YES
## 102 YES
## 103 YES
## 104 YES
## 105 YES
## 106 YES
## 107 YES
## 108 YES
## 109 YES
## 110 YES
## 111 YES
## 112 YES
## 113 YES
## 114 YES
## 115 YES
## 116 YES
## 117 YES
## 118 YES
## 119 YES
## 120 YES
## 121 YES
## 122 YES
## 123 YES
## 124 YES
## 125 YES
## 126 YES
## 127 YES
## 128 YES
## 129 YES
## 130 YES
## 131 YES
## 132 YES
# A visual of the database of recommended schools.
view(temp_schools)
# Create a label that encompasses multiple variables. Use the <p> html code to create a hard return and separate the City and State data.
temp_schools$label <- paste("<p><a>", temp_schools$NAME,"<p></a>",
temp_schools$CITY,",",
temp_schools$STATE)
# Create Leaflet map centered on the US eastern seaboard.
# The lapply function is used to interpret the <p> html code instead of literal text.
leaflet(temp_schools) %>%
addProviderTiles("CartoDB") %>%
setView(-80.95, 35.635, zoom = 4) %>%
addCircles(lat = ~ LAT, lng = ~ LON, label = lapply(temp_schools$label, HTML))
Websites for colleges are vastly different from one another in terms of HTML structure and website layout. For example, for some colleges, when navigating to their course descriptions page, the page itself will contain links to PDFs.
Figure 1: Course description page for Angelo State University
When accessing the course description page for other colleges, the descriptions will be on the page itself instead of on a PDF as shown on Figure 2.
Another plan that the team had in mind was to ignore the websites themselves and just parse through the course catalog PDFs for all of the colleges with graduate accounting programs. However, we ran into a similar problem where even the PDFs themselves were vastly different from one another in terms of layout if we compare Figure 3 to Figure 4.
Figure 3: A snippet of the graduate accounting course descriptions for Angelo State University taken from the 2019-2020 graduate catalogue
Figure 4: A snippet of the graduate account course descriptions for Bay Path University taken from the 2019 - 2020 graduate catalogue
Based on these caveats that the team encountered when exploring the possibility of web scraping for college course descriptions, the team decided that it would be best to just use the data that was collected from Dr. Foy’s students which was manually copy and pasted.
By mining course descriptions words and joining them with vectors of desired skills, we successfully built a recommender system with a few key predictors. We extended the model by adding geocoding and mapping features to perform basic cluster analysis. From a visualize overview centered on the Eastern U.S. coastline, we can observe clustering of the schools predominently in the northeast: NYC Metro, Boston Metro and the Philadelphia Metro areas. Also North Carolina and Florida show significant clustering. Constructing a database of courses from schools probably would not be as efficient as copying and pasting data directly from the sshool’s websites.
We can further extend the model and add other aspects into the model such as tuition costs, post graduate employment percentage and national university ranking.
Note, some colleges did not publish course descriptions so they were penalized by the recommender system.