Overview

Prospective students face many options on where to earn a graduate accounting degree. On the east coast alone, there are thousands of universities offering a graduate accounting degree. Adding to this complexity, students must also evaluate whether each university program offers skills that employers are recruiting for. Some skills are technical and deal with specific topics such as SQL, Python and statistics while others are ‘soft’ skills like team work and collaboration.

In this projects, we approach it in three parts. Part I is the construction of a recommender system. Part II is to create a mapping system for recommended schools. Part III is a discussion on some of the issues with obtaining the data in a webscraping or copy and paste approach.

PART I: Recommender System Construction

We will show how to build a recommender system using tinymodels by first loading pre-processed college meta data and sought-after data science skills. After classifying each program as a match for having sufficient data science training, we create a sample set and build a recommender system.

The recommender system at the end will be able to categorize any other accounting program as having a good data science program or not.

Loading of requred libraries for the overall project.

# Libraries
library(tidyverse)
## Warning: package 'tidyr' was built under R version 4.1.2
library(tidytext)
## Warning: package 'tidytext' was built under R version 4.1.2
# For tinymodels
library(DiceDesign)
library(tidymodels)
## Warning: package 'tidymodels' was built under R version 4.1.2
## Warning: package 'rsample' was built under R version 4.1.2
library(workflows)
library(tune)
library(mlbench)
## Warning: package 'mlbench' was built under R version 4.1.2
library(rsample)
library(recipes)
library(parsnip)
library(yardstick)
library(tm)
## Warning: package 'tm' was built under R version 4.1.2
# For mapping
library(sf)
## Warning: package 'sf' was built under R version 4.1.2
library(leaflet)
library(htmltools)

Loading Collegiate and Desired Skills Data

Building on the prior work by Team Four, we load three data frames: * Graduate Accounting Programs on the east coast * Dictionary of desired technical skills by employers * Dictionary of desired soft skills by employers

accounting_programs <- read_csv("https://github.com/cliftonleesps/607_final_project/blob/master/Acct_Curricula2.csv?raw=true", show_col_types = FALSE, )
technical_skills <- read_csv("https://github.com/cliftonleesps/607_final_project/raw/master/technical_skills.csv", show_col_types = FALSE)
soft_skills <- read_csv("https://github.com/cliftonleesps/607_final_project/raw/master/soft_skills.csv", show_col_types = FALSE)
# Geocoding Schools from Kratika Patel
library(sf)
library(tidyverse)

url1 <- "https://raw.githubusercontent.com/cliftonleesps/607_final_project/master/Acct_Curricula2.csv"
AcctCurricula <- data.frame(read.csv(url1))
col <- colnames(AcctCurricula) 
col <- toupper(col)
col[1] <- "NAME"
colnames(AcctCurricula) <- col
Names <- AcctCurricula %>% select("NAME")

Names <- data.frame(NAME = unique(Names$NAME))

url2 <- "https://raw.githubusercontent.com/cliftonleesps/607_final_project/master/EDGE_GEOCODE_POSTSECSCH_2021.csv"
schools <- data.frame(read.csv(url2))
col <- colnames(schools)
col[1] <- "UNITID"
colnames(schools) <- col
#head(schools)


SchoolGeo <- schools %>%
  filter(NAME %in% Names$NAME)

#Correct typos and clean names of Universities not detected in schools dataframe
Names %>%
  filter(!(NAME %in% schools$NAME))
##                                                     NAME
## 1                             Fitchberg State University
## 2                          Pennsylvania State University
## 3                            Saint Joseph's University\n
## 4                          Strayer University - Delaware
## 5 Strayer University-North Carolina (online, for-profit)
## 6                 University of Massachussetts - Amherst
## 7               University of Massachussetts - Dartmouth
## 8               University of North Carolina Chapel Hill
Names$NAME[Names$NAME == "Fitchberg State University"] <- "Fitchburg State University"
Names$NAME[Names$NAME == "Saint Joseph's University\n"] <- "Saint Joseph's University"
Names$NAME[Names$NAME == "Pennsylvania State University"] <- "Pennsylvania State University-Penn State Harrisburg"
Names$NAME[Names$NAME == "Strayer University - Delaware"] <- "Strayer University-Delaware"
Names$NAME[Names$NAME == "Strayer University-North Carolina (online, for-profit)"] <- "Strayer University-North Carolina"
Names$NAME[Names$NAME == "University of Massachussetts - Amherst"] <- "University of Massachusetts-Amherst"
Names$NAME[Names$NAME == "University of Massachussetts - Dartmouth"] <- "University of Massachusetts-Dartmouth"
Names$NAME[Names$NAME == "University of North Carolina Chapel Hill"] <- "University of North Carolina at Chapel Hill"

SchoolGeo <- schools %>%
  filter(NAME %in% Names$NAME)

s <- schools %>% filter(NAME== "University of Connecticut")
SchoolGeo <- add_row(SchoolGeo, s)

SchoolGeo[39,2] <- "Fitchberg State University"
SchoolGeo[130,2] <- "Saint Joseph's University\n"
SchoolGeo[1,2] <- "Pennsylvania State University"
SchoolGeo[142,2] <- "Strayer University - Delaware"
SchoolGeo[143,2] <- "Strayer University-North Carolina (online, for-profit)"
SchoolGeo[41,2] <- "University of Massachussetts - Amherst"
SchoolGeo[45,2] <- "University of Massachussetts - Dartmouth"
SchoolGeo[113,2] <- "University of North Carolina Chapel Hill"


#Remove Duplicate row for Pennsylvania State University-Penn State Harrisburg
SchoolGeo <- SchoolGeo[!(SchoolGeo$UNITID == 49576722),]
#glimpse(SchoolGeo)


# subset(SchoolGeo, NAME == "Ramapo College of New Jersey")
# 
# ?inner_join
# 
# t <- right_join(SchoolGeo, temp_schools, by = c("NAME"= "name"))
# subset(t, NAME == "Ramapo College of New Jersey")

Tidying Collegiate Data And Creating Categorizations

The collegiate accounting in its native form requires a little tidying. Each row is an observation of a course and its curriculum description. We’ll create a vector from each description and join with a vector of technical and a vector of soft skills. If there are any matches, the match_technical_skills attribute is set from zero to one.

# initialize some counters
current_school <- accounting_programs$School[1]
description <- accounting_programs$Description[1]

# temp_schools is where we keep our tidy data
temp_schools <- tibble(
  name = "",
  description = "",
  match_technical_skills = 0, 
  match_soft_skills = 0
)

# Iterate through the accounting programs
# Since a college appears on more than one row, we have to aggregate all of the course descriptions grouping
# by college name
for (row in 2:nrow(accounting_programs)) {
  
  # if we detect a different school name, then save the data to the tibble
  if (current_school != accounting_programs$School[row]) {
      temp_schools <- temp_schools %>%
        add_row( 
                name = current_school, 
                description = paste0(description, accounting_programs$Description[row]),
                match_technical_skills = 0, 
                match_soft_skills = 0
        )
      description <- accounting_programs$Description[row]
      current_school <- accounting_programs$School[row]
  } else if (!is.na(accounting_programs$Description[row])) {
    # Just keep pasting the description for later
    description <- paste0(description, accounting_programs$Description[row])
  }
}


# Add the last school to the tibble
temp_schools <- temp_schools %>%
  add_row( 
    name = current_school, 
    description = paste0(description, accounting_programs$Description[row]),
    match_technical_skills = 0, 
    match_soft_skills = 0
  )

# delete the first row
nrow(temp_schools)
## [1] 151
temp_schools <- temp_schools[-1,]
nrow(temp_schools)
## [1] 150
# Function to remove duplicate words to be used in the next for loop

rem_dup_word <- function(x){
  x <- tolower(x)
  x <- gsub("-", " ", x)
  x <- gsub("/", " ", x)
  x <- gsub("[[:punct:]]", "", x)
  x <- gsub("[[:digit:]]", "", x)
  x <- gsub("this course", "", x)
  x <- gsub("topics include", "", x)
  return(paste(unique(trimws(tibble(word = unlist(strsplit(x, split = " ", fixed = F, perl = T))) %>% anti_join(stop_words) %>% pull(word))),
         collapse = " "))
}

# now iterate through the schools and split the descriptions

for (count in 1:nrow(temp_schools)) {
  # get the current row
  ts <- temp_schools[count,]
  
  # Obtain the school anme
  school_name <- ts[1]
  
  # Use the rem_dup_word function on the 2nd element of ts, which contains the
  # course description
  description_string <- rem_dup_word(ts[2])
  
  # Make each word in the `description_string` character vector a row element
  # in a `school_descriptions` dataframe.
  school_descriptions <- data.frame(as.list(str_split(description_string, " ")))
  
  # Change the column name in the `school_descriptions` dataframe
  colnames(school_descriptions) <- c("word")

  # now join with the technical skills
  # If any words match the vector of technical skills then we 
  # set technical_skill_match = 1
  technical_skill_match <- inner_join(technical_skills, school_descriptions,by="word")
  if (nrow(technical_skill_match) > 0) {
    #print (school_name)
    temp_schools[count,][3] <- 1
  }
  
  # now join with the soft skills
  # If any words match the vector of soft skills, we 
  # soft_skills_match = 1
  soft_skills_match <- inner_join(soft_skills, school_descriptions,by="word")
  if (nrow(soft_skills_match) > 0) {
    #print (school_name)
    temp_schools[count,][4] <- 1
  }
  
  
}

# create a new column school_score = match_technical_skills + match_soft_skills
temp_schools <- temp_schools %>% mutate (school_score = match_technical_skills + match_soft_skills)

# create another new column good_data_science_program = [YES,NO] 
temp_schools <- temp_schools %>% mutate (good_data_science_program = ifelse( school_score >= 2, "YES", "NO"))



# drop the description column since it takes a lot of
# memory
temp_schools <- subset(temp_schools, select = -c(2))
ncol(temp_schools)
## [1] 5
# now join with SchoolGeo so we get the latitude and longtitude
temp_schools <- right_join(SchoolGeo, temp_schools, by = c("NAME"= "name"))
#temp_schools <- inner_join(SchoolGeo, temp_schools, by = c("NAME"= "name"))

Constructing A Recommender System With Tiny Models

# Start building the model
set.seed(4393003)
sample_size <- 100



glimpse(temp_schools)
## Rows: 150
## Columns: 27
## $ UNITID                    <int> 129020, 129215, 129242, 130253, 130943, 1309~
## $ NAME                      <chr> "Pennsylvania State University", "Eastern Co~
## $ STREET                    <chr> "352 Mansfield Road", "83 Windham St", "1073~
## $ CITY                      <chr> "Storrs", "Willimantic", "Fairfield", "Fairf~
## $ STATE                     <chr> "CT", "CT", "CT", "CT", "DE", "DE", "DE", "F~
## $ ZIP                       <chr> "06269", "06226", "06824-5195", "06825-1000"~
## $ STFIP                     <chr> "09", "09", "09", "09", "10", "10", "10", "1~
## $ CNTY                      <chr> "09013", "09015", "09001", "09001", "10003",~
## $ NMCNTY                    <chr> "Tolland County", "Windham County", "Fairfie~
## $ LOCALE                    <chr> "21", "31", "21", "21", "21", "21", "21", "2~
## $ LAT                       <dbl> 41.80910, 41.72167, 41.15767, 41.22089, 39.6~
## $ LON                       <dbl> -72.24995, -72.21875, -73.25590, -73.24333, ~
## $ CBSA                      <chr> "25540", "49340", "14860", "14860", "37980",~
## $ NMCBSA                    <chr> "Hartford-East Hartford-Middletown, CT", "Wo~
## $ CBSATYPE                  <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ CSA                       <chr> "278", "148", "408", "408", "428", "428", "4~
## $ NMCSA                     <chr> "Hartford-East Hartford, CT", "Boston-Worces~
## $ NECTA                     <chr> "73450", "79300", "71950", "71950", "N", "N"~
## $ NMNECTA                   <chr> "Hartford-East Hartford-Middletown, CT", "Wi~
## $ CD                        <chr> "0902", "0902", "0904", "0904", "1000", "100~
## $ SLDL                      <chr> "09054", "09049", "09133", "09134", "10025",~
## $ SLDU                      <chr> "09029", "09029", "09028", "09028", "10008",~
## $ SCHOOLYEAR                <chr> "2020-2021", "2020-2021", "2020-2021", "2020~
## $ match_technical_skills    <dbl> 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,~
## $ match_soft_skills         <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1,~
## $ school_score              <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 0, 1, 2, 2, 2, 2, 2,~
## $ good_data_science_program <chr> "YES", "NO", "YES", "YES", "YES", "YES", "YE~
random_schools <- sample(temp_schools, size= 2,  replace = FALSE)
random_schools
##                  CITY school_score
## 1              Storrs            2
## 2         Willimantic            1
## 3           Fairfield            2
## 4           Fairfield            2
## 5              Newark            2
## 6          Wilmington            2
## 7          New Castle            2
## 8               Miami            2
## 9             Orlando            0
## 10         Boca Raton            1
## 11              Miami            2
## 12           Lakeland            2
## 13        Tallahassee            2
## 14        Gainesville            2
## 15       Coral Gables            2
## 16       Jacksonville            2
## 17              Tampa            2
## 18              Tampa            2
## 19        Babson Park            2
## 20          Pensacola            2
## 21              Orono            2
## 22           Standish            2
## 23           Portland            2
## 24         Waterville            2
## 25          Baltimore            2
## 26          Baltimore            0
## 27            Adelphi            2
## 28       College Park            2
## 29          Baltimore            2
## 30             Towson            2
## 31          Worcester            2
## 32          Wellesley            2
## 33         Longmeadow            2
## 34            Waltham            2
## 35      Chestnut Hill            2
## 36        Bridgewater            2
## 37          Worcester            2
## 38             Milton            2
## 39          Fitchburg            2
## 40             Lowell            2
## 41            Amherst            2
## 42             Boston            2
## 43             Dudley            2
## 44             Boston            2
## 45    North Dartmouth            2
## 46             Boston            2
## 47        Springfield            2
## 48          Westfield            2
## 49         Bloomfield            2
## 50           Caldwell            2
## 51            Teaneck            2
## 52            Madison            2
## 53        Jersey City            2
## 54              Union            2
## 55   West Long Branch            2
## 56          Montclair            2
## 57             Mahwah            2
## 58             Newark            2
## 59        Jersey City            2
## 60       South Orange            2
## 61        Garden City            2
## 62             Alfred            1
## 63           New York            2
## 64           Brooklyn            2
## 65      Staten Island            2
## 66           New York            2
## 67              Bronx            2
## 68             Queens            2
## 69              Bronx            2
## 70          Hempstead            2
## 71       New Rochelle            2
## 72             Ithaca            2
## 73           Syracuse            2
## 74         Brookville            2
## 75          Riverdale            1
## 76       Poughkeepsie            2
## 77        Dobbs Ferry            2
## 78   Rockville Centre            2
## 79              Bronx            2
## 80           Newburgh            2
## 81          Rochester            2
## 82           New York            2
## 83       Old Westbury            2
## 84           New York            2
## 85          Rochester            2
## 86  Saint Bonaventure            0
## 87   Brooklyn Heights            2
## 88             Albany            2
## 89        Loudonville            2
## 90           Brooklyn            2
## 91          Patchogue            2
## 92          Rochester            2
## 93             Queens            2
## 94             Albany            2
## 95             Vestal            2
## 96        Stony Brook            2
## 97              Utica            0
## 98            Geneseo            2
## 99          New Paltz            2
## 100            Oswego            2
## 101      Old Westbury            2
## 102          Syracuse            2
## 103          New York            2
## 104            Albany            2
## 105             Utica            2
## 106     Staten Island            2
## 107          New York            2
## 108       Buies Creek            2
## 109        Greenville            2
## 110              Elon            2
## 111   Boiling Springs            2
## 112        Greensboro            2
## 113       Chapel Hill            2
## 114           Raleigh            2
## 115           Wingate            2
## 116         Cullowhee            2
## 117        Bloomsburg            2
## 118            Radnor            2
## 119      Philadelphia            2
## 120     Elizabethtown            2
## 121      Philadelphia            0
## 122          La Plume            2
## 123      Wilkes-Barre            1
## 124      Philadelphia            1
## 125         Bethlehem            1
## 126             Aston            2
## 127      Philadelphia            2
## 128         Langhorne            1
## 129      Philadelphia            2
## 130          Scranton            0
## 131      Philadelphia            2
## 132         Villanova            0
## 133           Chester            2
## 134              York            2
## 135        Burlington            2
## 136           Fairfax            2
## 137        Fort Myers            2
## 138        Fort Myers            2
## 139           Trevose            1
## 140          Danville            2
## 141        Wilmington            2
## 142       Morrisville            2
## 143           Fairfax            2
## 144         Melbourne            2
## 145          New York            2
## 146           Miramar            2
## 147         Charlotte            2
## 148    Ft. Washington            1
## 149         Arlington            2
## 150            Storrs            0
# Randomly select schools
schools_bad <- temp_schools %>% filter(good_data_science_program == "NO")
schools_good <- temp_schools %>% filter(good_data_science_program == "YES")


sample_schools <- schools_good[sample(nrow(schools_good), sample_size - nrow(schools_bad)), ]
for (c in 1:nrow(schools_bad)) {
  row <- schools_bad[c,]
  sample_schools <- add_row(sample_schools, tibble(
      name = row$name,
      match_technical_skills = row$match_technical_skills,
      match_soft_skills = row$match_soft_skills,
      school_score = row$school_score, 
      good_data_science_program = row$good_data_science_program
    )
  )
}

# now we have our samples
school_split <- initial_split(sample_schools, 
                             prop = 3/4)
school_split
## <Analysis/Assess/Total>
## <75/25/100>
school_train <- training(school_split)
school_test <- testing(school_split)


school_cv <- vfold_cv(school_train)
school_cv
## #  10-fold cross-validation 
## # A tibble: 10 x 2
##    splits         id    
##    <list>         <chr> 
##  1 <split [67/8]> Fold01
##  2 <split [67/8]> Fold02
##  3 <split [67/8]> Fold03
##  4 <split [67/8]> Fold04
##  5 <split [67/8]> Fold05
##  6 <split [68/7]> Fold06
##  7 <split [68/7]> Fold07
##  8 <split [68/7]> Fold08
##  9 <split [68/7]> Fold09
## 10 <split [68/7]> Fold10
# define the recipe
school_recipe <- 
  # which consists of the formula (outcome ~ predictors)
  recipe(good_data_science_program ~ match_technical_skills + match_soft_skills + school_score, 
         data = sample_schools) %>%
  step_normalize(all_numeric()) %>%
  step_impute_knn(all_predictors())

school_recipe
## Recipe
## 
## Inputs:
## 
##       role #variables
##    outcome          1
##  predictor          3
## 
## Operations:
## 
## Centering and scaling for all_numeric()
## K-nearest neighbor imputation for all_predictors()
school_train_preprocessed <- school_recipe %>%
  # apply the recipe to the training data
  prep(school_train) %>%
  # extract the pre-processed training dataset
  juice()
school_train_preprocessed
## # A tibble: 75 x 4
##    match_technical_skills match_soft_skills school_score good_data_science_prog~
##                     <dbl>             <dbl>        <dbl> <fct>                  
##  1                  0.367             0.412        0.444 YES                    
##  2                  0.367             0.412        0.444 YES                    
##  3                  0.367            -2.40        -1.22  NO                     
##  4                  0.367             0.412        0.444 YES                    
##  5                  0.367             0.412        0.444 YES                    
##  6                  0.367             0.412        0.444 YES                    
##  7                  0.367             0.412        0.444 YES                    
##  8                  0.367             0.412        0.444 YES                    
##  9                  0.367             0.412        0.444 YES                    
## 10                  0.367             0.412        0.444 YES                    
## # ... with 65 more rows
rf_model <- 
  # specify that the model is a random forest
  rand_forest() %>%
  # specify that the `mtry` parameter needs to be tuned
  set_args(mtry = tune()) %>%
  # select the engine/package that underlies the model
  set_engine("ranger", importance = "impurity") %>%
  # choose either the continuous regression or binary classification mode
  set_mode("classification") 


# set the workflow
rf_workflow <- workflow() %>%
  # add the recipe
  add_recipe(school_recipe) %>%
  # add the model
  add_model(rf_model)
rf_workflow
## == Workflow ====================================================================
## Preprocessor: Recipe
## Model: rand_forest()
## 
## -- Preprocessor ----------------------------------------------------------------
## 2 Recipe Steps
## 
## * step_normalize()
## * step_impute_knn()
## 
## -- Model -----------------------------------------------------------------------
## Random Forest Model Specification (classification)
## 
## Main Arguments:
##   mtry = tune()
## 
## Engine-Specific Arguments:
##   importance = impurity
## 
## Computational engine: ranger
rf_grid <- expand.grid(mtry = c(2,3))
rf_tune_results <- rf_workflow %>%
  tune_grid(resamples = school_cv, #CV object
            grid = rf_grid, # grid of values to try
            metrics = metric_set(accuracy, roc_auc) # metrics we care about
  )
## Warning: package 'ranger' was built under R version 4.1.2
## ! Fold05: internal: No event observations were detected in `truth` with event leve...
## ! Fold10: internal: No event observations were detected in `truth` with event leve...
rf_tune_results %>%
  collect_metrics()
## # A tibble: 4 x 7
##    mtry .metric  .estimator  mean     n std_err .config             
##   <dbl> <chr>    <chr>      <dbl> <int>   <dbl> <chr>               
## 1     2 accuracy binary         1    10       0 Preprocessor1_Model1
## 2     2 roc_auc  binary         1     8       0 Preprocessor1_Model1
## 3     3 accuracy binary         1    10       0 Preprocessor1_Model2
## 4     3 roc_auc  binary         1     8       0 Preprocessor1_Model2
param_final <- rf_tune_results %>%
  select_best(metric = "accuracy")
#param_final

rf_workflow <- rf_workflow %>%
  finalize_workflow(param_final)


rf_fit <- rf_workflow %>%
  # fit on the training set and evaluate on test set
  last_fit(school_split)
#rf_fit


test_performance <- rf_fit %>% collect_metrics()
#test_performance

test_predictions <- rf_fit %>% collect_predictions()
#test_predictions

final_model <- fit(rf_workflow, sample_schools)
#final_model

Testing The Recommender System on Example College Programs

# predict fictitious colleges
test_bad_college <- tibble(
  name = "Test Bad college",
  match_technical_skills = 0,
  match_soft_skills = 1, 
  school_score = 1
)

test_good_college <- tibble(
  name = "Test Good college",
  match_technical_skills = 1,
  match_soft_skills = 1, 
  school_score = 2
)

# Predict will output if the college has a good data science program
recommendation <- predict(final_model, new_data = test_bad_college)
print(paste0("For a college without a data science program the recommendation is ", recommendation$.pred_class))
## [1] "For a college without a data science program the recommendation is NO"
recommendation <- predict(final_model, new_data = test_good_college)
print(paste0("For a college with a data science program the recommendation is ", recommendation$.pred_class))
## [1] "For a college with a data science program the recommendation is YES"
# Dataframe of recommended schools.
temp_schools %>% filter(good_data_science_program == "YES")
##     UNITID                                                   NAME
## 1   129020                          Pennsylvania State University
## 2   129242                                   Fairfield University
## 3   130253                                Sacred Heart University
## 4   130943                                 University of Delaware
## 5   130989                                  Goldey-Beacom College
## 6   131113                                  Wilmington University
## 7   132471                                       Barry University
## 8   133951                       Florida International University
## 9   134079                               Florida Southern College
## 10  134097                               Florida State University
## 11  134130                                  University of Florida
## 12  135726                                    University of Miami
## 13  136172                            University of North Florida
## 14  137351                            University of South Florida
## 15  137847                                The University of Tampa
## 16  138293                        Webber International University
## 17  138354                         The University of West Florida
## 18  161253                                    University of Maine
## 19  161518                        Saint Joseph's College of Maine
## 20  161554                           University of Southern Maine
## 21  161563                                         Thomas College
## 22  161873                                University of Baltimore
## 23  163204                   University of Maryland Global Campus
## 24  163286                    University of Maryland-College Park
## 25  163453                                Morgan State University
## 26  164076                                      Towson University
## 27  164562                                  Assumption University
## 28  164580                                         Babson College
## 29  164632                                    Bay Path University
## 30  164739                                     Bentley University
## 31  164924                                         Boston College
## 32  165024                           Bridgewater State University
## 33  165334                                       Clark University
## 34  165529                                          Curry College
## 35  165820                             Fitchberg State University
## 36  166513                     University of Massachusetts-Lowell
## 37  166629                 University of Massachussetts - Amherst
## 38  166638                     University of Massachusetts-Boston
## 39  167260                                        Nichols College
## 40  167358                                Northeastern University
## 41  167987               University of Massachussetts - Dartmouth
## 42  168005                                     Suffolk University
## 43  168254                         Western New England University
## 44  168263                             Westfield State University
## 45  183822                                     Bloomfield College
## 46  183910                                    Caldwell University
## 47  184603     Fairleigh Dickinson University-Metropolitan Campus
## 48  184694          Fairleigh Dickinson University-Florham Campus
## 49  185129                             New Jersey City University
## 50  185262                                        Kean University
## 51  185572                                    Monmouth University
## 52  185590                             Montclair State University
## 53  186201                           Ramapo College of New Jersey
## 54  186399                              Rutgers University-Newark
## 55  186432                               Saint Peter's University
## 56  186584                                  Seton Hall University
## 57  188429                                     Adelphi University
## 58  190512                          CUNY Bernard M Baruch College
## 59  190549                                  CUNY Brooklyn College
## 60  190558                          College of Staten Island CUNY
## 61  190594                                    CUNY Hunter College
## 62  190637                                    CUNY Lehman College
## 63  190664                                    CUNY Queens College
## 64  191241                                     Fordham University
## 65  191649                                     Hofstra University
## 66  191931                                           Iona College
## 67  191968                                         Ithaca College
## 68  192323                                       Le Moyne College
## 69  192448                                 Long Island University
## 70  192819                                         Marist College
## 71  193016                                          Mercy College
## 72  193292                                         Molloy College
## 73  193308                                         Monroe College
## 74  193353                               Mount Saint Mary College
## 75  193584                                       Nazareth College
## 76  193900                                    New York University
## 77  194091                       New York Institute of Technology
## 78  194310                                        Pace University
## 79  195003                      Rochester Institute of Technology
## 80  195173                                     St Francis College
## 81  195234                              The College of Saint Rose
## 82  195474                                          Siena College
## 83  195544                          St. Joseph's College-New York
## 84  195562                       St. Joseph's College-Long Island
## 85  195720                              Saint John Fisher College
## 86  195809                         St. John's University-New York
## 87  196060                                         SUNY at Albany
## 88  196079                                  Binghamton University
## 89  196097                                 Stony Brook University
## 90  196167                                SUNY College at Geneseo
## 91  196176              State University of New York at New Paltz
## 92  196194                                 SUNY College at Oswego
## 93  196237                           SUNY College at Old Westbury
## 94  196413                                    Syracuse University
## 95  196592                                          Touro College
## 96  196680                                      Excelsior College
## 97  197045                                          Utica College
## 98  197197                                         Wagner College
## 99  197708                                     Yeshiva University
## 100 198136                                    Campbell University
## 101 198464                               East Carolina University
## 102 198516                                        Elon University
## 103 198561                                Gardner-Webb University
## 104 199102                  North Carolina A & T State University
## 105 199120               University of North Carolina Chapel Hill
## 106 199193             North Carolina State University at Raleigh
## 107 199962                                     Wingate University
## 108 200004                            Western Carolina University
## 109 211158                  Bloomsburg University of Pennsylvania
## 110 211352                                     Cabrini University
## 111 212054                                      Drexel University
## 112 212197                                  Elizabethtown College
## 113 213303                                       Keystone College
## 114 214272                                     Neumann University
## 115 215062                             University of Pennsylvania
## 116 215770                            Saint Joseph's University\n
## 117 216339                                      Temple University
## 118 216852                                     Widener University
## 119 217059                           York College of Pennsylvania
## 120 231174                                  University of Vermont
## 121 232186                                George Mason University
## 122 367884                                      Hodges University
## 123 433660                          Florida Gulf Coast University
## 124 449931            Averett University-Non-Traditional Programs
## 125 450298                          Strayer University - Delaware
## 126 453163 Strayer University-North Carolina (online, for-profit)
## 127 460376                          Fairfax University of America
## 128 480569                 Florida Institute of Technology-Online
## 129 482413                              DeVry College of New York
## 130 482459                               DeVry University-Florida
## 131 482565                        DeVry University-North Carolina
## 132 482653                              DeVry University-Virginia
##                                                   STREET             CITY STATE
## 1                                     352 Mansfield Road           Storrs    CT
## 2                                       1073 N Benson Rd        Fairfield    CT
## 3                                          5151 Park Ave        Fairfield    CT
## 4                                      104 Hullihen Hall           Newark    DE
## 5                                      4701 Limestone Rd       Wilmington    DE
## 6                                         320 Dupont Hwy       New Castle    DE
## 7                                       11300 NE 2nd Ave            Miami    FL
## 8                                   11200 S. W. 8 Street            Miami    FL
## 9                              111 Lake Hollingsworth Dr         Lakeland    FL
## 10                                222 S. Copeland Street      Tallahassee    FL
## 11                                           Tigert Hall      Gainesville    FL
## 12                                   University of Miami     Coral Gables    FL
## 13                                           1 UNF Drive     Jacksonville    FL
## 14                                  4202 East Fowler Ave            Tampa    FL
## 15                                    401 W Kennedy Blvd            Tampa    FL
## 16                                     1201 N Scenic Hwy      Babson Park    FL
## 17                              11000 University Parkway        Pensacola    FL
## 18                                    168 College Avenue            Orono    ME
## 19                                  278 Whites Bridge Rd         Standish    ME
## 20                                        96 Falmouth St         Portland    ME
## 21                                        180 W River Rd       Waterville    ME
## 22                                Charles at Mount Royal        Baltimore    MD
## 23                             3501 University Blvd East          Adelphi    MD
## 24                                                     M     College Park    MD
## 25                            1700 East Cold Spring Lane        Baltimore    MD
## 26                                          8000 York Rd           Towson    MD
## 27                                      500 Salisbury St        Worcester    MA
## 28                                     231 Forest Street        Wellesley    MA
## 29                                 588 Longmeadow Street       Longmeadow    MA
## 30                                         175 Forest St          Waltham    MA
## 31                               140 Commonwealth Avenue    Chestnut Hill    MA
## 32                                     131 Summer Street      Bridgewater    MA
## 33                                           950 Main St        Worcester    MA
## 34                                    1071 Blue Hill Ave           Milton    MA
## 35                                          160 Pearl St        Fitchburg    MA
## 36                                      1 University Ave           Lowell    MA
## 37            374 Whitmore Building 181 Presidents Drive          Amherst    MA
## 38                               100 Morrissey Boulevard           Boston    MA
## 39                                             Center Rd           Dudley    MA
## 40                                    360 Huntington Ave           Boston    MA
## 41                                   285 Old Westport Rd  North Dartmouth    MA
## 42                                        73 Tremont St.           Boston    MA
## 43                                     1215 Wilbraham Rd      Springfield    MA
## 44                                       577 Western Ave        Westfield    MA
## 45                                       467 Franklin St       Bloomfield    NJ
## 46                                 120 Bloomfield Avenue         Caldwell    NJ
## 47                                         1000 River Rd          Teaneck    NJ
## 48                                       285 Madison Ave          Madison    NJ
## 49                                     2039 Kennedy Blvd      Jersey City    NJ
## 50                                    1000 Morris Avenue            Union    NJ
## 51                                         400 Cedar Ave West Long Branch    NJ
## 52                                       1 Normal Avenue        Montclair    NJ
## 53                                  505 Ramapo Valley Rd           Mahwah    NJ
## 54                249 University Avenue, Blumenthal Hall           Newark    NJ
## 55                                     2641 Kennedy Blvd      Jersey City    NJ
## 56                                      400 S Orange Ave     South Orange    NJ
## 57                                             South Ave      Garden City    NY
## 58  One Bernard Baruch Way (55 Lexington Ave at 24th St)         New York    NY
## 59                                      2900 Bedford Ave         Brooklyn    NY
## 60                                     2800 Victory Blvd    Staten Island    NY
## 61                                          695 Park Ave         New York    NY
## 62                            250 Bedford Park Blvd West            Bronx    NY
## 63                                    65-30 Kissena Blvd           Queens    NY
## 64                                      441 E Fordham Rd            Bronx    NY
## 65                                100 Hofstra University        Hempstead    NY
## 66                                         715 North Ave     New Rochelle    NY
## 67                                        953 Danby Road           Ithaca    NY
## 68                                  1419 Salt Springs Rd         Syracuse    NY
## 69                                     720 Northern Blvd       Brookville    NY
## 70                                         3399 North Rd     Poughkeepsie    NY
## 71                                          555 Broadway      Dobbs Ferry    NY
## 72                                    1000 Hempstead Ave Rockville Centre    NY
## 73                                    2501 Jerome Avenue            Bronx    NY
## 74                                     330 Powell Avenue         Newburgh    NY
## 75                                         4245 East Ave        Rochester    NY
## 76                                70 Washington Sq South         New York    NY
## 77                                         Northern Blvd     Old Westbury    NY
## 78                                          1 Pace Plaza         New York    NY
## 79                                    1 Lomb Memorial Dr        Rochester    NY
## 80                                     180 Remsen Street Brooklyn Heights    NY
## 81                                       432 Western Ave           Albany    NY
## 82                                         515 Loudon Rd      Loudonville    NY
## 83                                       245 Clinton Ave         Brooklyn    NY
## 84                                        155 W Roe Blvd        Patchogue    NY
## 85                                         3690 East Ave        Rochester    NY
## 86                                       8000 Utopia Pky           Queens    NY
## 87                                1400 Washington Avenue           Albany    NY
## 88                              4400 Vestal Parkway East           Vestal    NY
## 89                           310 Administration Building      Stony Brook    NY
## 90                                      1 College Circle          Geneseo    NY
## 91                                          1 Hawk Drive        New Paltz    NY
## 92                                  7060 State Route 104           Oswego    NY
## 93                                     223 Store Hill Rd     Old Westbury    NY
## 94                                 900 South Crouse Ave.         Syracuse    NY
## 95                                        500 7th Avenue         New York    NY
## 96                                        7 Columbia Cir           Albany    NY
## 97                                     1600 Burrstone Rd            Utica    NY
## 98                                         One Campus Rd    Staten Island    NY
## 99                                        500 W 185th St         New York    NY
## 100                                      143 Main Street      Buies Creek    NC
## 101                                      East 5th Street       Greenville    NC
## 102                                     100 Campus Drive             Elon    NC
## 103                                              Main St  Boiling Springs    NC
## 104                                    1601 E Market  St       Greensboro    NC
## 105                               103 South Bldg Cb 9100      Chapel Hill    NC
## 106                             2101 Hillsborough Street          Raleigh    NC
## 107                                 301 E. Wilson Street          Wingate    NC
## 108                                          Highway 107        Cullowhee    NC
## 109                                      400 E Second St       Bloomsburg    PA
## 110                               610 King of Prussia Rd           Radnor    PA
## 111                                     3141 Chestnut St     Philadelphia    PA
## 112                                      One Alpha Drive    Elizabethtown    PA
## 113                                    One College Green         La Plume    PA
## 114                                    One Neumann Drive            Aston    PA
## 115                                 34th & Spruce Street     Philadelphia    PA
## 116                                     5600 City Avenue     Philadelphia    PA
## 117                              1801 North Broad Street     Philadelphia    PA
## 118                                 One University Place          Chester    PA
## 119                                  441 Country Club Rd             York    PA
## 120                                     85 S Prospect St       Burlington    VT
## 121                                   4400 University Dr          Fairfax    VA
## 122                                   4501 Colonial Blvd       Fort Myers    FL
## 123                                    10501 Fgcu Blvd S       Fort Myers    FL
## 124                                        420 W Main St         Danville    VA
## 125                      800 North King Street Suite 101       Wilmington    DE
## 126                                        4 Copley Pkwy      Morrisville    NC
## 127                                   4401 Village Drive          Fairfax    VA
## 128                             150 West University Blvd        Melbourne    FL
## 129                          180 Madison Ave., Ste. 1200         New York    NY
## 130                                   2300 SW 145th Ave.          Miramar    FL
## 131                    2015 Ayrsley Town Blvd., Ste. 109        Charlotte    NC
## 132                             1400 Crystal Dr, Ste 120        Arlington    VA
##            ZIP STFIP  CNTY                 NMCNTY LOCALE      LAT       LON
## 1        06269    09 09013         Tolland County     21 41.80910 -72.24995
## 2   06824-5195    09 09001       Fairfield County     21 41.15767 -73.25590
## 3   06825-1000    09 09001       Fairfield County     21 41.22089 -73.24333
## 4        19716    10 10003      New Castle County     21 39.67958 -75.75282
## 5        19808    10 10003      New Castle County     21 39.74150 -75.68962
## 6        19720    10 10003      New Castle County     21 39.68230 -75.58700
## 7   33161-6695    12 12086      Miami-Dade County     21 25.87891 -80.19893
## 8        33199    12 12086      Miami-Dade County     21 25.75732 -80.37393
## 9   33801-5698    12 12105            Polk County     12 28.03244 -81.94820
## 10  32306-1037    12 12073            Leon County     12 30.44076 -84.29192
## 11       32611    12 12001         Alachua County     12 29.64629 -82.34791
## 12       33146    12 12086      Miami-Dade County     13 25.72126 -80.27866
## 13  32224-7699    12 12031           Duval County     11 30.27194 -81.50914
## 14  33620-9951    12 12057    Hillsborough County     11 28.06146 -82.41323
## 15  33606-1490    12 12057    Hillsborough County     11 27.94845 -82.46483
## 16  33827-0096    12 12105            Polk County     31 27.83878 -81.53231
## 17  32514-5750    12 12033        Escambia County     13 30.54908 -87.21851
## 18       04469    23 23019       Penobscot County     23 44.89926 -68.66933
## 19  04084-5236    23 23005      Cumberland County     41 43.82631 -70.48337
## 20       04103    23 23005      Cumberland County     13 43.66286 -70.27425
## 21  04901-5097    23 23011        Kennebec County     41 44.52491 -69.66473
## 22  21201-5720    24 24510         Baltimore city     11 39.30583 -76.61659
## 23  20783-8010    24 24033 Prince George's County     21 38.91271 -76.84758
## 24       20742    24 24033 Prince George's County     21 38.98818 -76.94472
## 25  21251-0001    24 24510         Baltimore city     11 39.34416 -76.58557
## 26  21252-0001    24 24005       Baltimore County     13 39.39362 -76.61116
## 27  01609-1296    25 25027       Worcester County     12 42.29423 -71.82899
## 28  02457-0310    25 25021         Norfolk County     21 42.29702 -71.26406
## 29       01106    25 25013         Hampden County     21 42.05509 -72.58338
## 30  02452-4705    25 25017       Middlesex County     13 42.38600 -71.22284
## 31       02467    25 25017       Middlesex County     13 42.33621 -71.16924
## 32       02325    25 25023        Plymouth County     21 41.98749 -70.97455
## 33  01610-1477    25 25027       Worcester County     12 42.24999 -71.82336
## 34  02186-2395    25 25021         Norfolk County     21 42.23806 -71.11654
## 35  01420-2697    25 25027       Worcester County     22 42.58830 -71.78967
## 36  01854-5104    25 25017       Middlesex County     21 42.65286 -71.32681
## 37       01003    25 25015       Hampshire County     21 42.38600 -72.52673
## 38  02125-3393    25 25025         Suffolk County     11 42.31288 -71.03687
## 39  01571-5000    25 25027       Worcester County     21 42.04403 -71.93028
## 40  02115-5005    25 25025         Suffolk County     11 42.33999 -71.08878
## 41  02747-2300    25 25005         Bristol County     22 41.62869 -71.00455
## 42  02108-3901    25 25025         Suffolk County     11 42.35795 -71.06092
## 43  01119-2684    25 25013         Hampden County     12 42.11502 -72.52047
## 44  01086-1630    25 25013         Hampden County     21 42.13270 -72.79650
## 45       07003    34 34013           Essex County     21 40.79510 -74.19431
## 46  07006-6195    34 34013           Essex County     21 40.83275 -74.27257
## 47       07666    34 34003          Bergen County     21 40.89721 -74.02899
## 48       07940    34 34027          Morris County     21 40.77450 -74.43212
## 49       07305    34 34017          Hudson County     11 40.70994 -74.08727
## 50       07083    34 34039           Union County     21 40.67798 -74.23350
## 51  07764-1898    34 34025        Monmouth County     21 40.28007 -74.00645
## 52  07043-1624    34 34031         Passaic County     21 40.86041 -74.19814
## 53  07430-1680    34 34003          Bergen County     21 41.08094 -74.17409
## 54       07102    34 34013           Essex County     11 40.73912 -74.17581
## 55  07306-5997    34 34017          Hudson County     11 40.72711 -74.07154
## 56  07079-2697    34 34013           Essex County     21 40.74234 -74.24603
## 57  11530-0701    36 36059          Nassau County     21 40.72144 -73.65332
## 58       10010    36 36061        New York County     11 40.74024 -73.98342
## 59       11210    36 36047           Kings County     11 40.63152 -73.94990
## 60       10314    36 36085        Richmond County     11 40.60183 -74.14849
## 61       10065    36 36061        New York County     11 40.76867 -73.96479
## 62       10468    36 36005           Bronx County     11 40.87296 -73.89538
## 63       11367    36 36081          Queens County     11 40.73518 -73.81610
## 64       10458    36 36005           Bronx County     11 40.85935 -73.88271
## 65       11549    36 36059          Nassau County     21 40.71596 -73.60078
## 66  10801-1890    36 36119     Westchester County     21 40.92572 -73.78805
## 67  14850-7002    36 36109        Tompkins County     23 42.42215 -76.49414
## 68  13214-1301    36 36067        Onondaga County     21 43.04919 -76.09043
## 69  11548-1327    36 36059          Nassau County     21 40.82071 -73.59368
## 70       12601    36 36027        Dutchess County     21 41.72094 -73.93548
## 71       10522    36 36119     Westchester County     21 41.02163 -73.87445
## 72  11571-5002    36 36059          Nassau County     21 40.68594 -73.62618
## 73       10468    36 36005           Bronx County     11 40.86446 -73.90022
## 74       12550    36 36071          Orange County     13 41.51387 -74.01265
## 75  14618-3790    36 36055          Monroe County     21 43.10158 -77.51858
## 76  10012-1091    36 36061        New York County     11 40.72945 -73.99726
## 77  11568-8000    36 36059          Nassau County     21 40.81245 -73.60780
## 78  10038-1598    36 36061        New York County     11 40.71101 -74.00472
## 79  14623-5603    36 36055          Monroe County     21 43.08419 -77.67386
## 80  11201-4305    36 36047           Kings County     11 40.69323 -73.99216
## 81  12203-1490    36 36001          Albany County     13 42.66430 -73.78666
## 82  12211-1462    36 36001          Albany County     21 42.71760 -73.75260
## 83  11205-3688    36 36047           Kings County     11 40.69042 -73.96766
## 84       11772    36 36103         Suffolk County     21 40.77593 -73.02466
## 85  14618-3597    36 36055          Monroe County     21 43.11626 -77.51306
## 86       11439    36 36081          Queens County     11 40.72252 -73.79610
## 87       12222    36 36001          Albany County     13 42.68549 -73.82466
## 88  13850-6000    36 36007          Broome County     22 42.08787 -75.96689
## 89  11794-0701    36 36103         Suffolk County     21 40.91476 -73.12046
## 90  14454-1465    36 36051      Livingston County     32 42.79664 -77.82189
## 91  12561-2443    36 36111          Ulster County     21 41.74094 -74.08219
## 92       13126    36 36075          Oswego County     32 43.45429 -76.54080
## 93  11568-0210    36 36059          Nassau County     21 40.79902 -73.57191
## 94       13244    36 36067        Onondaga County     12 43.04018 -76.13698
## 95       10018    36 36061        New York County     11 40.75320 -73.98940
## 96  12203-5159    36 36001          Albany County     13 42.70549 -73.86298
## 97  13502-4892    36 36065          Oneida County     13 43.09621 -75.27292
## 98  10301-4495    36 36085        Richmond County     11 40.61559 -74.09291
## 99  10033-3299    36 36061        New York County     11 40.85061 -73.92987
## 100      27506    37 37085         Harnett County     31 35.40915 -78.73824
## 101 27858-4353    37 37147            Pitt County     13 35.60719 -77.36829
## 102 27244-2010    37 37001        Alamance County     22 36.10415 -79.50344
## 103 28017-0997    37 37045       Cleveland County     32 35.24732 -81.66814
## 104      27411    37 37081        Guilford County     11 36.07282 -79.77338
## 105      27599    37 37135          Orange County     13 35.91177 -79.05097
## 106 27695-7001    37 37183            Wake County     11 35.78511 -78.67452
## 107 28174-0159    37 37179           Union County     21 34.98606 -80.44305
## 108 28723-9646    37 37099         Jackson County     32 35.30898 -83.18626
## 109      17815    42 42037        Columbia County     13 41.00782 -76.44784
## 110 19087-3698    42 42045        Delaware County     21 40.05636 -75.37526
## 111      19104    42 42101    Philadelphia County     11 39.95522 -75.19005
## 112 17022-2298    42 42071       Lancaster County     21 40.14924 -76.59322
## 113 18440-0200    42 42131         Wyoming County     21 41.55897 -75.77746
## 114 19014-1298    42 42045        Delaware County     21 39.87488 -75.44002
## 115 19104-6303    42 42101    Philadelphia County     11 39.95093 -75.19391
## 116 19131-1395    42 42101    Philadelphia County     11 39.99444 -75.23834
## 117 19122-6096    42 42101    Philadelphia County     11 39.98055 -75.15686
## 118 19013-5792    42 42045        Delaware County     21 39.86169 -75.35536
## 119 17403-3651    42 42133            York County     22 39.94614 -76.72798
## 120 05405-0160    50 50007      Chittenden County     13 44.47733 -73.19665
## 121 22030-4444    51 51059         Fairfax County     21 38.82998 -77.30743
## 122      33966    12 12071             Lee County     13 26.61087 -81.82142
## 123 33965-6565    12 12071             Lee County     21 26.46364 -81.77260
## 124      24541    51 51590          Danville city     32 36.57729 -79.41320
## 125      19801    10 10003      New Castle County     13 39.74336 -75.54791
## 126      27560    37 37183            Wake County     21 35.86546 -78.82245
## 127 22030-0000    51 51059         Fairfax County     21 38.84905 -77.34775
## 128 32901-6975    12 12009         Brevard County     13 28.06575 -80.62438
## 129      10016    36 36061        New York County     11 40.74775 -73.98349
## 130      33027    12 12011         Broward County     21 25.98764 -80.33984
## 131      28273    37 37119     Mecklenburg County     11 35.13766 -80.93197
## 132      22202    51 51013       Arlington County     12 38.86110 -77.04976
##      CBSA                                       NMCBSA CBSATYPE CSA
## 1   25540        Hartford-East Hartford-Middletown, CT        1 278
## 2   14860              Bridgeport-Stamford-Norwalk, CT        1 408
## 3   14860              Bridgeport-Stamford-Norwalk, CT        1 408
## 4   37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 5   37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 6   37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 7   33100      Miami-Fort Lauderdale-Pompano Beach, FL        1 370
## 8   33100      Miami-Fort Lauderdale-Pompano Beach, FL        1 370
## 9   29460                    Lakeland-Winter Haven, FL        1 422
## 10  45220                              Tallahassee, FL        1   N
## 11  23540                              Gainesville, FL        1 264
## 12  33100      Miami-Fort Lauderdale-Pompano Beach, FL        1 370
## 13  27260                             Jacksonville, FL        1 300
## 14  45300          Tampa-St. Petersburg-Clearwater, FL        1   N
## 15  45300          Tampa-St. Petersburg-Clearwater, FL        1   N
## 16  29460                    Lakeland-Winter Haven, FL        1 422
## 17  37860               Pensacola-Ferry Pass-Brent, FL        1 426
## 18  12620                                   Bangor, ME        1   N
## 19  38860                  Portland-South Portland, ME        1 438
## 20  38860                  Portland-South Portland, ME        1 438
## 21  12300                       Augusta-Waterville, ME        2   N
## 22  12580                Baltimore-Columbia-Towson, MD        1 548
## 23  47900 Washington-Arlington-Alexandria, DC-VA-MD-WV        1 548
## 24  47900 Washington-Arlington-Alexandria, DC-VA-MD-WV        1 548
## 25  12580                Baltimore-Columbia-Towson, MD        1 548
## 26  12580                Baltimore-Columbia-Towson, MD        1 548
## 27  49340                             Worcester, MA-CT        1 148
## 28  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 29  44140                              Springfield, MA        1   N
## 30  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 31  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 32  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 33  49340                             Worcester, MA-CT        1 148
## 34  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 35  49340                             Worcester, MA-CT        1 148
## 36  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 37  44140                              Springfield, MA        1   N
## 38  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 39  49340                             Worcester, MA-CT        1 148
## 40  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 41  39300                    Providence-Warwick, RI-MA        1 148
## 42  14460               Boston-Cambridge-Newton, MA-NH        1 148
## 43  44140                              Springfield, MA        1   N
## 44  44140                              Springfield, MA        1   N
## 45  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 46  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 47  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 48  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 49  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 50  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 51  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 52  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 53  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 54  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 55  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 56  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 57  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 58  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 59  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 60  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 61  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 62  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 63  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 64  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 65  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 66  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 67  27060                                   Ithaca, NY        1 296
## 68  45060                                 Syracuse, NY        1 532
## 69  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 70  39100         Poughkeepsie-Newburgh-Middletown, NY        1 408
## 71  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 72  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 73  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 74  39100         Poughkeepsie-Newburgh-Middletown, NY        1 408
## 75  40380                                Rochester, NY        1 464
## 76  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 77  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 78  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 79  40380                                Rochester, NY        1 464
## 80  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 81  10580                  Albany-Schenectady-Troy, NY        1 104
## 82  10580                  Albany-Schenectady-Troy, NY        1 104
## 83  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 84  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 85  40380                                Rochester, NY        1 464
## 86  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 87  10580                  Albany-Schenectady-Troy, NY        1 104
## 88  13780                               Binghamton, NY        1   N
## 89  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 90  40380                                Rochester, NY        1 464
## 91  28740                                 Kingston, NY        1 408
## 92  45060                                 Syracuse, NY        1 532
## 93  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 94  45060                                 Syracuse, NY        1 532
## 95  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 96  10580                  Albany-Schenectady-Troy, NY        1 104
## 97  46540                               Utica-Rome, NY        1   N
## 98  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 99  35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 100 22180                             Fayetteville, NC        1 246
## 101 24780                               Greenville, NC        1 272
## 102 15500                               Burlington, NC        1 268
## 103 43140                                   Shelby, NC        2 172
## 104 24660                    Greensboro-High Point, NC        1 268
## 105 20500                       Durham-Chapel Hill, NC        1 450
## 106 39580                             Raleigh-Cary, NC        1 450
## 107 16740            Charlotte-Concord-Gastonia, NC-SC        1 172
## 108 19000                                Cullowhee, NC        2   N
## 109 14100                       Bloomsburg-Berwick, PA        1 146
## 110 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 111 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 112 29540                                Lancaster, PA        1   N
## 113 42540                   Scranton--Wilkes-Barre, PA        1   N
## 114 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 115 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 116 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 117 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 118 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 119 49620                             York-Hanover, PA        1 276
## 120 15540              Burlington-South Burlington, VT        1 162
## 121 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV        1 548
## 122 15980                    Cape Coral-Fort Myers, FL        1 163
## 123 15980                    Cape Coral-Fort Myers, FL        1 163
## 124 19260                                 Danville, VA        2   N
## 125 37980  Philadelphia-Camden-Wilmington, PA-NJ-DE-MD        1 428
## 126 39580                             Raleigh-Cary, NC        1 450
## 127 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV        1 548
## 128 37340            Palm Bay-Melbourne-Titusville, FL        1   N
## 129 35620        New York-Newark-Jersey City, NY-NJ-PA        1 408
## 130 33100      Miami-Fort Lauderdale-Pompano Beach, FL        1 370
## 131 16740            Charlotte-Concord-Gastonia, NC-SC        1 172
## 132 47900 Washington-Arlington-Alexandria, DC-VA-MD-WV        1 548
##                                              NMCSA NECTA
## 1                       Hartford-East Hartford, CT 73450
## 2                     New York-Newark, NY-NJ-CT-PA 71950
## 3                     New York-Newark, NY-NJ-CT-PA 71950
## 4         Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 5         Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 6         Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 7         Miami-Port St. Lucie-Fort Lauderdale, FL     N
## 8         Miami-Port St. Lucie-Fort Lauderdale, FL     N
## 9                     Orlando-Lakeland-Deltona, FL     N
## 10                                               N     N
## 11                       Gainesville-Lake City, FL     N
## 12        Miami-Port St. Lucie-Fort Lauderdale, FL     N
## 13           Jacksonville-St. Marys-Palatka, FL-GA     N
## 14                                               N     N
## 15                                               N     N
## 16                    Orlando-Lakeland-Deltona, FL     N
## 17                     Pensacola-Ferry Pass, FL-AL     N
## 18                                               N 70750
## 19            Portland-Lewiston-South Portland, ME 76750
## 20            Portland-Lewiston-South Portland, ME 76750
## 21                                               N 78850
## 22  Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
## 23  Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
## 24  Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
## 25  Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
## 26  Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
## 27        Boston-Worcester-Providence, MA-RI-NH-CT 79600
## 28        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 29                                               N 78100
## 30        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 31        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 32        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 33        Boston-Worcester-Providence, MA-RI-NH-CT 79600
## 34        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 35        Boston-Worcester-Providence, MA-RI-NH-CT 74500
## 36        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 37                                               N 78100
## 38        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 39        Boston-Worcester-Providence, MA-RI-NH-CT 79600
## 40        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 41        Boston-Worcester-Providence, MA-RI-NH-CT 75550
## 42        Boston-Worcester-Providence, MA-RI-NH-CT 71650
## 43                                               N 78100
## 44                                               N 78100
## 45                    New York-Newark, NY-NJ-CT-PA     N
## 46                    New York-Newark, NY-NJ-CT-PA     N
## 47                    New York-Newark, NY-NJ-CT-PA     N
## 48                    New York-Newark, NY-NJ-CT-PA     N
## 49                    New York-Newark, NY-NJ-CT-PA     N
## 50                    New York-Newark, NY-NJ-CT-PA     N
## 51                    New York-Newark, NY-NJ-CT-PA     N
## 52                    New York-Newark, NY-NJ-CT-PA     N
## 53                    New York-Newark, NY-NJ-CT-PA     N
## 54                    New York-Newark, NY-NJ-CT-PA     N
## 55                    New York-Newark, NY-NJ-CT-PA     N
## 56                    New York-Newark, NY-NJ-CT-PA     N
## 57                    New York-Newark, NY-NJ-CT-PA     N
## 58                    New York-Newark, NY-NJ-CT-PA     N
## 59                    New York-Newark, NY-NJ-CT-PA     N
## 60                    New York-Newark, NY-NJ-CT-PA     N
## 61                    New York-Newark, NY-NJ-CT-PA     N
## 62                    New York-Newark, NY-NJ-CT-PA     N
## 63                    New York-Newark, NY-NJ-CT-PA     N
## 64                    New York-Newark, NY-NJ-CT-PA     N
## 65                    New York-Newark, NY-NJ-CT-PA     N
## 66                    New York-Newark, NY-NJ-CT-PA     N
## 67                             Ithaca-Cortland, NY     N
## 68                             Syracuse-Auburn, NY     N
## 69                    New York-Newark, NY-NJ-CT-PA     N
## 70                    New York-Newark, NY-NJ-CT-PA     N
## 71                    New York-Newark, NY-NJ-CT-PA     N
## 72                    New York-Newark, NY-NJ-CT-PA     N
## 73                    New York-Newark, NY-NJ-CT-PA     N
## 74                    New York-Newark, NY-NJ-CT-PA     N
## 75              Rochester-Batavia-Seneca Falls, NY     N
## 76                    New York-Newark, NY-NJ-CT-PA     N
## 77                    New York-Newark, NY-NJ-CT-PA     N
## 78                    New York-Newark, NY-NJ-CT-PA     N
## 79              Rochester-Batavia-Seneca Falls, NY     N
## 80                    New York-Newark, NY-NJ-CT-PA     N
## 81                          Albany-Schenectady, NY     N
## 82                          Albany-Schenectady, NY     N
## 83                    New York-Newark, NY-NJ-CT-PA     N
## 84                    New York-Newark, NY-NJ-CT-PA     N
## 85              Rochester-Batavia-Seneca Falls, NY     N
## 86                    New York-Newark, NY-NJ-CT-PA     N
## 87                          Albany-Schenectady, NY     N
## 88                                               N     N
## 89                    New York-Newark, NY-NJ-CT-PA     N
## 90              Rochester-Batavia-Seneca Falls, NY     N
## 91                    New York-Newark, NY-NJ-CT-PA     N
## 92                             Syracuse-Auburn, NY     N
## 93                    New York-Newark, NY-NJ-CT-PA     N
## 94                             Syracuse-Auburn, NY     N
## 95                    New York-Newark, NY-NJ-CT-PA     N
## 96                          Albany-Schenectady, NY     N
## 97                                               N     N
## 98                    New York-Newark, NY-NJ-CT-PA     N
## 99                    New York-Newark, NY-NJ-CT-PA     N
## 100             Fayetteville-Sanford-Lumberton, NC     N
## 101              Greenville-Kinston-Washington, NC     N
## 102      Greensboro--Winston-Salem--High Point, NC     N
## 103                       Charlotte-Concord, NC-SC     N
## 104      Greensboro--Winston-Salem--High Point, NC     N
## 105                        Raleigh-Durham-Cary, NC     N
## 106                        Raleigh-Durham-Cary, NC     N
## 107                       Charlotte-Concord, NC-SC     N
## 108                                              N     N
## 109                 Bloomsburg-Berwick-Sunbury, PA     N
## 110       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 111       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 112                                              N     N
## 113                                              N     N
## 114       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 115       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 116       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 117       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 118       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 119                    Harrisburg-York-Lebanon, PA     N
## 120          Burlington-South Burlington-Barre, VT 72400
## 121 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
## 122               Cape Coral-Fort Myers-Naples, FL     N
## 123               Cape Coral-Fort Myers-Naples, FL     N
## 124                                              N     N
## 125       Philadelphia-Reading-Camden, PA-NJ-DE-MD     N
## 126                        Raleigh-Durham-Cary, NC     N
## 127 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
## 128                                              N     N
## 129                   New York-Newark, NY-NJ-CT-PA     N
## 130       Miami-Port St. Lucie-Fort Lauderdale, FL     N
## 131                       Charlotte-Concord, NC-SC     N
## 132 Washington-Baltimore-Arlington, DC-MD-VA-WV-PA     N
##                                   NMNECTA   CD  SLDL  SLDU SCHOOLYEAR
## 1   Hartford-East Hartford-Middletown, CT 0902 09054 09029  2020-2021
## 2         Bridgeport-Stamford-Norwalk, CT 0904 09133 09028  2020-2021
## 3         Bridgeport-Stamford-Norwalk, CT 0904 09134 09028  2020-2021
## 4                                       N 1000 10025 10008  2020-2021
## 5                                       N 1000 10021 10004  2020-2021
## 6                                       N 1000 10017 10012  2020-2021
## 7                                       N 1224 12108 12038  2020-2021
## 8                                       N 1226 12116 12039  2020-2021
## 9                                       N 1215 12040 12022  2020-2021
## 10                                      N 1202 12009 12003  2020-2021
## 11                                      N 1203 12021 12008  2020-2021
## 12                                      N 1227 12114 12037  2020-2021
## 13                                      N 1204 12012 12004  2020-2021
## 14                                      N 1214 12063 12020  2020-2021
## 15                                      N 1214 12060 12018  2020-2021
## 16                                      N 1209 12042 12026  2020-2021
## 17                                      N 1201 12001 12001  2020-2021
## 18                             Bangor, ME 2302 23123 23005  2020-2021
## 19            Portland-South Portland, ME 2301 23023 23026  2020-2021
## 20            Portland-South Portland, ME 2301 23040 23027  2020-2021
## 21                         Waterville, ME 2301 23109 23016  2020-2021
## 22                                      N 2407 24045 24045  2020-2021
## 23                                      N 2404 24024 24024  2020-2021
## 24                                      N 2405 24021 24021  2020-2021
## 25                                      N 2407 24043 24043  2020-2021
## 26                                      N 2402 2442A 24042  2020-2021
## 27                       Worcester, MA-CT 2502 25215 25010  2020-2021
## 28         Boston-Cambridge-Newton, MA-NH 2504 25170 25017  2020-2021
## 29                     Springfield, MA-CT 2501 25103 25007  2020-2021
## 30         Boston-Cambridge-Newton, MA-NH 2505 25127 25016  2020-2021
## 31         Boston-Cambridge-Newton, MA-NH 2504 25129 25029  2020-2021
## 32         Boston-Cambridge-Newton, MA-NH 2508 25179 25036  2020-2021
## 33                       Worcester, MA-CT 2502 25219 25010  2020-2021
## 34         Boston-Cambridge-Newton, MA-NH 2507 25163 25032  2020-2021
## 35                 Leominster-Gardner, MA 2503 25205 25009  2020-2021
## 36         Boston-Cambridge-Newton, MA-NH 2503 25135 25013  2020-2021
## 37                     Springfield, MA-CT 2502 25117 25006  2020-2021
## 38         Boston-Cambridge-Newton, MA-NH 2508 25187 25001  2020-2021
## 39                       Worcester, MA-CT 2501 25208 25012  2020-2021
## 40         Boston-Cambridge-Newton, MA-NH 2507 25190 25028  2020-2021
## 41                        New Bedford, MA 2509 25077 25038  2020-2021
## 42         Boston-Cambridge-Newton, MA-NH 2508 25186 25027  2020-2021
## 43                     Springfield, MA-CT 2501 25110 25007  2020-2021
## 44                     Springfield, MA-CT 2501 25105 25005  2020-2021
## 45                                      N 3410 34028 34028  2020-2021
## 46                                      N 3411 34027 34027  2020-2021
## 47                                      N 3405 34037 34037  2020-2021
## 48                                      N 3411 34027 34027  2020-2021
## 49                                      N 3410 34031 34031  2020-2021
## 50                                      N 3410 34020 34020  2020-2021
## 51                                      N 3406 34011 34011  2020-2021
## 52                                      N 3411 34040 34040  2020-2021
## 53                                      N 3405 34039 34039  2020-2021
## 54                                      N 3410 34029 34029  2020-2021
## 55                                      N 3410 34033 34033  2020-2021
## 56                                      N 3410 34027 34027  2020-2021
## 57                                      N 3604 36019 36006  2020-2021
## 58                                      N 3612 36075 36028  2020-2021
## 59                                      N 3609 36042 36021  2020-2021
## 60                                      N 3611 36063 36024  2020-2021
## 61                                      N 3612 36073 36028  2020-2021
## 62                                      N 3613 36081 36034  2020-2021
## 63                                      N 3606 36025 36016  2020-2021
## 64                                      N 3615 36078 36034  2020-2021
## 65                                      N 3604 36019 36006  2020-2021
## 66                                      N 3616 36088 36037  2020-2021
## 67                                      N 3623 36125 36058  2020-2021
## 68                                      N 3624 36128 36050  2020-2021
## 69                                      N 3603 36019 36005  2020-2021
## 70                                      N 3618 36106 36041  2020-2021
## 71                                      N 3617 36092 36035  2020-2021
## 72                                      N 3604 36021 36009  2020-2021
## 73                                      N 3613 36078 36033  2020-2021
## 74                                      N 3618 36104 36039  2020-2021
## 75                                      N 3625 36133 36055  2020-2021
## 76                                      N 3610 36066 36027  2020-2021
## 77                                      N 3603 36019 36005  2020-2021
## 78                                      N 3610 36066 36026  2020-2021
## 79                                      N 3625 36138 36059  2020-2021
## 80                                      N 3607 36052 36026  2020-2021
## 81                                      N 3620 36109 36044  2020-2021
## 82                                      N 3620 36110 36044  2020-2021
## 83                                      N 3608 36057 36025  2020-2021
## 84                                      N 3601 36007 36003  2020-2021
## 85                                      N 3625 36133 36055  2020-2021
## 86                                      N 3605 36024 36014  2020-2021
## 87                                      N 3620 36109 36044  2020-2021
## 88                                      N 3622 36123 36052  2020-2021
## 89                                      N 3601 36004 36002  2020-2021
## 90                                      N 3627 36133 36059  2020-2021
## 91                                      N 3619 36103 36042  2020-2021
## 92                                      N 3624 36130 36048  2020-2021
## 93                                      N 3603 36015 36005  2020-2021
## 94                                      N 3624 36129 36053  2020-2021
## 95                                      N 3610 36075 36031  2020-2021
## 96                                      N 3620 36109 36044  2020-2021
## 97                                      N 3622 36119 36047  2020-2021
## 98                                      N 3611 36063 36023  2020-2021
## 99                                      N 3613 36072 36031  2020-2021
## 100                                     N 3702 37053 37012  2020-2021
## 101                                     N 3701 37008 37005  2020-2021
## 102                                     N 3706 37064 37024  2020-2021
## 103                                     N 3710 37111 37044  2020-2021
## 104                                     N 3706 37061 37028  2020-2021
## 105                                     N 3704 37056 37023  2020-2021
## 106                                     N 3704 37033 37015  2020-2021
## 107                                     N 3709 37069 37035  2020-2021
## 108                                     N 3711 37119 37050  2020-2021
## 109                                     N 4209 42109 42027  2020-2021
## 110                                     N 4205 42165 42017  2020-2021
## 111                                     N 4203 42188 42007  2020-2021
## 112                                     N 4211 42098 42036  2020-2021
## 113                                     N 4212 42117 42020  2020-2021
## 114                                     N 4205 42161 42009  2020-2021
## 115                                     N 4203 42188 42008  2020-2021
## 116                                     N 4203 42192 42007  2020-2021
## 117                                     N 4202 42181 42003  2020-2021
## 118                                     N 4205 42159 42009  2020-2021
## 119                                     N 4210 42095 42028  2020-2021
## 120       Burlington-South Burlington, VT 5000 50C64 50CHI  2020-2021
## 121                                     N 5111 51037 51034  2020-2021
## 122                                     N 1219 12078 12027  2020-2021
## 123                                     N 1219 12078 12027  2020-2021
## 124                                     N 5105 51014 51020  2020-2021
## 125                                     N 1000 10002 10003  2020-2021
## 126                                     N 3704 37041 37016  2020-2021
## 127                                     N 5111 51037 51037  2020-2021
## 128                                     N 1208 12052 12017  2020-2021
## 129                                     N 3612 36075 36027  2020-2021
## 130                                     N 1220 12103 12035  2020-2021
## 131                                     N 3712 37092 37037  2020-2021
## 132                                     N 5108 51048 51030  2020-2021
##     match_technical_skills match_soft_skills school_score
## 1                        1                 1            2
## 2                        1                 1            2
## 3                        1                 1            2
## 4                        1                 1            2
## 5                        1                 1            2
## 6                        1                 1            2
## 7                        1                 1            2
## 8                        1                 1            2
## 9                        1                 1            2
## 10                       1                 1            2
## 11                       1                 1            2
## 12                       1                 1            2
## 13                       1                 1            2
## 14                       1                 1            2
## 15                       1                 1            2
## 16                       1                 1            2
## 17                       1                 1            2
## 18                       1                 1            2
## 19                       1                 1            2
## 20                       1                 1            2
## 21                       1                 1            2
## 22                       1                 1            2
## 23                       1                 1            2
## 24                       1                 1            2
## 25                       1                 1            2
## 26                       1                 1            2
## 27                       1                 1            2
## 28                       1                 1            2
## 29                       1                 1            2
## 30                       1                 1            2
## 31                       1                 1            2
## 32                       1                 1            2
## 33                       1                 1            2
## 34                       1                 1            2
## 35                       1                 1            2
## 36                       1                 1            2
## 37                       1                 1            2
## 38                       1                 1            2
## 39                       1                 1            2
## 40                       1                 1            2
## 41                       1                 1            2
## 42                       1                 1            2
## 43                       1                 1            2
## 44                       1                 1            2
## 45                       1                 1            2
## 46                       1                 1            2
## 47                       1                 1            2
## 48                       1                 1            2
## 49                       1                 1            2
## 50                       1                 1            2
## 51                       1                 1            2
## 52                       1                 1            2
## 53                       1                 1            2
## 54                       1                 1            2
## 55                       1                 1            2
## 56                       1                 1            2
## 57                       1                 1            2
## 58                       1                 1            2
## 59                       1                 1            2
## 60                       1                 1            2
## 61                       1                 1            2
## 62                       1                 1            2
## 63                       1                 1            2
## 64                       1                 1            2
## 65                       1                 1            2
## 66                       1                 1            2
## 67                       1                 1            2
## 68                       1                 1            2
## 69                       1                 1            2
## 70                       1                 1            2
## 71                       1                 1            2
## 72                       1                 1            2
## 73                       1                 1            2
## 74                       1                 1            2
## 75                       1                 1            2
## 76                       1                 1            2
## 77                       1                 1            2
## 78                       1                 1            2
## 79                       1                 1            2
## 80                       1                 1            2
## 81                       1                 1            2
## 82                       1                 1            2
## 83                       1                 1            2
## 84                       1                 1            2
## 85                       1                 1            2
## 86                       1                 1            2
## 87                       1                 1            2
## 88                       1                 1            2
## 89                       1                 1            2
## 90                       1                 1            2
## 91                       1                 1            2
## 92                       1                 1            2
## 93                       1                 1            2
## 94                       1                 1            2
## 95                       1                 1            2
## 96                       1                 1            2
## 97                       1                 1            2
## 98                       1                 1            2
## 99                       1                 1            2
## 100                      1                 1            2
## 101                      1                 1            2
## 102                      1                 1            2
## 103                      1                 1            2
## 104                      1                 1            2
## 105                      1                 1            2
## 106                      1                 1            2
## 107                      1                 1            2
## 108                      1                 1            2
## 109                      1                 1            2
## 110                      1                 1            2
## 111                      1                 1            2
## 112                      1                 1            2
## 113                      1                 1            2
## 114                      1                 1            2
## 115                      1                 1            2
## 116                      1                 1            2
## 117                      1                 1            2
## 118                      1                 1            2
## 119                      1                 1            2
## 120                      1                 1            2
## 121                      1                 1            2
## 122                      1                 1            2
## 123                      1                 1            2
## 124                      1                 1            2
## 125                      1                 1            2
## 126                      1                 1            2
## 127                      1                 1            2
## 128                      1                 1            2
## 129                      1                 1            2
## 130                      1                 1            2
## 131                      1                 1            2
## 132                      1                 1            2
##     good_data_science_program
## 1                         YES
## 2                         YES
## 3                         YES
## 4                         YES
## 5                         YES
## 6                         YES
## 7                         YES
## 8                         YES
## 9                         YES
## 10                        YES
## 11                        YES
## 12                        YES
## 13                        YES
## 14                        YES
## 15                        YES
## 16                        YES
## 17                        YES
## 18                        YES
## 19                        YES
## 20                        YES
## 21                        YES
## 22                        YES
## 23                        YES
## 24                        YES
## 25                        YES
## 26                        YES
## 27                        YES
## 28                        YES
## 29                        YES
## 30                        YES
## 31                        YES
## 32                        YES
## 33                        YES
## 34                        YES
## 35                        YES
## 36                        YES
## 37                        YES
## 38                        YES
## 39                        YES
## 40                        YES
## 41                        YES
## 42                        YES
## 43                        YES
## 44                        YES
## 45                        YES
## 46                        YES
## 47                        YES
## 48                        YES
## 49                        YES
## 50                        YES
## 51                        YES
## 52                        YES
## 53                        YES
## 54                        YES
## 55                        YES
## 56                        YES
## 57                        YES
## 58                        YES
## 59                        YES
## 60                        YES
## 61                        YES
## 62                        YES
## 63                        YES
## 64                        YES
## 65                        YES
## 66                        YES
## 67                        YES
## 68                        YES
## 69                        YES
## 70                        YES
## 71                        YES
## 72                        YES
## 73                        YES
## 74                        YES
## 75                        YES
## 76                        YES
## 77                        YES
## 78                        YES
## 79                        YES
## 80                        YES
## 81                        YES
## 82                        YES
## 83                        YES
## 84                        YES
## 85                        YES
## 86                        YES
## 87                        YES
## 88                        YES
## 89                        YES
## 90                        YES
## 91                        YES
## 92                        YES
## 93                        YES
## 94                        YES
## 95                        YES
## 96                        YES
## 97                        YES
## 98                        YES
## 99                        YES
## 100                       YES
## 101                       YES
## 102                       YES
## 103                       YES
## 104                       YES
## 105                       YES
## 106                       YES
## 107                       YES
## 108                       YES
## 109                       YES
## 110                       YES
## 111                       YES
## 112                       YES
## 113                       YES
## 114                       YES
## 115                       YES
## 116                       YES
## 117                       YES
## 118                       YES
## 119                       YES
## 120                       YES
## 121                       YES
## 122                       YES
## 123                       YES
## 124                       YES
## 125                       YES
## 126                       YES
## 127                       YES
## 128                       YES
## 129                       YES
## 130                       YES
## 131                       YES
## 132                       YES
# A visual of the database of recommended schools.
view(temp_schools)
# Create a label that encompasses multiple variables.  Use the <p> html code to create a hard return and separate the City and State data.
temp_schools$label <- paste("<p><a>", temp_schools$NAME,"<p></a>",
                         temp_schools$CITY,",",
                          temp_schools$STATE)

PART III: Data Acquisition Issues with Webscraping

Web Scraping College Course Descriptions

Websites for colleges are vastly different from one another in terms of HTML structure and website layout. For example, for some colleges, when navigating to their course descriptions page, the page itself will contain links to PDFs.

Figure 1: Course description page for Angelo State University

When accessing the course description page for other colleges, the descriptions will be on the page itself instead of on a PDF as shown on Figure 2.

Figure 2: Three of the concentration requirements for the Masters of Accounting program taken from the Appalachian State University website Another plan that the team had in mind was to ignore the websites themselves and just parse through the course catalog PDFs for all of the colleges with graduate accounting programs. However, we ran into a similar problem where even the PDFs themselves were vastly different from one another in terms of layout if we compare Figure 3 to Figure 4.

Figure 3: A snippet of the graduate accounting course descriptions for Angelo State University taken from the 2019-2020 graduate catalogue

Figure 4: A snippet of the graduate account course descriptions for Bay Path University taken from the 2019 - 2020 graduate catalogue

Based on these caveats that the team encountered when exploring the possibility of web scraping for college course descriptions, the team decided that it would be best to just use the data that was collected from Dr. Foy’s students which was manually copy and pasted.

Conclusion

By mining course descriptions words and joining them with vectors of desired skills, we successfully built a recommender system with a few key predictors. We extended the model by adding geocoding and mapping features to perform basic cluster analysis. From a visualize overview centered on the Eastern U.S. coastline, we can observe clustering of the schools predominently in the northeast: NYC Metro, Boston Metro and the Philadelphia Metro areas. Also North Carolina and Florida show significant clustering. Constructing a database of courses from schools probably would not be as efficient as copying and pasting data directly from the sshool’s websites.

We can further extend the model and add other aspects into the model such as tuition costs, post graduate employment percentage and national university ranking.

Note, some colleges did not publish course descriptions so they were penalized by the recommender system.