Background

The USAspending API (Application Programming Interface) allows the public to access comprehensive U.S. government spending data.

The data include spending on awards (e.g., who received federal contracts or grants, geographic breakdowns, agency breakdowns, and much more). The data also include account-level, non-award spending such as federal employee compensation. You can learn more about the data and the federal law that requires it to be publicly accessible (the DATA Act) at http://fedspendingtransparency.github.io.

As of June 16, 2017, one limitation of the USAspending API is that it only queries FY17 data. As more years pass, this data source will provide a richer historical source of federal spending across the nation.

Querying the API from RStudio

Loading Necessary Packages

Before querying USAspending data from their API, place this code at the beginning of your script.

################## Load Necessary Packages ###################
# load necessary packages
library( httr )
library( jsonlite )
library( compiler )
library( purrr )
# compilePKGS()
setCompilerOptions(suppressAll = TRUE )
# enableJIT enables or disables just-in-time (JIT) compilation. 
# JIT is disabled if the argument is 0. 
# If level is 3, all top level loops are compiled before they are executed.
enableJIT(3) # 3 will appear

Understanding APIs

Querying data through an API requires how to speak the language of the internet. After reviewing “The Anatomy of a URL: Protocol, Hostname, Path, and Parameters all in one string of content”, you’ll understand how parameters enable you to specify a query tailor-made for your needs.

The path

The DATA ACT API website api.usaspending.gov. The website hosts a variety of paths, labeled as “endpoints”. These endpoints host different categories of data tables. This tutorial uses the “/api/v1/transactions/” endpoint.

The transactions endpoint returns a list of transactions for all federal spending (i.e. contracts, loans, grants, other financial assistance, etc…). For more information about the different types of federal spending categories, please see the “USAspending Guide Source File”.

The parameters

This looks like a lot of information. To help narrow it down, I’m interested in Illinois and the 6th congressional district. In the language of the internet, these two parameters are translated as:

  • Illinois = place_of_performance__state_code=IL
  • 6th congressional district = place_of_performance__congressional_code=06

Together, the formal API query looks like this:

https://api.usaspending.gov/api/v1/transactions/?place_of_performance__state_code=IL&place_of_performance__congressional_code=06.

The formal API query returns one page of 100 results that satisfy the parameters we gave.

The field choice “place_of_performance” is the principal place of business, where the majority of the work is performed. It is a nested object, with other variables inside of this object. To call the variable “state_code”, I first called “place_of_performance”, added “__“, and finally added”state_code“.

To learn more about what field choices the USAspending API supports for each endpoint, please see the USAspending API Data Dictionary.

General GET_all_pages Function

It would be helpful to grab every page of results and return a data frame with all the pages.

################### General Purpose GET Function ############################################
#############################################################################################
GET_all_pages <- function( PATH, QUERY ) {
  # Create empty list
  pages <- list()
  # Create initial API url
  url <- modify_url("https://api.usaspending.gov"
                    , path = PATH
                    , query = QUERY
  )
  # Get API url
  raw.resp <- GET(url)
  if (http_type(raw.resp) != "application/json") {
    stop("API did not return json. Check 'status code = 200'"
         , call. = FALSE)
  }
  this.char.resp <- rawToChar( raw.resp$content) # convert from raw to char
  # convert JSON object into R object
  this.clean.resp <- fromJSON(this.char.resp
                              , flatten = TRUE
  ) 
  # Set initial page number
  page_number <- 1
  
  # conditional element selection
  # if results page is does not have a next page
  # return a data frame for these results
  if( this.clean.resp$page_metadata$has_next_page == FALSE ){
      return( this.clean.resp$results)
    }
  # while loop with boolean condition
  # if the results page contains a next page
  # call the next page and bind the results to a data frame
  # return the data frame with all the page results
  while( this.clean.resp$page_metadata$has_next_page == TRUE ) {
    # identify current page url
    current.page.url <- this.clean.resp$page_metadata$current
    # subsitute "&page=XX" with "&page='page_number'"
    next.page.url <- gsub( pattern = "&page=[[:digit:]]+"
                           , replacement = paste0( "&page=", page_number)
                           , x = current.page.url
    )
    # Get new API url
    raw.resp <- GET( url = next.page.url )
    # Convert raw vector to character vector
    this.char.resp <- rawToChar( raw.resp$content )
    # Convert JSON object into R object
    this.clean.resp <- fromJSON( this.char.resp
                                 , flatten = TRUE
    )
    # For every page number (1, 2, 3...), insert that page's "results" inside the list
    pages[[ page_number ]] <- this.clean.resp$results
    # Add to the page number and restart the loop
    page_number <- page_number + 1
  }
  # once all the pages have been collected,
  data_api_data <- rbind_pages(pages) # rbind.pages() is deprecated
  # return what we've collected
  return( data_api_data )
  
  
  # Turn API errors into R errors
  if (http_error( raw.resp )) {
    stop(
      sprintf(
        "USASpending.gov API request failed [%s]\n%s\n<%s>", 
        status_code( raw.resp),
        this.clean.resp$message,
        this.clean.resp$documentation_url
      ),
      call. = FALSE
    )
  }
  # add some structure stuff 
  structure(
    list(
      content = this.clean.resp
      , path = PATH
      , response = raw.resp
    )
    , class = "usa_spending_api"
  )
} # end of function

Geographic Specific Functions

Once the “GET_all_pages” function is created, geographic specific functions are easier to create and maintain.

City/State Pair

################### City, State API Function #####################################
city_state <- function( CITY_NAME, STATE_NAME ) {
  api_data <- GET_all_pages( PATH = "/api/v1/transactions/"
                             , QUERY = list(place_of_performance__city_name = CITY_NAME
                                            , place_of_performance__state_name = STATE_NAME
                             )
  )
  # replace character(0) with NAs functions
  char0_to_NA <- function( list_object ) {
    # set counter
    i <- 1
    # create empty list vector
    emptyList <- list()
    # while loop
    while( i <= length( list_object )) {
      if (identical(list_object[[i]], character(0))) {
        emptyList[[i]] <- NA
      } else {
        emptyList[[i]] <- list_object[[i]]
      } # end of if statement
      # add to counter
      i <- i + 1
    } # end of while statement
    # return empty character vector
    return( emptyList )
  } # end of function
  #
  # Replace the character(0) values with NA
  # in the 'recipient.business_categories' column
  #
  api_data$recipient.business_categories <- char0_to_NA( api_data$recipient.business_categories)
  # 
  # Replace the list with character vector 
  # of the first element of the list
  api_data$recipient.business_categories_1 <- unlist( purrr::map( api_data$recipient.business_categories, 1) )
  # Create a placeholder variable
  api_data$recipient.business_categories_2 <- NA 
  # Create a dataframe with rows that contain a list
  # with only 2 elements
  ad_2 <- api_data[ lapply(api_data$recipient.business_categories
                           , length) == 2
                    ,  ]
  # Fill in the placeholder variable with the second element of the list
  ad_2$recipient.business_categories_2 <- unlist( purrr::map( ad_2$recipient.business_categories, 2) )
  # create a dataframe with rows that DO NOT contain a list
  # with only 2 elements
  ad_3 <- api_data[ !lapply(api_data$recipient.business_categories
                           , length) == 2
                    ,  ]
  # Rename api_data to be the result of an rbind.data.frame
  # from ad_2 and ad_3
  api_data_final <- rbind.data.frame( ad_2, ad_3)
  # relabel the row names
  rownames( api_data_final ) <- as.character(1:nrow(api_data_final))
  # make the list variable 'recipient.business_categories' NULL
  # hereby removing it from the returned dataframe
  api_data_final <- subset( x = api_data_final
                      , select = -recipient.business_categories
                      )
  # convert na to "marked_unknown/other"
  # for recipient.business_categories_1
  api_data_final <- within(api_data_final, recipient.business_categories_1 <-
                             ifelse( test = is.na(recipient.business_categories_1)
                                     , yes = "marked_unknown/other"
                                     , no = recipient.business_categories_1
                                     )
                           )
  # convert na to "marked_unknown/other"
  # for recipient.business_categories_2
  api_data_final <- within(api_data_final, recipient.business_categories_2 <-
                             ifelse( test = is.na(recipient.business_categories_2)
                                     , yes = "marked_unknown/other"
                                     , no = recipient.business_categories_2
                                     )
                           )
  # check if column "funding_agency"
  if( "funding_agency" %in% names(api_data_final) == TRUE ) {
    # delete the column "funding_agency"
    api_data_final <- subset( x = api_data_final
                        , select = -funding_agency
                        )
  }
  return( api_data_final )
} # end of function

Let’s explore Austin, Texas!

austin_tx <- city_state("AUSTIN", "TEXAS")
dim(austin_tx)
## [1] 1190   83
names(austin_tx)
##  [1] "id"                                                          
##  [2] "type"                                                        
##  [3] "type_description"                                            
##  [4] "period_of_performance_start_date"                            
##  [5] "period_of_performance_current_end_date"                      
##  [6] "action_date"                                                 
##  [7] "action_type"                                                 
##  [8] "action_type_description"                                     
##  [9] "federal_action_obligation"                                   
## [10] "modification_number"                                         
## [11] "description"                                                 
## [12] "contract_data"                                               
## [13] "awarding_agency.id"                                          
## [14] "awarding_agency.toptier_flag"                                
## [15] "awarding_agency.office_agency"                               
## [16] "awarding_agency.toptier_agency.cgac_code"                    
## [17] "awarding_agency.toptier_agency.fpds_code"                    
## [18] "awarding_agency.toptier_agency.abbreviation"                 
## [19] "awarding_agency.toptier_agency.name"                         
## [20] "awarding_agency.subtier_agency.subtier_code"                 
## [21] "awarding_agency.subtier_agency.abbreviation"                 
## [22] "awarding_agency.subtier_agency.name"                         
## [23] "funding_agency.id"                                           
## [24] "funding_agency.toptier_flag"                                 
## [25] "funding_agency.office_agency"                                
## [26] "funding_agency.toptier_agency.cgac_code"                     
## [27] "funding_agency.toptier_agency.fpds_code"                     
## [28] "funding_agency.toptier_agency.abbreviation"                  
## [29] "funding_agency.toptier_agency.name"                          
## [30] "funding_agency.subtier_agency.subtier_code"                  
## [31] "funding_agency.subtier_agency.abbreviation"                  
## [32] "funding_agency.subtier_agency.name"                          
## [33] "recipient.legal_entity_id"                                   
## [34] "recipient.parent_recipient_unique_id"                        
## [35] "recipient.recipient_name"                                    
## [36] "recipient.business_types"                                    
## [37] "recipient.business_types_description"                        
## [38] "recipient.recipient_unique_id"                               
## [39] "recipient.domestic_or_foreign_entity_description"            
## [40] "recipient.small_business_description"                        
## [41] "recipient.location.country_name"                             
## [42] "recipient.location.state_code"                               
## [43] "recipient.location.state_name"                               
## [44] "recipient.location.state_description"                        
## [45] "recipient.location.city_name"                                
## [46] "recipient.location.address_line1"                            
## [47] "recipient.location.address_line2"                            
## [48] "recipient.location.address_line3"                            
## [49] "recipient.location.foreign_location_description"             
## [50] "recipient.location.zip5"                                     
## [51] "recipient.location.foreign_postal_code"                      
## [52] "recipient.location.foreign_province"                         
## [53] "recipient.location.foreign_city_name"                        
## [54] "recipient.location.location_country_code"                    
## [55] "place_of_performance.country_name"                           
## [56] "place_of_performance.state_code"                             
## [57] "place_of_performance.state_name"                             
## [58] "place_of_performance.state_description"                      
## [59] "place_of_performance.city_name"                              
## [60] "place_of_performance.address_line1"                          
## [61] "place_of_performance.address_line2"                          
## [62] "place_of_performance.address_line3"                          
## [63] "place_of_performance.foreign_location_description"           
## [64] "place_of_performance.zip5"                                   
## [65] "place_of_performance.foreign_postal_code"                    
## [66] "place_of_performance.foreign_province"                       
## [67] "place_of_performance.foreign_city_name"                      
## [68] "place_of_performance.location_country_code"                  
## [69] "assistance_data.fain"                                        
## [70] "assistance_data.uri"                                         
## [71] "assistance_data.business_funds_indicator_description"        
## [72] "assistance_data.face_value_loan_guarantee"                   
## [73] "assistance_data.original_loan_subsidy_cost"                  
## [74] "assistance_data.record_type_description"                     
## [75] "assistance_data.correction_late_delete_indicator_description"
## [76] "assistance_data.cfda.id"                                     
## [77] "assistance_data.cfda.program_number"                         
## [78] "assistance_data.cfda.program_title"                          
## [79] "assistance_data.cfda.popular_name"                           
## [80] "assistance_data.cfda.objectives"                             
## [81] "assistance_data.cfda.website_address"                        
## [82] "recipient.business_categories_1"                             
## [83] "recipient.business_categories_2"

What types of FY17 federal spending is most popular in Austin, Texas?

# what kinds of awards is Austin, Texas receiving?
austintx_type_df <- as.data.frame( sort( table( as.factor( austin_tx$type_description ) ) )
                               , stringsAsFactors = FALSE
                               )
colnames( austintx_type_df ) <- c("Award Type", "FY17 Count")

### Round up function
### see: "How to Round Up to the nearest 10 or 100" post on StackOverflow https://stackoverflow.com/questions/6461209/how-to-round-up-to-the-nearest-10-or-100-or-x
roundUpNice <- function(x, nice=c(1,2,4,5,6,8,10)) {
  if(length(x) != 1) stop("'x' must be of length 1")
  10^floor(log10(x)) * nice[[which(x <= 10^floor(log10(x)) * nice)[[1]]]]
}

# plot
par( mar = c(5, 15, 4, 4))
barplot( height = sort( table( as.factor( austin_tx$type_description ) ) )
         , horiz = TRUE
         , las = 1
         , main = "FY17 Award Types in Austin, Texas"
         , xlim = c(0, roundUpNice( max( austintx_type_df$`FY17 Count`) ))
         , xlab = "Award Type Frequency"
)
# draw vertical lines
# Add vertical lines to separate the years
abline( v = seq( from = 0
                 , to = roundUpNice( max( austintx_type_df$`FY17 Count`) )
                 , by = 200
                 )
        , col = "gray"
        , lty = 2
)

# unique(austintx_type_df$`Award Type`)

What are the most popular recipients?

austintx_rtype_df <- as.data.frame( sort( table( as.factor( austin_tx$recipient.business_categories_1 ) ) )
                               , stringsAsFactors = FALSE
                               )
colnames( austintx_rtype_df ) <- c("Detailed Recipient Type", "FY17 Count")

# plot
par( mar = c(5, 17, 4, 4))
barplot( height = sort( table( as.factor( austin_tx$recipient.business_categories_1 ) ) )
         , horiz = TRUE
         , las = 1
         , main = "FY17 Detailed Recipient Types in Austin, Texas"
         , xlim = c(0, roundUpNice( max( austintx_rtype_df$`FY17 Count`) ))
         , xlab = "Award Type Frequency"
)
# draw vertical lines
# Add vertical lines to separate the years
abline( v = seq( from = 0
                 , to = roundUpNice( max( austintx_rtype_df$`FY17 Count`) )
                 , by = 100
                 )
        , col = "gray"
        , lty = 2
)

What about broad categories?

austintx_rtype_simple_df <- as.data.frame( sort( table( as.factor( austin_tx$recipient.business_categories_2 ) ) )
                               , stringsAsFactors = FALSE
                               )
colnames( austintx_rtype_simple_df ) <- c("Simple Recipient Type", "FY17 Count")

# plot
par( mar = c(5, 17, 4, 4))
barplot( height = sort( table( as.factor( austin_tx$recipient.business_categories_2 ) ) )
         , horiz = TRUE
         , las = 1
         , main = "FY17 Simple Recipient Types in Austin, Texas"
         , xlim = c(0, roundUpNice( max( austintx_rtype_simple_df$`FY17 Count`) ))
         , xlab = "Award Type Frequency"
)
# draw vertical lines
# Add vertical lines to separate the years
abline( v = seq( from = 0
                 , to = roundUpNice( max( austintx_rtype_simple_df$`FY17 Count`) )
                 , by = 100
                 )
        , col = "gray"
        , lty = 2
)

# unique( austin_tx$recipient.business_types_description)
# whoa <- austin_tx$recipient.business_types_description

austin_tx$recipient.business_types_description <- gsub( pattern = "Nonprofit with 501(c)(3) IRS Status (Other than Institution of Higher Education)"
              , replacement = "Non-Higher Ed Nonprofit with 501(c)(3) IRS Status"
              , x = austin_tx$recipient.business_types_description
              , fixed = TRUE
              )
# unique(whoa2)
austintx_rtype_df_all <- as.data.frame( sort( table( as.factor( austin_tx$recipient.business_types_description) ) )
                               , stringsAsFactors = FALSE
                               )
colnames( austintx_rtype_df_all ) <- c("Recipient Type", "FY17 Count")

# plot
par( mar = c(5, 22, 4, 4))
barplot( height = sort( table( as.factor( austin_tx$recipient.business_types_description ) ) )
         , horiz = TRUE
         , las = 1
         , main = "FY17 Recipient Types in Austin, Texas"
         , xlim = c(0, roundUpNice( max( austintx_rtype_df_all$`FY17 Count`) ))
         , xlab = "Award Type Frequency"
)
# draw vertical lines
# Add vertical lines to separate the years
abline( v = seq( from = 0
                 , to = roundUpNice( max( austintx_rtype_df_all$`FY17 Count`) )
                 , by = 100
                 )
        , col = "gray"
        , lty = 2
)

Anyway we slice it, it looks like the recipient type “unknown types” and recipient business category “marked unknown/other” compose a large part of FY17 federal spending in Austin, Texas.

library( magrittr )
library( pander )
# give me a data frame
# where recipient.business_categories_2 == "marked_unknown/other"
hidden.austintx <- austin_tx[ austin_tx$recipient.business_categories_2 == "marked_unknown/other", ]
dim( hidden.austintx )
## [1] 371  83
table( hidden.austintx$recipient.recipient_name) %>% pander
Table continues below
AUSTIN INDEPENDENT SCHOOL DISTRICT AUSTIN, CITY OF BAT CONSERVATION INTERNATIONAL, INC .
2 3 1
Table continues below
BCL OF TEXAS CARITAS OF AUSTIN Community Partnership for the Homeless
2 7 5
Table continues below
DISABILITY RIGHTS TEXAS EL CAMINO REAL DE LOS TEJAS NATIONAL HISTORIC
1 1
Table continues below
ENDING COMMUNITY HOMELESSNESS COALITION, INC. FRONT STEPS INC. HEALTH & HUMAN SERVICES-COMMISSION
2 3 5
Table continues below
HISTORICAL COMMISSION, TEXAS LUTHERAN SOCIAL SERVICES OF THE SOUTH, INC
2 1
Table continues below
National Domestic Violence Hotline SALVATION ARMY, THE
1 1
Table continues below
South Central Partnership For Energy Efficien STATE HEALTH SERVICES, TEXAS DEPARTMENT OF
2 7
Table continues below
TEXAS ASSOCIATION OF COMMUNITY HEALTH CENTERS INC TEXAS HEALTH INSTITUTE
1 1
Table continues below
Texas Legal Services Center TEXAS MEDICAL CAREERS LIMITED
1 52
Table continues below
TEXAS MILITARY DEPARTMENT TEXAS PARKS AND WILDLIFE DEPARTMENT
5 2
Table continues below
TEXAS WATER DEVELOPMENT BOARD TEXAS WORKFORCE COMMISSION The SAFE Alliance
6 8 6
Table continues below
TRANSPORTATION, TEXAS DEPARTMENT OF UNIVERSITY OF TEXAS AT AUSTIN
6 235
WOMEN’S HEALTH AND FAMILY PLANNING ASSOCIATION OF TEXAS YOUTH AND FAMILY ALLIANCE
1 1

It looks like the University of Texas at Austin (UT Austin) was marked as a business category of “marked_unknown/other”. Let’s reassign the recipient business type description and recipient business categories for all transactions with recipient name “UNIVERSITY OF TEXAS AT AUSTIN”.

utaustin <- hidden.austintx[ hidden.austintx$recipient.recipient_name == "UNIVERSITY OF TEXAS AT AUSTIN", ]
dim( utaustin )
## [1] 235  83
table( utaustin$type_description) # 1 Block Grant, 45 Cooperative Agreements, and 189 Project Grants
## 
##           Block Grant Cooperative Agreement         Project Grant 
##                     1                    45                   189
# austin_tx$recipient.business_categories_2
# Store the row names of this ut austin data frame
utaustin.rownnames <- rownames(utaustin)
# reclassify the recipient.business_categories_2
austin_tx$recipient.business_categories_2[ as.numeric( utaustin.rownnames ) ] <- "higher_education"

# reclassify the recipient.business_categories_1
austin_tx$recipient.business_categories_1[ as.numeric( utaustin.rownnames ) ] <- "public_institution_of_higher_education"

# reclassify the recipient.business_types_description
austin_tx$recipient.business_types_description[ as.numeric( utaustin.rownnames ) ] <- "Public/State Controlled Institution of Higher Education"

#View( austin_tx[ as.numeric(utaustin.rownnames), ])
#unique( austin_tx$recipient.business_types_description)

The FY17 distribution of recipients in Austin, Texas looks a lot different now!

austintx_rtype_df_all <- as.data.frame( sort( table( as.factor( austin_tx$recipient.business_types_description) ) )
                               , stringsAsFactors = FALSE
                               )
colnames( austintx_rtype_df_all ) <- c("Recipient Type", "FY17 Count")

# plot
par( mar = c(5, 22, 4, 4))
barplot( height = sort( table( as.factor( austin_tx$recipient.business_types_description ) ) )
         , horiz = TRUE
         , las = 1
         , main = "FY17 Recipient Types in Austin, Texas"
         , xlim = c(0, roundUpNice( max( austintx_rtype_df_all$`FY17 Count`) ))
         , xlab = "Award Type Frequency"
)
# draw vertical lines
# Add vertical lines to separate the years
abline( v = seq( from = 0
                 , to = roundUpNice( max( austintx_rtype_df_all$`FY17 Count`) )
                 , by = 100
                 )
        , col = "gray"
        , lty = 2
)

# change from character to numeric
austin_tx$federal_action_obligation <- as.numeric(austin_tx$federal_action_obligation)

# only include rows where federal action obligation amount is NOT NA
nonNA.austin_tx <- austin_tx[ is.na( austin_tx$federal_action_obligation ) == FALSE, ]

# how many rows did we drop?
nrow( austin_tx ) # 1190
## [1] 1190
nrow( nonNA.austin_tx ) # 1018
## [1] 1018
# how many transactions are less than $1M?
nrow( nonNA.austin_tx[ nonNA.austin_tx$federal_action_obligation < 1000000, ]) # 997
## [1] 997
# how many transactions are more than $1M?
nrow( nonNA.austin_tx[ nonNA.austin_tx$federal_action_obligation > 1000000, ]) # 21
## [1] 21
# Let's GeoCode and map our results!
unique( nonNA.austin_tx$recipient.business_types_description)
##  [1] "For-Profit Organization (Other than Small Business)"    
##  [2] "Private Institution of Higher Education"                
##  [3] "City or Township Government"                            
##  [4] "Independent School District"                            
##  [5] "State government"                                       
##  [6] "Public/State Controlled Institution of Higher Education"
##  [7] "Small Business"                                         
##  [8] "Other"                                                  
##  [9] "Individual"                                             
## [10] "Unknown Types"                                          
## [11] "Non-Higher Ed Nonprofit with 501(c)(3) IRS Status"
# load packages
library( ggmap)
library( dplyr )

# how many geocode queries do we have left?
geocodeQueryCheck() # 2500

# create full address
full.address <- function( address, city_name, state_code, zip_code ) {
  paste0( address
          , ", "
          , city_name
          , ", "
          , state_code
          , ifelse( test = is.na( zip_code )
                    , yes = ""
                    , no = paste0( " ", zip_code )
          )
  )
} # end of function
# make full address
nonNA.austin_tx$recipient.full.address <- full.address( address = nonNA.austin_tx$recipient.location.address_line1
                                                        , city_name = nonNA.austin_tx$recipient.location.city_name
                                                        , state_code = nonNA.austin_tx$recipient.location.state_code
                                                        , zip_code = nonNA.austin_tx$recipient.location.zip5
                                                        )
# Extract unique full addresses
length( nonNA.austin_tx$recipient.full.address ) # 1018
## [1] 1018
unique_recipient_austin_tx_fulladdress <- unique( nonNA.austin_tx$recipient.full.address ) # 
length( unique_recipient_austin_tx_fulladdress ) # 76
## [1] 76
# geocode results
geocode_results <- geocode( location = unique_recipient_austin_tx_fulladdress
                            , output = "latlon"
                            )
## Warning: geocode failed with status ZERO_RESULTS, location = "DBA-HHSC 4900
## NORTH LAMAR BLV 4-FL, AUSTIN, TX 78751"
# Add unique full address to geocode results
geocode_results$recipient.full.address <- unique_recipient_austin_tx_fulladdress

# merge!
nonNA.austin_tx <- left_join( x = geocode_results
                              , y = nonNA.austin_tx
                              , by = "recipient.full.address"
                              )

# how many business types are there?
unique(nonNA.austin_tx$recipient.business_types_description) # 11
##  [1] "For-Profit Organization (Other than Small Business)"    
##  [2] "Private Institution of Higher Education"                
##  [3] "City or Township Government"                            
##  [4] "Independent School District"                            
##  [5] "State government"                                       
##  [6] "Public/State Controlled Institution of Higher Education"
##  [7] "Small Business"                                         
##  [8] "Other"                                                  
##  [9] "Individual"                                             
## [10] "Unknown Types"                                          
## [11] "Non-Higher Ed Nonprofit with 501(c)(3) IRS Status"
# 11 colors by recipient business type
# colors come from the argument 'markerColor' inside the 'makeAwesomeIcon'
# function from the leaflet package
col.schema <- c("red", "beige", "cadetblue"
                , "white", "green"
                , "pink", "orange", "blue"
                , "black", "lightblue"
                , "darkgreen"
)

# Make data frame assigning colors to unique business types
business_colors <- data.frame( recipient.business_types_description = unique(nonNA.austin_tx$recipient.business_types_description)
                              , color = col.schema
                              , stringsAsFactors = FALSE
)

# merge this data frame onto the original data frame
nonNA.austin_tx <- left_join( x = business_colors
                              , y = nonNA.austin_tx
                              , by = "recipient.business_types_description"
                              )
# make lists based on business types
business_types <- split( nonNA.austin_tx
                         , nonNA.austin_tx$recipient.business_types_description
                         )

Interactive Map

# Load necessary packages
library(leaflet)
library(htmltools)
library(htmlwidgets)
library( scales )
library( magrittr )
library( stringi )

# edit map
map <- leaflet() %>%
  # set the view to Austin, Texas
  setView( lng = -97.743061
           , lat = 30.267153
           , zoom = 9
           ) %>%
   # add background to map
  addProviderTiles( providers$Esri.WorldStreetMap ) %>%
    # add mini map
  addMiniMap(
    tiles = providers$Esri.WorldStreetMap
    , toggleDisplay = TRUE
    , minimized = TRUE
    ) %>%
  # add zoom out button
  addEasyButton( easyButton(
    icon = "ion-android-globe", title = "Zoom Back Out"
    , onClick = JS("function(btn, map){ map.setZoom(9); }")
  ) )

# obtain each data frame within the list
names( business_types ) %>%
  purrr::walk( function( df ){
    map <<- map %>%
      addAwesomeMarkers( data = business_types[[df]]
                         , lng = ~lon
                         , lat = ~lat
                         # Add useful information onto the popup
                         , popup = paste0( "<b>Recipient Name: </b>"
                                           , stri_trans_totitle( business_types[[df]]$recipient.recipient_name )
                                           , "<br>"
                                           , "<b>Business Type: </b>"
                                           , business_types[[df]]$recipient.business_types_description
                                           , "<br>"
                                           , "<b>Total Obligation ($): </b>"
                                           , dollar( business_types[[df]]$federal_action_obligation )
                                           , "<br>"
                                           , "<b>Awarding Agency Name: </b>"
                                           , business_types[[df]]$awarding_agency.toptier_agency.name
                                           , "<br>"
                                           , "<b>Award Type: </b>"
                                           , business_types[[df]]$type_description
                                           , "<br>"
                                           , "<b>Date Signed: </b>"
                                           , business_types[[df]]$date_signed
                         )
                         , icon = awesomeIcons( icon = "ion-social-usd"
                                                , iconColor = "#eefef7"
                                                , library = "ion"
                                                , markerColor = business_types[[df]]$color
                         )
                         , group = df
                         , clusterOptions = markerClusterOptions()
                         , labelOptions = labelOptions( noHide = FALSE
                                                        , direction = "auto"))
  }) # end of combing through the list

map <- map %>%
  # Add widget to allow user to click which department
  # they would like to view
  addLayersControl(
    overlayGroups = names( business_types )
    , options = layersControlOptions( collapsed = TRUE )
  )
# View map
map

Check out more of my work on GitHub.

Last updated on June 18, 2017