You may have to install some if they give you an error.
require(tidyverse)
require(DT)
require(lubridate)
require(scales)
I had started with the idea we could use the API, but it was possible to just make a query that goes back to 2008 for all transactions. The only trick is that you have to choose the “sub-agency” FEMA, not the usual one, because before DHS it was an independent agency.
It says you can use the download link anytime, but if you don’t copy it (and the Copy Link button doesn’t work), I don’t see a way to get back to it.
Note that my search is on “Funding Agency”, not “Award agency”. The reason is that there are sometimes government-wide contracts that are managed by other agencies, and the agency just tags along. This happened during the Iraq war, when the Interior Department was hirng guards at Abu Ghraib.
I renamed the folder that was created from the zip file that starts with “all_prime_transactions_subawards…” to “fema_downlaods”. It was downloaded on June 4, 2019.
There were 4 files in there:
Here’s an “analysts’ guide” to the USA spending that might answer some questions: https://datalab.usaspending.gov/assets/analyst-guide-1-2.pdf
The easiest thing to get is contracts. Let’s see how it guesses.
It seemed to guess pretty well except for a few fields that it guesses as logical that have some illegal values. I changed them so to override the default types.
This section shouldn’t have to be run again, and won’t be when you knit it.
Then I went back and fixed the import, this time taking care to set the problematic fields to “c” instead of logical or numeric. This code should never have to run again, just load the saved file.
fema_contracts_orig <-
read_csv(file="raw_fema_data/fema_downlaods/all_contracts_prime_transactions_1.csv",
col_types = cols (.default = col_guess(),
sam_exception = "c",
other_statutory_authority = "c",
research_code = "c",
research = "c",
awarding_sub_agency_code = "c",
action_date=col_date())
)
#just turning it into a simpler object so you don't accidentally try to print the whole thing. Also, this will show you the data types at the top of the view.
fema_contracts_orig <- as_tibble(fema_contracts_orig)
#let's go ahead and save it so you don't have to do this again
save(fema_contracts_orig, file="usaspending_orig.RData")
rm(fema_contracts_orig)
This code is how I trouble-shooted the original reading. First, I looked at problems(fema_contracts_orig), which tells me which rows had issues. There, I could see that there were some fields that were being read as “logical” (T/F/1/0/etc) but weren’t, and a few others that should have been numbers but werent.
Second, I chose some of the rows (1650, 6664, and 16733) to just look at. Again, I just looked at it then removed the interim dataset. This code chouldn’t have to be run again, so it’s set in the r chunk options to {r eval=FALSE}
problems(fema_contracts_orig)
#look at a couple of the rows with problems
subset <- fema_contracts_orig [c(1650, 6664, 16733), ]
glimpse(subset)
rm(subset)
Let’s see which states get the most by year:
First, we have to load the dataset I saved above. This takes a while because it has SO MANY columns (261 of them to be exact.) Note that the RData file name isn’t the same as the data frame / tibble name. That’s because you can save lots of data frames and objects into an RData file, which will then package them up for you and let you load them whenever you want.
load("fema_contracts.RData")
The lubridate library lets us get the year, month, day from a date variable, so this way we could get the fiscal year. If the action_date is in months 10, 11, 12, then it’s the following fiscal year. (A fiscal year is noted by the year in which it ends, not begins. So we’re currently in FY 19 until Sept. 30)
This is a pretty straightforward set of code: figure out the fiscal year, group by and summarize, sort by descending amounts, then format the numbers into readable values in the DT (datatable) package. It will look better when it’s knitted.
fema_sumdata <-
fema_contracts_orig %>%
mutate( fiscal_year = ifelse(month(action_date) >= 10, year(action_date+1), year(action_date))) %>%
select (fiscal_year, st_abbr = primary_place_of_performance_state_code, recipient_duns, total_dollars_obligated,
federal_action_obligation, award_id_piid) %>%
group_by(fiscal_year, st_abbr) %>%
summarise ( num_of_actions = n(), num_of_contractors = n_distinct(recipient_duns),
obligations = sum(total_dollars_obligated, na.rm=TRUE),
num_of_contracts = n_distinct(award_id_piid),
federal_dollars = sum(federal_action_obligation, na.rm=TRUE)
)
fema_sumdata %>%
arrange(desc(federal_dollars)) %>%
datatable() %>%
formatCurrency(c("obligations", "federal_dollars")) %>%
formatRound(c("num_of_actions", "num_of_contractors"), digits=0, mark=",")
A couple of notes on this:
One thing I don’t understand - how, for example, the “federal action obligation” field sum can be negative, while the “total dollars obligated” can be positive.
Let’s look at the Lousisiana in FY 2008 to see how it works. It may be double-counting by summing.
USA Spending defines them this way:
Look at all dollar fields for Lousiana FY 2008 values.
fema_contracts_orig %>%
filter (primary_place_of_performance_state_code == 'LA' &
between (action_date, as.Date('2007-10-01'), as.Date('2008-09-30'))
) %>%
select (parent_award_id, parent_award_agency_id, parent_award_modification_number, award_id_piid, modification_number, transaction_number, recipient_name,
federal_action_obligation, total_dollars_obligated, base_and_all_options_value, potential_total_value_of_award,
action_date, period_of_performance_potential_end_date) %>%
mutate (difference = total_dollars_obligated - federal_action_obligation) %>%
arrange (award_id_piid, modification_number, transaction_number)
Use federal action obligated for actual spending. Use IDV values for the potential value, or look them up.
Here’s what I found based on this query. It’s very confusing.
The contract 0016 (Phillips & Jordan, and Enviornmental Chem Corp) are hired under the DOD contract shown in the parent award (W912P805D0022). The federal obligation is how much went to that contract, even though it’s really kind of a subcontract. The amount in “total obligation” is for the parent contract, which could include lots of different vendors in this case.
Take a look at this parent contract in the usaspending.gov site to see what I mean: https://www.usaspending.gov/#/award/69333261
(I’ll get rid of this field in the smaller sets so that you don’t get confused. It will end up double-counting anything that has these kinds of split-up contracts, which I think is common.)
I suggest going through an example item that you already know about – say, one that you had to write a report on for class – and then you can choose the fields you want to save.
We really need to get the years before 2007 if we’re going to include Louisiana (Hurricane Katrina). The reason is that this is the year they had to pay back money for work that was canceled or never done. There are a lot of negative entries for it, because that is money they never used – either they didn’t need it, they got it back because they cracked down on fraud, or something else.
Some thoughts:
fema_contracts_orig %>%
select ( state = primary_place_of_performance_state_name,
recipient_name,
award_description,
fed_dollars = federal_action_obligation,
solicitation_identifier,
total_value = base_and_exercised_options_value,
action_date,
award = award_id_piid) %>%
filter (fed_dollars > 1000000 ,
state == "CALIFORNIA",
action_date > "2018-01-01") %>%
arrange(desc(total_value))
So let’s look up that one – the MLU Services Inc in 2019-05-07, which is $53 million!
A couple of other things to notice:
The description SOMETIMES includes the disaster declaration. In this case, DR-4407. But not always!
Of course, this isn’t everything from MLU Corp. Only things that were in California and more than $1 million in 2019
Look it up in USA Spending as a way to see what all of the other fields are for. Example:
fema_contracts_orig %>%
filter (award_id_piid %in% c("70FB8019F00000043", "70FB8018D00000013")) %>%
select (award_id_piid:solicitation_identifier)
There should be a way to build a search for it into the table, but I can’t quite figure out what’s happening on the USA spending page. Instead, you can go to the Advanced Search page, and copy and paste the award ID into the Award ID box, then search. If I do that, I get one answer, and that leads to this page:
https://www.usaspending.gov/#/award/80471094
This gives you – in English, or to the extent contracting language is IN English – what each field means. Your data is at the transaction level, this is at the “award” level. In other words, this is an award with a total amount over time, but each transaction is a payment that’s made for pieces of the work.
I like a couple of things about this – you can see what office it’s coming from, and also what agency awarded the contract. For example, FEMA might be funding things that are done by HHS.
Once you find something of interest, you might look it up in FedBizOpps. For example, after fumbling around in FedBiz opps and working backwards from this, I found this:
Note that at the bottom, this company got a $50 million (?) contract with FEMA after a bid protest kicked out the other company! This is actually a great little tip: The previous company, Timberline Homes of LA, seemed to have had problems in past performance. (This is for a different contract.)
I can explain a little about how you get this. There usually won’t be solicitation numbers on “award” items, just on “IDV” items. I can explain. I actually took a little course in federal contracting rules when I was a reporter at the Post because we couldn’t figure this stuff out!
Don’t worry for now about the sub-contractors, but when you find something you want to dig into, you might check for them. They’ll be linked by the award numbers. (There may be none. Only the largest contractors have to report them, and that has been relatively recently)
Try finding a way to isolate events, like Sandy, Katrina, Maria, etc. – there may be something in here that does link them, and that would be good.
See about contractors that have a lot of de-obligations. This is money they never spent or had to give back to the government for some reason. Don’t worry about NET (sum) amounts – just look for substantial (say, over $100,000) negative numbers. If they’re all in a time or place, you might find some problems with the state or the federal government at that time.
Look into contractors that have high IDV’s (that means they contracted for a lot of money) but were actually awarded very little of it. That suggests that they didn’t do the job.
You might look for telltale signs in the data, using Tribute Contracting LLC , which this Times story said never did the work.
fema_contracts_orig %>%
filter (str_detect(recipient_name, "TRIBUTE CONTRACTING")) %>%
glimpse()
## Observations: 20
## Variables: 261
## $ award_id_piid <chr> "HSFE…
## $ modification_number <chr> "0", …
## $ transaction_number <dbl> NA, N…
## $ parent_award_agency_id <dbl> NA, N…
## $ parent_award_agency_name <chr> NA, N…
## $ parent_award_id <chr> NA, N…
## $ parent_award_modification_number <chr> "0", …
## $ federal_action_obligation <dbl> 500, …
## $ total_dollars_obligated <dbl> NA, N…
## $ base_and_exercised_options_value <dbl> 0, 0,…
## $ current_total_value_of_award <dbl> NA, N…
## $ base_and_all_options_value <dbl> 22000…
## $ potential_total_value_of_award <dbl> NA, N…
## $ action_date <date> 2013…
## $ period_of_performance_start_date <dttm> 2013…
## $ period_of_performance_current_end_date <dttm> NA, …
## $ period_of_performance_potential_end_date <dttm> NA, …
## $ ordering_period_end_date <date> 2018…
## $ awarding_agency_code <chr> "070"…
## $ awarding_agency_name <chr> "DEPA…
## $ awarding_sub_agency_code <chr> "7022…
## $ awarding_sub_agency_name <chr> "FEDE…
## $ awarding_office_code <chr> "LOG7…
## $ awarding_office_name <chr> "LOGI…
## $ funding_agency_code <chr> "070"…
## $ funding_agency_name <chr> "DEPA…
## $ funding_sub_agency_code <dbl> 7022,…
## $ funding_sub_agency_name <chr> "FEDE…
## $ funding_office_code <chr> "LOG7…
## $ funding_office_name <chr> "LOGI…
## $ foreign_funding <chr> NA, N…
## $ foreign_funding_description <chr> "NOT …
## $ sam_exception <chr> NA, N…
## $ sam_exception_description <chr> "NAN"…
## $ recipient_duns <chr> "0790…
## $ recipient_name <chr> "TRIB…
## $ recipient_doing_business_as_name <chr> NA, N…
## $ cage_code <chr> NA, N…
## $ recipient_parent_name <chr> "TRIB…
## $ recipient_parent_duns <chr> "0790…
## $ recipient_country_code <chr> "UNIT…
## $ recipient_country_name <chr> NA, N…
## $ recipient_address_line_1 <chr> "1097…
## $ recipient_address_line_2 <chr> NA, N…
## $ recipient_city_name <chr> "ATLA…
## $ recipient_state_code <chr> "GA",…
## $ recipient_state_name <chr> NA, N…
## $ recipient_zip_4_code <chr> "3030…
## $ recipient_congressional_district <chr> "GA05…
## $ recipient_phone_number <chr> "4048…
## $ recipient_fax_number <chr> NA, N…
## $ primary_place_of_performance_country_code <chr> NA, N…
## $ primary_place_of_performance_country_name <chr> NA, N…
## $ primary_place_of_performance_city_name <chr> NA, N…
## $ primary_place_of_performance_county_name <chr> NA, N…
## $ primary_place_of_performance_state_code <chr> NA, N…
## $ primary_place_of_performance_state_name <chr> NA, N…
## $ primary_place_of_performance_zip_4 <chr> NA, N…
## $ primary_place_of_performance_congressional_district <chr> NA, N…
## $ award_or_idv_flag <chr> "IDV"…
## $ award_type_code <chr> NA, N…
## $ award_type <chr> NA, N…
## $ idv_type_code <chr> "B", …
## $ idv_type <chr> "IDC"…
## $ multiple_or_single_award_idv_code <chr> "M", …
## $ multiple_or_single_award_idv <chr> "MULT…
## $ type_of_idc_code <chr> NA, N…
## $ type_of_idc <chr> "INDE…
## $ type_of_contract_pricing_code <chr> "J", …
## $ type_of_contract_pricing <chr> "FIRM…
## $ award_description <chr> "CLEA…
## $ action_type_code <chr> NA, N…
## $ action_type <chr> NA, N…
## $ solicitation_identifier <chr> "HSFE…
## $ number_of_actions <dbl> NA, N…
## $ inherently_governmental_functions <chr> NA, N…
## $ inherently_governmental_functions_description <chr> NA, N…
## $ product_or_service_code <chr> "7920…
## $ product_or_service_code_description <chr> "BROO…
## $ contract_bundling_code <chr> "D", …
## $ contract_bundling <chr> "NOT …
## $ dod_claimant_program_code <chr> NA, N…
## $ dod_claimant_program_description <chr> NA, N…
## $ naics_code <dbl> 42385…
## $ naics_description <chr> "SERV…
## $ recovered_materials_sustainability_code <chr> "E", …
## $ recovered_materials_sustainability <chr> "BIO-…
## $ domestic_or_foreign_entity_code <chr> "A", …
## $ domestic_or_foreign_entity <chr> "U.S.…
## $ dod_acquisition_program_code <chr> NA, N…
## $ dod_acquisition_program_description <chr> NA, N…
## $ information_technology_commercial_item_category_code <chr> NA, N…
## $ information_technology_commercial_item_category <chr> NA, N…
## $ epa_designated_product_code <chr> NA, N…
## $ epa_designated_product <chr> NA, N…
## $ country_of_product_or_service_origin_code <chr> "NAN"…
## $ country_of_product_or_service_origin <chr> NA, N…
## $ place_of_manufacture_code <chr> NA, N…
## $ place_of_manufacture <chr> NA, N…
## $ subcontracting_plan_code <chr> "B", …
## $ subcontracting_plan <chr> "PLAN…
## $ extent_competed_code <chr> "D", …
## $ extent_competed <chr> "FULL…
## $ solicitation_procedures_code <chr> "NP",…
## $ solicitation_procedures <chr> "NEGO…
## $ type_of_set_aside_code <chr> "SBA"…
## $ type_of_set_aside <chr> "SMAL…
## $ evaluated_preference_code <chr> "NONE…
## $ evaluated_preference <chr> "NO P…
## $ research_code <chr> NA, N…
## $ research <chr> "NAN"…
## $ fair_opportunity_limited_sources_code <chr> NA, N…
## $ fair_opportunity_limited_sources <chr> "NAN"…
## $ other_than_full_and_open_competition_code <chr> NA, N…
## $ other_than_full_and_open_competition <chr> NA, N…
## $ number_of_offers_received <dbl> 3, 4,…
## $ commercial_item_acquisition_procedures_code <chr> "A", …
## $ commercial_item_acquisition_procedures <chr> NA, N…
## $ small_business_competitiveness_demonstration_program <lgl> FALSE…
## $ commercial_item_test_program_code <chr> "N", …
## $ commercial_item_test_program <chr> "NO",…
## $ a76_fair_act_action_code <chr> "NO",…
## $ a76_fair_act_action <chr> NA, N…
## $ fed_biz_opps_code <chr> "Y", …
## $ fed_biz_opps <chr> "YES"…
## $ local_area_set_aside_code <chr> "N", …
## $ local_area_set_aside <chr> NA, N…
## $ price_evaluation_adjustment_preference_percent_difference <dbl> NA, N…
## $ clinger_cohen_act_planning_code <chr> "Y: Y…
## $ clinger_cohen_act_planning <chr> NA, N…
## $ materials_supplies_articles_equipment_code <chr> "X", …
## $ materials_supplies_articles_equipment <chr> "NOT …
## $ labor_standards_code <chr> "X", …
## $ labor_standards <chr> "NOT …
## $ construction_wage_rate_requirements_code <chr> "X", …
## $ construction_wage_rate_requirements <chr> "NOT …
## $ interagency_contracting_authority_code <chr> "X", …
## $ interagency_contracting_authority <chr> "NOT …
## $ other_statutory_authority <chr> NA, N…
## $ program_acronym <chr> NA, N…
## $ parent_award_type_code <chr> NA, N…
## $ parent_award_type <chr> NA, N…
## $ parent_award_single_or_multiple_code <chr> NA, N…
## $ parent_award_single_or_multiple <chr> NA, N…
## $ major_program <chr> NA, N…
## $ national_interest_action_code <chr> "NONE…
## $ national_interest_action <chr> "NONE…
## $ cost_or_pricing_data_code <chr> "N", …
## $ cost_or_pricing_data <chr> "NO",…
## $ cost_accounting_standards_clause_code <chr> "X", …
## $ cost_accounting_standards_clause <chr> "NOT …
## $ gfe_gfp_code <chr> "N", …
## $ gfe_gfp <chr> "TRAN…
## $ sea_transportation_code <chr> NA, N…
## $ sea_transportation <chr> NA, N…
## $ undefinitized_action_code <chr> "X", …
## $ undefinitized_action <chr> "NO",…
## $ consolidated_contract_code <chr> "N", …
## $ consolidated_contract <chr> "NO",…
## $ performance_based_service_acquisition_code <chr> "X", …
## $ performance_based_service_acquisition <chr> "NOT …
## $ multi_year_contract_code <chr> "N", …
## $ multi_year_contract <chr> "NO",…
## $ contract_financing_code <chr> NA, N…
## $ contract_financing <chr> NA, N…
## $ purchase_card_as_payment_method_code <chr> NA, N…
## $ purchase_card_as_payment_method <chr> NA, N…
## $ contingency_humanitarian_or_peacekeeping_operation_code <chr> NA, N…
## $ contingency_humanitarian_or_peacekeeping_operation <chr> NA, N…
## $ alaskan_native_owned_corporation_or_firm <lgl> FALSE…
## $ american_indian_owned_business <lgl> FALSE…
## $ indian_tribe_federally_recognized <lgl> FALSE…
## $ native_hawaiian_owned_business <lgl> FALSE…
## $ tribally_owned_business <lgl> FALSE…
## $ veteran_owned_business <lgl> FALSE…
## $ service_disabled_veteran_owned_business <lgl> FALSE…
## $ woman_owned_business <lgl> TRUE,…
## $ women_owned_small_business <lgl> FALSE…
## $ economically_disadvantaged_women_owned_small_business <lgl> FALSE…
## $ joint_venture_women_owned_small_business <lgl> FALSE…
## $ joint_venture_economic_disadvantaged_women_owned_small_bus <lgl> FALSE…
## $ minority_owned_business <lgl> TRUE,…
## $ subcontinent_asian_asian_indian_american_owned_business <lgl> FALSE…
## $ asian_pacific_american_owned_business <lgl> FALSE…
## $ black_american_owned_business <lgl> TRUE,…
## $ hispanic_american_owned_business <lgl> FALSE…
## $ native_american_owned_business <lgl> FALSE…
## $ other_minority_owned_business <lgl> FALSE…
## $ contracting_officers_determination_of_business_size <chr> "SMAL…
## $ contracting_officers_determination_of_business_size_code <chr> "S", …
## $ emerging_small_business <lgl> FALSE…
## $ community_developed_corporation_owned_firm <lgl> FALSE…
## $ labor_surplus_area_firm <lgl> FALSE…
## $ us_federal_government <lgl> FALSE…
## $ federally_funded_research_and_development_corp <lgl> FALSE…
## $ federal_agency <lgl> FALSE…
## $ us_state_government <lgl> FALSE…
## $ us_local_government <lgl> FALSE…
## $ city_local_government <lgl> FALSE…
## $ county_local_government <lgl> FALSE…
## $ inter_municipal_local_government <lgl> FALSE…
## $ local_government_owned <lgl> FALSE…
## $ municipality_local_government <lgl> FALSE…
## $ school_district_local_government <lgl> FALSE…
## $ township_local_government <lgl> FALSE…
## $ us_tribal_government <lgl> FALSE…
## $ foreign_government <lgl> FALSE…
## $ organizational_type <chr> NA, N…
## $ corporate_entity_not_tax_exempt <lgl> FALSE…
## $ corporate_entity_tax_exempt <lgl> FALSE…
## $ partnership_or_limited_liability_partnership <lgl> TRUE,…
## $ sole_proprietorship <lgl> FALSE…
## $ small_agricultural_cooperative <lgl> FALSE…
## $ international_organization <lgl> FALSE…
## $ us_government_entity <lgl> FALSE…
## $ community_development_corporation <lgl> FALSE…
## $ domestic_shelter <lgl> FALSE…
## $ educational_institution <lgl> FALSE…
## $ foundation <lgl> FALSE…
## $ hospital_flag <lgl> FALSE…
## $ manufacturer_of_goods <lgl> FALSE…
## $ veterinary_hospital <lgl> FALSE…
## $ hispanic_servicing_institution <lgl> FALSE…
## $ receives_contracts <lgl> TRUE,…
## $ receives_grants <lgl> TRUE,…
## $ receives_contracts_and_grants <lgl> TRUE,…
## $ airport_authority <lgl> FALSE…
## $ council_of_governments <lgl> FALSE…
## $ housing_authorities_public_tribal <lgl> FALSE…
## $ interstate_entity <lgl> FALSE…
## $ planning_commission <lgl> FALSE…
## $ port_authority <lgl> FALSE…
## $ transit_authority <lgl> FALSE…
## $ subchapter_scorporation <lgl> FALSE…
## $ limited_liability_corporation <lgl> FALSE…
## $ foreign_owned_and_located <lgl> FALSE…
## $ for_profit_organization <lgl> TRUE,…
## $ nonprofit_organization <lgl> FALSE…
## $ other_not_for_profit_organization <lgl> FALSE…
## $ the_ability_one_program <lgl> FALSE…
## $ number_of_employees <dbl> 1, 1,…
## $ annual_revenue <dbl> 1000,…
## $ private_university_or_college <lgl> FALSE…
## $ state_controlled_institution_of_higher_learning <lgl> FALSE…
## $ `1862_land_grant_college` <lgl> FALSE…
## $ `1890_land_grant_college` <lgl> FALSE…
## $ `1994_land_grant_college` <lgl> FALSE…
## $ minority_institution <lgl> FALSE…
## $ historically_black_college <lgl> FALSE…
## $ tribal_college <lgl> FALSE…
## $ alaskan_native_servicing_institution <lgl> FALSE…
## $ native_hawaiian_servicing_institution <lgl> FALSE…
## $ school_of_forestry <lgl> FALSE…
## $ veterinary_college <lgl> FALSE…
## $ dot_certified_disadvantage <lgl> FALSE…
## $ self_certified_small_disadvantaged_business <lgl> FALSE…
## $ small_disadvantaged_business <lgl> TRUE,…
## $ c8a_program_participant <lgl> FALSE…
## $ historically_underutilized_business_zone_hubzone_firm <lgl> FALSE…
## $ sba_certified_8a_joint_venture <lgl> FALSE…
## $ last_modified_date <dttm> 2013…
I put the results of the previous step into Excel, then parsed it to get the names of the fields. I create a column YN for whether we wanted to keep it, filtered just for them, and the renamed them in a big, long select statemet. At least this gets it down to 70-some fields, which is still a lot, but they also have shorter names for typing. Look at the file here called rename_select_contract_variables.xlsx to see how it’s done.
Here is a quote on how to create a unique ID for each transaction:
The unique_transaction_ids were assigned by the legacy database and, therefore, couldn’t be transferred over to the new USAspending.gov database. However, you can create your own by concatenation. For contracts, you’d need these 6 fields: award_id_piid, modification_number, transaction_number, parent_award_id, parent_award_modification_number, and awarding_sub_agency_code.
Also, the analyst’s guide mentioned above has some really important information. For example:
Finally, it’s worth noting that if all initially obligated money isn’t used, the last modification will display a negative amount. For example, if an initial contract was made for $10,000 and an agency only used $9,000 of that initial obligation, the last record would display an amount of -$1,000.
Create the unique id using those 6 fields, with the paste() command, separating the pieces with “-”. (It will have “-NA-” if it doesn’t exist, but we don’t really care. It’s still unique.)
Add a fiscal year to the dataset so we don’t have to keep doing it over and over. I also suggest that you split out IDV from AWARD data – they kind of have different things in them.
Crap - there are no FIP codes anywhere in the original data! Do they keep this from us on purpose?? How do they do their maps?? It’s possible that they are consistent enough that they can be used. Otherwise, have to geocode the city.
See the spreadsheet in this folder called
fema_all <-
fema_contracts_orig %>%
mutate( unique_id =
paste(award_id_piid,
modification_number,
transaction_number,
parent_award_id,
parent_award_modification_number,
awarding_sub_agency_code, sep="-") ,
fiscal_year = ifelse(month(action_date) >= 10, year(action_date+1), year(action_date))
) %>%
select (fiscal_year,
piid=award_id_piid,
mod=modification_number,
trans_num=transaction_number,
parent_agency_id=parent_award_agency_id,
parent_agency=parent_award_agency_name,
parent_award_id=parent_award_id,
parent_mod=parent_award_modification_number,
fed_obligation=federal_action_obligation,
exercised_value=base_and_exercised_options_value,
current_value=current_total_value_of_award,
total_value=base_and_all_options_value,
potential_value=potential_total_value_of_award,
action_date=action_date,
start_date=period_of_performance_start_date,
current_end_date=period_of_performance_current_end_date,
potential_end_date=period_of_performance_potential_end_date,
order_end_date=ordering_period_end_date,
award_agency_id=awarding_agency_code,
award_agency=awarding_agency_name,
award_subagency_id=awarding_sub_agency_code,
award_subagency=awarding_sub_agency_name,
award_office_id=awarding_office_code,
award_office=awarding_office_name,
funding_agency_id=funding_agency_code,
funding_agency=funding_agency_name,
funding_subagency_id=funding_sub_agency_code,
funding_subagency=funding_sub_agency_name,
funding_office_id=funding_office_code,
funding_office=funding_office_name,
recip_duns=recipient_duns,
recip_name=recipient_name,
recip_dba=recipient_doing_business_as_name,
recip_parent=recipient_parent_name,
recip_parent_duns=recipient_parent_duns,
recip_address=recipient_address_line_1,
recip_city=recipient_city_name,
recip_stabbr=recipient_state_code,
recip_state=recipient_state_name,
recip_zip4=recipient_zip_4_code,
recip_cd=recipient_congressional_district,
recip_phone=recipient_phone_number,
pop_city=primary_place_of_performance_city_name,
pop_county=primary_place_of_performance_county_name,
pop_stabbr=primary_place_of_performance_state_code,
pop_state=primary_place_of_performance_state_name,
pop_zip4=primary_place_of_performance_zip_4,
pop_cd=primary_place_of_performance_congressional_district,
award_idv_flag=award_or_idv_flag,
award_type=award_type,
idv_type=idv_type,
mult_award_idv=multiple_or_single_award_idv,
type_idc=type_of_idc,
type_pricing=type_of_contract_pricing,
description=award_description,
type_action=action_type,
solicitation_id=solicitation_identifier,
action_ct=number_of_actions,
gov_function=inherently_governmental_functions_description,
prod_code=product_or_service_code,
prod_name=product_or_service_code_description,
bundled=contract_bundling,
naics_code=naics_code,
naics_name=naics_description,
subcontracting=subcontracting_plan,
competed=extent_competed,
solicitation_procedure=solicitation_procedures,
set_aside=type_of_set_aside,
preference=evaluated_preference,
not_competed=other_than_full_and_open_competition,
offers_received=number_of_offers_received,
fed_biz_opps_yn=fed_biz_opps,
acronym=program_acronym,
national_interest=national_interest_action,
multi_year=multi_year_contract,
last_modified_date=last_modified_date,
unique_id)
#check to see that all of them are in the two words. - it matches
#the number of records in the dataset.
fema_all %>%
filter(award_idv_flag %in% c("IDV", "AWARD")) %>%
summarise (n())
fema_idv <-
fema_all %>%
filter(award_idv_flag == "IDV")
fema_award <-
fema_all %>%
filter (award_idv_flag == "AWARD")
#Now create a new R data file that contains both smaller tables.
save(fema_idv, fema_award , file="usaspending_split.RData")
rm(list=c('fema_idv', 'fema_award', 'fema_all'))
Now you can just load them when you need them. This time look at what’s in the National Interest field, to see if we can use it to isolate specific important events.
load("usaspending_split.RData")
#notice that you might get some of what you want this way, but I wouldn't count on them to fill it out.
fema_idv %>%
filter (national_interest != "NONE") %>%
group_by (national_interest) %>%
summarise( num_idvs = n(), num_contracts = n_distinct(piid))
fema_idv %>%
filter (national_interest== "HURRICANE MARIA 2017")
Now look at the awards that came from it:
fema_award %>%
filter (national_interest == "HURRICANE MARIA 2017") %>%
arrange (action_date)