This markdown gives you instructions for importing and understanding the PPP data for your mini-memo. There are other background documents and example stories in the Canvas module that you can also review.

Download the data

To load the data into your Global Environment add this line to a code chunk:

    data_url <- "https://github.com/cronkitedata/rstudyguide/blob/master/data/az_ppp.Rda?raw=true"

    load (url ( data_url) )
    

That will create a data fram called az_ppp in your RStudio session.

Source

The original data was published by the SBA here https://sba.app.box.com/s/5myd1nxutoq8wxecx2562baruz774si6 on Feb. 1, 2021. I downloaded and unzipped the PPP portion, and read it into R using a different program.

The SBA has never released a record layout or data dictionary for this data that I can find. I worked on a project over the Christmas break using an earlier version of the data, so I think I understand much of what it contains.

I selected about half of the variables that are included in the original data, and simplified a few as shown below. I also converted all character fields to upper case to ease filtering, and converted the Zip Codes to 5-digits .

This dataset includes all loans that were made to companies in Arizona. A handful (6) of these are going toward projects in other states.

Here are the columns I kept, and what I understand them to mean. Any changes other than those above are indicated in the record layout.

The loan application form . There’s one important thing I don’t know the answer to: There are no rows with a loan statust that suggests that it was forgiven, though we know from reports that more than $100 billion have already been forgiven. There is a “loan status” field in the data, but that doesn’t have anything related to forgiveness, and the “PAID IN FULL” indicator doesn’t add up to nearly enough money. I’m trying to find out what the answer is there, but I wanted to warn you not to use that column until we get some clarity on it.

column name type description
id n A unique identifier supplied by the SBA
date_approved d Date the loan was initially approved
status_date d When the status was reported (NA if unreported)
borrower c Name of the borrower, all upper case
address c Address of the borrower
city c
state c
zip c 5-digit Zip code of the borrower
loan_status c Blank if redacted because of EXEMPTION 4 - VERY unclear what this means!
term n Months to pay back
initial_amt n Initial amount approved by SBA
payroll_amt n Amount used for payroll
nonpayroll_amt n Total amount used for other costs
franchise c Name of a franchisee
lender c Name of originating lender (not servicing lender)
lender_state c
rural_urban c
business_age c How long the business had been open before the loan -
project_city c City where the money will be used
project_county c County where the money will be used
project_state c State where the county will be used
jobs_reported n Number of jobs “saved” by the loan
naics_code c Standard industry code used in the federal government
naics_sector c Sector of that standard industry code, where it was a valid one
naics_descript c 2017 description of the industry, blank if a previous year’s version used
race_ethnicity c Race and ethnicity of the business owner, but usually not filled out
gender c Same with gender
veteran c Same with veteran
non_profit c Only filled out if the borrower is a non-profit organization
LS0tCnRpdGxlOiAiUFBQIGxvYW5zIGluIEFyaXpvbmEiCmF1dGhvcjogIlNhcmFoIENvaGVuIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCm91dHB1dDogCiAgaHRtbF9kb2N1bWVudDoKICAgIHRoZW1lOiB5ZXRpCiAgICBoaWdobGlnaHQ6IGthdGUKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OiB0cnVlCiAgICBkZl9wcmludDogcGFnZWQKICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKLS0tCgoKYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0UsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CgpsaWJyYXJ5KGtuaXRyKQpsaWJyYXJ5KHJtZGZvcm1hdHMpCgprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUpCgpsaWJyYXJ5KHRpZHl2ZXJzZSkKbGlicmFyeShqYW5pdG9yKQpsaWJyYXJ5KGx1YnJpZGF0ZSkKCgpvcHRpb25zKHNjaXBlbj05OTksIGRpZ2l0cz0xNSkKCmRhdGFfdXJsIDwtICJodHRwczovL2dpdGh1Yi5jb20vY3JvbmtpdGVkYXRhL3JzdHVkeWd1aWRlL2Jsb2IvbWFzdGVyL2RhdGEvYXpfcHBwLlJkYT9yYXc9dHJ1ZSIKbG9hZCAodXJsIChkYXRhX3VybCkgKQpgYGAKCgpUaGlzIG1hcmtkb3duIGdpdmVzIHlvdSBpbnN0cnVjdGlvbnMgZm9yIGltcG9ydGluZyBhbmQgdW5kZXJzdGFuZGluZyB0aGUgUFBQIGRhdGEgZm9yIHlvdXIgbWluaS1tZW1vLiBUaGVyZSBhcmUgb3RoZXIgYmFja2dyb3VuZCBkb2N1bWVudHMgYW5kIGV4YW1wbGUgc3RvcmllcyBpbiB0aGUgQ2FudmFzIG1vZHVsZSB0aGF0IHlvdSBjYW4gYWxzbyByZXZpZXcuIAoKIyMgRG93bmxvYWQgdGhlIGRhdGEgCgoKVG8gbG9hZCB0aGUgZGF0YSBpbnRvIHlvdXIgR2xvYmFsIEVudmlyb25tZW50IGFkZCB0aGlzIGxpbmUgdG8gYSBjb2RlIGNodW5rOiAKCiAgICAgICAgZGF0YV91cmwgPC0gImh0dHBzOi8vZ2l0aHViLmNvbS9jcm9ua2l0ZWRhdGEvcnN0dWR5Z3VpZGUvYmxvYi9tYXN0ZXIvZGF0YS9hel9wcHAuUmRhP3Jhdz10cnVlIgoKICAgICAgICBsb2FkICh1cmwgKCBkYXRhX3VybCkgKQogICAgICAgIAoKClRoYXQgd2lsbCBjcmVhdGUgYSBkYXRhIGZyYW0gY2FsbGVkIGBhel9wcHBgIGluIHlvdXIgUlN0dWRpbyBzZXNzaW9uLiAgCgoKCiMjIFNvdXJjZQoKVGhlIG9yaWdpbmFsIGRhdGEgd2FzIHB1Ymxpc2hlZCBieSB0aGUgU0JBIGhlcmUgPGh0dHBzOi8vc2JhLmFwcC5ib3guY29tL3MvNW15ZDFueHV0b3E4d3hlY3gyNTYyYmFydXo3NzRzaTY+IG9uIEZlYi4gMSwgMjAyMS4gSSBkb3dubG9hZGVkIGFuZCB1bnppcHBlZCB0aGUgUFBQIHBvcnRpb24sIGFuZCByZWFkIGl0IGludG8gUiB1c2luZyBhIGRpZmZlcmVudCBwcm9ncmFtLiAKClRoZSBTQkEgaGFzIG5ldmVyIHJlbGVhc2VkIGEgcmVjb3JkIGxheW91dCBvciBkYXRhIGRpY3Rpb25hcnkgZm9yIHRoaXMgZGF0YSB0aGF0IEkgY2FuIGZpbmQuIEkgd29ya2VkIG9uIGEgcHJvamVjdCBvdmVyIHRoZSBDaHJpc3RtYXMgYnJlYWsgdXNpbmcgYW4gZWFybGllciB2ZXJzaW9uIG9mIHRoZSBkYXRhLCBzbyBJICp0aGluayogSSB1bmRlcnN0YW5kIG11Y2ggb2Ygd2hhdCBpdCBjb250YWlucy4gCgpJIHNlbGVjdGVkIGFib3V0IGhhbGYgb2YgdGhlIHZhcmlhYmxlcyB0aGF0IGFyZSBpbmNsdWRlZCBpbiB0aGUgb3JpZ2luYWwgZGF0YSwgYW5kIHNpbXBsaWZpZWQgYSBmZXcgYXMgc2hvd24gYmVsb3cuIEkgYWxzbyBjb252ZXJ0ZWQgYWxsIGNoYXJhY3RlciBmaWVsZHMgdG8gdXBwZXIgY2FzZSB0byBlYXNlIGZpbHRlcmluZywgYW5kIGNvbnZlcnRlZCB0aGUgWmlwIENvZGVzIHRvIDUtZGlnaXRzIC4gCgpUaGlzIGRhdGFzZXQgaW5jbHVkZXMgYWxsIGxvYW5zIHRoYXQgd2VyZSBtYWRlIHRvIGNvbXBhbmllcyBpbiBBcml6b25hLiBBIGhhbmRmdWwgKDYpIG9mIHRoZXNlIGFyZSBnb2luZyB0b3dhcmQgcHJvamVjdHMgaW4gb3RoZXIgc3RhdGVzLiAgIAoKSGVyZSBhcmUgdGhlIGNvbHVtbnMgSSBrZXB0LCBhbmQgd2hhdCBJIHVuZGVyc3RhbmQgdGhlbSB0byBtZWFuLiBBbnkgY2hhbmdlcyBvdGhlciB0aGFuIHRob3NlIGFib3ZlIGFyZSBpbmRpY2F0ZWQgaW4gdGhlIHJlY29yZCBsYXlvdXQuIAoKClRoZSBbbG9hbiBhcHBsaWNhdGlvbiBmb3JtXShodHRwczovL3d3dy5zYmEuZ292L3NpdGVzL2RlZmF1bHQvZmlsZXMvMjAyMC0wNC9QUFAtQm9ycm93ZXItQXBwbGljYXRpb24tRm9ybS1GaWxsYWJsZS5wZGYpIC4gVGhlcmUncyBvbmUgaW1wb3J0YW50IHRoaW5nIEkgZG9uJ3Qga25vdyB0aGUgYW5zd2VyIHRvOiBUaGVyZSBhcmUgbm8gcm93cyB3aXRoIGEgbG9hbiBzdGF0dXN0IHRoYXQgc3VnZ2VzdHMgdGhhdCBpdCB3YXMgZm9yZ2l2ZW4sIHRob3VnaCB3ZSBrbm93IGZyb20gcmVwb3J0cyB0aGF0IG1vcmUgdGhhbiAkMTAwIGJpbGxpb24gaGF2ZSBhbHJlYWR5IGJlZW4gZm9yZ2l2ZW4uIFRoZXJlIGlzIGEgImxvYW4gc3RhdHVzIiBmaWVsZCBpbiB0aGUgZGF0YSwgYnV0IHRoYXQgZG9lc24ndCBoYXZlIGFueXRoaW5nIHJlbGF0ZWQgdG8gZm9yZ2l2ZW5lc3MsIGFuZCB0aGUgIlBBSUQgSU4gRlVMTCIgaW5kaWNhdG9yIGRvZXNuJ3QgYWRkIHVwIHRvIG5lYXJseSBlbm91Z2ggbW9uZXkuIEknbSB0cnlpbmcgdG8gZmluZCBvdXQgd2hhdCB0aGUgYW5zd2VyIGlzIHRoZXJlLCBidXQgSSB3YW50ZWQgdG8gd2FybiB5b3Ugbm90IHRvIHVzZSB0aGF0IGNvbHVtbiB1bnRpbCB3ZSBnZXQgc29tZSBjbGFyaXR5IG9uIGl0LiAKCjxkaXYgc3R5bGU9IndpZHRoOjcwJSI+Cgpjb2x1bW4gbmFtZSAgICAgICB8IHR5cGUgfCBkZXNjcmlwdGlvbgotLS0tLS0tLS0tLS0tLS0tLS18IC0tLS0gfCAtLS0tLS0tLS0tLQppZCAgICAgICAgICAgICAgICB8ICBuICB8IEEgdW5pcXVlIGlkZW50aWZpZXIgc3VwcGxpZWQgYnkgdGhlIFNCQQpkYXRlX2FwcHJvdmVkICAgICB8ICBkICB8IERhdGUgdGhlIGxvYW4gd2FzIGluaXRpYWxseSBhcHByb3ZlZApzdGF0dXNfZGF0ZSAgICAgICB8ICBkICB8IFdoZW4gdGhlIHN0YXR1cyB3YXMgcmVwb3J0ZWQgKE5BIGlmIHVucmVwb3J0ZWQpCmJvcnJvd2VyICAgICAgICAgIHwgIGMgIHwgTmFtZSBvZiB0aGUgYm9ycm93ZXIsIGFsbCB1cHBlciBjYXNlCmFkZHJlc3MgICAgICAgICAgIHwgIGMgIHwgQWRkcmVzcyBvZiB0aGUgYm9ycm93ZXIKY2l0eSAgICAgICAgICAgICAgfCAgYyAgfApzdGF0ZSAgICAgICAgICAgICB8ICBjICB8CnppcCAgICAgICAgICAgICAgIHwgIGMgIHwgNS1kaWdpdCBaaXAgY29kZSBvZiB0aGUgYm9ycm93ZXIKbG9hbl9zdGF0dXMgICAgICAgfCAgYyAgfCBCbGFuayBpZiByZWRhY3RlZCBiZWNhdXNlIG9mIEVYRU1QVElPTiA0IC0gVkVSWSB1bmNsZWFyIHdoYXQgdGhpcyBtZWFucyEgCnRlcm0gICAgICAgICAgICAgIHwgIG4gIHwgTW9udGhzIHRvIHBheSBiYWNrCmluaXRpYWxfYW10ICAgICAgIHwgIG4gIHwgSW5pdGlhbCBhbW91bnQgYXBwcm92ZWQgYnkgU0JBCnBheXJvbGxfYW10ICAgICAgIHwgIG4gIHwgQW1vdW50IHVzZWQgZm9yIHBheXJvbGwKbm9ucGF5cm9sbF9hbXQgICAgfCAgbiAgfCBUb3RhbCBhbW91bnQgdXNlZCBmb3Igb3RoZXIgY29zdHMKZnJhbmNoaXNlICAgICAgICAgfCAgYyAgfCBOYW1lIG9mIGEgZnJhbmNoaXNlZSAKbGVuZGVyICAgICAgICAgICAgfCAgYyAgfCBOYW1lIG9mIG9yaWdpbmF0aW5nIGxlbmRlciAobm90IHNlcnZpY2luZyBsZW5kZXIpCmxlbmRlcl9zdGF0ZSAgICAgIHwgIGMgIHwKcnVyYWxfdXJiYW4gICAgICAgfCAgYyAgfApidXNpbmVzc19hZ2UgICAgICB8ICBjICB8IEhvdyBsb25nIHRoZSBidXNpbmVzcyBoYWQgYmVlbiBvcGVuIGJlZm9yZSB0aGUgbG9hbiAtIApwcm9qZWN0X2NpdHkgICAgICB8ICBjICB8IENpdHkgd2hlcmUgdGhlIG1vbmV5IHdpbGwgYmUgdXNlZApwcm9qZWN0X2NvdW50eSAgICB8ICBjICB8IENvdW50eSB3aGVyZSB0aGUgbW9uZXkgd2lsbCBiZSB1c2VkCnByb2plY3Rfc3RhdGUgICAgIHwgIGMgIHwgU3RhdGUgd2hlcmUgdGhlIGNvdW50eSB3aWxsIGJlIHVzZWQKam9ic19yZXBvcnRlZCAgICAgfCAgbiAgfCBOdW1iZXIgb2Ygam9icyAic2F2ZWQiIGJ5IHRoZSBsb2FuCm5haWNzX2NvZGUgICAgICAgIHwgIGMgIHwgU3RhbmRhcmQgaW5kdXN0cnkgY29kZSB1c2VkIGluIHRoZSBmZWRlcmFsIGdvdmVybm1lbnQKbmFpY3Nfc2VjdG9yICAgICAgfCAgYyAgfCBTZWN0b3Igb2YgdGhhdCBzdGFuZGFyZCBpbmR1c3RyeSBjb2RlLCB3aGVyZSBpdCB3YXMgYSB2YWxpZCBvbmUKbmFpY3NfZGVzY3JpcHQgICAgfCAgYyAgfCAyMDE3IGRlc2NyaXB0aW9uIG9mIHRoZSBpbmR1c3RyeSwgYmxhbmsgaWYgYSBwcmV2aW91cyB5ZWFyJ3MgdmVyc2lvbiB1c2VkCnJhY2VfZXRobmljaXR5ICAgIHwgIGMgIHwgUmFjZSBhbmQgZXRobmljaXR5IG9mIHRoZSBidXNpbmVzcyBvd25lciwgYnV0IHVzdWFsbHkgbm90IGZpbGxlZCBvdXQKZ2VuZGVyICAgICAgICAgICAgfCAgYyAgfCBTYW1lIHdpdGggZ2VuZGVyCnZldGVyYW4gICAgICAgICAgIHwgIGMgIHwgU2FtZSB3aXRoIHZldGVyYW4Kbm9uX3Byb2ZpdCAgICAgICAgfCAgYyAgfCBPbmx5IGZpbGxlZCBvdXQgaWYgdGhlIGJvcnJvd2VyIGlzIGEgbm9uLXByb2ZpdCBvcmdhbml6YXRpb24KIAo8L2Rpdj4KCgo=