Synopsis

This is my homework report for week 2, produced with R Markdown. In this homework I perform the five data importing exercises listed under Week 2’s Assignment section, which includes importing the following three data sets:

  1. Artificial Reddit user data
  2. HUD data regarding 2017 fair market rent
  3. Average daily temperatures for Cincinnati dating back to 1995

Packages Required

To reproduce the code and results throughout this homework assignment I used the following packages:

library(readxl)         # for reading in the .xlsx file in exercise #3
library(gdata)          # for scraping the .xlsx file in exercise #4

Homework Problems

For each problem I imported the data and save as a data frame. I then used head() to display the first few rows of the data frame and str() to display the structure of each data frame. In this example I do not display the code so that you can have the enjoyment of finding the required code on your own; however, in your homework I expect you to show all your code.


1. Download & import the csv file located at: https://bradleyboehmke.github.io/public/data/reddit.csv

'data.frame':   32754 obs. of  14 variables:
 $ id               : int  1 2 3 4 5 6 7 8 9 10 ...
 $ gender           : int  0 0 1 0 1 0 0 0 0 0 ...
 $ age.range        : Factor w/ 7 levels "18-24","25-34",..: 2 2 1 2 2 2 2 1 3 2 ...
 $ marital.status   : Factor w/ 6 levels "Engaged","Forever Alone",..: NA NA NA NA NA 4 3 4 4 3 ...
 $ employment.status: Factor w/ 6 levels "Employed full time",..: 1 1 2 2 1 1 1 4 1 2 ...
 $ military.service : Factor w/ 2 levels "No","Yes": NA NA NA NA NA 1 1 1 1 1 ...
 $ children         : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ education        : Factor w/ 7 levels "Associate degree",..: 2 2 5 2 2 2 5 2 2 5 ...
 $ country          : Factor w/ 439 levels " Canada"," Canada eh",..: 394 394 394 394 394 394 125 394 394 125 ...
 $ state            : Factor w/ 53 levels "","Alabama","Alaska",..: 33 33 48 33 6 33 1 6 33 1 ...
 $ income.range     : Factor w/ 8 levels "$100,000 - $149,999",..: 2 2 8 2 7 2 NA 7 2 7 ...
 $ fav.reddit       : Factor w/ 1834 levels "","___","-","?",..: 720 691 1511 1528 188 691 1318 571 1629 1 ...
 $ dog.cat          : Factor w/ 3 levels "I like cats.",..: NA NA NA NA NA 2 2 2 1 1 ...
 $ cheese           : Factor w/ 11 levels "American","Brie",..: NA NA NA NA NA 3 3 1 10 7 ...


2. Now import the above csv file directly from the url provided (without downloading to your local hard drive)


3. Import the .xlsx file located at: http://www.huduser.gov/portal/datasets/fmr/fmr2017/FY2017_4050_FMR.xlsx

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   4769 obs. of  21 variables:
 $ fips2010         : chr  "2300512300" "6099999999" "6999999999" "0100199999" ...
 $ fips2000         : chr  NA NA NA "0100199999" ...
 $ fmr2             : num  1078 677 666 822 977 ...
 $ fmr0             : num  755 502 411 587 807 501 665 665 491 464 ...
 $ fmr1             : num  851 506 498 682 847 505 751 751 494 467 ...
 $ fmr3             : num  1454 987 961 1054 1422 ...
 $ fmr4             : num  1579 1038 1158 1425 1634 ...
 $ State            : num  23 60 69 1 1 1 1 1 1 1 ...
 $ Metro_code       : chr  "METRO38860MM6400" "NCNTY60999N60999" "NCNTY69999N69999" "METRO33860M33860" ...
 $ areaname         : chr  "Portland, ME HUD Metro FMR Area" "American Samoa" "Northern Mariana Islands" "Montgomery, AL MSA" ...
 $ county           : num  NA 999 999 1 3 5 7 9 11 13 ...
 $ CouSub           : chr  "12300" "99999" "99999" "99999" ...
 $ countyname       : chr  "Cumberland County" "American Samoa" "Northern Mariana Islands" "Autauga County" ...
 $ county_town_name : chr  "Chebeague Island town" "American Samoa" "Northern Mariana Islands" "Autauga County" ...
 $ pop2010          : num  341 55519 53883 54571 182265 ...
 $ acs_2016_2       : num  1109 653 642 788 873 ...
 $ state_alpha      : chr  "ME" "AS" "MP" "AL" ...
 $ fmr_type         : num  40 40 40 40 40 40 40 40 40 40 ...
 $ metro            : num  1 0 0 1 1 0 1 1 0 0 ...
 $ FMR_PCT_Change   : num  0.972 1.037 1.037 1.043 1.119 ...
 $ FMR_Dollar_Change: num  -31 24 24 34 104 35 26 26 52 52 ...


4. Now import the above .xlsx file directly from the url provided (without downloading to your local hard drive)

trying URL 'http://www.huduser.gov/portal/datasets/fmr/fmr2017/FY2017_4050_FMR.xlsx'
Content type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' length 615031 bytes (600 KB)
==================================================
downloaded 600 KB
'data.frame':   4769 obs. of  21 variables:
 $ fips2010         : num  2.3e+09 6.1e+09 7.0e+09 1.0e+08 1.0e+08 ...
 $ fips2000         : num  NA NA NA 1e+08 1e+08 ...
 $ fmr2             : int  1078 677 666 822 977 671 866 866 621 621 ...
 $ fmr0             : int  755 502 411 587 807 501 665 665 491 464 ...
 $ fmr1             : int  851 506 498 682 847 505 751 751 494 467 ...
 $ fmr3             : int  1454 987 961 1054 1422 839 1163 1163 853 849 ...
 $ fmr4             : int  1579 1038 1158 1425 1634 958 1298 1298 856 1094 ...
 $ State            : int  23 60 69 1 1 1 1 1 1 1 ...
 $ Metro_code       : Factor w/ 2598 levels "METRO10180M10180",..: 451 2592 2594 384 160 625 55 55 626 627 ...
 $ areaname         : Factor w/ 2598 levels " Santa Ana-Anaheim-Irvine, CA HUD Metro FMR Area",..: 1903 52 1723 1633 571 122 186 186 263 271 ...
 $ county           : int  NA 999 999 1 3 5 7 9 11 13 ...
 $ CouSub           : int  12300 99999 99999 99999 99999 99999 99999 99999 99999 99999 ...
 $ countyname       : Factor w/ 1961 levels "A\xf1asco Municipio",..: 462 42 1265 92 99 110 163 178 239 249 ...
 $ county_town_name : Factor w/ 3175 levels "A\xf1asco Municipio",..: 533 61 2024 136 149 165 254 277 386 401 ...
 $ pop2010          : int  341 55519 53883 54571 182265 27457 22915 57322 10914 20947 ...
 $ acs_2016_2       : int  1109 653 642 788 873 636 840 840 569 569 ...
 $ state_alpha      : Factor w/ 56 levels "AK","AL","AR",..: 24 4 28 2 2 2 2 2 2 2 ...
 $ fmr_type         : int  40 40 40 40 40 40 40 40 40 40 ...
 $ metro            : int  1 0 0 1 1 0 1 1 0 0 ...
 $ FMR_PCT_Change   : num  0.972 1.037 1.037 1.043 1.119 ...
 $ FMR_Dollar_Change: int  -31 24 24 34 104 35 26 26 52 52 ...


5. Go to this University of Dayton webpage http://academic.udayton.edu/kissock/http/Weather/citylistUS.htm, scroll down to Ohio and import the Cincinnati (OHCINCIN.txt) file

'data.frame':   7948 obs. of  4 variables:
 $ V1: int  1 1 1 1 1 1 1 1 1 1 ...
 $ V2: int  1 2 3 4 5 6 7 8 9 10 ...
 $ V3: int  1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 ...
 $ V4: num  41.1 22.2 22.8 14.9 9.5 23.8 31.1 26.9 31.3 31.5 ...
LS0tCnRpdGxlOiAiV2VlayAyIEhvbWV3b3JrIgpvdXRwdXQ6CiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdAogIGh0bWxfZG9jdW1lbnQ6IGRlZmF1bHQKLS0tCgojIyBTeW5vcHNpcwoKVGhpcyBpcyBteSBob21ld29yayByZXBvcnQgZm9yIHdlZWsgMiwgcHJvZHVjZWQgd2l0aCBSIE1hcmtkb3duLiAgSW4gdGhpcyBob21ld29yayBJIHBlcmZvcm0gdGhlIGZpdmUgZGF0YSBpbXBvcnRpbmcgZXhlcmNpc2VzIGxpc3RlZCB1bmRlciBXZWVrIDIncyBbQXNzaWdubWVudCBzZWN0aW9uXShodHRwOi8vdWMtci5naXRodWIuaW8vZGF0YV93cmFuZ2xpbmcvd2Vlay0yI2Fzc2lnbm1lbnQpLCB3aGljaCBpbmNsdWRlcyBpbXBvcnRpbmcgdGhlIGZvbGxvd2luZyB0aHJlZSBkYXRhIHNldHM6CgoxLiBbQXJ0aWZpY2lhbCBSZWRkaXQgdXNlciBkYXRhXShodHRwczovL2JyYWRsZXlib2VobWtlLmdpdGh1Yi5pby9wdWJsaWMvZGF0YS9yZWRkaXQuY3N2KQoyLiBbSFVEIGRhdGEgcmVnYXJkaW5nIDIwMTcgZmFpciBtYXJrZXQgcmVudF0oaHR0cDovL3d3dy5odWR1c2VyLmdvdi9wb3J0YWwvZGF0YXNldHMvZm1yL2ZtcjIwMTcvRlkyMDE3XzQwNTBfRk1SLnhsc3gpCjMuIFtBdmVyYWdlIGRhaWx5IHRlbXBlcmF0dXJlcyBmb3IgQ2luY2lubmF0aSBkYXRpbmcgYmFjayB0byAxOTk1XShodHRwOi8vYWNhZGVtaWMudWRheXRvbi5lZHUva2lzc29jay9odHRwL1dlYXRoZXIvY2l0eWxpc3RVUy5odG0pCgojIyBQYWNrYWdlcyBSZXF1aXJlZAoKVG8gcmVwcm9kdWNlIHRoZSBjb2RlIGFuZCByZXN1bHRzIHRocm91Z2hvdXQgdGhpcyBob21ld29yayBhc3NpZ25tZW50IEkgdXNlZCB0aGUgZm9sbG93aW5nIHBhY2thZ2VzOgoKYGBge3J9CmxpYnJhcnkocmVhZHhsKSAgICAgICAgICMgZm9yIHJlYWRpbmcgaW4gdGhlIC54bHN4IGZpbGUgaW4gZXhlcmNpc2UgIzMKbGlicmFyeShnZGF0YSkgICAgICAgICAgIyBmb3Igc2NyYXBpbmcgdGhlIC54bHN4IGZpbGUgaW4gZXhlcmNpc2UgIzQKYGBgCgoKIyMgSG9tZXdvcmsgUHJvYmxlbXMKCkZvciBlYWNoIHByb2JsZW0gSSBpbXBvcnRlZCB0aGUgZGF0YSBhbmQgc2F2ZSBhcyBhIGRhdGEgZnJhbWUuIEkgdGhlbiB1c2VkIGBoZWFkKClgIHRvIGRpc3BsYXkgdGhlIGZpcnN0IGZldyByb3dzIG9mIHRoZSBkYXRhIGZyYW1lIGFuZCBgc3RyKClgIHRvIGRpc3BsYXkgdGhlIHN0cnVjdHVyZSBvZiBlYWNoIGRhdGEgZnJhbWUuIEluIHRoaXMgZXhhbXBsZSBJIGRvIG5vdCBkaXNwbGF5IHRoZSBjb2RlIHNvIHRoYXQgeW91IGNhbiBoYXZlIHRoZSBlbmpveW1lbnQgb2YgZmluZGluZyB0aGUgcmVxdWlyZWQgY29kZSBvbiB5b3VyIG93bjsgaG93ZXZlciwgaW4geW91ciBob21ld29yayBJIGV4cGVjdCB5b3UgdG8gc2hvdyBhbGwgeW91ciBjb2RlLgoKPGJyPgoKKioxXC4qKiBEb3dubG9hZCAmIGltcG9ydCB0aGUgY3N2IGZpbGUgbG9jYXRlZCBhdDogW2h0dHBzOi8vYnJhZGxleWJvZWhta2UuZ2l0aHViLmlvL3B1YmxpYy9kYXRhL3JlZGRpdC5jc3ZdKGh0dHBzOi8vYnJhZGxleWJvZWhta2UuZ2l0aHViLmlvL3B1YmxpYy9kYXRhL3JlZGRpdC5jc3YpCgpgYGB7ciwgZWNobz1GQUxTRX0KcmVkZGl0IDwtIHJlYWQuY3N2KCJkYXRhL3JlZGRpdC5jc3YiKQoKaGVhZChyZWRkaXQpCgpzdHIocmVkZGl0KQpgYGAKCjxicj4KCioqMlwuKiogTm93IGltcG9ydCB0aGUgYWJvdmUgY3N2IGZpbGUgZGlyZWN0bHkgZnJvbSB0aGUgdXJsIHByb3ZpZGVkICgqd2l0aG91dCogZG93bmxvYWRpbmcgdG8geW91ciBsb2NhbCBoYXJkIGRyaXZlKQoKYGBge3IsIGVjaG89RkFMU0V9CnJlZGRpdF91cmwgPC0gImh0dHBzOi8vYnJhZGxleWJvZWhta2UuZ2l0aHViLmlvL3B1YmxpYy9kYXRhL3JlZGRpdC5jc3YiCgpyZWRkaXQyIDwtIHJlYWQuY3N2KHJlZGRpdF91cmwpCgpoZWFkKHJlZGRpdDIpCgpzdHIocmVkZGl0MikKYGBgCgo8YnI+CgoqKjNcLioqIEltcG9ydCB0aGUgLnhsc3ggZmlsZSBsb2NhdGVkIGF0OiBbaHR0cDovL3d3dy5odWR1c2VyLmdvdi9wb3J0YWwvZGF0YXNldHMvZm1yL2ZtcjIwMTcvRlkyMDE3XzQwNTBfRk1SLnhsc3hdKGh0dHA6Ly93d3cuaHVkdXNlci5nb3YvcG9ydGFsL2RhdGFzZXRzL2Ztci9mbXIyMDE3L0ZZMjAxN180MDUwX0ZNUi54bHN4KQoKYGBge3IsIGVjaG89RkFMU0UsIHJlc3VsdHM9J2hpZGUnfQoKZXhjZWxfc2hlZXRzKCJkYXRhL0ZZMjAxN180MDUwX0ZNUi54bHN4IikKYGBgCgpgYGB7ciwgZWNobz1GQUxTRX0KcmVudCA8LSByZWFkX2V4Y2VsKCJkYXRhL0ZZMjAxN180MDUwX0ZNUi54bHN4Iiwgc2hlZXQgPSAiRVhDRUxfREFUQSIpCgpoZWFkKHJlbnQpCgpzdHIocmVudCkKYGBgCgo8YnI+Cgo0XC4gTm93IGltcG9ydCB0aGUgYWJvdmUgLnhsc3ggZmlsZSBkaXJlY3RseSBmcm9tIHRoZSB1cmwgcHJvdmlkZWQgKCp3aXRob3V0KiBkb3dubG9hZGluZyB0byB5b3VyIGxvY2FsIGhhcmQgZHJpdmUpCgpgYGB7ciwgd2FybmluZz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgZWNobz1GQUxTRX0KCmh1ZF91cmwgPC0gImh0dHA6Ly93d3cuaHVkdXNlci5nb3YvcG9ydGFsL2RhdGFzZXRzL2Ztci9mbXIyMDE3L0ZZMjAxN180MDUwX0ZNUi54bHN4IgoKcmVudDIgPC0gcmVhZC54bHMoaHVkX3VybCkKCmhlYWQocmVudDIpCgpzdHIocmVudDIpCmBgYAoKPGJyPgoKNVwuIEdvIHRvIHRoaXMgVW5pdmVyc2l0eSBvZiBEYXl0b24gd2VicGFnZSBbaHR0cDovL2FjYWRlbWljLnVkYXl0b24uZWR1L2tpc3NvY2svaHR0cC9XZWF0aGVyL2NpdHlsaXN0VVMuaHRtXShodHRwOi8vYWNhZGVtaWMudWRheXRvbi5lZHUva2lzc29jay9odHRwL1dlYXRoZXIvY2l0eWxpc3RVUy5odG0pLCBzY3JvbGwgZG93biB0byBPaGlvIGFuZCBpbXBvcnQgdGhlIENpbmNpbm5hdGkgKE9IQ0lOQ0lOLnR4dCkgZmlsZQoKYGBge3IsIGVjaG89RkFMU0V9CndlYXRoZXJfdXJsIDwtICJodHRwOi8vYWNhZGVtaWMudWRheXRvbi5lZHUva2lzc29jay9odHRwL1dlYXRoZXIvZ3NvZDk1LWN1cnJlbnQvT0hDSU5DSU4udHh0IgoKd2VhdGhlciA8LSByZWFkLnRhYmxlKHdlYXRoZXJfdXJsKQoKaGVhZCh3ZWF0aGVyKQoKc3RyKHdlYXRoZXIpCmBgYAo=