Week 2 Homework

Synopsis

This is my homework report for week 2, produced with R Markdown. In this homework I perform the five data importing exercises listed under Week 2’s Assignment section, which includes importing the following three data sets:

  1. Artificial Reddit user data
  2. HUD data regarding 2017 fair market rent
  3. Average daily temperatures for Cincinnati dating back to 1995

Packages Required

To reproduce the code and results throughout this homework assignment I used the following packages:

library(readxl)
library(gdata)

Homework Problems

For each problem I imported the data and save as a data frame. I then used head() to display the first few rows of the data frame and str() to display the structure of each data frame. In this example I do not display the code so that you can have the enjoyment of finding the required code on your own; however, in your homework I expect you to show all your code.

  1. Download & import the csv file located at: https://bradleyboehmke.github.io/public/data/reddit.csv
'data.frame':   32754 obs. of  14 variables:
 $ id               : int  1 2 3 4 5 6 7 8 9 10 ...
 $ gender           : int  0 0 1 0 1 0 0 0 0 0 ...
 $ age.range        : Factor w/ 7 levels "18-24","25-34",..: 2 2 1 2 2 2 2 1 3 2 ...
 $ marital.status   : Factor w/ 6 levels "Engaged","Forever Alone",..: NA NA NA NA NA 4 3 4 4 3 ...
 $ employment.status: Factor w/ 6 levels "Employed full time",..: 1 1 2 2 1 1 1 4 1 2 ...
 $ military.service : Factor w/ 2 levels "No","Yes": NA NA NA NA NA 1 1 1 1 1 ...
 $ children         : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ education        : Factor w/ 7 levels "Associate degree",..: 2 2 5 2 2 2 5 2 2 5 ...
 $ country          : Factor w/ 439 levels " Canada"," Canada eh",..: 394 394 394 394 394 394 125 394 394 125 ...
 $ state            : Factor w/ 53 levels "","Alabama","Alaska",..: 33 33 48 33 6 33 1 6 33 1 ...
 $ income.range     : Factor w/ 8 levels "$100,000 - $149,999",..: 2 2 8 2 7 2 NA 7 2 7 ...
 $ fav.reddit       : Factor w/ 1834 levels "","'home' page (or front page if you prefer)",..: 720 691 1511 1528 188 691 1318 571 1629 1 ...
 $ dog.cat          : Factor w/ 3 levels "I like cats.",..: NA NA NA NA NA 2 2 2 1 1 ...
 $ cheese           : Factor w/ 11 levels "American","Brie",..: NA NA NA NA NA 3 3 1 10 7 ...
  1. Now import the above csv file directly from the url provided (without downloading to your local hard drive)

  1. Import the .xlsx file located at: http://www.huduser.gov/portal/datasets/fmr/fmr2017/FY2017_4050_FMR.xlsx
'data.frame':   32754 obs. of  14 variables:
 $ id               : int  1 2 3 4 5 6 7 8 9 10 ...
 $ gender           : int  0 0 1 0 1 0 0 0 0 0 ...
 $ age.range        : Factor w/ 7 levels "18-24","25-34",..: 2 2 1 2 2 2 2 1 3 2 ...
 $ marital.status   : Factor w/ 6 levels "Engaged","Forever Alone",..: NA NA NA NA NA 4 3 4 4 3 ...
 $ employment.status: Factor w/ 6 levels "Employed full time",..: 1 1 2 2 1 1 1 4 1 2 ...
 $ military.service : Factor w/ 2 levels "No","Yes": NA NA NA NA NA 1 1 1 1 1 ...
 $ children         : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ education        : Factor w/ 7 levels "Associate degree",..: 2 2 5 2 2 2 5 2 2 5 ...
 $ country          : Factor w/ 439 levels " Canada"," Canada eh",..: 394 394 394 394 394 394 125 394 394 125 ...
 $ state            : Factor w/ 53 levels "","Alabama","Alaska",..: 33 33 48 33 6 33 1 6 33 1 ...
 $ income.range     : Factor w/ 8 levels "$100,000 - $149,999",..: 2 2 8 2 7 2 NA 7 2 7 ...
 $ fav.reddit       : Factor w/ 1834 levels "","'home' page (or front page if you prefer)",..: 720 691 1511 1528 188 691 1318 571 1629 1 ...
 $ dog.cat          : Factor w/ 3 levels "I like cats.",..: NA NA NA NA NA 2 2 2 1 1 ...
 $ cheese           : Factor w/ 11 levels "American","Brie",..: NA NA NA NA NA 3 3 1 10 7 ...
  1. Now import the above .xlsx file directly from the url provided (without downloading to your local hard drive)

  1. Go to this University of Dayton webpage http://academic.udayton.edu/kissock/http/Weather/citylistUS.htm, scroll down to Ohio and import the Cincinnati (OHCINCIN.txt) file
'data.frame':   7963 obs. of  4 variables:
 $ V1: int  1 1 1 1 1 1 1 1 1 1 ...
 $ V2: int  1 2 3 4 5 6 7 8 9 10 ...
 $ V3: int  1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 ...
 $ V4: num  41.1 22.2 22.8 14.9 9.5 23.8 31.1 26.9 31.3 31.5 ...