The goal of this assignment is to give you practice in preparing different datasets for downstream analysis work.
Your task is to: (1) Choose any three of the “wide” datasets identified in the Week 6 Discussion items. (You may use your own dataset; please don’t use my Sample Post dataset, since that was used in your Week 6 assignment!) For each of the three chosen datasets: ??? Create a .CSV file (or optionally, a MySQL database!) that includes all of the information included in the dataset. You’re encouraged to use a “wide” structure similar to how the information appears in the discussion item, so that you can practice tidying and transformations as described below. ??? Read the information from your .CSV file into R, and use tidyr and dplyr as needed to tidy and transform your data. [Most of your grade will be based on this step!] ??? Perform the analysis requested in the discussion item. ??? Your code should be in an R Markdown file, posted to rpubs.com, and should include narrative descriptions of your data cleanup work, analysis, and conclusions. (2) Please include in your homework submission, for each of the three chosen datasets: ??? The URL to the .Rmd file in your GitHub repository, and ??? The URL for your rpubs.com web page.
I have used data the dataset for the Mobile Food Facility Permit of San Francisco area which can be found in https://data.sfgov.org/Economy-and-Community/Mobile-Food-Facility-Permit/rqzj-sfat
but I have downloaded the csv file and uploaded it to my github account and used it for my further research and work.
library(RCurl)
## Warning: package 'RCurl' was built under R version 3.5.1
## Loading required package: bitops
mobile_food_permit <- read.csv("https://raw.githubusercontent.com/maharjansudhan/DATA607/master/Mobile_Food_Facility_Permit.csv", header = TRUE, sep = ',')
Check the data structure what the data is about and any other kind of information.
library(tidyverse, quietly= TRUE)
## Warning: package 'tidyverse' was built under R version 3.5.1
## -- Attaching packages ---------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.0.0 v purrr 0.2.5
## v tibble 1.4.2 v dplyr 0.7.6
## v tidyr 0.8.1 v stringr 1.3.1
## v readr 1.1.1 v forcats 0.3.0
## Warning: package 'ggplot2' was built under R version 3.5.1
## Warning: package 'tibble' was built under R version 3.5.1
## Warning: package 'tidyr' was built under R version 3.5.1
## Warning: package 'readr' was built under R version 3.5.1
## Warning: package 'purrr' was built under R version 3.5.1
## Warning: package 'dplyr' was built under R version 3.5.1
## Warning: package 'stringr' was built under R version 3.5.1
## Warning: package 'forcats' was built under R version 3.5.1
## -- Conflicts ------------------------------------------------------------------------- tidyverse_conflicts() --
## x tidyr::complete() masks RCurl::complete()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
lets have a look at the data
dim(mobile_food_permit)
## [1] 621 24
str(mobile_food_permit)
## 'data.frame': 621 obs. of 24 variables:
## $ locationid : int 1222440 751253 735318 364218 735315 773095 773105 848184 848185 812018 ...
## $ Applicant : Factor w/ 111 levels "Akuranvyka USA Inc",..: 29 68 111 100 111 4 4 71 71 86 ...
## $ FacilityType : Factor w/ 3 levels "","Push Cart",..: 2 3 2 2 2 2 2 3 3 3 ...
## $ cnn : int 9090000 5688000 30727000 9543000 4969000 30747000 417000 2799106 211101 9094000 ...
## $ LocationDescription: Factor w/ 462 levels "","01ST ST: CLEMENTINA ST to FOLSOM ST (245 - 299)",..: 345 191 308 356 171 304 46 103 31 339 ...
## $ Address : Factor w/ 558 levels "1 BUSH ST","1 CALIFORNIA ST",..: 420 169 386 479 2 11 459 189 382 456 ...
## $ blocklot : Factor w/ 549 levels "","0042022","0043001",..: 194 154 17 187 37 129 274 503 491 212 ...
## $ block : Factor w/ 390 levels "","0042","0043",..: 146 122 16 145 27 107 191 357 346 156 ...
## $ lot : Factor w/ 119 levels "","000","001",..: 93 86 33 30 13 13 21 38 55 80 ...
## $ permit : Factor w/ 155 levels "11MFF-0175","12MFF-0083",..: 155 8 7 2 7 6 6 16 16 11 ...
## $ Status : Factor w/ 7 levels "APPROVED","EXPIRED",..: 6 6 6 7 6 6 6 6 6 6 ...
## $ FoodItems : Factor w/ 132 levels "","7 Multiple Trucks on rotation (1 on Mission Bay Blvd South & 6 on 4th St). Serving everything but hot dogs",..: 132 110 1 64 1 46 46 44 44 16 ...
## $ X : num 6012851 6007857 6013917 6012504 6013553 ...
## $ Y : num 2115275 2107724 2117244 2114927 2116845 ...
## $ Latitude : num 37.8 37.8 37.8 37.8 37.8 ...
## $ Longitude : num -122 -122 -122 -122 -122 ...
## $ Schedule : Factor w/ 155 levels "http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=11MFF-0"| __truncated__,..: 155 8 7 2 7 6 6 16 16 11 ...
## $ dayshours : Factor w/ 221 levels "","Fr:10AM-3PM",..: 57 1 1 89 1 217 95 97 187 1 ...
## $ NOISent : Factor w/ 4 levels "","05/15/2018 12:00:00 AM",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Approved : Factor w/ 49 levels "","02/07/2018 12:00:00 AM",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Received : Factor w/ 89 levels "2011-09-26","2012-04-03",..: 89 8 7 2 7 6 6 15 15 11 ...
## $ PriorPermit : int 0 0 0 0 0 0 0 0 0 0 ...
## $ ExpirationDate : Factor w/ 12 levels "","01/30/2018 12:00:00 AM",..: 11 1 4 1 4 1 1 5 5 1 ...
## $ Location : Factor w/ 513 levels "(0, 0)","(37.7093754640014, -122.404154378509)",..: 431 256 492 418 480 307 278 115 78 403 ...
glimpse(mobile_food_permit)
## Observations: 621
## Variables: 24
## $ locationid <int> 1222440, 751253, 735318, 364218, 735315, 7...
## $ Applicant <fct> Faith Sandwich, Pipo's Grill, Ziaurehman A...
## $ FacilityType <fct> Push Cart, Truck, Push Cart, Push Cart, Pu...
## $ cnn <int> 9090000, 5688000, 30727000, 9543000, 49690...
## $ LocationDescription <fct> MISSION ST: SHAW ALY to ANTHONY ST (543 - ...
## $ Address <fct> 560 MISSION ST, 1800 FOLSOM ST, 5 THE EMBA...
## $ blocklot <fct> 3708095, 3549083, 0234017, 3707014, 026400...
## $ block <fct> 3708, 3549, 0234, 3707, 0264, 3506, 3783, ...
## $ lot <fct> 095, 083, 017, 014, 004, 004, 009, 021, 03...
## $ permit <fct> 18MFF-0108, 16MFF-0010, 15MFF-0159, 12MFF-...
## $ Status <fct> REQUESTED, REQUESTED, REQUESTED, SUSPEND, ...
## $ FoodItems <fct> Vietnamese sandwiches: various meat rice p...
## $ X <dbl> 6012851, 6007857, 6013917, 6012504, 601355...
## $ Y <dbl> 2115275, 2107724, 2117244, 2114927, 211684...
## $ Latitude <dbl> 37.78886, 37.76785, 37.79433, 37.78789, 37...
## $ Longitude <dbl> -122.3994, -122.4161, -122.3958, -122.4005...
## $ Schedule <fct> http://bsm.sfdpw.org/PermitsTracker/report...
## $ dayshours <fct> Mo-Fr:8AM-3PM, , , Mo-Su:7AM-6PM, , We/Th/...
## $ NOISent <fct> , , , , , , , , , , , , , , , , , , , , , ...
## $ Approved <fct> , , , , , , , , , , , , , , , , , , 12/28/...
## $ Received <fct> 2018-09-25, 2016-02-04, 2015-12-31, 2012-0...
## $ PriorPermit <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ ExpirationDate <fct> 07/15/2019 12:00:00 AM, , 03/15/2016 12:00...
## $ Location <fct> (37.788864715343, -122.399359351363), (37....
head(mobile_food_permit)
## locationid Applicant FacilityType cnn
## 1 1222440 Faith Sandwich Push Cart 9090000
## 2 751253 Pipo's Grill Truck 5688000
## 3 735318 Ziaurehman Amini Push Cart 30727000
## 4 364218 The Chai Cart Push Cart 9543000
## 5 735315 Ziaurehman Amini Push Cart 4969000
## 6 773095 Athena SF Gyro Push Cart 30747000
## LocationDescription
## 1 MISSION ST: SHAW ALY to ANTHONY ST (543 - 586)
## 2 FOLSOM ST: 14TH ST to 15TH ST (1800 - 1899)
## 3 MARKET ST: DRUMM ST intersection
## 4 NEW MONTGOMERY ST: AMBROSE BIERCE ST to MISSION ST (77 - 99)
## 5 DRUMM ST: MARKET ST to CALIFORNIA ST (1 - 6)
## 6 MARKET ST: 11TH ST intersection
## Address blocklot block lot permit Status
## 1 560 MISSION ST 3708095 3708 095 18MFF-0108 REQUESTED
## 2 1800 FOLSOM ST 3549083 3549 083 16MFF-0010 REQUESTED
## 3 5 THE EMBARCADERO 0234017 0234 017 15MFF-0159 REQUESTED
## 4 79 NEW MONTGOMERY ST 3707014 3707 014 12MFF-0083 SUSPEND
## 5 1 CALIFORNIA ST 0264004 0264 004 15MFF-0159 REQUESTED
## 6 10 SOUTH VAN NESS AVE 3506004 3506 004 15MFF-0145 REQUESTED
## FoodItems
## 1 Vietnamese sandwiches: various meat rice plates & bowls: vermicelli: spring rolls: sticky rice: Vietnamese Goi: pho: noodles: coffee: various flavored tea : various soda and juices: water
## 2 Tacos: Burritos: Hot Dogs: and Hamburgers
## 3
## 4 Hot Indian Chai (Tea)
## 5
## 6 Gyro pita bread (Lamb or chicken): lamb over rice: chicken over rice: chicken biryani rice: soft drinks
## X Y Latitude Longitude
## 1 6012851 2115275 37.78886 -122.3994
## 2 6007857 2107724 37.76785 -122.4161
## 3 6013917 2117244 37.79433 -122.3958
## 4 6012504 2114927 37.78789 -122.4005
## 5 6013553 2116845 37.79321 -122.3970
## 6 6006927 2110076 37.77426 -122.4195
## Schedule
## 1 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=18MFF-0108&ExportPDF=1&Filename=18MFF-0108_schedule.pdf
## 2 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=16MFF-0010&ExportPDF=1&Filename=16MFF-0010_schedule.pdf
## 3 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=15MFF-0159&ExportPDF=1&Filename=15MFF-0159_schedule.pdf
## 4 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=12MFF-0083&ExportPDF=1&Filename=12MFF-0083_schedule.pdf
## 5 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=15MFF-0159&ExportPDF=1&Filename=15MFF-0159_schedule.pdf
## 6 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=15MFF-0145&ExportPDF=1&Filename=15MFF-0145_schedule.pdf
## dayshours NOISent Approved Received PriorPermit
## 1 Mo-Fr:8AM-3PM 2018-09-25 0
## 2 2016-02-04 0
## 3 2015-12-31 0
## 4 Mo-Su:7AM-6PM 2012-04-03 0
## 5 2015-12-31 0
## 6 We/Th/Fr:6AM-6PM 2015-09-01 0
## ExpirationDate Location
## 1 07/15/2019 12:00:00 AM (37.788864715343, -122.399359351363)
## 2 (37.7678524427181, -122.416104892532)
## 3 03/15/2016 12:00:00 AM (37.7943310032468, -122.395811053023)
## 4 (37.7878896999061, -122.400535326777)
## 5 03/15/2016 12:00:00 AM (37.7932137316634, -122.397043036718)
## 6 (37.77425926306, -122.419485988398)
tail(mobile_food_permit)
## locationid Applicant FacilityType cnn
## 616 1058991 Ruru Juice LLC Truck 9092000
## 617 1193831 Eva's Catering Truck 10006000
## 618 1184111 D & T Catering Truck 13451000
## 619 1219122 Halal Cart of San Francisco Push Cart 3529000
## 620 1163788 SOHOMEI, LLC Truck 9094000
## 621 1181515 John's Catering #5 Truck 6742000
## LocationDescription
## 616 MISSION ST: 02ND ST to NEW MONTGOMERY ST (600 - 634)
## 617 ORTEGA ST: 18TH AVE to 19TH AVE (1100 - 1199)
## 618 WASHINGTON ST: MAPLE ST to CHERRY ST (3800 - 3899)
## 619 CALIFORNIA ST: LEIDESDORFF ST to MONTGOMERY ST (449 - 499)
## 620 MISSION ST: ANNIE ST to 03RD ST (663 - 699)
## 621 HARRISON ST: ALAMEDA ST to 15TH ST (1830 - 1899)
## Address blocklot block lot permit Status
## 616 601 MISSION ST 3722001 3722 001 17MFF-0198 REQUESTED
## 617 1199 ORTEGA ST 1404045 1404 045 18MFF-0086 APPROVED
## 618 3839 WASHINGTON ST 0992034 0992 034 18MFF-0057 APPROVED
## 619 400 MONTGOMERY ST 0239009 0239 009 18MFF-0105 APPROVED
## 620 690 MISSION ST 3707024 3707 024 18MFF-0028 REQUESTED
## 621 1830 HARRISON ST 3550020 3550 020 18MFF-0045 APPROVED
## FoodItems
## 616 Smoothies: Juice: Salads: Fruit Bowls: Soup
## 617 Cold Truck: Burrito: Corn Dog: Salads: Sandwiches: Quesadilla: Tacos: Fried Rice: Cow Mein: Chinese Rice: Noodle Plates: Soup: Bacon: Eggs: Ham: Avacado: Sausages: Beverages
## 618 Cold Truck: Pre-packaged sandwiches: Chicken Bake: Canned Soup: Chili Dog: Corn Dog: Cup of Noodles: Egg Muffins: Hamburgers: Cheeseburgers: Hot Dog: Hot sandwiches: quesadillas: Beverages: Flan: Fruits: Yogurt: Candy: Cookies: Chips: Donuts: Snacks
## 619 Halal Gyro over Rice: Halal Chicken over Rice: Halal Gyro: and Chicken Sandwich
## 620 COLD TRUCK. Deli: bbq chicken skewer: Chinese spring roll: Chinese fried rice/noodle: fried chicken leg/wing: bbq chicken sandwich: chicken cheese burger: burrito: lumpia. Snack: sunflower seeds: muffins: chips: snickers: kit-kat: 10 types of chocolate. Drinks: Coke: 7-Up: Dr. Pepper: Pepsi: Redbull: Vitamin Water: Rockstar: Coconut Juice: Water. Hot drinks: coffee: tea.
## 621 Cold Truck: Soda:Chips:Candy: Cold/Hot Sandwiches: Donuts. (Pitco Wholesale)
## X Y Latitude Longitude
## 616 6012726 2114843 37.78767 -122.3998
## 617 5986603 2113870 37.78351 -122.4901
## 618 5996468 2115510 37.78858 -122.4561
## 619 6011956 2116853 37.79315 -122.4026
## 620 NA NA 0.00000 0.0000
## 621 6008479 2107759 37.76798 -122.4140
## Schedule
## 616 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=17MFF-0198&ExportPDF=1&Filename=17MFF-0198_schedule.pdf
## 617 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=18MFF-0086&ExportPDF=1&Filename=18MFF-0086_schedule.pdf
## 618 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=18MFF-0057&ExportPDF=1&Filename=18MFF-0057_schedule.pdf
## 619 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=18MFF-0105&ExportPDF=1&Filename=18MFF-0105_schedule.pdf
## 620 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=18MFF-0028&ExportPDF=1&Filename=18MFF-0028_schedule.pdf
## 621 http://bsm.sfdpw.org/PermitsTracker/reports/report.aspx?title=schedule&report=rptSchedule¶ms=permit=18MFF-0045&ExportPDF=1&Filename=18MFF-0045_schedule.pdf
## dayshours NOISent Approved
## 616 Tu/Th/Fr:9AM-1PM
## 617 Sa:8AM-10AM 07/26/2018 12:00:00 AM
## 618 Mo-Fr:10AM-11AM 07/11/2018 12:00:00 AM
## 619 Tu/Th:10AM-4PM 09/12/2018 12:00:00 AM
## 620 Mo-Fr:7AM-8AM/10AM-11AM/12PM-1PM
## 621 Mo-Fr:2PM-3PM 07/06/2018 12:00:00 AM
## Received PriorPermit ExpirationDate
## 616 2017-11-29 0
## 617 2018-07-26 1 07/15/2019 12:00:00 AM
## 618 2018-07-11 1 07/15/2019 12:00:00 AM
## 619 2018-09-12 0 07/15/2019 12:00:00 AM
## 620 2018-05-21 1 07/15/2019 12:00:00 AM
## 621 2018-07-06 1 07/15/2019 12:00:00 AM
## Location
## 616 (37.7876716444879, -122.399762925839)
## 617 (37.7835091222127, -122.4900712081)
## 618 (37.788583655156, -122.456059894366)
## 619 (37.7931486269835, -122.402567175578)
## 620 (0, 0)
## 621 (37.7679845948645, -122.413953822522)
columns <- paste('V', c(4,7,8,9,10,12,13,14,15,16,17,18,19,21,24), sep="")
columns
## [1] "V4" "V7" "V8" "V9" "V10" "V12" "V13" "V14" "V15" "V16" "V17"
## [12] "V18" "V19" "V21" "V24"
The data is a mesh right now. It’s very hard to see what exaclty is going on and what kind of information can we get from this dataset. We don’t need all these information for this project.
I’ll take out the columns that we don’t need for this project or which are meaningless to be presented in this project.
tidymobile_food_permit <- tail(mobile_food_permit) %>% select(-c(4,7,8,9,10,12,13,14,15,16,17,18,19,21,24))
head(tidymobile_food_permit)
## locationid Applicant FacilityType
## 616 1058991 Ruru Juice LLC Truck
## 617 1193831 Eva's Catering Truck
## 618 1184111 D & T Catering Truck
## 619 1219122 Halal Cart of San Francisco Push Cart
## 620 1163788 SOHOMEI, LLC Truck
## 621 1181515 John's Catering #5 Truck
## LocationDescription
## 616 MISSION ST: 02ND ST to NEW MONTGOMERY ST (600 - 634)
## 617 ORTEGA ST: 18TH AVE to 19TH AVE (1100 - 1199)
## 618 WASHINGTON ST: MAPLE ST to CHERRY ST (3800 - 3899)
## 619 CALIFORNIA ST: LEIDESDORFF ST to MONTGOMERY ST (449 - 499)
## 620 MISSION ST: ANNIE ST to 03RD ST (663 - 699)
## 621 HARRISON ST: ALAMEDA ST to 15TH ST (1830 - 1899)
## Address Status Approved PriorPermit
## 616 601 MISSION ST REQUESTED 0
## 617 1199 ORTEGA ST APPROVED 07/26/2018 12:00:00 AM 1
## 618 3839 WASHINGTON ST APPROVED 07/11/2018 12:00:00 AM 1
## 619 400 MONTGOMERY ST APPROVED 09/12/2018 12:00:00 AM 0
## 620 690 MISSION ST REQUESTED 1
## 621 1830 HARRISON ST APPROVED 07/06/2018 12:00:00 AM 1
## ExpirationDate
## 616
## 617 07/15/2019 12:00:00 AM
## 618 07/15/2019 12:00:00 AM
## 619 07/15/2019 12:00:00 AM
## 620 07/15/2019 12:00:00 AM
## 621 07/15/2019 12:00:00 AM
# to remove if there is any rows that are blank
tidymobile_food_permit <- tidymobile_food_permit %>% filter(Applicant != " ")
## Warning: package 'bindrcpp' was built under R version 3.5.1
# to confirm that there is no more blank rows
head(tidymobile_food_permit,20)
## locationid Applicant FacilityType
## 1 1058991 Ruru Juice LLC Truck
## 2 1193831 Eva's Catering Truck
## 3 1184111 D & T Catering Truck
## 4 1219122 Halal Cart of San Francisco Push Cart
## 5 1163788 SOHOMEI, LLC Truck
## 6 1181515 John's Catering #5 Truck
## LocationDescription
## 1 MISSION ST: 02ND ST to NEW MONTGOMERY ST (600 - 634)
## 2 ORTEGA ST: 18TH AVE to 19TH AVE (1100 - 1199)
## 3 WASHINGTON ST: MAPLE ST to CHERRY ST (3800 - 3899)
## 4 CALIFORNIA ST: LEIDESDORFF ST to MONTGOMERY ST (449 - 499)
## 5 MISSION ST: ANNIE ST to 03RD ST (663 - 699)
## 6 HARRISON ST: ALAMEDA ST to 15TH ST (1830 - 1899)
## Address Status Approved PriorPermit
## 1 601 MISSION ST REQUESTED 0
## 2 1199 ORTEGA ST APPROVED 07/26/2018 12:00:00 AM 1
## 3 3839 WASHINGTON ST APPROVED 07/11/2018 12:00:00 AM 1
## 4 400 MONTGOMERY ST APPROVED 09/12/2018 12:00:00 AM 0
## 5 690 MISSION ST REQUESTED 1
## 6 1830 HARRISON ST APPROVED 07/06/2018 12:00:00 AM 1
## ExpirationDate
## 1
## 2 07/15/2019 12:00:00 AM
## 3 07/15/2019 12:00:00 AM
## 4 07/15/2019 12:00:00 AM
## 5 07/15/2019 12:00:00 AM
## 6 07/15/2019 12:00:00 AM
tidymobile_food_permit <- tidymobile_food_permit %>%
group_by(PriorPermit)
hist(tidymobile_food_permit$PriorPermit)
According to the histogram, the applicants who are running the business has been in the business for many years. They have their previous experiences that doesn’t mean new applicants didn’t come to the market but old fish are on the bond doing business for many years.