IS 607 - Project 2 The goal of this assignment is to give you practice in preparing different datasets for downstream analysis work. Your task is to:

  1. Choose any three of the “wide” datasets identified in the Week 6 Discussion items. (You may use your own dataset; please don’t use my Sample Post dataset, since that was used in your Week 6 assignment!) For each of the three chosen datasets: ??? Create a .CSV file (or optionally, a MySQL database!) that includes all of the information included in the dataset. You’re encouraged to use a “wide” structure similar to how the information appears in the discussion item, so that you can practice tidying and transformations as described below. ??? Read the information from your .CSV file into R, and use tidyr and dplyr as needed to tidy and transform your data. [Most of your grade will be based on this step!] ??? Perform the analysis requested in the discussion item. ??? Your code should be in an R Markdown file, posted to rpubs.com, and should include narrative descriptions of your data cleanup work, analysis, and conclusions.
  2. Please include in your homework submission, for each of the three chosen datasets: ??? The URL to the .Rmd file in your GitHub repository, and ??? The URL for your rpubs.com web page.

Data Acquired from :https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j/data Courtesy of Robert Lauto

I will take the Restaurant Inspection resutls and transpose and tidy them so that I may make a comparison between cuisine types and overall cleanliness of restaurants between Bronx and Brooklyn.

Loading the packages

library(tidyr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Loading the data into R

Rawwide_data <- read.csv(paste0("https://raw.githubusercontent.com/Fyoun123/Data607/master/Project%202/DOHMH_New_York_City_Restaurant_Inspection_Results%20(3).csv"),stringsAsFactors = F); head(Rawwide_data)
##      CAMIS                        DBA     BORO BUILDING
## 1 50065189 NORTH COAST SHARK AND BAKE BROOKLYN      873
## 2 50070029          LA COCINA DE YALA    BRONX     1756
## 3 50071125                   TAP BEER BROOKLYN     1781
## 4 50005487           EMPIRE KITCHEN . BROOKLYN     1575
## 5 50051703               LI'S KITCHEN    BRONX     2515
## 6 41040405                 MCDONALD'S BROOKLYN      139
##                STREET ZIPCODE      PHONE
## 1     SCHENECTADY AVE   11203 3472953930
## 2          E 174TH ST   10472 3474222693
## 3 SHEEPSHEAD BAY ROAD   11235 3475306443
## 4             PARK PL   11233 7189533532
## 5   WILLIAMSBRIDGE RD   10469 7185191881
## 6     FLATBUSH AVENUE   11217 7182300176
##                                                CUISINE.DESCRIPTION
## 1                                                        Caribbean
## 2 Latin (Cuban, Dominican, Puerto Rican, South & Central American)
## 3                                                          Russian
## 4                                                          Chinese
## 5                                                          Chinese
## 6                                                       Hamburgers
##   INSPECTION.DATE                                          ACTION
## 1      03/17/2018 Violations were cited in the following area(s).
## 2      04/07/2018 Violations were cited in the following area(s).
## 3      05/31/2018 Violations were cited in the following area(s).
## 4      07/18/2018 Violations were cited in the following area(s).
## 5      01/03/2018 Violations were cited in the following area(s).
## 6      08/24/2018 Violations were cited in the following area(s).
##   VIOLATION.CODE
## 1            10B
## 2            08A
## 3            10F
## 4            10F
## 5            09B
## 6            04N
##                                                                                                                                                                                                                                                                                  VIOLATION.DESCRIPTION
## 1                                                                   Plumbing not properly installed or maintained; anti-siphonage or backflow prevention device not provided where required; equipment or floor not properly drained; sewage disposal system in disrepair or not functioning properly.
## 2                                                                                                                                                                   Facility not vermin proof. Harborage or conditions conducive to attracting vermin to the premises and/or allowing vermin to exist.
## 3                      Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.
## 4                      Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.
## 5                                                                                                                                                                                                                                                                         Thawing procedures improper.
## 6 Filth flies or food/refuse/sewage-associated (FRSA) flies present in facility\032s food and/or non-food areas. Filth flies include house flies, little house flies, blow flies, bottle flies and flesh flies. Food/refuse/sewage-associated flies include fruit flies, drain flies and Phorid flies.
##   CRITICAL.FLAG SCORE GRADE GRADE.DATE RECORD.DATE
## 1  Not Critical    11                   10/04/2018
## 2  Not Critical    28                   10/04/2018
## 3  Not Critical     4     A 05/31/2018  10/04/2018
## 4  Not Critical     7     A 07/18/2018  10/04/2018
## 5  Not Critical    14                   10/04/2018
## 6      Critical    13     A 08/24/2018  10/04/2018
##                                 INSPECTION.TYPE
## 1         Cycle Inspection / Initial Inspection
## 2 Pre-permit (Operational) / Initial Inspection
## 3 Pre-permit (Operational) / Initial Inspection
## 4              Cycle Inspection / Re-inspection
## 5         Cycle Inspection / Initial Inspection
## 6         Cycle Inspection / Initial Inspection

Selecting portions I want

DF2 <- Rawwide_data %>%
  select(BORO,CUISINE.DESCRIPTION,SCORE);head(DF2)
##       BORO
## 1 BROOKLYN
## 2    BRONX
## 3 BROOKLYN
## 4 BROOKLYN
## 5    BRONX
## 6 BROOKLYN
##                                                CUISINE.DESCRIPTION SCORE
## 1                                                        Caribbean    11
## 2 Latin (Cuban, Dominican, Puerto Rican, South & Central American)    28
## 3                                                          Russian     4
## 4                                                          Chinese     7
## 5                                                          Chinese    14
## 6                                                       Hamburgers    13

Renaming the Column

colnames(DF2)[1] <- "BROOKLYN/BRONX"

Filtering out undesired results

DF3 <- DF2 %>% filter(SCORE>=0);head(DF3)
##   BROOKLYN/BRONX
## 1       BROOKLYN
## 2          BRONX
## 3       BROOKLYN
## 4       BROOKLYN
## 5          BRONX
## 6       BROOKLYN
##                                                CUISINE.DESCRIPTION SCORE
## 1                                                        Caribbean    11
## 2 Latin (Cuban, Dominican, Puerto Rican, South & Central American)    28
## 3                                                          Russian     4
## 4                                                          Chinese     7
## 5                                                          Chinese    14
## 6                                                       Hamburgers    13

Testing the Mean

mean(DF3$SCORE)
## [1] 21.49978

Getting an average score by Cusisine type.

Performance <- DF3 %>% group_by(`BROOKLYN/BRONX`,CUISINE.DESCRIPTION) %>% summarise(Average = mean(SCORE));Performance
## # A tibble: 133 x 3
## # Groups:   BROOKLYN/BRONX [?]
##    `BROOKLYN/BRONX` CUISINE.DESCRIPTION                            Average
##    <chr>            <chr>                                            <dbl>
##  1 BRONX            African                                           29.1
##  2 BRONX            American                                          19.4
##  3 BRONX            Asian                                             20.7
##  4 BRONX            Bagels/Pretzels                                   16.1
##  5 BRONX            Bakery                                            20.0
##  6 BRONX            Bangladeshi                                       31.6
##  7 BRONX            Barbecue                                          32.9
##  8 BRONX            Bottled beverages, including water, sodas, ju~    17.6
##  9 BRONX            Café/Coffee/Tea                                20.8
## 10 BRONX            Caribbean                                         22.6
## # ... with 123 more rows

Benchmarking a comparison between the two boroughs by cusisine type.

Performance2 <- spread(Performance,"BROOKLYN/BRONX","Average");Performance2
## # A tibble: 81 x 3
##    CUISINE.DESCRIPTION BRONX BROOKLYN
##    <chr>               <dbl>    <dbl>
##  1 Afghan               NA       20.4
##  2 African              29.1     30.5
##  3 American             19.4     19.4
##  4 Armenian             NA       10  
##  5 Asian                20.7     21.2
##  6 Australian           NA       11.4
##  7 Bagels/Pretzels      16.1     24.7
##  8 Bakery               20.0     21.2
##  9 Bangladeshi          31.6     31.9
## 10 Barbecue             32.9     21.2
## # ... with 71 more rows