DACS 601 - HW4

DACS 601 - HW4 in distill format

Apoorva Saraswat
2022-07-11

HW4: Visualizations

This is the R Markdown file for the HW4 of DACS-601 Summer 2022. I’m using the New York City Airbnb csv file from the Sample Datasets.

loading data

# A tibble: 6 × 16
     id name          host_id host_name neighbourhood_g… neighbourhood
  <dbl> <chr>           <dbl> <chr>     <chr>            <chr>        
1  2539 Clean & quie…    2787 John      Brooklyn         Kensington   
2  2595 Skylit Midto…    2845 Jennifer  Manhattan        Midtown      
3  3647 THE VILLAGE …    4632 Elisabeth Manhattan        Harlem       
4  3831 Cozy Entire …    4869 LisaRoxa… Brooklyn         Clinton Hill 
5  5022 Entire Apt: …    7192 Laura     Manhattan        East Harlem  
6  5099 Large Cozy 1…    7322 Chris     Manhattan        Murray Hill  
# … with 10 more variables: latitude <dbl>, longitude <dbl>,
#   room_type <chr>, price <dbl>, minimum_nights <dbl>,
#   number_of_reviews <dbl>, last_review <date>,
#   reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
#   availability_365 <dbl>

Data analysis

1. Number of airbnbs in each neighbourhood group

# A tibble: 5 × 2
  neighbourhood_group     n
  <chr>               <int>
1 Bronx                1091
2 Brooklyn            20104
3 Manhattan           21661
4 Queens               5666
5 Staten Island         373

2. Average price of airbnb in each neighbourhood group

# A tibble: 5 × 2
  neighbourhood_group avg_price
  <chr>                   <dbl>
1 Manhattan               197. 
2 Brooklyn                124. 
3 Staten Island           115. 
4 Queens                   99.5
5 Bronx                    87.5

3. Median price of airbnb in each neighbourhood group

# A tibble: 5 × 2
  neighbourhood_group median_price
  <chr>                      <dbl>
1 Manhattan                    150
2 Brooklyn                      90
3 Queens                        75
4 Staten Island                 75
5 Bronx                         65

4. Standard deviation price of airbnb in each neighbourhood group

# A tibble: 5 × 2
  neighbourhood_group sd_price
  <chr>                  <dbl>
1 Manhattan               291.
2 Staten Island           278.
3 Brooklyn                187.
4 Queens                  167.
5 Bronx                   107.

5. Host with the most stays

# A tibble: 6 × 2
    host_id     n
      <dbl> <int>
1 219517861   327
2 107434423   232
3  30283594   121
4 137358866   103
5  12243051    96
6  16098958    96

6. Airbnb with most number of reviews

# A tibble: 6 × 3
# Groups:   id [6]
        id name                              reviews
     <dbl> <chr>                               <dbl>
1  9145202 Room near JFK Queen Bed               629
2   903972 Great Bedroom in Manhattan            607
3   903947 Beautiful Bedroom in Manhattan        597
4   891117 Private Bedroom in Manhattan          594
5 10101135 Room Near JFK Twin Beds               576
6  8168619 Steps away from Laguardia airport     543

7. Most available airbnb

# A tibble: 6 × 3
# Groups:   id [6]
     id name                                availability
  <dbl> <chr>                                      <dbl>
1  2539 Clean & quiet apt home by the park           365
2  3647 THE VILLAGE OF HARLEM....NEW YORK !          365
3 11452 Clean and Quiet in Brooklyn                  365
4 11943 Country space in the city                    365
5 21644 Upper Manhattan, New York                    365
6 32037 Huge Private  Floor at The Waverly           365

8. Least available airbnb

# A tibble: 6 × 3
# Groups:   id [6]
     id name                                              availability
  <dbl> <chr>                                                    <dbl>
1  5022 Entire Apt: Spacious Studio/Loft by central park             0
2  5121 BlissArtsSpace!                                              0
3  5203 Cozy Clean Guest Room - Family Apt                           0
4  6090 West Village Nest - Superhost                                0
5  7801 Sweet and Spacious Brooklyn Loft                             0
6  8700 Magnifique Suite au N de Manhattan - vue Cloitres            0

Plots

Univariate

Plot 1 -Price Distribution

Since this data is not that readable, we are zooming in on the values of x from 0 to 500 since the majority of the distribution is in this range

Limitations

Bivariate

Plot 2 - Average price of airbnb for each neighbourhood group

Limitations

Plot 3 - Average price of airbnb for each room type

Limitations