# A tibble: 6 × 18
id name host_id host_name neighbourhood_group neighbourhood latitude
<dbl> <chr> <dbl> <chr> <lgl> <chr> <dbl>
1 3686 Vita's Hid… 4645 Vita NA Historic Ana… 38.9
2 3943 Historic R… 5059 Vasa NA Edgewood, Bl… 38.9
3 4197 Capitol Hi… 5061 Sandra NA Capitol Hill… 38.9
4 4529 Bertina's … 5803 Bertina NA Eastland Gar… 38.9
5 5589 Cozy apt i… 6527 Ami NA Kalorama Hei… 38.9
6 7103 Lovely gue… 17633 Charlotte NA Spring Valle… 38.9
# ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
# minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
# reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
# availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>
Summary Statistics
summary(df)
id name host_id host_name
Min. :3.686e+03 Length:6257 Min. : 4617 Length:6257
1st Qu.:3.792e+07 Class :character 1st Qu.: 22024017 Class :character
Median :7.501e+17 Mode :character Median : 81005284 Mode :character
Mean :6.159e+17 Mean :176451046
3rd Qu.:1.143e+18 3rd Qu.:304261532
Max. :1.375e+18 Max. :681391481
neighbourhood_group neighbourhood latitude longitude
Mode:logical Length:6257 Min. :38.82 Min. :-77.11
NA's:6257 Class :character 1st Qu.:38.90 1st Qu.:-77.03
Mode :character Median :38.91 Median :-77.01
Mean :38.91 Mean :-77.01
3rd Qu.:38.92 3rd Qu.:-76.99
Max. :38.99 Max. :-76.91
room_type price minimum_nights number_of_reviews
Length:6257 Min. : 10.0 Min. : 1.00 Min. : 0.00
Class :character 1st Qu.: 88.0 1st Qu.: 1.00 1st Qu.: 1.00
Mode :character Median : 131.0 Median : 2.00 Median : 19.00
Mean : 168.7 Mean : 13.23 Mean : 66.38
3rd Qu.: 193.0 3rd Qu.: 31.00 3rd Qu.: 86.00
Max. :7000.0 Max. :701.00 Max. :1205.00
NA's :1488
last_review reviews_per_month calculated_host_listings_count
Min. :2013-06-15 00:00:00 Min. : 0.010 Min. : 1.00
1st Qu.:2024-10-17 00:00:00 1st Qu.: 0.470 1st Qu.: 1.00
Median :2025-01-23 00:00:00 Median : 1.460 Median : 3.00
Mean :2024-09-12 12:48:19 Mean : 1.974 Mean : 33.15
3rd Qu.:2025-02-27 00:00:00 3rd Qu.: 2.940 3rd Qu.: 14.00
Max. :2025-03-14 00:00:00 Max. :28.200 Max. :289.00
NA's :1236 NA's :1236
availability_365 number_of_reviews_ltm license
Min. : 0.0 Min. : 0.0 Length:6257
1st Qu.: 43.0 1st Qu.: 0.0 Class :character
Median :175.0 Median : 5.0 Mode :character
Mean :175.8 Mean : 15.8
3rd Qu.:303.0 3rd Qu.: 25.0
Max. :365.0 Max. :290.0
Loading tidyverse
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.0 ✔ readr 2.1.6
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Data Visualization
ggplot(data = df, aes(x = number_of_reviews, y = price, fill = reviews_per_month)) +geom_point(size =2, shape =23, alpha =0.5) +scale_fill_gradient(low ="blue", high ="red")+filter(df, number_of_reviews <86, number_of_reviews >1, price <500, price >1, reviews_per_month >1, reviews_per_month <28) +labs(title ="How the amount of reviews affect the price",caption ="Source: Airbnb_DC_25.csv",x ="Number of Reviews",y ="Price")
Short Paragraph:
This visualization showcases the relationship between the number of reviews and the price of the AirBnBs in D.C. It also shows if there is a relationship between the reviews per month and the previous two variables. I learned in this visualization that the AirBnBs with the highest amount of reviews per month were typically around the 100 range. I also learned that the increase in price generally didn’t affect the amount of reviews, rather how often they are reviewed. The lower price also had a greater number of reviews.