Airbnb in DC

Author

Sarah Abdela

Airbnb

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(knitr)
library(readxl)


airbnb <- read_excel("C:/Users/ss671/OneDrive/Documents/Airbnb_DC_25 (1).csv")
head(airbnb)
# A tibble: 6 × 18
     id name        host_id host_name neighbourhood_group neighbourhood latitude
  <dbl> <chr>         <dbl> <chr>     <lgl>               <chr>            <dbl>
1  3686 Vita's Hid…    4645 Vita      NA                  Historic Ana…     38.9
2  3943 Historic R…    5059 Vasa      NA                  Edgewood, Bl…     38.9
3  4197 Capitol Hi…    5061 Sandra    NA                  Capitol Hill…     38.9
4  4529 Bertina's …    5803 Bertina   NA                  Eastland Gar…     38.9
5  5589 Cozy apt i…    6527 Ami       NA                  Kalorama Hei…     38.9
6  7103 Lovely gue…   17633 Charlotte NA                  Spring Valle…     38.9
# ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
#   minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
#   reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
#   availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>
avg_price<-airbnb%>%
  group_by(room_type)%>%
  summarize(avg_price=mean(price,na.rm=TRUE))
ggplot(avg_price,aes(x=room_type,y=avg_price,fill=room_type))+
  geom_bar(stat="identity")+
  labs(
    title="Average price of Airbnb by room type in Dc",
    x="Type of Room",
    y="Average price",
    fill="Type of Room",
    caption="Airbnb_DC_25 dataset"
  )

airbnb
# A tibble: 6,257 × 18
      id name       host_id host_name neighbourhood_group neighbourhood latitude
   <dbl> <chr>        <dbl> <chr>     <lgl>               <chr>            <dbl>
 1  3686 Vita's Hi…    4645 Vita      NA                  Historic Ana…     38.9
 2  3943 Historic …    5059 Vasa      NA                  Edgewood, Bl…     38.9
 3  4197 Capitol H…    5061 Sandra    NA                  Capitol Hill…     38.9
 4  4529 Bertina's…    5803 Bertina   NA                  Eastland Gar…     38.9
 5  5589 Cozy apt …    6527 Ami       NA                  Kalorama Hei…     38.9
 6  7103 Lovely gu…   17633 Charlotte NA                  Spring Valle…     38.9
 7 11785 Sanctuary…   32015 Teresa    NA                  Cathedral He…     38.9
 8 12442 Peaches &…   32015 Teresa    NA                  Cathedral He…     38.9
 9 13744 Heart of …   53927 Victoria  NA                  Columbia Hei…     38.9
10 14218 Quiet Com…   32015 Teresa    NA                  Cathedral He…     38.9
# ℹ 6,247 more rows
# ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
#   minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
#   reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
#   availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>

I chose to compare the average price of Airbnb for the different room types in Washington DC using the Airbnb_DC_25 dataset. The x-axis shows the type of room offered, while the y-axis shows the average listing price available. I used fill=roomtype to create the legend and to assign different colors to the different room types. One interesting thing I noticed was that the shared room was more expensive than any of the other rooms. I would assume that the shared rooms are primarily rented in an expensive neighborhood.