library(readxl)
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
df <- read_excel("C:/Users/panca/Downloads/Airbnb_DC_25.csv")
head(df)
## # A tibble: 6 × 18
## id name host_id host_name neighbourhood_group neighbourhood latitude
## <dbl> <chr> <dbl> <chr> <lgl> <chr> <dbl>
## 1 3686 Vita's Hid… 4645 Vita NA Historic Ana… 38.9
## 2 3943 Historic R… 5059 Vasa NA Edgewood, Bl… 38.9
## 3 4197 Capitol Hi… 5061 Sandra NA Capitol Hill… 38.9
## 4 4529 Bertina's … 5803 Bertina NA Eastland Gar… 38.9
## 5 5589 Cozy apt i… 6527 Ami NA Kalorama Hei… 38.9
## 6 7103 Lovely gue… 17633 Charlotte NA Spring Valle… 38.9
## # ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
## # minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
## # reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
## # availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>
rooms <- df |>
group_by(room_type) |>
summarize(count=n())
ggplot(rooms,aes(x=room_type, y=count,fill=room_type))+
geom_bar(stat="identity")+
labs(
title="Airbnb listing by type of room",
x="Room Type",
y="Amount of listings",
caption="Source: Airbnb_DC_25",
)
The type of graph I chose to do was a bar graph. The variable I chose to represent is room type, which is categorical. The height of the graph displays the total amount of listings for each type of room type in the dataset. The graph implies that entire home/apt airbnbs are the most popular, with shared rooms being the least popular.