Week 5 Assignment: Airbnb DC Data Visualization

Author

Emmanuel Gkatongoni

Published

March 11, 2026

Load Libraries and Dataset

library(readxl)
Warning: package 'readxl' was built under R version 4.3.3
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.3.3
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
df <- read_excel("Airbnb_DC_25.csv")


head(df)
# A tibble: 6 × 18
     id name        host_id host_name neighbourhood_group neighbourhood latitude
  <dbl> <chr>         <dbl> <chr>     <lgl>               <chr>            <dbl>
1  3686 Vita's Hid…    4645 Vita      NA                  Historic Ana…     38.9
2  3943 Historic R…    5059 Vasa      NA                  Edgewood, Bl…     38.9
3  4197 Capitol Hi…    5061 Sandra    NA                  Capitol Hill…     38.9
4  4529 Bertina's …    5803 Bertina   NA                  Eastland Gar…     38.9
5  5589 Cozy apt i…    6527 Ami       NA                  Kalorama Hei…     38.9
6  7103 Lovely gue…   17633 Charlotte NA                  Spring Valle…     38.9
# ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
#   minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
#   reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
#   availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>

Data Visualization: Median Airbnb Price by Room Type in Washington, D.C.

# Using dplyr to summarize median price by room type, removing extreme outliers
price_summary <- df |> filter(price > 0, price <= 1000) |> group_by(room_type) |> summarize(median_price = median(price, na.rm = TRUE), count = n() ) |> arrange(desc(median_price))                             

# custom colors (one per room type)
room_colors <- c("Entire home/apt" = "orange", "Hotel room" = "blue", "Private room" = "green","Shared room" = "purple")

# Building bar chart
ggplot(price_summary, aes(x = reorder(room_type, -median_price), y = median_price, fill = room_type)) + geom_col(width = 0.6, show.legend = TRUE) + geom_text(aes(label = paste0("$", median_price)), vjust = -0.5, fontface = "bold", size = 4.5) + scale_fill_manual(values = room_colors, name   = "Room Type") + scale_y_continuous(labels = scales::dollar_format(), expand = expansion(mult = c(0, 0.12))) + labs(title = "Median Nightly Airbnb Price by Room Type in Washington, D.C.", x = "Room Type", y = "Median Nightly Price (USD)", caption = "Data Source: Airbnb DC Listings, 2025 (Airbnb_DC_25.csv)") + theme_minimal(base_size = 13) + theme(plot.title      = element_text(face = "bold", hjust = 0.5, size = 14), plot.caption    = element_text(color = "gray50", size = 9), axis.text.x     = element_text(size = 11), legend.position = "right", panel.grid.major.x = element_blank())

Interpretation

This bar chart shows the median nightly price for each Airbnb room type in Washington, D.C. using the 2025 Airbnb dataset. I used dplyr to filter out listings with a price of 0 and very high prices over 1000, then grouped the data by room type and found the median price for each one. The chart shows that hotel rooms have the highest median price at $286, followed by entire homes/apartments at $146, while private rooms and shared rooms are much cheaper at $68 and $66. A clear pattern here is that the more private or full-property options tend to cost more, while shared-style listings are the most affordable. Overall, this suggests that room type plays a big role in Airbnb pricing in Washington, D.C.