Assignment 5

# Load required libraries
library(dplyr)
library(ggplot2)

# Load the dataset — update this path to match where your file is saved
library(readxl)

df <- read_excel("C:/Users/amanu/OneDrive/Documentos/Assignment 5 data 110_files/Airbnb_DC_25.csv.xlsx")

Price Distribution by Room Type

# Use dplyr to filter out extreme outliers (prices above $1000)
# so the chart is easier to read
df_filtered <- df |>
  filter(price < 1000)

# Create a boxplot of price by room type
# fill = room_type gives each room type its own color
ggplot(df_filtered, aes(x = room_type, y = price, fill = room_type)) +
  geom_boxplot() +
  scale_fill_brewer(palette = "Set2") +   # applies at least two colors
  labs(
    title = "Airbnb DC: Nightly Price Distribution by Room Type",
    x = "Room Type",
    y = "Nightly Price (USD)",
    fill = "Room Type",                   # legend label
    caption = "Source: Airbnb DC 2025 dataset (Airbnb_DC_25.csv)"
  ) +
  theme_minimal()

Summary

This boxplot shows the distribution of nightly Airbnb prices in Washington DC across different room types, filtered to listings under $1,000 per night to remove extreme outliers. Entire home/apartment listings are the most expensive on average, with a wider spread in prices, while shared rooms are the most affordable and tightly clustered at the lower end. Private rooms fall in between, offering a mid-range option for visitors. This pattern suggests that guests looking for budget friendly stays in DC should prioritize shared or private room listings over entire home rentals. —

Quarto

Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.

Running Code

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

1 + 1
[1] 2

You can add options to executable code like this

[1] 4

The echo: false option disables the printing of code (only output is displayed).