hw 5

Author

Kenneth

library(readxl)
Warning: package 'readxl' was built under R version 4.5.2
 library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.5.2
Warning: package 'ggplot2' was built under R version 4.5.2
Warning: package 'tibble' was built under R version 4.5.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
setwd("C:/Users/kenne/Downloads")
df <- read_excel("Airbnb_DC_25.csv")
df
# A tibble: 6,257 × 18
      id name       host_id host_name neighbourhood_group neighbourhood latitude
   <dbl> <chr>        <dbl> <chr>     <lgl>               <chr>            <dbl>
 1  3686 Vita's Hi…    4645 Vita      NA                  Historic Ana…     38.9
 2  3943 Historic …    5059 Vasa      NA                  Edgewood, Bl…     38.9
 3  4197 Capitol H…    5061 Sandra    NA                  Capitol Hill…     38.9
 4  4529 Bertina's…    5803 Bertina   NA                  Eastland Gar…     38.9
 5  5589 Cozy apt …    6527 Ami       NA                  Kalorama Hei…     38.9
 6  7103 Lovely gu…   17633 Charlotte NA                  Spring Valle…     38.9
 7 11785 Sanctuary…   32015 Teresa    NA                  Cathedral He…     38.9
 8 12442 Peaches &…   32015 Teresa    NA                  Cathedral He…     38.9
 9 13744 Heart of …   53927 Victoria  NA                  Columbia Hei…     38.9
10 14218 Quiet Com…   32015 Teresa    NA                  Cathedral He…     38.9
# ℹ 6,247 more rows
# ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
#   minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
#   reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
#   availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>
price_of_room <- df |>
  group_by(room_type) |>
  summarize(avg_price = mean(price, na.rm = TRUE))
ggplot(price_of_room, aes(x = room_type, y = avg_price, fill = room_type)) +
  geom_bar(stat = "identity") +
  labs(
    title = "Average airbnb price by room type in DC",
    x = "Room Type",
    y = "Average Price",
    caption = "Source: Airbnb_DC_25 Dataset"
  ) +
  theme_dark()

p.s( I wanted to do boxplot but couldn’t due to error so I just did bar. I know you wanted us to do this early in case we ran into a erroer but I thought it would be easier if I just switch to a different graph.)

Essay

The bar graph I created above is showing the average Airbnb prices by room type in Washington DC. Each bar represents a different room type with there being 4 bars on the graph. The first bar is the “entire home” being on the left and then going to “hotel room”, “private room” and lastly “shared room”. The y-axis tells you the average price while the x-axis shows you the room type. Looking at this graph, I could tell that private room cost the least out of everything. And shared room cost the most out of everything almost hitting 1500. This does make sense with shared room costing the most because they are spliting the cost between the people paying for the shared room. And hotel room price does seem right because you are just staying at the hotel for a few days.