library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.5.2
## Warning: package 'ggplot2' was built under R version 4.5.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl)
library(dplyr)
setwd("C:/Users/rjzavaleta/Downloads/Data 110")
airbnb <- read_csv("airbnb_DC_25.csv")
## Multiple files in zip: reading '[Content_Types].xml'
## Rows: 1 Columns: 1
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
df <- read_excel("airbnb_DC_25.csv")
df
## # A tibble: 6,257 × 18
## id name host_id host_name neighbourhood_group neighbourhood latitude
## <dbl> <chr> <dbl> <chr> <lgl> <chr> <dbl>
## 1 3686 Vita's Hi… 4645 Vita NA Historic Ana… 38.9
## 2 3943 Historic … 5059 Vasa NA Edgewood, Bl… 38.9
## 3 4197 Capitol H… 5061 Sandra NA Capitol Hill… 38.9
## 4 4529 Bertina's… 5803 Bertina NA Eastland Gar… 38.9
## 5 5589 Cozy apt … 6527 Ami NA Kalorama Hei… 38.9
## 6 7103 Lovely gu… 17633 Charlotte NA Spring Valle… 38.9
## 7 11785 Sanctuary… 32015 Teresa NA Cathedral He… 38.9
## 8 12442 Peaches &… 32015 Teresa NA Cathedral He… 38.9
## 9 13744 Heart of … 53927 Victoria NA Columbia Hei… 38.9
## 10 14218 Quiet Com… 32015 Teresa NA Cathedral He… 38.9
## # ℹ 6,247 more rows
## # ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
## # minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
## # reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
## # availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>
head(df)
## # A tibble: 6 × 18
## id name host_id host_name neighbourhood_group neighbourhood latitude
## <dbl> <chr> <dbl> <chr> <lgl> <chr> <dbl>
## 1 3686 Vita's Hid… 4645 Vita NA Historic Ana… 38.9
## 2 3943 Historic R… 5059 Vasa NA Edgewood, Bl… 38.9
## 3 4197 Capitol Hi… 5061 Sandra NA Capitol Hill… 38.9
## 4 4529 Bertina's … 5803 Bertina NA Eastland Gar… 38.9
## 5 5589 Cozy apt i… 6527 Ami NA Kalorama Hei… 38.9
## 6 7103 Lovely gue… 17633 Charlotte NA Spring Valle… 38.9
## # ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
## # minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
## # reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
## # availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>
airbnb2 <- df |>
filter(availability_365 == "365")|>
group_by(room_type)
airbnb2
## # A tibble: 224 × 18
## # Groups: room_type [3]
## id name host_id host_name neighbourhood_group neighbourhood latitude
## <dbl> <chr> <dbl> <chr> <lgl> <chr> <dbl>
## 1 161913 X-tra la… 767543 Dana NA Dupont Circl… 38.9
## 2 178395 Spare Ro… 852801 Allison NA Takoma, Brig… 39.0
## 3 223203 ROOM FOR… 1159505 Elizabeth NA Colonial Vil… 39.0
## 4 251611 LUXURY L… 1159505 Elizabeth NA Colonial Vil… 39.0
## 5 251615 NICE HOU… 1159505 Elizabeth NA Colonial Vil… 39.0
## 6 251619 LUXURY L… 1159505 Elizabeth NA Colonial Vil… 39.0
## 7 501809 Upstair … 481929 Chris NA Howard Unive… 38.9
## 8 654835 Georgeto… 1671809 Mary Beth NA Georgetown, … 38.9
## 9 688914 Master B… 3517743 Kanita NA Douglas, Shi… 38.9
## 10 792578 Lovely C… 4027780 Leonard NA River Terrac… 38.9
## # ℹ 214 more rows
## # ℹ 11 more variables: longitude <dbl>, room_type <chr>, price <dbl>,
## # minimum_nights <dbl>, number_of_reviews <dbl>, last_review <dttm>,
## # reviews_per_month <dbl>, calculated_host_listings_count <dbl>,
## # availability_365 <dbl>, number_of_reviews_ltm <dbl>, license <chr>
ggplot(data = airbnb2, aes(x=room_type, fill = room_type)) +
geom_bar(alpha = 0.5)+ # try replacing alpha = 0.5 with 0.8 to see how it changes
labs(x = "Room Type", y = "Count",
title = "Counts of Rooms Available All Year Round Based on Room Type")
This barplot shows the counts of rooms that are available all year round to rent for an airbnb. The different room types are entire homes or apartments, private room, and shared room. With entire rooms/apt being the most common. I was able to use the filter command to filter all of the places that were available all 365 days of the year. In comparison to the original dataset there were only 224 airbnbs, compared to the 6257 in the original. Something that I notice is about this graph is that the amount of shared rooms are little to none, showing that more people would rather have an airbnb by themselves.