I created a boxplot with the relationship between the prices versus the room you could get in Capitol Hill, Washington DC. I used the ggplot function () from the ggplot 2 package witht the dplyr to make sure that the capitol_hill_data was defined and specified with the variables for the plot “room_type” on the x-axis and have price on the y. I used fill=room type to have The colors, the colors represent the room type, ranging from dark blue being a shared room, green private room, yellow hotel room, and red a entire home.I .
The geom_boxplot() function was added to draw the box plots, with alpha = 0.7 to make the boxes slightly transparent and outlier.shape = NA to hide extreme outliers. The coord_cartesian() function limits the y-axis to $0–$500, focusing on the price range mainly. I added colors that were custom using scale_fill_manual(): which allowed Shared rooms in dark blue, Private rooms in green, Hotel rooms in yellow, and Entire homes in red. Titles and axis labels were added with labs(), and i used theme_minimal to make it look neat.
One thing that I would change is the yellow box plot and the title where it shows the colors. The yellow box plot for some reason the line isn’t showing and the title shows up as “room_type” instead of just rooms. I would also add the cities to the box plot so it’s not so broad.
library(readxl)
library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
library(ggplot2)
df <- read_excel(file.choose())
capitol_hill_data <- df %>% +
+ filter(grepl(“Capitol Hill”, neighbourhood))capitol_hill_data <- capitol_hill_data %>% +
+ mutate(price = as.numeric(price))print(nrow(capitol_hill_data)) [1] 587
summary(capitol_hill_data$price) Min. 1st Qu. Median Mean 3rd Qu. Max. NA’s 23.0 116.0 155.0 188.2 210.0 2000.0 97
ggplot(capitol_hill_data, aes(x = room_type, y = price, fill = room_type)) + +
+ geom_boxplot(alpha = 0.7, outlier.shape = NA)
Warning message: Removed 97 rows containing non-finite outside the scale range (stat_boxplot()).coord_cartesian(ylim = c(0, 500))
<ggproto object: Class CoordCartesian, Coord, gg> aspect: function backtransform_range: function clip: on default: FALSE distance: function draw_panel: function expand: TRUE is_free: function is_linear: function labels: function limits: list modify_scales: function range: function ratio: NULL render_axis_h: function render_axis_v: function render_bg: function render_fg: function reverse: none setup_data: function setup_layout: function setup_panel_guides: function setup_panel_params: function setup_params: function train_panel_guides: function transform: function super: <ggproto object: Class CoordCartesian, Coord, gg>labs( +
+ title = “Distribution of Airbnb Prices by Room Type in Capitol Hill, DC”, +
+ x = “Room Type”, +
+ y = “Price ($)”, +
+ fill = “Room Type”, +
+ caption = “Data Source: Airbnb_DC_25 Dataset” +
+ ) + +
+ theme_minimal() NULL To change the color I wanted : red, yellow, green, dark blue! Error: unexpected symbol in “To change”
Code: library(ggplot2)Error: object ‘Code’ not found
ggplot(capitol_hill_data, aes(x = room_type, y = price, fill = room_type)) + +
+ geom_boxplot(alpha = 0.7, outlier.shape = NA) + +
+
+ coord_cartesian(ylim = c(0, 500)) + +
+
+ scale_fill_manual(values = c( +
+
+ “Shared room” = “darkblue”, +
+
+ “Private room” = “green”, +
+
+ “Hotel room” = “yellow”, +
+
+ “Entire home/apt” = “red” +
+
+ )) + +
+
+ labs( +
+
+ title = “Airbnb Prices by Room Type in Capitol Hill”, +
+
+ x = “Room Type”, +
+
+ y = “Price ($)” +
+
+ ) + +
+
+ theme_minimal()