Loading Data

## Rows: 15244 Columns: 18
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr   (3): name, host_name, room_type
## dbl  (12): id, host_id, neighbourhood, latitude, longitude, price, minimum_n...
## lgl   (2): neighbourhood_group, license
## date  (1): last_review
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.

Numeric Summaries

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##     8.0    91.0   144.0   282.1   250.0 38143.0    4061
##   25%   50%   75%   95% 
##  91.0 144.0 250.0 780.8
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##    1.000    1.000    2.000    7.643    3.000 1124.000
## 25% 50% 75% 95% 
##   1   2   3  30

The price column has a minimum value of $8 and a maximum value of $38,143. The minimum_nights column ranges from 1 to 1124 nights, with nights the median at 2 nights.

The 95th Percentile priceis $780.8, which means most listings are below this price and the 75th percentile minimum_nights is 3 nights which indicates that most listings are for short stay.

Categorical Summaries

## 
## 78701 78702 78703 78704 78705 78712 78717 78719 78721 78722 78723 78724 78725 
##  1112  1821   715  2264   568     1    73    23   400   289   496   232    99 
## 78726 78727 78728 78729 78730 78731 78732 78733 78734 78735 78736 78737 78738 
##    24   177   116   166    43   175    76    99   359    76    93   199    83 
## 78739 78741 78742 78744 78745 78746 78747 78748 78749 78750 78751 78752 78753 
##    36   927     6   461   813   285   124   261   173    95   546   274   223 
## 78754 78756 78757 78758 78759 
##   179   193   287   407   175
## 
## Entire home/apt      Hotel room    Private room     Shared room 
##           12429             134            2562             119
## [1] 44
## [1] 4

There are 44 unique neighbourhoods and 4 unique room_type.

Analysis and Visualizations

Q1. Which neighbourhood has the highest average price per night?

## # A tibble: 44 x 2
##    neighbourhood Average_price
##            <dbl>         <dbl>
##  1         78750         1433.
##  2         78732         1393.
##  3         78746          779.
##  4         78731          755.
##  5         78730          735.
##  6         78727          718.
##  7         78733          633.
##  8         78729          550.
##  9         78701          396.
## 10         78737          390.
## # i 34 more rows

The 78750 neighbourhoods has the highest average price per night ($1432.51), followed by 78732 with average price per night at $1392.82. Both zip codes belong to Travis County, indicating that location is more expensive that other places and location influences pricing.

The price distribution is right-skewed, indicating most listings are affordable, with a few luxury options.

Note: I set the maximum at 2000 to remove outliers and see a better visualisation.

The minimum_nights distribution has more counts at 1-2 nights, which suggests a preference for short-term stays.

Note: I set the maximum at 100 to remove outliers and see a better visualisation.

Q2. Is there a relationship between the number of reviews and the price of a listing?

## Warning: Removed 4061 rows containing missing values (geom_point).

There is a negative correlation between price and the number_of_review. The higher the price, the fewer reviews. Listings with low prices tend to have more reviews; this might be because they have more bookings than others.

Q3. How does the room type affect the minimum nights required by hosts?

## # A tibble: 4 x 2
##   room_type       median_minimum_nights
##   <chr>                           <dbl>
## 1 Entire home/apt                     2
## 2 Hotel room                          1
## 3 Private room                        1
## 4 Shared room                        30

Shared room listings tend to require longer minimum stays (a median of 30 nights) than Entire homes/apartments (a median of 2 nights). Hotel rooms and private rooms have the shortest minimum stay requirement, a median of 1 night.

Further question: Why do shared rooms require a longer stay compared to others? This could be due to various factors such as shared facilities, cleaning schedules, or agreements between multiple occupants. Further data and analysis would be needed to explore this question fully.