References: http://insideairbnb.com/singapore/, https://singaporelegaladvice.com/law-articles/is-airbnb-illegal-singapore
Data sources: http://insideairbnb.com/get-the-data.html
To many, Airbnb is a much cheaper alternative to traditional hotels and offer many benefits and flexibilities to both hosts and guests. With tight land space, the rise of the Airbnb industry has impacted housing prices and the local government has introduced measures to curb short-term rentals to tourists. As of 25 Sept 2019, there are 7674 listings and 2640 hosts in Singapore. This article seeks to answer these questions:
We see that the Central Region has more listings than the other regions combined and this is unsurprising. In general, Airbnb guests are likely to be either tourists or expatriates working in Singapore; most offices and attractions are located in the Central Region, thus hosts would strategically locate their listings there.
In the second barplot, Kallang and Geylang have significantly most listings than other neighbourhoods. While most locals would agree that these two neighbourhoods would be better classified as East Region, they are in fact classified as Central Region by Airbnb. These neighbourhoods are considered residential areas, are not far off from central and offer many efficient routes of transport to the central. They would therefore be cheaper than other neighbourhoods while not sacrificing much accessibility.
Singapore housing laws state that in general, property rentals have to be at least either 6 months long (for HDB flats) or 3 months long (for private properties). HDB flats also cannot be rented to tourists.
From the histogram, it is clear that the vast majority of hosts are not looking at long term occupants but rather short term ones, which contravene housing laws.
Availability, which is the number of days per year which the listing is available for booking, is also a measure of whether a listing is likely to be for short or long term rentals. According to Airbnb website, “entire homes or apartments highly available year-round for tourists, probably don’t have the owner present and could be illegal”.
The first graph (truncated y-axis) shows that most listings are either highly available or not available (extremes). The barplot shows that more than half the listings are potential red flags for illegal short-term rentals.
According to Airbnb website, “Hosts with multiple listings are more likely to be running a business, are unlikely to be living in the property, and in violation of most short term rental laws designed to protect residential housing.” From the first histogram, we see that most hosts (> 70%) have only one listing while some have multiple listings. One host even has 285 listings! The second graph identifies the top 10 hosts with most listings.
The first graph shows a rather similar shape across all five regions, with all peaking around the $50 - $60 mark. However, the median price for the Central Region is significantly higher than the other regions. This is again unsurprising as facilities, accessibility and attractions are more abundant in the Central Region and should drive up prices. The mean price is not a useful indicator because there are plenty of outliers as seen from the boxplots. Notice that the region with the highest mean price is the West, due to one particularly listing at $10,000.
Expectedly, the most expensive neighbourhoods also belong to the Central Region.
The Airbnb data provides a wealth of features, 106, to be exact, but many of which are useless and must be dropped.
Looking at the heatmap, we do not observe any strong correlation between price and the numerical variables.
##
## Call:
## lm(formula = price ~ accommodates + extra_people + room_type +
## cancellation_policy + bedrooms + property_type + security_deposit +
## reviews_per_month + calculated_host_listings_count + host_identity_verified +
## review_scores_rating + neighbourhood_group_cleansed + instant_bookable +
## cleaning_fee + availability_365 + beds + number_of_reviews,
## data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -708.3 -57.6 -12.3 28.5 8306.4
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.840e+00 3.937e+01 0.199 0.842146
## accommodates 1.107e+01 2.129e+00 5.201 2.05e-07 ***
## extra_people 2.906e+00 1.726e-01 16.834 < 2e-16 ***
## room_typeHotel room -5.661e+01 1.435e+01 -3.945 8.07e-05 ***
## room_typePrivate room -6.247e+01 8.419e+00 -7.420 1.36e-13 ***
## room_typeShared room -1.092e+02 2.084e+01 -5.240 1.67e-07 ***
## cancellation_policymoderate -7.470e+01 1.151e+01 -6.491 9.28e-11 ***
## cancellation_policystrict -7.282e+01 9.371e+00 -7.771 9.24e-15 ***
## bedrooms 3.997e+01 4.580e+00 8.728 < 2e-16 ***
## property_typeOther 1.246e+01 7.098e+00 1.755 0.079369 .
## property_typeServiced apartment 6.469e+01 1.342e+01 4.820 1.47e-06 ***
## security_deposit 5.411e-02 8.763e-03 6.175 7.10e-10 ***
## reviews_per_month -5.398e+00 3.753e+00 -1.438 0.150376
## calculated_host_listings_count -2.297e-01 5.917e-02 -3.882 0.000105 ***
## host_identity_verifiedTRUE -1.601e+01 7.843e+00 -2.042 0.041222 *
## review_scores_rating 6.190e-01 3.430e-01 1.805 0.071151 .
## neighbourhood_group_cleansedOther 5.152e+01 1.974e+01 2.610 0.009086 **
## instant_bookableTRUE -1.444e+01 7.072e+00 -2.042 0.041183 *
## cleaning_fee 1.072e-01 8.291e-02 1.293 0.196017
## availability_365 7.076e-02 2.431e-02 2.910 0.003629 **
## beds -4.186e+00 2.065e+00 -2.027 0.042717 *
## number_of_reviews -2.047e-01 1.414e-01 -1.448 0.147583
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 234.8 on 5348 degrees of freedom
## Multiple R-squared: 0.19, Adjusted R-squared: 0.1869
## F-statistic: 59.75 on 21 and 5348 DF, p-value: < 2.2e-16
## [1] "Correlation: 0.250810195348764"
## mae mse rmse mape
## 86.788612 96584.220627 310.780020 0.666494
## [1] "Mean Absolute Scaled Error: 0.588837178424066"
Forward variable selection using Akaike’s Information Criterion was done on the features and linear model ran on the chosen features. The model shows an overall statistical significance but has poor R-squared value. Performing prediction also did not yield good results.
Seems like Shirley’s place is clean, accessible, convenient and comfortable!
This graph shows the number of reviews per month for all listings in Singapore and we can observe an exponential trend since 2012, where Airbnb became increasingly popular.