Introduction
I conduct an exploratory analysis of Airbnb listings in Zürich, Switzerland, based on a publicly available data set containing scraped data from September 22nd, 2022. The purpose of this report is to gain some insights on Zürich’s Airbnb vacation rental business. More specifically, I try to answer the following questions:
- How does the number of listings differ between Zürich’s districts (Kreise)?
- Which room types are more widespread?
- How does price vary by district?
- Do hosts with longer experience get better ratings?
Data preparation
Data retrieval and import
I use data provided by Inside Airbnb, a “mission-driven project that provides data and advocacy about Airbnb’s impact on residential communities”. The data are available under a Creative Commons Attribution 4.0 International License, which allows the free sharing and adaptation of the material for any purpose.
Inside Airbnb scrapes listings in specified cities from Airbnb’s website on a quarterly basis and provides free access to data from the last 12 months. My analysis uses a data set that represents a snapshot of Airbnb listings in Zürich on September 22nd, 2022.
Data cleaning & transformation
I print a summary of the airbnb data frame to check if the variable classes have been correctly specified and if there are mistakes or inconsistencies in the data.
Rows: 2,246
Columns: 51
$ id <fct> 73282, 86645, 143821, 178448, 204586, 2…
$ listing_url <chr> "https://www.airbnb.com/rooms/73282", "…
$ last_scraped <date> 2022-09-23, 2022-09-23, 2022-09-23, 20…
$ name <chr> "Clean, central, quiet", "Stadium Letzi…
$ description <chr> "Arty neighborhood<br /><br /><b>The sp…
$ neighborhood_overview <chr> "", "Located 300 meters to Zurich Letzi…
$ picture_url <chr> "https://a0.muscache.com/pictures/48107…
$ host_id <fct> 377532, 475053, 697307, 854016, 1004816…
$ host_url <chr> "https://www.airbnb.com/users/show/3775…
$ host_name <chr> "Simona", "James", "Erhan", "Delphine",…
$ host_since <date> 2011-02-04, 2011-03-31, 2011-06-13, 20…
$ host_location <chr> "Zurich, Switzerland", "", "Zürich, Swi…
$ host_about <chr> "I am from Italy and have lived in Zuri…
$ host_response_time <fct> NA, within an hour, NA, within an hour,…
$ host_response_rate <chr> NA, "100%", NA, "100%", NA, "100%", "10…
$ host_acceptance_rate <chr> "0%", "98%", "0%", NA, NA, "92%", "40%"…
$ host_is_superhost <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE…
$ host_thumbnail_url <chr> "https://a0.muscache.com/im/users/37753…
$ host_picture_url <chr> "https://a0.muscache.com/im/users/37753…
$ host_neighbourhood <chr> "", "", "", "", "", "", "", "", "", "",…
$ host_listings_count <int> 1, 36, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1,…
$ host_total_listings_count <int> 1, 41, 1, 1, 1, 4, 2, 1, 1, 3, 1, 1, 1,…
$ host_verifications <chr> "['email', 'phone']", "['email', 'phone…
$ host_has_profile_pic <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
$ host_identity_verified <lgl> TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, T…
$ neighbourhood_cleansed <fct> Sihlfeld, Sihlfeld, Alt-Wiedikon, Enge,…
$ neighbourhood_group_cleansed <fct> Kreis 3, Kreis 3, Kreis 3, Kreis 2, Kre…
$ latitude <dbl> 47.37374, 47.38038, 47.35724, 47.36565,…
$ longitude <dbl> 8.519570, 8.504610, 8.523040, 8.527530,…
$ property_type <fct> Entire rental unit, Entire rental unit,…
$ room_type <fct> Entire home/apt, Entire home/apt, Entir…
$ accommodates <dbl> 4, 3, 2, 1, 1, 3, 3, 2, 1, 4, 10, 6, 1,…
$ bathrooms_text <chr> "1 bath", "1 bath", "1.5 baths", "1 bat…
$ bedrooms <dbl> 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 3, 3, 1, …
$ beds <dbl> 1, 2, 2, 1, 1, 2, 2, 1, 1, 2, 9, 2, 1, …
$ amenities <chr> "[\"Essentials\", \"Kitchen\", \"Hot tu…
$ price <chr> "$100.00", "$184.00", "$200.00", "$60.0…
$ has_availability <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
$ number_of_reviews <dbl> 49, 50, 0, 9, 0, 227, 30, 0, 334, 293, …
$ first_review <date> 2012-05-19, 2011-06-10, NA, 2011-08-30…
$ last_review <date> 2019-04-27, 2021-07-16, NA, 2016-05-10…
$ review_scores_rating <dbl> 4.78, 4.52, NA, 4.89, NA, 4.59, 4.97, N…
$ review_scores_accuracy <dbl> 4.87, 4.67, NA, 4.89, NA, 4.65, 5.00, N…
$ review_scores_cleanliness <dbl> 4.80, 4.70, NA, 4.89, NA, 4.33, 4.93, N…
$ review_scores_checkin <dbl> 4.84, 4.64, NA, 4.89, NA, 4.82, 4.97, N…
$ review_scores_communication <dbl> 4.93, 4.77, NA, 4.89, NA, 4.83, 4.93, N…
$ review_scores_location <dbl> 4.71, 4.60, NA, 5.00, NA, 4.77, 4.93, N…
$ review_scores_value <dbl> 4.61, 4.47, NA, 4.89, NA, 4.61, 4.83, N…
$ instant_bookable <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE…
$ calculated_host_listings_count <dbl> 1, 17, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1,…
$ reviews_per_month <dbl> 0.39, 0.36, NA, 0.07, NA, 1.71, 0.23, N…
Below is a list of the cleaning and transforming operations conducted on selected variables.
host_since: a data. I calculate the number of years since host begun hosting on Airbnb
host_response_time: it takes four different values (within an hour, within a day, within a few hours, a few days or more). First, I re-code the categories to make them shorter and then I reorder them in a meaninful way.
host_response_rate: the percentages are recorded as text strings (e.g. “80%”). I convert these to numbers.
host_acceptance_rate: same as before
neighbourhood_cleansedandneighbourhood_group_cleansed: I change the names toneighborhoodanddistrictto follow the official administrative nomenclature. Theneighborhoodname Schwamendingen-Mitte is also changed so that the text folds in two lines (useful in plots).
price: the values are stored as character strings (e.g. “$184.00”). I convert them numeric values. Moreover, there is a listing with price $4600.00. After manually checking the listing I am confident this is either a system error or an error on the part of the host. The listing is similar to others with prices <$200.00. This record will not be omitted from the data set.
Data visualizations
Maps
The municipality of Zürich is divided into 12 districts (Kreise in German), numbered 1 to 12. Each district can contain between 1 and 4 neighborhoods (Quartiere in German), with a total of 34. These administrative subdivisions are shown in the interactive maps below:
Zürich’s districts (left) and neighborhoods (right). (Note: the two maps are synchronized. Zooming, panning, and cursor positioning in one map is reflected on the other.
The approximate locations of the listings in the data set relative to the city’s districts are shown in the figure below (Note: locations within 150m of the actual address due to Airbnb’s privacy and security policies).
Locations of Airbnb listings in the city of Zürich on September 22nd, 2022 (n = 2245). Green points show listings within the city boundaries. Red points show listings outside of the city’s boundaries (Note: all points are clickable and display the advertised price and a link to the listing’s webpage).
Given all listings in the data set have a corresponding value for district and neighborhood, the fact that a lot of the points are outside the district polygons means some listings (n = 256) have been wrongly classified as being within Zürich city’s limits. Many listings outside the city’s boundaries are described in their webpage as being in Zürich, the location being given, for examples, as “Regensdorf, Zürich, Switzerland”. This may refer to the fact that they are in the canton of Zürich. Inside Airbnb performs data scraping that collects this information, but also performs data cleaning in which the scraped location names are assigned to a particular district and neighborhood. It is in this data cleaning step that the errors are likely introduced.
In view of the above, I will perform all further analyses on the subset of the original data set containing only those listings that are strictly within the city’s boundaries (n = 1988).
Plots
District 4 is home to the largest number of listings, largely focused in the neighborhood of Langstrasse. This is Zürich’s most multicultural neighborhood, known for its lively night life, with dance clubs, bars, and music venues. It used to be an area with above-average crime rate, drug dealing and prostitution, but great improvements have been made in the last two decades in public order and safety. An accelerated process of gentrification in the last decade has changed the face of this neighborhood drastically, with a lot buildings changing owners and being renovated. In combination to its proximity to the center of Zürich, this gentrification process has injected a lot of energy into Langstrasse’s real estate market and it is no surprise a lot of previously run down buildings and apartments have been converted to Airbnb rentals.
Despite the large number of listings in the neighborhood of Langstrasse, Altestetten ranks first among the neighborhoods with most listings. This is another quarter of Zürich that used to be less desirable due to its poor development, but is currently undergoing a rapid gentrification process, with large investments in its infrastructure. A large number of apartments are available not just in the short-term rental industry, but more generally in terms of real estate.