This is an analysis of data from from Inside Airbnb.
The data is scraped listings from Airbnb in Lisbon, Portugal.
The data
Short-term rentals have been an issue in many cities across Europe,
including Lisbon, the focus of online publication Atlas Lisboa.
I searched for data on short-term rentals in Lisbon and found Inside Airbnb, which scrapes
listings from Airbnb, the world’s
largest short-term rental platform.
I found data for the city of Lisbon, and decided to explore what it
could tell me about short-term rentals in the city.
The data can be found here.
The data dictionary by Inside Airbnb can be found here
This data is licensed under a Creative Commons
Attribution 4.0 International License.
df <- read.csv("inside_airbnb_listings.csv") %>%
clean_names()
df$price <- gsub("[$,]", "", df$price)
df$price <- as.numeric(df$price)
Filtering
The data includes a lot of information such as host bios and listings
descriptions. For this analysis, I am focusing on licencing so start by
selecting only the variables I think will be useful.
The data is then filtered by neighbourhood_group_cleansed to
only show listings in the city of Lisbon (Lisboa) rather than Greater
Lisbon.
lisbon_listings <- df %>%
select(id, listing_url, host_id, host_name, neighbourhood_group_cleansed, license, latitude, longitude, room_type, price, minimum_nights) %>%
filter(neighbourhood_group_cleansed == "Lisboa")
Next, I filter by license to find listings that self-declare
as “exempt”.
I also filter by license to find listings that have no
licence listed (which I will refer to as ‘blank’ listings).
Each listing has a unique ID, so I am using these to count individual
listings.
exempt_listings <- lisbon_listings %>%
filter(license == "Exempt")
blank_listings <- lisbon_listings %>%
filter(license == "")
# Number of listings in Lisbon city
length(lisbon_listings$id)
## [1] 17428
# Number of listings in Lisbon that self-declared as 'exempt' on Airbnb
length(exempt_listings$id)
## [1] 3698
# Number of listings in Lisbon that left the licence section empty on Airbnb
length(blank_listings$id)
## [1] 669
# Percentage of Lisbon listings that self-declared as 'exempt' on Airbnb
percent((length(exempt_listings$id)/length(lisbon_listings$id)), accuracy = 0.01)
## [1] "21.22%"
# Percentage of Lisbon listings that leave the licence section blank on Airbnb
percent((length(blank_listings$id)/length(lisbon_listings$id)), accuracy = 0.01)
## [1] "3.84%"
We can see that there are 17428 Airbnb listings in
Lisbon city and 3698 of those listings declare that
they are ‘exempt’ from licencing, and a further 669
have nothing entered in the licence section on Airbnb.
How many are homes?
Next, I want to check how many of those are entire apartment or
houses, since it is these rentals that put the most pressure on city
housing stock.
Starting with ‘exempt’ listings.
table(exempt_listings$room_type)
##
## Entire home/apt Hotel room Private room Shared room
## 2460 18 1219 1
exempt_listings %>%
ggplot(aes(room_type))+
geom_bar()+
labs(title = "Listings self-reporting as exempt", x = "Room type", y = "# of listings")

exempt_homes <- exempt_listings %>%
filter(room_type == "Entire home/apt")
We see that 2460 of those ‘exempt’ listings are
entire homes.
Doing the same for blank listings:
table(blank_listings$room_type)
##
## Entire home/apt Hotel room Private room Shared room
## 498 16 153 2
blank_listings %>%
ggplot(aes(room_type))+
geom_bar()+
labs(title = "Blank listings", x = "Room type", y = "# of listings")

blank_homes <- blank_listings %>%
filter(room_type == "Entire home/apt")
We can see that 498 of the blank listings are entire
homes.
Medium- or short-term rentals?
In Portugal, you can offer your property as a medium-term let
(normally a month or more) for students and such.
So next, I look at homes listed on Airbnb by the the minimum number
of nights offered to see if this might be a reason for certain
exemptions.
ggplotly(exempt_homes %>%
filter(minimum_nights < 100) %>%
ggplot(aes(minimum_nights))+
geom_bar(fill = "blue")+
labs(x = "Minimum Nights", y = "Listings"),
dynamicTicks = "y")
There seem to be two clusters.
The majority of exempt homes seem to be operating as regular
short-term rentals, with the minimum number of nights listed as a week
or less. There is another group that has minimum nights starting around
one month, most likely medium-term rentals.
# Entire homes that are offerings short term stays
short_exempt_homes <- exempt_homes %>%
filter(minimum_nights >=1, minimum_nights <=7)
length(short_exempt_homes$id)
## [1] 1382
We have 1382 homes self-declared as exempt offering
stays of 7 days or less.
This is very suspicious, but does not necessarily mean they are
illegal.
Next I look at the number of exempt homes offerings stays with
minimum number of nights above 28 days but less than the 1 year, which
would make a rental permanent.
medium_term <- exempt_homes %>%
filter(minimum_nights >=28, minimum_nights <365)
length(medium_term$id)
## [1] 1003
percent((length(medium_term$id)/length(exempt_homes$id)), accuracy = 0.01)
## [1] "40.77%"
We can see that these “medium-term rentals” account for more than
40% of exempt homes listed on Airbnb.
What next?
From here, we have a few outcomes to follow up:
- One in five Airbnb listings in Lisbon self-declare as exempt
- The majority of these have no obvious reason for being exempt
- Around 40% seem to be ‘medium-term’ rentals, which could offer a
loophole to new legislation
- Almost 4% of listings in Lisbon don’t list any licence at all
The companies are coming in
Next, I am grouping by host ID and host name. Host IDs allow us to
differentiate between multiple hosts with the same name.
multi_usrs <- exempt_homes %>%
group_by(host_id, host_name) %>%
tally(sort = TRUE)
reactable(multi_usrs)
We can see that two hosts stand out: Blueground and Ukio. * Are these
companies? * What business are they operating?
exempt_homes %>%
filter(host_name %in% c("Blueground","Ukio")) %>%
group_by(host_name) %>%
summarise(average_price = mean(price, na.rm = T))
## # A tibble: 2 × 2
## host_name average_price
## <chr> <dbl>
## 1 Blueground 153.
## 2 Ukio 151.
The average price of listings by Blueground and Ukio is 153EUR and
150EUR, respectively.
Or, the equivalent of 4590EUR and 4500EUR per month (30 days),
considerably more than a standard rent in the city.
It’s also interesting to see than many ‘exempt’ hosts have multiple
listings.
Doing the same for blank listings:
reactable(blank_homes %>% group_by(host_id, host_name) %>% tally(sort = TRUE))
A few individuals stand out as having multiple listings but no
licence listed.
It is important to remember that not having a licence
displayed does not mean they don’t have a licence.
It is likely that some of these listings predate the requirement to
add a licence.
Map of exempt and “blank” homes in Lisbon
map_df <- lisbon_listings %>%
filter(license == "Exempt" | license == "") %>%
filter(room_type == "Entire home/apt") %>%
select(id, license, longitude, latitude, price, host_id, host_name)
pal <- colorFactor(
palette = c("red", "blue"),
domain = map_df$license)
map_df %>%
leaflet() %>%
addTiles() %>%
setView(lat = 38.736946, lng = -9.142685, zoom = 12) %>%
addCircles(lat = ~latitude, lng = ~longitude,
color= ~pal(license),
popup = map_df$name) %>%
addLegend(position = "bottomright",
colors = c("red", "blue"),
labels = c("Blank", "Exempt"))
---
title: "Exempt Airbnb Listings in Lisbon"
author: "Eden Flaherty"
date: "2026-01-20"
output:
  html_document:
    toc: true
    toc_float: true
    code_download: true
    theme: united
  
---


```{r setup, include=FALSE}
library("tidyverse")
library("scales")
library("leaflet")
library("reactable")
library("plotly")
library("RColorBrewer")
library("janitor")
```

This is an analysis of data from from Inside Airbnb. 

The data is scraped listings from Airbnb in Lisbon, Portugal.

## The data

Short-term rentals have been an issue in many cities across Europe, including Lisbon, the focus of online publication [Atlas Lisboa](https://www.atlaslisboa.com/).

I searched for data on short-term rentals in Lisbon and found [Inside Airbnb](https://insideairbnb.com/), which scrapes listings from [Airbnb](https://airbnb.com/), the world's largest short-term rental platform. 

I found data for the city of Lisbon, and decided to explore what it could tell me about short-term rentals in the city. 

The data can be found [here.](https://insideairbnb.com/get-the-data/)

The data dictionary by Inside Airbnb can be found [here](https://docs.google.com/spreadsheets/d/1iWCNJcSutYqpULSQHlNyGInUvHg2BoUGoNRIGa6Szc4/edit?gid=1322284596#gid=1322284596)

This data is licensed under a [Creative Commons Attribution 4.0 International License.](http://creativecommons.org/licenses/by/4.0/)


```{r}
df <- read.csv("inside_airbnb_listings.csv") %>% 
  clean_names()

df$price <- gsub("[$,]", "", df$price)
df$price <- as.numeric(df$price)

```


## Filtering 

The data includes a lot of information such as host bios and listings descriptions. For this analysis, I am focusing on licencing so start by selecting only the variables I think will be useful.

The data is then filtered by *neighbourhood_group_cleansed* to only show listings in the city of Lisbon (Lisboa) rather than Greater Lisbon.


```{r}
lisbon_listings <- df %>% 
  select(id, listing_url, host_id, host_name, neighbourhood_group_cleansed, license, latitude, longitude, room_type, price, minimum_nights) %>% 
  filter(neighbourhood_group_cleansed == "Lisboa")
```

Next, I filter by *license* to find listings that self-declare as "exempt".

I also filter by *license* to find listings that have no licence listed (which I will refer to as 'blank' listings).

Each listing has a unique ID, so I am using these to count individual listings.

```{r}

exempt_listings <- lisbon_listings %>%
    filter(license == "Exempt")

blank_listings <- lisbon_listings %>%
    filter(license == "")

# Number of listings in Lisbon city 
length(lisbon_listings$id)

# Number of listings in Lisbon that self-declared as 'exempt' on Airbnb
length(exempt_listings$id)

# Number of listings in Lisbon that left the licence section empty on Airbnb
length(blank_listings$id)

# Percentage of Lisbon listings that self-declared as 'exempt' on Airbnb
percent((length(exempt_listings$id)/length(lisbon_listings$id)), accuracy = 0.01)

# Percentage of Lisbon listings that leave the licence section blank on Airbnb
percent((length(blank_listings$id)/length(lisbon_listings$id)), accuracy = 0.01)

```


We can see that there are **17428** Airbnb listings in Lisbon city and **3698** of those listings declare that they are 'exempt' from licencing, and a further **669** have nothing entered in the licence section on Airbnb. 

## How many are homes? 

Next, I want to check how many of those are entire apartment or houses, since it is these rentals that put the most pressure on city housing stock.

Starting with 'exempt' listings.

```{r}

table(exempt_listings$room_type)

exempt_listings %>% 
    ggplot(aes(room_type))+
  geom_bar()+
  labs(title = "Listings self-reporting as exempt", x = "Room type", y = "# of listings")

exempt_homes <- exempt_listings %>% 
  filter(room_type == "Entire home/apt")

```

We see that **2460** of those 'exempt' listings are entire homes. 

Doing the same for blank listings:

```{r}
table(blank_listings$room_type)

blank_listings %>% 
    ggplot(aes(room_type))+
  geom_bar()+
  labs(title = "Blank listings", x = "Room type", y = "# of listings")

blank_homes <- blank_listings %>% 
  filter(room_type == "Entire home/apt")

```

We can see that **498** of the blank listings are entire homes. 

## Medium- or short-term rentals?

In Portugal, you can offer your property as a medium-term let (normally a month or more) for students and such.

So next, I look at homes listed on Airbnb by the the minimum number of nights offered to see if this might be a reason for certain exemptions. 

```{r}

ggplotly(exempt_homes %>% 
  filter(minimum_nights < 100) %>% 
  ggplot(aes(minimum_nights))+
  geom_bar(fill = "blue")+
  labs(x = "Minimum Nights", y = "Listings"), 
  dynamicTicks = "y")

```


There seem to be two clusters.

The majority of exempt homes seem to be operating as regular short-term rentals, with the minimum number of nights listed as a week or less. 
There is another group that has minimum nights starting around one month, most likely medium-term rentals. 

```{r}

# Entire homes that are offerings short term stays 
short_exempt_homes <- exempt_homes %>%
  filter(minimum_nights >=1, minimum_nights <=7)
  
length(short_exempt_homes$id)


```

We have **1382** homes self-declared as exempt offering stays of 7 days or less. 

This is very suspicious, but does not necessarily mean they are illegal.

Next I look at the number of exempt homes offerings stays with minimum number of nights above 28 days but less than the 1 year, which would make a rental permanent.


```{r}

medium_term <- exempt_homes %>%
  filter(minimum_nights >=28, minimum_nights <365)

length(medium_term$id)

percent((length(medium_term$id)/length(exempt_homes$id)), accuracy = 0.01)

```

We can see that these "medium-term rentals" account for more than **40% of exempt homes listed on Airbnb**. 

## What next?

From here, we have a few outcomes to follow up:

1. One in five Airbnb listings in Lisbon self-declare as exempt
+ The majority of these have no obvious reason for being exempt
+ Around 40% seem to be 'medium-term' rentals, which could offer a loophole to new legislation
2. Almost 4% of listings in Lisbon don't list any licence at all

## The companies are coming in

Next, I am grouping by host ID and host name. Host IDs allow us to differentiate between multiple hosts with the same name. 

```{r}

multi_usrs <- exempt_homes %>% 
  group_by(host_id, host_name) %>% 
  tally(sort = TRUE) 

reactable(multi_usrs)

```

We can see that two hosts stand out: Blueground and Ukio. 
* Are these companies?
* What business are they operating? 

```{r}
exempt_homes %>% 
  filter(host_name %in% c("Blueground","Ukio")) %>% 
  group_by(host_name) %>% 
  summarise(average_price = mean(price, na.rm = T))

```

The average price of listings by Blueground and Ukio is 153EUR and 150EUR, respectively.

Or, the equivalent of 4590EUR and 4500EUR per month (30 days), considerably more than a standard rent in the city. 

It's also interesting to see than many 'exempt' hosts have multiple listings.

Doing the same for blank listings:

```{r}
reactable(blank_homes %>% group_by(host_id, host_name) %>% tally(sort = TRUE))
```

A few individuals stand out as having multiple listings but no licence listed. 

***It is important to remember that not having a licence displayed does not mean they don't have a licence.***

It is likely that some of these listings predate the requirement to add a licence.

## Map of exempt and "blank" homes in Lisbon

```{r}

map_df <- lisbon_listings %>%
  filter(license == "Exempt" | license == "") %>% 
  filter(room_type == "Entire home/apt") %>% 
  select(id, license, longitude, latitude, price, host_id, host_name)

pal <- colorFactor(
  palette = c("red", "blue"),
  domain = map_df$license)

map_df %>% 
  leaflet() %>% 
  addTiles() %>% 
  setView(lat = 38.736946, lng = -9.142685, zoom = 12) %>% 
  addCircles(lat = ~latitude, lng = ~longitude,
             color= ~pal(license),
             popup = map_df$name) %>% 
  addLegend(position = "bottomright", 
            colors = c("red", "blue"), 
            labels = c("Blank", "Exempt"))
  
```


