1 Introduction
2 Data exploration
3 Point pattern analysis
- 3.1 Data transofrmations
  - 3.1.1 Unmarked point pattern
  - 3.1.2 Marked point pattern
- 3.2 Analysis
4 Summary

1 Introduction

This project analyzes the spatial distribution of high-tech firms in Warsaw across two time points: 2012 and 2021. Using point pattern analysis and Poisson point process modeling, it is studies how firm specialization (based on PKD sector codes) and proximity to Rondo Daszyńskiego — now a central business hub — influence spatial intensity. The goal is to uncover sectoral clustering patterns and how spatial dynamics have evolved over the studied decade.

1.1 Loading packages

library(here)
library(dplyr)
library(sf)
library(ggplot2)
library(grid)
library(gridExtra)
library(spatstat)

1.2 Loading the data

At first, we will load the necessary data - registers with firms from 2012 and 2021 as well as the shapefile for Warsaw. When loading the firm data, we drop unnecessary variables and select only firms that are categorized as technological. In our dataset we differentiate between 3 groups of technological firms:

medium high-tech
high-tech
high-tech knowledge-intensive services

# REGON from 2012
regon.2012 <- st_read(here("data", "regon2012_clean.shp"))
regon.2012.tech <- regon.2012 %>%
  filter(IF_TECH == 1) %>% # tech firms
  select(-c(REGON, POWIAT, GMINA, STR)) # dropping unnecessary columns
glimpse(regon.2012.tech)

# REGON from 2021
regon.2021 <- st_read(here("data", "regon2021_clean.shp"))
regon.2021.tech <- regon.2021 %>%
  filter(IF_TECH == 1) %>% # tech firms
  select(-c(REGON, KOD_GM, STR)) # dropping unnecessary columns
glimpse(regon.2021.tech)

# Shapefile powiaty (Warsaw)
powiaty <- st_read(here("data", "powiaty.shp"))
powiaty <- st_transform(powiaty, 4326)
warszawa_sf <- powiaty %>% filter(JPT_NAZWA_ == "powiat Warszawa")

1.3 Filtering out data for Warsaw

With the use of the shapefile of Warsaw (polygon type) we can filter out points that are within that area, using function st_filter(). We do that for both datasets and set the type of tech firm as an ordered factor.

# 2012
reg.12.waw <- st_filter(regon.2012.tech, warszawa_sf)
glimpse(reg.12.waw) # 29 597 points

## Rows: 29,597
## Columns: 7
## $ SEK_PKD7 <chr> "C", "J", "J", "J", "J", "C", "J", "J", "J", "J", "J", "J", "…
## $ PKD7     <chr> "2899Z", "6209Z", "6312Z", "6202Z", "6202Z", "2630Z", "6201Z"…
## $ TECH     <chr> "medium-high-tech", "high-tech know-intens serv", "high-tech …
## $ IF_TECH  <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ SURVIVOR <dbl> 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1…
## $ LPRAC    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ geometry <POINT [°]> POINT (20.98255 52.26879), POINT (20.90713 52.21079), P…

# 2021
reg.21.waw <- st_filter(regon.2021.tech, warszawa_sf)
glimpse(reg.21.waw) # 48 992 points

## Rows: 48,992
## Columns: 7
## $ SEK_PKD7 <chr> "J", "M", "J", "J", "J", "J", "J", "J", "C", "J", "J", "J", "…
## $ PKD7     <chr> "6399Z", "7211Z", "6312Z", "6312Z", "6209Z", "5920Z", "6311Z"…
## $ TECH     <chr> "high-tech know-intens serv", "high-tech know-intens serv", "…
## $ IF_TECH  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ SURVIVOR <dbl> 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0…
## $ LPRAC    <int> 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ geometry <POINT [°]> POINT (21.01173 52.22983), POINT (21.01173 52.22983), P…

reg.12.waw$TECH <- factor(reg.12.waw$TECH,
                          levels = c("medium-high-tech",
                                     "high-tech",
                                     "high-tech know-intens serv"))

reg.21.waw$TECH <- factor(reg.21.waw$TECH,
                          levels = c("medium-high-tech",
                                     "high-tech",
                                     "high-tech know-intens serv"))

Above we observe what variables are contained within the datasets:

SEK_PKD7 - sector of the firm, overall we have 21 sectors (A:U), but in case of technological firms we will observe only 3:
- C: industrial processing (firms from medium high-tech and high-tech groups)
- J: information and communication (firms from high-tech knowledge-intensive services group)
- M: professional, scientific, and technical activities (firms from high-tech knowledge-intensive services group)
PKD7 - specialization of the firm (over 90 unique tech specializations in our dataset)
TECH - tech group of the firm (medium high-tech, high-tech or high-tech knowledge-instensive services)
IF-TECH - whether the firm is a technological one (all are 1’s in our dataset)
SURVIVOR - whether the firms existed both in the 2012 and 2021 register
LPRAC - size of the firm measured by the number of employees, in groups:
- 1 - 0-9 employees
- 2 - 10-49 employees
- 3 - 50-249 employees
- 4 - 250-999 employees
- 5 - 1000+ employees
geometry - stores the coordinates of the firm

2 Data exploration

2.1 Distribution of all tech firms

Let’s analyze how did the distribution of all tech firms looked like in 2012 and 2021.

alltech_2012 <- ggplot() +
  geom_sf(data = warszawa_sf, fill = "white") +
  geom_sf(data = reg.12.waw, color = "#3B528BFF", size = 0.1, alpha = 0.05) +
  theme_minimal() +
  labs(title = "2012",
       subtitle = "29 597 firms") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 12, face = 'bold'),
        plot.subtitle = element_text(hjust = 0.5, size = 10))

alltech_2021 <- ggplot() +
  geom_sf(data = warszawa_sf, fill = "white") +
  geom_sf(data = reg.21.waw, color = "#3B528BFF", size = 0.1, alpha = 0.05) +
  theme_minimal() +
  labs(title = "2021",
       subtitle = "48 992 firms") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        plot.title = element_text(hjust = 0.5, size = 12, face = 'bold'),
        plot.subtitle = element_text(hjust = 0.5, size = 10))

grid.arrange(alltech_2012, alltech_2021,
             ncol = 2,
             top = textGrob("Distribution of technological firms in Warsaw",
                            gp = gpar(fontsize = 15, font = 3)))

We can see that over the years the number of technological firms in Warsaw went up by nearly 20 thousand firms. The highest concentration of them is observed in the central part of the city

2.2 Distribution of tech firms by tech groups

2.2.1 Change in numbers

Now let’s see how the distribution differs between different tech groups

# table(reg.12.waw$TECH)
label_firms_2012 <- c(
  "high-tech" = "high-tech\n1 467 firms",
  "medium-high-tech" = "medium\nhigh-tech\n2 882 firms",
  "high-tech know-intens serv" = "high-tech\nknow-intens serv\n25 248 firms")

# table(reg.21.waw$TECH)
label_firms_2021 <- c(
  "high-tech" = "high-tech\n1 361 firms",
  "medium-high-tech" = "medium\nhigh-tech\n3 122 firms",
  "high-tech know-intens serv" = "high-tech\nknow-intens serv\n44 509 firms")

tech_groups_2012 <- ggplot() +
  geom_sf(data = warszawa_sf, fill = "white") +
  geom_sf(data = reg.12.waw, color = "#3B528BFF", size = 0.1, alpha = 0.1) +
  facet_wrap(~ TECH, labeller = as_labeller(label_firms_2012)) +
  theme_minimal() +
  labs(title = "2012") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5, size = 12, face = 'bold'))

tech_groups_2021 <- ggplot() +
  geom_sf(data = warszawa_sf, fill = "white") +
  geom_sf(data = reg.21.waw, color = "#3B528BFF", size = 0.1, alpha = 0.1) +
  facet_wrap(~ TECH, labeller = as_labeller(label_firms_2021)) +
  theme_minimal() +
  labs(title = "2021") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5, size = 12, face = 'bold'))

grid.arrange(tech_groups_2012, tech_groups_2021,
             nrow = 2,
             top = textGrob("Distribution of technological firms in Warsaw by tech groups",
                            gp = gpar(fontsize = 15, font = 3)))

We see that the most frequently occuring group is the high-tech knowledge-intensive services. When we analyze the differences between the years, this is also the group that stood behind the big growth of the number of firms (~ 20 000 new firms). When it comes to the other groups, we see slight changes for the medium high-tech group (up by ~ 300 firms) and high tech (decline by ~ 100 firms).

From the maps we can also observe that the distribution of the firms definitely varies across space.

2.3 Most frequent specializations of firms

As a next step, let’s analyze what are the most frequently occuring tech specializations in Warsaw and how did they change over the years.

To do this we select top 10 most frequently occuring specializations

top_12 <- as.data.frame(sort(table(reg.12.waw$PKD7), decreasing = T)[1:10])
top_21 <- as.data.frame(sort(table(reg.21.waw$PKD7), decreasing = T)[1:10])

colnames(top_12) <- c("PKD7", "Firms_2012")
colnames(top_21) <- c("PKD7", "Firms_2021")

pkd_lookup <- data.frame(
  PKD7 = c("6201Z", "6202Z", "5911Z", "6312Z", "6209Z",
           "6311Z", "6110Z", "3250Z", "7219Z", "6203Z"),
  Description = c(
    "Software development",
    "IT consulting",
    "Film and video production",
    "Internet portals",
    "Other IT services",
    "Data processing & hosting",
    "Wired telecommunications",
    "Medical device manufacturing",
    "R&D in science/tech",
    "IT infrastructure management"))

top_combined <- full_join(top_12, top_21, by = "PKD7") %>% 
  mutate(Difference = Firms_2021 - Firms_2012,
         Percent_Change = round(100 * (Firms_2021 - Firms_2012) / Firms_2012, 1)) %>% 
  left_join(pkd_lookup, by = 'PKD7') %>% 
  select(PKD7, Description, Firms_2012, Firms_2021, Difference, Percent_Change)

knitr::kable(top_combined)

PKD7	Description	Firms_2012	Firms_2021	Difference	Percent_Change
6201Z	Software development	7964	18381	10417	130.8
6202Z	IT consulting	4118	8075	3957	96.1
5911Z	Film and video production	2812	4351	1539	54.7
6312Z	Internet portals	2244	2861	617	27.5
6209Z	Other IT services	2098	2692	594	28.3
6311Z	Data processing & hosting	1348	1501	153	11.4
6110Z	Wired telecommunications	774	759	-15	-1.9
3250Z	Medical device manufacturing	715	745	30	4.2
7219Z	R&D in science/tech	653	1010	357	54.7
6203Z	IT infrastructure management	578	1259	681	117.8

From the table above we observe that in both 2012 and 2021 we had the same firms in the top 10. When it comes to the order, specializations from the top 6 stayed the same.

Looking at the % growth (2012 as the base level), we see that the biggest rise is registered for Software development firms (+ 130.8%) and secondly for the IT infrastructure management firms (+ 117.8%). This means that in these specializations, number of firms more than doubled when comparing to the state in 2012.

The only observed decline is for the Wired telecommunication companies.

2.3.1 Distribution comparison

To analyze firms with most popular specializations, we will make subsets and save them as new datasets. At the same time this will help us minimize number of points to analyze further on.

top5.reg.12.waw <- reg.12.waw %>% 
  filter(PKD7 %in% c("6201Z", "6202Z", "5911Z", "6312Z", "6209Z")) %>% 
  mutate(PKD7 = factor(PKD7, levels = c("6201Z", "6202Z", "5911Z", "6312Z", "6209Z"))) %>% 
  glimpse() # 19 236 firms

## Rows: 19,236
## Columns: 7
## $ SEK_PKD7 <chr> "J", "J", "J", "J", "J", "J", "J", "J", "J", "J", "J", "J", "…
## $ PKD7     <fct> 6209Z, 6312Z, 6202Z, 6202Z, 6201Z, 6312Z, 6201Z, 5911Z, 6202Z…
## $ TECH     <fct> high-tech know-intens serv, high-tech know-intens serv, high-…
## $ IF_TECH  <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ SURVIVOR <dbl> 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1…
## $ LPRAC    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1…
## $ geometry <POINT [°]> POINT (20.90713 52.21079), POINT (21.02519 52.17849), P…

top5.reg.21.waw <- reg.21.waw %>% 
  filter(PKD7 %in% c("6201Z", "6202Z", "5911Z", "6312Z", "6209Z")) %>% 
  mutate(PKD7 = factor(PKD7, levels = c("6201Z", "6202Z", "5911Z", "6312Z", "6209Z"))) %>% 
  glimpse() # 36 360 firms

## Rows: 36,360
## Columns: 7
## $ SEK_PKD7 <chr> "J", "J", "J", "J", "J", "J", "J", "J", "J", "J", "J", "J", "…
## $ PKD7     <fct> 6312Z, 6312Z, 6209Z, 6312Z, 6201Z, 6202Z, 6312Z, 6202Z, 6312Z…
## $ TECH     <fct> high-tech know-intens serv, high-tech know-intens serv, high-…
## $ IF_TECH  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ SURVIVOR <dbl> 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1…
## $ LPRAC    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ geometry <POINT [°]> POINT (21.01173 52.22983), POINT (21.01173 52.22983), P…

labels_top5_2012 <- c(
  "6201Z" = "Software\ndevelopment\n7 964 firms",
  "6202Z" = "IT consulting\n4 118 firms",
  "5911Z" = "Fim and video\nproduction\n2 812 firms",
  "6312Z" = "Internet\nportals\n2 244 firms",
  "6209Z" = "Other\nIT services\n2 098 firms")

labels_top5_2021 <- c(
  "6201Z" = "Software\ndevelopment\n18 381 firms",
  "6202Z" = "IT consulting\n8 075 firms",
  "5911Z" = "Fim and video\nproduction\n4 351 firms",
  "6312Z" = "Internet\nportals\n2 861 firms",
  "6209Z" = "Other\nIT services\n2 692 firms")

top_5_2012 <- ggplot() +
  geom_sf(data = warszawa_sf, fill = "white") +
  geom_sf(data = top5.reg.12.waw, color = "#3B528BFF", size = 0.1, alpha = 0.1) +
  facet_wrap(~ PKD7,
             labeller = as_labeller(labels_top5_2012),
             nrow = 5) +
  theme_minimal() +
  labs(title = "2012") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5, size = 12, face = 'bold')) +
  coord_sf(datum = NA)

top_5_2021 <- ggplot() +
  geom_sf(data = warszawa_sf, fill = "white") +
  geom_sf(data = top5.reg.21.waw, color = "#3B528BFF", size = 0.1, alpha = 0.1) +
  facet_wrap(~ PKD7,
             labeller = as_labeller(labels_top5_2021),
             nrow = 5) +
  theme_minimal() +
  labs(title = "2021") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5, size = 12, face = 'bold')) +
  coord_sf(datum = NA)

grid.arrange(top_5_2012, top_5_2021,
             ncol = 2,
             top = textGrob("Top 5\ntechnological firm types\nby frequency",
                            gp = gpar(fontsize = 15, font = 3)))

3 Point pattern analysis

In this step of the analysis we will look at the data using some methods designed specifically for point patterns.

3.1 Data transofrmations

First off, let’s transform Warsaw boundary to the planar projection and prepare the window.

# Warszawa polygon in planar projection
warszawa <- st_transform(warszawa_sf, 3857) 

# Making the window for the point pattern
W <- as.owin(warszawa) 
W

## window: polygonal boundary
## enclosing rectangle: [2321199.3, 2367893.7] x [6817837, 6866968] units

Secondly let’s transform our firm (point) data to the point pattern type. To do this, we have to also transform it to the planar projection.

# Tranforming points to the planar projection 
top5.spec.tech.12 <- st_transform(top5.reg.12.waw, 3857) # 2012 data
top5.spec.tech.21 <- st_transform(top5.reg.21.waw, 3857) # 2021 data

3.1.1 Unmarked point pattern

To prepare the unmarked point pattern we use the coordinates for the firms and the window for Warsaw (W). Then we perform some minimal reshuffling using jitter() to deal with the problem of duplicated points (different firms with the same localization).

top5.spec.tech.12.ppp.um <- ppp(x = st_coordinates(top5.spec.tech.12)[,1], 
                                y = st_coordinates(top5.spec.tech.12)[,2], 
                                window = W) 
top5.spec.tech.12.ppp.um <- rjitter(top5.spec.tech.12.ppp.um, 0.02)

top5.spec.tech.21.ppp.um <- ppp(x = st_coordinates(top5.spec.tech.21)[,1], 
                                y = st_coordinates(top5.spec.tech.21)[,2], 
                                window = W) 
top5.spec.tech.21.ppp.um <- rjitter(top5.spec.tech.21.ppp.um, 0.02)

summary(top5.spec.tech.21.ppp.um)

## Planar point pattern:  36360 points
## Average intensity 2.641824e-05 points per square unit
## 
## Coordinates are given to 1 decimal place
## i.e. rounded to the nearest multiple of 0.1 units
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [2321199.3, 2367893.7] x [6817837, 6866968] units
##                      (46690 x 49130 units)
## Window area = 1376320000 square units
## Fraction of frame area: 0.6

3.1.2 Marked point pattern

To prepare the marked point pattern we again use the coordinates for the firms and the window for Warsaw (W), but also add the marks, which in our case will be the specializations indicated by the PKD7 codes. Similarly, we use jitter() to deal with firms with the same localisations.

top5.spec.tech.12.ppp.pkd7 <- ppp(x = st_coordinates(top5.spec.tech.12)[,1], 
                                y = st_coordinates(top5.spec.tech.12)[,2], 
                                window = W,
                               marks = top5.spec.tech.12$PKD7) 
top5.spec.tech.12.ppp.pkd7 <- rjitter(top5.spec.tech.12.ppp.pkd7, 0.02)

top5.spec.tech.21.ppp.pkd7 <- ppp(x = st_coordinates(top5.spec.tech.21)[,1], 
                                y = st_coordinates(top5.spec.tech.21)[,2], 
                                window = W,
                               marks = top5.spec.tech.21$PKD7) 
top5.spec.tech.21.ppp.pkd7 <- rjitter(top5.spec.tech.21.ppp.pkd7, 0.02)

summary(top5.spec.tech.21.ppp.pkd7)

## Marked planar point pattern:  36360 points
## Average intensity 2.641824e-05 points per square unit
## 
## Coordinates are given to 1 decimal place
## i.e. rounded to the nearest multiple of 0.1 units
## 
## Multitype:
##       frequency proportion    intensity
## 6201Z     18381 0.50552810 1.335516e-05
## 6202Z      8075 0.22208470 5.867086e-06
## 5911Z      4351 0.11966450 3.161324e-06
## 6312Z      2861 0.07868537 2.078729e-06
## 6209Z      2692 0.07403740 1.955938e-06
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [2321199.3, 2367893.7] x [6817837, 6866968] units
##                      (46690 x 49130 units)
## Window area = 1376320000 square units
## Fraction of frame area: 0.6

Let’s visualize the marked point patterns.

par(mfrow = c(1,2))
plot(top5.spec.tech.12.ppp.pkd7, pch = 20, cex = 0.6,
     cols= c("red", "green", "blue", "pink", "orange"),
     main = "2012")
plot(top5.spec.tech.21.ppp.pkd7, pch = 20, cex = 0.6,
     cols= c("red", "green", "blue", "pink", "orange"),
     main = "2021")

par(mfrow = c(1,1))

Next on we will, we will rescale the data so it fits the real life dimensions. Warsaw has the surface of 517 sq km, so that is what we use to calculate the conversion factor and later on rescale the points using the rescale() function.

area_km2 <- 517 # area of warsaw
area_units <- area.owin(W) # area in the units
rsc <- sqrt(area_units/area_km2) # conversion factor

# rescalling
top5.spec.tech.12.ppp.um <- rescale(top5.spec.tech.12.ppp.um, rsc, "km")
summary(top5.spec.tech.12.ppp.um) # ok

## Planar point pattern:  19236 points
## Average intensity 37.20696 points per square km
## 
## Coordinates are given to 4 decimal places
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6

top5.spec.tech.21.ppp.um <- rescale(top5.spec.tech.21.ppp.um, rsc, "km")
summary(top5.spec.tech.21.ppp.um)

## Planar point pattern:  36360 points
## Average intensity 70.32882 points per square km
## 
## Coordinates are given to 4 decimal places
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6

top5.spec.tech.12.ppp.pkd7 <- rescale(top5.spec.tech.12.ppp.pkd7, rsc, "km")
summary(top5.spec.tech.12.ppp.pkd7)

## Marked planar point pattern:  19236 points
## Average intensity 37.20696 points per square km
## 
## Coordinates are given to 4 decimal places
## 
## Multitype:
##       frequency proportion intensity
## 6201Z      7964  0.4140154 15.404260
## 6202Z      4118  0.2140778  7.965184
## 5911Z      2812  0.1461842  5.439072
## 6312Z      2244  0.1166563  4.340426
## 6209Z      2098  0.1090663  4.058027
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6

top5.spec.tech.21.ppp.pkd7 <- rescale(top5.spec.tech.21.ppp.pkd7, rsc, "km")
summary(top5.spec.tech.21.ppp.pkd7)

## Marked planar point pattern:  36360 points
## Average intensity 70.32882 points per square km
## 
## Coordinates are given to 4 decimal places
## 
## Multitype:
##       frequency proportion intensity
## 6201Z     18381 0.50552810 35.553190
## 6202Z      8075 0.22208470 15.618960
## 5911Z      4351 0.11966450  8.415861
## 6312Z      2861 0.07868537  5.533849
## 6209Z      2692 0.07403740  5.206963
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6

3.2 Analysis

3.2.1 Distances to the nearest neighbours

With the use of the distfun() function we can analyze the distances to the nearest neighbors and look how does the distribution of these distances looks like on the map. We will look at the distances to the 5th nearest neighbour, to have a wider picture.

# distances to the neighbours
dist.fun.12 <- distfun(top5.spec.tech.12.ppp.um, k = 5)
dist.fun.21 <- distfun(top5.spec.tech.21.ppp.um, k = 5)

par(mfrow = c(1,2))
plot(dist.fun.12, main = "Distances in 2012\nk = 5")
plot(dist.fun.21, main = "Distances in 2021\nk = 5")

par(mfrow = c(1,1))

At the first look, the distribution looks very similar, but looking closer we see that the scale changed - with more firms in 2021 the maximum distances to the nearest neighbors possibly shrinked. Let’s look at some statistics.

summary(dist.fun.12)

## Distance function for point pattern
## Distance to 5th nearest point will be computed
## defined in a polygonal window inside the frame [1422.649, 1451.267] x [4178.61, 
## 4208.722] km
## 
## Distance function values:
##  range = [0.001351011, 3.037102]
##  mean = 0.5316533

summary(dist.fun.21)

## Distance function for point pattern
## Distance to 5th nearest point will be computed
## defined in a polygonal window inside the frame [1422.649, 1451.267] x [4178.61, 
## 4208.722] km
## 
## Distance function values:
##  range = [0.003327948, 2.747697]
##  mean = 0.4247781

So we can observe that maximum distance is smaller, same with the mean distance (from 0.53 km in 2012 to 0.42 km in 2021).

3.2.2 Intensity of points and clustering

Based on the previous findings, we know that the number of fimrs increased over the years. Let’s investigate how the intensity and number of firms in specific areas changed over the years.

intensity(top5.spec.tech.12.ppp.um)

## [1] 37.20696

intensity(top5.spec.tech.21.ppp.um)

## [1] 70.32882

Thanks to the intensity() function we can conclude that the number of firms per 1 km² went up from ~37 to ~70 firms!

How does the situation look for different parts of the city? At first, let’s see how the numbers differ over the years using the default settings in the quadratcount() function (division to 5x5 grid).

qdr_2012 <- quadratcount(top5.spec.tech.12.ppp.um)
qdr_2021 <- quadratcount(top5.spec.tech.21.ppp.um)
qdr_diff <- qdr_2021 - qdr_2012

par(mfrow = c(3,2))
plot(qdr_2012, main = "2012 (n)")
plot(intensity(qdr_2012, image=TRUE), main = "2012 (intensity)")
plot(qdr_2021, main = "2021 (n)")
plot(intensity(qdr_2021, image=TRUE), main = "2021 (intensity)")
plot(qdr_diff, main = "Difference (n)")
plot(intensity(qdr_diff, image=TRUE), main = "Difference (intensity)")
par(mfrow = c(1,1))

In the first column we can observe the number of firms in each tile in 2012 and 2021, as well as the difference between the years. In the right column, we observe the intensity, which is the number of firms per km² in 2012 and 2021 as well as the difference in the intensity between the years.

We can observe that the central part of the city was the one with the highest concetration of technological firms, and over the years that positioned became even stronger. When we look at the differences in number of firms, one central tile is definitely ahead of others.

How would the intensity look like when we would change the number of tiles?

qdr_2012_8 <- quadratcount(top5.spec.tech.12.ppp.um, nx = 8)
qdr_2021_8 <- quadratcount(top5.spec.tech.21.ppp.um, nx = 8)

qdr_2012_12 <- quadratcount(top5.spec.tech.12.ppp.um, nx = 12)
qdr_2021_12 <- quadratcount(top5.spec.tech.21.ppp.um, nx = 12)

qdr_2012_20 <- quadratcount(top5.spec.tech.12.ppp.um, nx = 20)
qdr_2021_20 <- quadratcount(top5.spec.tech.21.ppp.um, nx = 20)

par(mfrow = c(3,2))

plot(intensity(qdr_2012_8, image=TRUE), main = "2012 (8x8)")
plot(intensity(qdr_2021_8, image=TRUE), main = "2021 (8x8)")

plot(intensity(qdr_2012_12, image=TRUE), main = "2012 (12x12)")
plot(intensity(qdr_2021_12, image=TRUE), main = "2021 (12x12)")

plot(intensity(qdr_2012_20, image=TRUE), main = "2012 (20x20)")
plot(intensity(qdr_2021_20, image=TRUE), main = "2021 (20x20)")

par(mfrow = c(1,1))

The different intensities show us how there is a strong centralization when it comes to the location of the technological firms and that with a high likelihood, there is a clustering pattern there. Let’s perform a test to find out, usingn the data from 2012 for the 5x5 quadrants.

qdr.test.12 <- quadrat.test(top5.spec.tech.12.ppp.um,
                            alternative = "clustered")
qdr.test.12

## 
##  Chi-squared test of CSR using quadrat counts
## 
## data:  top5.spec.tech.12.ppp.um
## X2 = 15791, df = 21, p-value < 2.2e-16
## alternative hypothesis: clustered
## 
## Quadrats: 22 tiles (irregular windows)

By running a chi-squared test of Complete Spatial Randomness (CSR) using quadrant counts, we observe strong statistical evidence (p-value close to 0) to reject the null hypothesis of spatial randomness for the sake of the alternative, which is the clustering. So even at the lowest level of granulation (lowest amount of tiles) we observe strong clustering.

To explore how clustering behaves across different spatial scales, we can proceed with Ripley’s K-function analysis. The goal is to see how it behaves on different distance but also for different years.

rvals <- seq(0, 3, by = 0.05) # for same length of functions
K.12.um <- Kest(top5.spec.tech.12.ppp.um, r = rvals)
K.21.um <- Kest(top5.spec.tech.21.ppp.um, r = rvals)

plot(K.12.um, 
     col = "black", 
     lwd = 2, 
     main = "Ripley's K-function: 2012 vs 2021")

lines(K.21.um$r, K.21.um$theo, 
      col = "blue", 
      lty = 2)

lines(K.21.um$r, K.21.um$border, 
      col = "blue", 
      lwd = 2)

legend("topleft",
       legend = c("2012", "2021"),
       col = c("black", "blue"),
       lty = c(1, 1),
       lwd = 2,
       bty = "o",
       inset = c(0.05, 0.35))

In both years, the observed K-functions lie above the dashed line representing complete spatial randomness (CSR), indicating significant clustering. What’s more, the 2021 curve (blue) is consistently higher than the 2012 curve (black), suggesting that the spatial concentration of high-tech firms intensified over time.

Next on we can look how the clustering differs for different firm specializations. To do this we will split our marked point pattern from 2012 and analyze.

ppp.split.12 <- split(top5.spec.tech.12.ppp.pkd7)
ppp.split.12

## Point pattern split by factor 
## 
## 6201Z:
## Planar point pattern: 7964 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 6202Z:
## Planar point pattern: 4118 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 5911Z:
## Planar point pattern: 2812 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 6312Z:
## Planar point pattern: 2244 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 6209Z:
## Planar point pattern: 2098 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km

K.6201Z <- Kest(ppp.split.12$'6201Z', r = rvals)
K.6202Z <- Kest(ppp.split.12$'6202Z', r = rvals)
K.5911Z <- Kest(ppp.split.12$'5911Z', r = rvals)
K.6312Z <- Kest(ppp.split.12$'6312Z', r = rvals)
K.6209Z <- Kest(ppp.split.12$'6209Z', r = rvals)

# K.6201Z
plot(K.6201Z, 
     col = "black", 
     lwd = 2, 
     main = "Ripley's K-function: tech specializations in 2012",
     ylim = c(0, 100))

# K.6202Z
lines(K.6202Z$r, K.6202Z$theo, 
      col = "blue", 
      lty = 2)

lines(K.6202Z$r, K.6202Z$border, 
      col = "blue", 
      lwd = 2)

# K.5911Z
lines(K.5911Z$r, K.5911Z$theo, 
      col = "red", 
      lty = 2)

lines(K.5911Z$r, K.5911Z$border, 
      col = "red", 
      lwd = 2)

# K.6312Z
lines(K.6312Z$r, K.6312Z$theo, 
      col = "pink", 
      lty = 2)

lines(K.6312Z$r, K.6312Z$border, 
      col = "pink", 
      lwd = 2)

# K.6209Z
lines(K.6209Z$r, K.6209Z$theo, 
      col = "orange", 
      lty = 2)

lines(K.6209Z$r, K.6209Z$border, 
      col = "orange", 
      lwd = 2)

# Legend
legend("topleft",
       legend = c("Software development",
                  "IT consulting",
                  "Film and video production",
                  "Internet portals",
                  "Other IT services"),
       col = c("black", "blue", "red", "pink", "orange"),
       lty = 1,
       lwd = 2,
       bty = "o",
       cex = 0.75,
       inset = c(0.05, 0.35))

The Ripley’s K-function plot illustrates the spatial distribution of high-tech firms across five major specializations in 2012. In all cases, the observed K-functions lie above the theoretical Poisson baseline (dashed line), indicating significant spatial clustering rather than random dispersion. Among the sectors, Internet portals (pink) show the strongest clustering across all spatial scales, followed by Film and video production (red) and other IT services (orange). In contrast, software development (black) and IT consulting (blue) exhibit comparatively lower, yet still evident, clustering patterns.

To assess how these spatial patterns have evolved over time, we now compare them to the clustering behavior observed in 2021.

ppp.split.21 <- split(top5.spec.tech.21.ppp.pkd7)
ppp.split.21

## Point pattern split by factor 
## 
## 6201Z:
## Planar point pattern: 18381 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 6202Z:
## Planar point pattern: 8075 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 5911Z:
## Planar point pattern: 4351 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 6312Z:
## Planar point pattern: 2861 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
## 
## 6209Z:
## Planar point pattern: 2692 points
## window: polygonal boundary
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km

K.6201Z <- Kest(ppp.split.21$'6201Z', r = rvals)
K.6202Z <- Kest(ppp.split.21$'6202Z', r = rvals)
K.5911Z <- Kest(ppp.split.21$'5911Z', r = rvals)
K.6312Z <- Kest(ppp.split.21$'6312Z', r = rvals)
K.6209Z <- Kest(ppp.split.21$'6209Z', r = rvals)

K.6201Z <- Kest(ppp.split.12$'6201Z', r = rvals)
K.6202Z <- Kest(ppp.split.12$'6202Z', r = rvals)
K.5911Z <- Kest(ppp.split.12$'5911Z', r = rvals)
K.6312Z <- Kest(ppp.split.12$'6312Z', r = rvals)
K.6209Z <- Kest(ppp.split.12$'6209Z', r = rvals)

# K.6201Z
plot(K.6201Z, 
     col = "black", 
     lwd = 2, 
     main = "Ripley's K-function: tech specializations in 2021",
     ylim = c(0, 100))

# K.6202Z
lines(K.6202Z$r, K.6202Z$theo, 
      col = "blue", 
      lty = 2)

lines(K.6202Z$r, K.6202Z$border, 
      col = "blue", 
      lwd = 2)

# K.5911Z
lines(K.5911Z$r, K.5911Z$theo, 
      col = "red", 
      lty = 2)

lines(K.5911Z$r, K.5911Z$border, 
      col = "red", 
      lwd = 2)

# K.6312Z
lines(K.6312Z$r, K.6312Z$theo, 
      col = "pink", 
      lty = 2)

lines(K.6312Z$r, K.6312Z$border, 
      col = "pink", 
      lwd = 2)

# K.6209Z
lines(K.6209Z$r, K.6209Z$theo, 
      col = "orange", 
      lty = 2)

lines(K.6209Z$r, K.6209Z$border, 
      col = "orange", 
      lwd = 2)

# Legend
legend("topleft",
       legend = c("Software development",
                  "IT consulting",
                  "Film and video production",
                  "Internet portals",
                  "Other IT services"),
       col = c("black", "blue", "red", "pink", "orange"),
       lty = 1,
       lwd = 2,
       bty = "o",
       cex = 0.75,
       inset = c(0.05, 0.35))

Compared to 2012, the 2021 Ripley’s K-function plot shows less variation in clustering intensity across the five main tech specializations. While all sectors still exhibit clear clustering (curves remain above the CSR baseline), the differences between them are now narrower. Notably, Internet portals (pink) stand out as the only specialization with significantly higher clustering, especially at larger spatial scales. Other sectors, such as software development, IT consulting, and Film production, show more comparable and moderate clustering patterns, suggesting a possible trend toward a more spatially balanced distribution of firm types over time.

3.2.3 Poisson point process models for 2012 and 2021

To explore how sectoral specialization affects spatial intensity, we now fit a Poisson point process model using the ppm() function. We will start with a model for 2012.

In addition to firm specialization (represented by PKD7 codes), we include distance to Rondo Daszyńskiego — a central location in Warsaw that today serves as the core of the city’s business district. This allows us to test whether firms tend to locate closer to this emerging commercial hub and whether this tendency differs by sector.

Let’s calculate the distances to the Rondo Daszyńskiego for the city of Warsaw.

# Coordinates for the city
rondo.daszynskiego.sf <- st_sfc(st_point(c(20.9836657, 52.2305718)), crs = 4326)
rondo.daszynskiego.sf <- st_transform(rondo.daszynskiego.sf, crs = 3857)

rondo.coords <- st_coordinates(rondo.daszynskiego.sf)

# Point pattern
rondo.ppp <- ppp(x = rondo.coords[1], y = rondo.coords[2], window = W)

# Rescalling
rondo.ppp.km <- rescale(rondo.ppp, rsc, "km")

# Calculating distances
dist.rondo <- distfun(rondo.ppp.km)
plot(dist.rondo, main = "Distance to Rondo Daszyńskiego (in km)")

Above we see a map with calculated distances for the city of Warsaw. Now we transform it to pixel image to be able to use it in our model.

dist.im.rondo <- as.im(dist.rondo) # As pixel image
plot(dist.im.rondo, main = "Distance to Rondo Daszyńskiego (pixel image)")

Model for 2012

model.pkd7.12 <- ppm(top5.spec.tech.12.ppp.pkd7 ~ marks + dist.im.rondo)
summary(model.pkd7.12)

## Point process model
## Fitted to data: top5.spec.tech.12.ppp.pkd7
## Fitting method: maximum likelihood (Berman-Turner approximation)
## Model was fitted using glm()
## Algorithm converged
## Call:
## ppm.formula(Q = top5.spec.tech.12.ppp.pkd7 ~ marks + dist.im.rondo)
## Edge correction: "border"
##  [border correction distance r = 0 ]
## --------------------------------------------------------------------------------
## Quadrature scheme (Berman-Turner) = data + dummy + weights
## 
## Data pattern:
## Marked planar point pattern:  19236 points
## Average intensity 37.2 points per square km
## Multitype:
##       frequency proportion intensity
## 6201Z      7960      0.414     15.40
## 6202Z      4120      0.214      7.97
## 5911Z      2810      0.146      5.44
## 6312Z      2240      0.117      4.34
## 6209Z      2100      0.109      4.06
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6
## 
## Dummy quadrature points:
##      280 x 280 grid of dummy points, plus 4 corner points
##      dummy spacing: 0.1022097 x 0.1075428 km
## 
## Original dummy parameters: =
## Marked planar point pattern:  312084 points
## Average intensity 604 points per square km
## Multitype:
##       frequency proportion intensity
## 6201Z     58300      0.187       113
## 6202Z     62100      0.199       120
## 5911Z     63500      0.203       123
## 6312Z     64000      0.205       124
## 6209Z     64200      0.206       124
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6
## Quadrature weights:
##      (counting weights based on 280 x 280 array of rectangular tiles)
## All weights:
##  range: [7.58e-05, 0.011]    total: 2580
## Weights on data points:
##  range: [7.58e-05, 0.0055]   total: 48.5
## Weights on dummy points:
##  range: [7.58e-05, 0.011]    total: 2530
## --------------------------------------------------------------------------------
## FITTED :
## 
## Nonstationary multitype Poisson process
## Possible marks:
## 6201Z 6202Z 5911Z 6312Z 6209Z
## ---- Intensity: ----
## 
## Log intensity: ~marks + dist.im.rondo
## Model depends on external covariate 'dist.im.rondo'
## Covariates provided:
##  dist.im.rondo: im
## 
## Fitted trend coefficients:
##   (Intercept)    marks6202Z    marks5911Z    marks6312Z    marks6209Z 
##     4.4964635    -0.6595638    -1.0410354    -1.2666714    -1.3339469 
## dist.im.rondo 
##    -0.2263093 
## 
##                 Estimate        S.E.    CI95.lo    CI95.hi Ztest       Zval
## (Intercept)    4.4964635 0.016790674  4.4635544  4.5293726   ***  267.79530
## marks6202Z    -0.6595638 0.019193785 -0.6971829 -0.6219447   ***  -34.36340
## marks5911Z    -1.0410354 0.021935903 -1.0840290 -0.9980418   ***  -47.45806
## marks6312Z    -1.2666714 0.023899746 -1.3135140 -1.2198288   ***  -52.99937
## marks6209Z    -1.3339469 0.024539957 -1.3820443 -1.2858494   ***  -54.35816
## dist.im.rondo -0.2263093 0.002047842 -0.2303230 -0.2222956   *** -110.51113
## 
## ----------- gory details -----
## 
## Fitted regular parameters (theta):
##   (Intercept)    marks6202Z    marks5911Z    marks6312Z    marks6209Z 
##     4.4964635    -0.6595638    -1.0410354    -1.2666714    -1.3339469 
## dist.im.rondo 
##    -0.2263093 
## 
## Fitted exp(theta):
##   (Intercept)    marks6202Z    marks5911Z    marks6312Z    marks6209Z 
##    89.6993502     0.5170768     0.3530889     0.2817680     0.2634355 
## dist.im.rondo 
##     0.7974714

The fitted model for 2012 confirms that both sector type and distance to Rondo Daszyńskiego significantly influence firm intensity. Software development firms (6201Z) serve as the reference group, with the highest baseline intensity (~90 firms/km²). All other specializations show significantly lower intensities:

IT consulting: about 52% intensity of Software development
Film and video production: about 35% intensity of Software development
Internet portals: about 28% intensity of Software development
Other IT services: about 26% intensity of Software development

Importantly, the negative coefficient for distance (-0.226) indicates that firm density decreases as distance from Rondo Daszyńskiego increases, suggesting a strong centralizing effect around this business hub. Taking the exponential of the coefficient gives a value of approximately 0.79, meaning that the intensity drops by about 21% for each 1 km increase in distance from the center.

We now apply the same modeling approach to the 2021 data to examine whether these spatial patterns and sectoral dynamics have changed over time.

Model for 2021

model.pkd7.21 <- ppm(top5.spec.tech.21.ppp.pkd7 ~ marks + dist.im.rondo)
summary(model.pkd7.21)

## Point process model
## Fitted to data: top5.spec.tech.21.ppp.pkd7
## Fitting method: maximum likelihood (Berman-Turner approximation)
## Model was fitted using glm()
## Algorithm converged
## Call:
## ppm.formula(Q = top5.spec.tech.21.ppp.pkd7 ~ marks + dist.im.rondo)
## Edge correction: "border"
##  [border correction distance r = 0 ]
## --------------------------------------------------------------------------------
## Quadrature scheme (Berman-Turner) = data + dummy + weights
## 
## Data pattern:
## Marked planar point pattern:  36360 points
## Average intensity 70.3 points per square km
## Multitype:
##       frequency proportion intensity
## 6201Z     18400     0.5060     35.60
## 6202Z      8080     0.2220     15.60
## 5911Z      4350     0.1200      8.42
## 6312Z      2860     0.0787      5.53
## 6209Z      2690     0.0740      5.21
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6
## 
## Dummy quadrature points:
##      390 x 390 grid of dummy points, plus 4 corner points
##      dummy spacing: 0.07338131 x 0.07721020 km
## 
## Original dummy parameters: =
## Marked planar point pattern:  601685 points
## Average intensity 1160 points per square km
## Multitype:
##       frequency proportion intensity
## 6201Z    109000      0.182       211
## 6202Z    120000      0.199       231
## 5911Z    123000      0.205       238
## 6312Z    125000      0.207       241
## 6209Z    125000      0.208       242
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [1422.6487, 1451.2675] x [4178.61, 4208.722] km
##                      (28.62 x 30.11 km)
## Window area = 517 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6
## Quadrature weights:
##      (counting weights based on 390 x 390 array of rectangular tiles)
## All weights:
##  range: [1.24e-05, 0.00567]  total: 2580
## Weights on data points:
##  range: [1.24e-05, 0.00283]  total: 38.9
## Weights on dummy points:
##  range: [1.24e-05, 0.00567]  total: 2540
## --------------------------------------------------------------------------------
## FITTED :
## 
## Nonstationary multitype Poisson process
## Possible marks:
## 6201Z 6202Z 5911Z 6312Z 6209Z
## ---- Intensity: ----
## 
## Log intensity: ~marks + dist.im.rondo
## Model depends on external covariate 'dist.im.rondo'
## Covariates provided:
##  dist.im.rondo: im
## 
## Fitted trend coefficients:
##   (Intercept)    marks6202Z    marks5911Z    marks6312Z    marks6209Z 
##     5.4699963    -0.8225447    -1.4409118    -1.8601462    -1.9210331 
## dist.im.rondo 
##    -0.2492899 
## 
##                 Estimate        S.E.    CI95.lo    CI95.hi Ztest       Zval
## (Intercept)    5.4699963 0.011604181  5.4472525  5.4927401   ***  471.38152
## marks6202Z    -0.8225447 0.013350768 -0.8487117 -0.7963776   ***  -61.61029
## marks5911Z    -1.4409118 0.016859307 -1.4739555 -1.4078682   ***  -85.46685
## marks6312Z    -1.8601462 0.020098063 -1.8995377 -1.8207548   ***  -92.55351
## marks6209Z    -1.9210331 0.020636740 -1.9614804 -1.8805858   ***  -93.08801
## dist.im.rondo -0.2492899 0.001535435 -0.2522993 -0.2462805   *** -162.35783
## 
## ----------- gory details -----
## 
## Fitted regular parameters (theta):
##   (Intercept)    marks6202Z    marks5911Z    marks6312Z    marks6209Z 
##     5.4699963    -0.8225447    -1.4409118    -1.8601462    -1.9210331 
## dist.im.rondo 
##    -0.2492899 
## 
## Fitted exp(theta):
##   (Intercept)    marks6202Z    marks5911Z    marks6312Z    marks6209Z 
##   237.4593156     0.4393123     0.2367118     0.1556499     0.1464556 
## dist.im.rondo 
##     0.7793540

The 2021 model confirms that both sector type and distance to Rondo Daszyńskiego continue to significantly influence the spatial intensity of high-tech firms. The baseline intensity (for software development firms at the city center) increased to approximately 237 firms per km², nearly 2.6 times higher than in 2012. As in the earlier model, all other specializations show significantly lower densities.

The distance coefficient remains negative (−0.249), indicating a continued decline in firm intensity with distance — about 22% decrease per kilometer from the city center, slightly stronger than the 20% drop observed in 2012.

Overall, this suggests that central clustering around Rondo Daszyńskiego has persisted or even intensified, and that sectoral differences in spatial distribution have become more pronounced.

4 Summary

The analysis confirms strong spatial clustering of high-tech firms in both 2012 and 2021, with software development emerging as the dominant sector in terms of spatial intensity. Firms consistently show a preference for locations closer to Rondo Daszyńskiego, with model results indicating a 20–22% drop in intensity for each kilometer away. Over time, spatial inequalities between specialization appear to have widened, suggesting an increasing centralization of activity around core business zones.

Spatial patterns of technological specialization in Warsaw

Monika Kot

June 2025