Extract data from the POLLnet database

Giovanni Bonafè

2019-09-16

The R package pollnet gives access to pollen and spores data collected by the Italian monitoring network. More information about the network here.

In an R session, you can install the package as follows

devtools::install_github("jobonaf/pollnet")

Once installed and loaded, let’s see a list of the available functions

library(pollnet)
lsf.str("package:pollnet")
#> plot_pollnet_particles : function (p = pollnet_particles(), plot_others = F)  
#> pollnet_data : function (part_ids, from, to, ...)  
#> pollnet_data_station : function (part_id = NULL, from = NULL, to = NULL, stat_id, verbose = F)  
#> pollnet_overview : function (from = "2019-03-01", to = "2019-06-01")  
#> pollnet_particles : function ()  
#> pollnet_regions : function ()  
#> pollnet_stations : function (regi_id = NULL)

We can get a list of the available particles with function pollnet_particles:

pp <- pollnet_particles ()
knitr::kable(pp[,c("PART_ID", "PART_NAME_L")])
PART_ID PART_NAME_L
1 Polline
1320 Aceraceae
1372 Acer negundo
1370 Acer saccharinum
1371 Altri
1327 Amaranthaceae
1321 Anacardiaceae
1322 Araliaceae
1323 Betulaceae
1374 Alnus
1373 Betula
1324 Buxaceae
1325 Cannabaceae
1437 Cannabis
1375 Humulus
1326 Caprifoliaceae
1443 Sambucus
1328 Compositae
1377 Altri
1378 Ambrosia
1379 Artemisia
1329 Corylaceae
1381 Carpinus
1380 Corylus
1382 Ostrya carpinifolia
1330 Cupressaceae/Taxaceae
1331 Cyperaceae
1332 Ericaceae
1333 Euphorbiaceae
1334 Fabaceae
1335 Fagaceae
1386 Castanea sativa
1385 Fagus sylvatica
1384 Quercus
1336 Ginkgoaceae
1352 Gramineae
1338 Hippocastanaceae
1339 Juglandaceae
1340 Juncaceae
1341 Lauraceae
1342 Mimosaceae
1343 Moraceae
1444 Broussonetia
1445 Morus
1344 Myrtaceae
1346 Oleaceae
1393 Altri
1417 Fraxinus
1395 Fraxinus excelsior
1392 Fraxinus ornus
1394 Ligustrum
1391 Olea
1347 Palmae
1348 Papaveraceae
1349 Pinaceae
1406 Abies
1397 Altri
1409 Cedrus
1396 Larix
1408 Picea
1407 Pinus
1350 Plantaginaceae
1351 Platanaceae
1353 Polygonaceae
1354 Ranunculaceae
1355 Rosaceae
1356 Rubiaceae
1357 Salicaceae
1401 Populus
1400 Salix
1422 Saxifragaceae
1358 Simaroubaceae
1359 Tiliaceae
1360 Ulmaceae
1438 Celtis
1442 Ulmus
1361 Umbelliferae
1362 Urticaceae
1421 Parietaria/altre Urticaceae
1420 Urtica membranacea
1363 Vitaceae
1405 Parthenocissus
1404 Vitis
1436 Altri pollini
1345 Pollini non identificati
2 Spore
1426 Agrocybe
1364 Alternaria
1427 Arthrinium
1423 Botrytis
1428 Chaetomium
1424 Cladosporium
1429 Curvularia
1430 Entomophthora
1365 Epicoccum
1425 Helminthosporium
1431 Oidium
1432 Periconia
1433 Peronospora
1366 Pithomyces
1434 Pleospora
1367 Polythrincium
1435 Puccinia
1368 Stemphylium
1369 Torula
3 Alghe

Italian regions are identified with numeric codes

pr <- pollnet_regions ()
knitr::kable(pr)
REGI_ID REGI_NAME_I REGI_NAME_D
3 Alto Adige Südtirol
10 Veneto Venezien
11 Emilia Romagna Emilia Romagna
13 Abruzzo Abruzzen
14 Trentino Trient
15 Basilicata Basilicata
17 Campania Kampanien
18 Friuli Venezia Giulia Friaul Julisch Venetien
19 Calabria Kalabrien
20 Liguria Ligurien
21 Piemonte Piemont
22 Marche Marken
23 Molise Molise
24 Puglia Apulien
25 Sardegna Sardinien
27 Toscana Toskana
28 Umbria Umbrien
29 Valle d’Aosta Aosta

With function pollnet_stations we can get metadata of the stations, specifying a single region

ps <- pollnet_stations (regi_id = 27)
knitr::kable(ps)
STAT_ID STAT_CODE STAT_NAME_I STAT_NAME_D REGI_ID REGI_NAME_I REGI_NAME_D LATITUDE LONGITUDE
69 FI1 Firenze Firenze 27 Toscana Toskana 43.78220 11.23237
70 LU1 Lido di Camaiore Lido di Camaiore 27 Toscana Toskana 43.91429 10.22404
157 GR1 Grosseto Grosseto 27 Toscana Toskana 42.76611 11.11562
162 AR1 Arezzo Arezzo 27 Toscana Toskana 43.46178 11.86554

We can extract metadata of all the Italian stations as well. Packages leaflet and htmltools help us plotting them in an interactive map.

ps <- pollnet_stations ()
library(leaflet)
library(htmltools)
leaflet(ps, width = 600, height = 400) %>% 
  addTiles() %>%
  addCircleMarkers(label = ~htmlEscape(STAT_NAME_I), radius = 5)
#> Assuming "LONGITUDE" and "LATITUDE" are longitude and latitude, respectively

Now we can extract measured data. For example of birch pollen (Betula, 1373). Package dplyr help us summarizing the results: we can display the highest data for each region.

dat <- pollnet_data(part_ids = 1373, 
                    from = "2019-03-01", 
                    to = "2019-03-07")
library(dplyr)
dat %>% 
  group_by(REGI_NAME_I) %>% 
  arrange(desc(REMA_CONCENTRATION)) %>% 
  slice(1) %>% 
  select(REMA_DATE, REMA_CONCENTRATION, REGI_NAME_I, STAT_NAME_I) %>%
  knitr::kable()
REMA_DATE REMA_CONCENTRATION REGI_NAME_I STAT_NAME_I
2019-03-06 0.85 Abruzzo L’Aquila
2019-03-01 0.00 Alto Adige Bolzano
2019-03-01 0.00 Campania Napoli - Via don Bosco
2019-03-03 2.15 Emilia Romagna San Giovanni in Persiceto
2019-03-03 1.69 Friuli Venezia Giulia Tolmezzo
2019-03-03 31.39 Liguria La Spezia - Arpal Dipartimento di La Spezia
2019-03-04 3.98 Marche Castel di Lama
2019-03-03 NA Molise Termoli
2019-03-01 3.15 Piemonte Alessandria
2019-03-04 3.39 Puglia Brindisi
2019-03-01 0.00 Toscana Firenze
2019-03-01 0.49 Valle d’Aosta Aosta - St. Christophe
2019-03-05 1.94 Veneto Vicenza

Extracting a single station and a single pollen type is faster. Package ggplot2 help us plotting the data.

dat <- pollnet_data_station(part_id = 1373,
                            stat_id = 84,
                            from = "2011-01-01",
                            to   = "2018-12-31")
library(ggplot2)
ggplot(dat) +
  geom_line(aes(x=REMA_DATE, 
                y=REMA_CONCENTRATION, 
                col=STAT_NAME_I))