DV3 homework #1

Using the journal R Markdown theme 😎

No need to replicate the paragraphs with a vertical line on the left (like this paragraph), as these are instructions/hints.

Note that commands are NOT echoed in this example document, but you should always echo the R commands and keep all warnings etc.

Load the regions and zones collected by the Spare Cores project from the below URLs:

You can use the fromJSON function from the jsonlite package to parse these JSON files. Make sure to convert to data.table for easier filtering and aggregation later.

The loaded data look like:

str(regions)

## Classes 'data.table' and 'data.frame':   148 obs. of  17 variables:
##  $ region_id    : chr  "1000" "1100" "1210" "1220" ...
##  $ api_reference: chr  "us-central1" "europe-west1" "us-west1" "asia-east1" ...
##  $ lon          : num  -95.8 3.87 -121.2 120.43 -80.04 ...
##  $ vendor_id    : chr  "gcp" "gcp" "gcp" "gcp" ...
##  $ display_name : chr  "Council Bluffs (US)" "St. Ghislain (BE)" "The Dalles (US)" "Changhua County (TW)" ...
##  $ lat          : num  41.2 50.5 45.6 24.1 33.1 ...
##  $ aliases      :List of 148
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr "nbg1"
##   ..$ : chr "hel1"
##   ..$ : chr "fsn1"
##   ..$ : chr "ash"
##   ..$ : chr "hil"
##   ..$ : chr "sin"
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr "EU (Frankfurt)"
##   ..$ : chr 
##   ..$ : chr "EU (Stockholm)"
##   ..$ : chr "EU (Milan)"
##   ..$ : chr 
##   ..$ : chr "EU (Ireland)"
##   ..$ : chr "EU (London)"
##   ..$ : chr "EU (Paris)"
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   ..$ : chr 
##   .. [list output truncated]
##  $ state        : chr  "Iowa" NA "Oregon" "Changhua County" ...
##  $ founding_year: int  2009 2015 2016 2013 2015 2016 2017 2017 2017 2017 ...
##  $ country_id   : chr  "US" "BE" "US" "TW" ...
##  $ green_energy : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ status       : chr  "active" "active" "active" "active" ...
##  $ city         : chr  "Council Bluffs" "St. Ghislain" "The Dalles" NA ...
##  $ observed_at  : chr  "2025-01-23T11:24:28.814269" "2025-01-23T11:24:28.813979" "2025-01-23T11:24:28.814355" "2025-01-23T11:24:28.813686" ...
##  $ address_line : logi  NA NA NA NA NA NA ...
##  $ name         : chr  "us-central1" "europe-west1" "us-west1" "asia-east1" ...
##  $ zip_code     : logi  NA NA NA NA NA NA ...
##  - attr(*, ".internal.selfref")=<externalptr>

str(zones)

## Classes 'data.table' and 'data.frame':   353 obs. of  8 variables:
##  $ vendor_id    : chr  "azure" "azure" "azure" "azure" ...
##  $ api_reference: chr  "0" "0" "0" "0" ...
##  $ status       : chr  "active" "active" "active" "active" ...
##  $ region_id    : chr  "australiacentral" "australiacentral2" "australiasoutheast" "brazilsoutheast" ...
##  $ zone_id      : chr  "0" "0" "0" "0" ...
##  $ name         : chr  "0" "0" "0" "0" ...
##  $ display_name : chr  "australiacentral-0" "australiacentral2-0" "australiasoutheast-0" "brazilsoutheast-0" ...
##  $ observed_at  : chr  "2025-01-23T11:27:39.099118" "2025-01-23T11:27:39.099167" "2025-01-23T11:27:39.099231" "2025-01-23T11:27:39.099282" ...
##  - attr(*, ".internal.selfref")=<externalptr>

Let’s count the number of regions per country, shown in desceding order:

##     country_id     N
##         <char> <int>
##  1:         US    32
##  2:         AU     9
##  3:         IN     9
##  4:         DE     8
##  5:         JP     6
##  6:         CA     6
##  7:         SG     5
##  8:         GB     5
##  9:         ZA     5
## 10:         BR     4
## 11:         FI     4
## 12:         CH     4
## 13:         KR     4
## 14:         IT     4
## 15:         FR     4
## 16:         ES     4
## 17:         NL     3
## 18:         HK     3
## 19:         PL     3
## 20:         IL     3
## 21:         SE     3
## 22:         AE     3
## 23:         ID     2
## 24:         QA     2
## 25:         CN     2
## 26:         IE     2
## 27:         NO     2
## 28:         BE     1
## 29:         TW     1
## 30:         CL     1
## 31:         SA     1
## 32:         MX     1
## 33:         BH     1
## 34:         NZ     1
##     country_id     N

You can pass any data.frame or similar tabular data to pander::pander to render as a HTML table instead of raw R console output.

A nicer table:

country_id	N
US	32
AU	9
IN	9
DE	8
JP	6
CA	6
SG	5
GB	5
ZA	5
BR	4
FI	4
CH	4
KR	4
IT	4
FR	4
ES	4
NL	3
HK	3
PL	3
IL	3
SE	3
AE	3
ID	2
QA	2
CN	2
IE	2
NO	2
BE	1
TW	1
CL	1
SA	1
MX	1
BH	1
NZ	1

Let’s show the distribution of the founding year of the regions:

Make sure to replicate the axis titles, legend position, theme etc in the below and all future plots! Try to also replicate the tiny details of the plot as well, like axis labels, and grid design.

Also showing the average founding year on the same plot:

Look into geom_vline.

Now let’s filter for the regions in Europe!

For this, you might need to lookup the continent of the provided country ids.

After filtering, there should be 47 regions left.

You can add inline R code chunks by using backquotes, followed by r, e.g. writing `r 2+2` will return 4.

Let’s count the number of zones per region, and merge it to the regions dataset. There are 31 regions with 3 zones, and 16 regions with a single zone. Showing this visually for each vendor:

Now let’s load a GeoJSON file on the boundaries of the European countries. You can use leakyMirror/map-of-europe’s europe.geojson.

## Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.4.0; sf_use_s2() is TRUE

## Reading layer `europe' from data source `/home/daroczig2/europe.json' using driver `GeoJSON'
## Simple feature collection with 51 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -24.54222 ymin: 29.48671 xmax: 50.37499 ymax: 71.15471
## Geodetic CRS:  WGS 84

Plotting the downloaded shapes:

Now let’s get a background tile for this area!

Note that we used ggmap’s get_stamenmap in the class, but I suggest switching to maptiles’s get_tiles (using the same API token), as it returns the tiles in the correct WGS84 projection that is easier to use with the other loaded datasets. You can pass the loaded GeoJSON object to get_tiles, use zoom=4 and the API key. Experiment with the other parameters as well to replicate the below using geom_spatraster_rgb from the tidyterra package!

## 
## Attaching package: 'tidyterra'

## The following object is masked from 'package:stats':
## 
##     filter

Now let’s put together all the loaded layers (background raster map, polygon on country borders, and location of regions weighted by the number of zones, using color to represent if the region is powered by green energy, and use the shape to also visualize the vendor)!

country_id	N
US	32
AU	9
IN	9
DE	8
JP	6
CA	6
SG	5
GB	5
ZA	5
BR	4
FI	4
CH	4
KR	4
IT	4
FR	4
ES	4
NL	3
HK	3
PL	3
IL	3
SE	3
AE	3
ID	2
QA	2
CN	2
IE	2
NO	2
BE	1
TW	1
CL	1
SA	1
MX	1
BH	1
NZ	1

country_id	N
US	32
AU	9
IN	9
DE	8
JP	6
CA	6
SG	5
GB	5
ZA	5
BR	4
FI	4
CH	4
KR	4
IT	4
FR	4
ES	4
NL	3
HK	3
PL	3
IL	3
SE	3
AE	3
ID	2
QA	2
CN	2
IE	2
NO	2
BE	1
TW	1
CL	1
SA	1
MX	1
BH	1
NZ	1

DV3 homework #1

Gergely Daroczi, CEU

2025-01-23

country_id	N
US	32
AU	9
IN	9
DE	8
JP	6
CA	6
SG	5
GB	5
ZA	5
BR	4
FI	4
CH	4
KR	4
IT	4
FR	4
ES	4
NL	3
HK	3
PL	3
IL	3
SE	3
AE	3
ID	2
QA	2
CN	2
IE	2
NO	2
BE	1
TW	1
CL	1
SA	1
MX	1
BH	1
NZ	1