As a first step, I put the data into a graph format, using the igraph package. At this exploratory, I am working with country-level data globally.
#2013 Graph
#Create node dataset
countries <- as.data.frame(cbind(unique(raw$sourcecountry), unique(raw$sourceiso)))
countries2 <- as.data.frame(cbind(unique(raw$hostcountry), unique(raw$hostiso)))
colnames(countries) <- c("country", "iso")
colnames(countries2) <- c("country", "iso")
countries <- rbind(countries, countries2)
countries <- as.data.frame(unique(countries))
nodes <- countries %>% select(iso)
colnames(nodes) <- c("nodes")
#Links: creating edges by aggregation
raw <- raw %>% mutate(link = 1)
links <- raw %>%
group_by(sourceiso, hostiso) %>% filter(year ==2013) %>% summarize(weight = sum(link), value = sum(capitalinvestment))
##agnodes: adjust nodes for dropped countries
agnodes <- nodes %>% filter(nodes %in% links$sourceiso | nodes %in% links$hostiso)
#Make igraph
net <- graph_from_data_frame(d=links, vertices = agnodes, directed = T)
#All time graph
allnodes <- nodes
#Links: creating edges by aggregation
raw <- raw %>% mutate(link = 1)
alllinks <- raw %>%
group_by(sourceiso, hostiso) %>% summarize(weight = sum(link), value = sum(capitalinvestment))
##agnodes: adjust nodes for dropped countries
#Make igraph
allnet <- graph_from_data_frame(d=alllinks, vertices = allnodes, directed = T)
Now I look at some key network statistics over the period. The following graphs present dynamic evolution of measures of connectedness such as degrees and centrality.
While these measures fluctuate overtime, it is also apparent that the network is very much a tier-ed one, where the top tier persistently dominates over the period with little breakouts.