For this week’s assignment, I made all the codes and text outputs invisible. The basic set up of the US Airports data is based on the script provided by Professor Rolfe.
The network has 755 nodes and 8228 edges.Again, there was a large difference between igraph and statnet in the outputs of edges. With Igraph, it seems there are lots of self loops.
ls()
## [1] "network_edgelist" "network_igraph" "network_nodes" "network_statnet"
vcount(network_igraph)
## [1] 755
ecount(network_igraph)
## [1] 23473
print(network_statnet)
## Network attributes:
## vertices = 755
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 8228
## missing edges= 0
## non-missing edges= 8228
##
## Vertex attribute names:
## City Distance vertex.names
##
## Edge attribute names not shown
print(network_igraph)
## IGRAPH bf6202d DN-- 755 23473 -- US airports
## + attr: name (g/c), name (v/c), City (v/c), Position (v/c), Carrier
## | (e/c), Departures (e/n), Seats (e/n), Passengers (e/n), Aircraft
## | (e/n), Distance (e/n)
## + edges from bf6202d (vertex names):
## [1] BGR->JFK BGR->JFK BOS->EWR ANC->JFK JFK->ANC LAS->LAX MIA->JFK EWR->ANC
## [9] BJC->MIA MIA->BJC TEB->ANC JFK->LAX LAX->JFK LAX->SFO AEX->LAS BFI->SBA
## [17] ELM->PIT GEG->SUN ICT->PBI LAS->LAX LAS->PBI LAS->SFO LAX->LAS PBI->AEX
## [25] PBI->ICT PIT->VCT SFO->LAX VCT->DWH IAD->JFK ABE->CLT ABE->HPN AGS->CLT
## [33] AGS->CLT AVL->CLT AVL->CLT AVP->CLT AVP->PHL BDL->CLT BHM->CLT BHM->CLT
## [41] BNA->CLT BNA->CLT BNA->DCA BNA->PHL BTR->CLT BUF->CLT BUF->DCA BUF->PHL
## + ... omitted several edges
The network is not bipartite. It is a directed and unweighted network.
is_bipartite(network_igraph)
## [1] FALSE
is_directed(network_igraph)
## [1] TRUE
is_weighted(network_igraph)
## [1] FALSE
print(network_statnet)
## Network attributes:
## vertices = 755
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 8228
## missing edges= 0
## non-missing edges= 8228
##
## Vertex attribute names:
## City Distance vertex.names
##
## Edge attribute names not shown
Below are network attributes of the US Airports data. Three vertexes(nodes) are name of the passport(abbreviated), city that each airport is located, and the position number for each passport. The edge attributes are Aircraft, Carrier, Departures, Distance, Passengers, Seats and Weight.
igraph::vertex_attr_names(network_igraph)
## [1] "name" "City" "Position"
igraph::edge_attr_names(network_igraph)
## [1] "Carrier" "Departures" "Seats" "Passengers" "Aircraft"
## [6] "Distance"
network::list.vertex.attributes(network_statnet)
## [1] "City" "Distance" "na" "vertex.names"
network::list.edge.attributes(network_statnet)
## [1] "Aircraft" "Carrier" "Departures" "Distance" "na"
## [6] "Passangers" "Seats" "weight"
summary(E(network_igraph)$ "Aircraft")
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.0 612.0 629.0 571.6 655.0 873.0
summary(E(network_igraph)$ "Departures")
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 2.00 14.00 30.19 41.00 769.00
summary(network_statnet %e% "Passengers")
## Length Class Mode
## 0 NULL NULL
#Dyad Census
There are 10449 (mutual), 2574 (asymmetric) and 271612 (null) dyads in the network.
igraph::dyad.census(network_igraph)
## $mut
## [1] 10449
##
## $asym
## [1] 2574
##
## $null
## [1] 271612
sna::dyad.census(network_statnet)
## Mut Asym Null
## [1,] 3605 1018 280012
#Triad Census
igraph::triad_census(network_igraph)
## [1] 68169544 665870 2427052 1445 1289 2465 15322 19171
## [9] 91 39 114868 202 376 558 6422 18671
sna::triad.census(network_statnet, mode="graph")
## 0 1 2 3
## [1,] 68527865 2755420 137878 22222
#Transitivity or Global Clustering
The transitivity (global clustering) of the network data is around .33.
transitivity (network_igraph)
## [1] 0.3384609
gtrans(network_statnet)
## [1] 0.3266617
#Local Transitivity and Clustering
The local clustering (local transitivity) coefficient is .64. Interestingly, the global clustering coefficient (.33) is smaller than the local clustering coefficient.This means that the level of embeddedness of single nodes in a network is greater than the overall clustering level in the network.
transitivity(network_igraph, type="average")
## [1] 0.6452844
#Average path length
The average path length of the network is 3.45.
average.path.length(network_igraph, directed=F)
## [1] 3.447169
There are 6 components in this network data. One component has 745 members in it, meaning that most of the airports have flights between them. The rest of components has a fewer members in it, ranging from 1 to 3. It suggests that those airports in smaller size components are probably not for commercial airlines but for other purposes such as AirForce base, cargo airports, etc..
There seems to be 166 isolates in the data.
There is one isolate airport that we could identity: DET (Coleman A. Young International Airport).
names(igraph::components(network_igraph))
## [1] "membership" "csize" "no"
igraph::components(network_igraph)$no
## [1] 6
igraph::components(network_igraph)$csize # 2 members or 3 members in the network
## [1] 745 2 2 3 2 1
isolates(network_statnet) # why are there isolates?
## [1] 166
as.vector(network_statnet %v% "vertex.names")[c(isolates(network_statnet))]
## [1] "DET"
Network density is a number of ties as a proportion of the maximum possible number of ties.
The density output was different for two packages. When adjusted for missing edges, they return the same density level, .014. The US Airports data therefore has relatively a small density level.
graph.density(network_igraph)
## [1] 0.04123351
graph.density(network_igraph, loops=TRUE)
## [1] 0.0411789
network.density(network_statnet)
## [1] 0.0144536
gden(network_statnet, diag=FALSE)
## [1] 0.0144536
Vertex degree is a description of an individual node, and each vertex in a network can have a different node degree. The degree of a node is equal to the number of links that the node has.
The most degrees a node has is 1700, and the least amount of degree is 1. On average, one node has 62 degrees in the US Airports network data.
igraph::degree(network_igraph) #igraph includes the loops
igraph::degree(network_igraph, loops=FALSE) #igraph without loops
sna::degree(network_statnet)
summary(degree(network_igraph))
The average mean of indegree for each node is 11. For outdegree, the average mean was also 11. Therefore, most of the airports in the US have about the same number of inbound and outbound flights on average, which makes sense.
#sna::degree(network_statnet, cmode = "indegree")
#sna::degree(network_statnet, cmode = "outdegree")
#does not work for some reason
#network_nodes <- network_nodes %>%
# mutate(indegree=sna::degree(network_nodes, cmode="indegree"),
# outdegree=sna::degree(network_nodes, cmode="outdegree"))
network_nodes$indegree <- sna::degree(network_statnet, cmode = "indegree")
network_nodes$outdegree <- sna::degree(network_statnet, cmode = "outdegree")
network_nodes$degree <- sna::degree(network_statnet, cmode = "freeman")
summary(network_nodes)
## name City Position indegree
## Length:755 Length:755 Length:755 Min. : 0.0
## Class :character Class :character Class :character 1st Qu.: 2.0
## Mode :character Mode :character Mode :character Median : 4.0
## Mean : 10.9
## 3rd Qu.: 9.0
## Max. :161.0
## outdegree degree
## Min. : 0.0 Min. : 0.0
## 1st Qu.: 2.0 1st Qu.: 4.0
## Median : 4.0 Median : 8.0
## Mean : 10.9 Mean : 21.8
## 3rd Qu.: 9.0 3rd Qu.: 18.0
## Max. :163.0 Max. :323.0
The degree centralization scores for both in-degree and out-degree in the US airports data are around .20.
sna::centralization(network_statnet, sna::degree,cmode="indegree")
## [1] 0.1993383
sna::centralization(network_statnet,sna::degree,cmode="outdegree")
## [1] 0.2019943
centr_degree(network_igraph, loops=FALSE, mode = "in")$centralization
## [1] 1.075669
centr_degree(network_igraph, loops=FALSE, mode = "out")$centralization
## [1] 1.099573
Both in-bound and out-bound flights are concentrated into a few airports.
hist(network_nodes$indegree, main ="US Airport: In-degree Distribution", xlab = "In-bound Airports")
hist(network_nodes$outdegree, main="US Airport: Out-degree Distribution", xlab= "Out-bound Airports")
It seems that Tucson(TUS), Greensboro/High Point(GSO), Rockford(RFD), Atlantic City(ACY) and Kalamazoo(AZO) are 5 cities that have the largest number of indegree.
For outdegree, GSO, TUS, RFD, ACY, AZO have the largest outdegree. As mentioned earlier, airports that have a high indegree number also have a high outdegree numbers.
arrange(network_nodes, desc(indegree))%>%slice(1:5)
## name City Position indegree outdegree degree
## 1 TUS Tucson, AZ N320658 W1105628 161 161 322
## 2 GSO Greensboro/High Point, NC N360552 W0795614 160 163 323
## 3 RFD Rockford, IL N421143 W0890550 147 152 299
## 4 ACY Atlantic City, NJ N392727 W0743438 139 143 282
## 5 AZO Kalamazoo, MI N421406 W0853307 139 141 280
arrange(network_nodes, desc(outdegree))%>%slice(1:5)
## name City Position indegree outdegree degree
## 1 GSO Greensboro/High Point, NC N360552 W0795614 160 163 323
## 2 TUS Tucson, AZ N320658 W1105628 161 161 322
## 3 RFD Rockford, IL N421143 W0890550 147 152 299
## 4 ACY Atlantic City, NJ N392727 W0743438 139 143 282
## 5 AZO Kalamazoo, MI N421406 W0853307 139 141 280
arrange(network_nodes, indegree)%>%slice(1:5)
## name City Position indegree outdegree degree
## 1 CMH Columbus, OH N395953 W0825331 0 1 1
## 2 PWM Portland, ME N433846 W0701832 0 1 1
## 3 STL St. Louis, MO N384452 W0902136 0 1 1
## 4 ABY Albany, GA N313208 W0841140 0 0 0
## 5 RBY Ruby, AK N644338 W1552812 0 1 1