It seems that network analysis is a very powerful tool for analyzing various network related data. So why not use network analysis on map? Past works which were done by plotting network graph on map has been done using ggplot and ggmap. In this publication we will be exploring to combine ggraph and ggmap.
For handling network data we will be using tidygraph.
The data for our analysis was obtained from the website: http://dgca.nic.in/dom_flt_schedule/flt_index.htm
The data was converted from pdf format to edges and nodes data.
library(ggplot2)
library(igraph)
library(tidyverse)
library(ggmap)
library(tidygraph)
library(ggraph)
library(leaflet)
library(ggiraph)
we use the ggmap to get the map data.
map <- get_map('India', zoom=5,source = "google",maptype = 'satellite')
p <- ggmap(map)
p
we read the routes data for our analysis which will act as our edge data.
routes = read.csv("E:\\SMU Assignements\\Visual Analytics\\Project\\Nodes\\Data By Carrier\\DataSet\\airline_edge1.csv",header=TRUE)
routes1=routes
routes=routes %>% group_by(source, target,Arrival,Departure,Carrier,Dept.Time
,Arr.Time)%>%filter(Carrier=="AirIndia") %>%summarize(Frequency = n())
routes
The airport data is our node which consist of longitude and latitude.
airports <- read.csv(file="E:\\SMU Assignements\\Visual Analytics\\Project\\Nodes\\Data By Carrier\\DataSet\\airline_nodes.csv",header=TRUE,sep=",")
airports
## Id Label Latitude Longitude
## 1 1 AGARTALA 23.891529 91.24065
## 2 2 AGATTI 10.824322 72.17606
## 3 3 AHMEDABAD 23.073426 72.62657
## 4 4 AIZAWAL 23.838848 92.62417
## 5 5 ALLAHABAD 25.440081 81.73407
## 6 6 AMRITSAR 31.707238 74.79890
## 7 7 AURANGABAD 19.863972 75.39605
## 8 8 BAGDOGRA 26.682113 88.32906
## 9 9 BANGALORE 13.198635 77.70659
## 10 10 BELGAUM 15.861327 74.61656
## 11 11 BHATINDA 30.268111 74.76645
## 12 12 BHAVNAGAR 21.753780 72.18343
## 13 13 BHOPAL 23.289182 77.33372
## 14 14 BHUBANESHWAR 20.250007 85.81721
## 15 15 BHUJ 23.287810 69.67094
## 16 16 CALICUT 11.136433 75.95333
## 17 17 CHANDIGARH 30.674029 76.78927
## 18 18 CHENNAI(MADRAS) 12.994112 80.17087
## 19 19 COCHIN 10.151783 76.39296
## 20 20 COIMBATORE 11.030434 77.03928
## 21 21 DEHRADUN 30.190860 78.18184
## 22 22 DELHI 28.556162 77.09996
## 23 23 DHARAMSHALA 32.164453 76.26312
## 24 24 DIBRUGARH 27.484037 95.02289
## 25 25 DIMAPUR 25.880152 93.77293
## 26 26 DIU 20.713320 70.92500
## 27 27 DURGAPUR 23.617635 87.23989
## 28 28 GAYA 24.748822 84.94373
## 29 29 GOA 15.380349 73.83500
## 30 30 GORAKHPUR 26.737748 83.45281
## 31 31 GUWAHATI 26.106473 91.58605
## 32 32 GWALIOR 26.285499 78.21717
## 33 33 HUBLI 15.360014 75.08620
## 34 34 HYDERABAD 17.240263 78.42938
## 35 35 IMPHAL 24.764401 93.89727
## 36 36 INDORE 22.721994 75.80281
## 37 37 JABALPUR 23.178753 80.05283
## 38 38 JAIPUR 26.825486 75.80533
## 39 39 JAMMU 32.687566 74.83881
## 40 40 JAMNAGAR 22.460711 70.01590
## 41 41 JODHPUR 26.264248 73.05057
## 42 42 JORHAT 26.734678 94.18475
## 43 43 KANPUR (CHAKERI) 26.409181 80.40953
## 44 44 KHAJURAHO 24.818791 79.91663
## 45 45 KOLKATA 22.652043 88.44633
## 46 46 KULLU 31.876485 77.15390
## 47 47 LEH 34.142505 77.55572
## 48 48 LILABARI 27.288744 94.09314
## 49 49 LUCKNOW 26.761732 80.88565
## 50 50 MADURAI 9.836831 78.08945
## 51 51 MANGALORE 12.954508 74.88565
## 52 52 MUMBAI (BOMBAY) 19.089560 72.86561
## 53 53 NAGPUR 21.088115 79.04510
## 54 54 NANDED 19.183297 77.33476
## 55 55 PANT NAGAR 29.031055 79.47334
## 56 56 PATNA 25.590693 85.08804
## 57 57 PORT BLAIR 11.650336 92.73278
## 58 58 PUNE 18.581550 73.91988
## 59 59 RAIPUR 21.184250 81.74172
## 60 60 RAJAHMUNDRY 17.107290 81.81778
## 61 61 RAJKOT 22.310911 70.77948
## 62 62 RANCHI 23.316049 85.32219
## 63 63 SHILLONG 25.706209 91.97508
## 64 64 SILCHAR 24.915985 92.97946
## 65 65 SRINAGAR 33.992277 74.77197
## 66 66 SURAT 21.118440 72.74342
## 67 67 TEZPUR 26.709348 92.77554
## 68 68 THOISE 34.142505 77.55572
## 69 69 TIRUPATHI 13.634490 79.54480
## 70 70 TRICHY 10.764305 78.71508
## 71 71 TRIVANDRUM 8.483420 76.91982
## 72 72 TUTICORIN 8.723088 78.02634
## 73 73 UDAIPUR 24.618157 73.89850
## 74 74 VADODARA 22.333173 73.22449
## 75 75 VARANASI 25.451857 82.86162
## 76 76 VIJAYAWADA 16.529554 80.79702
## 77 77 VIZAG 17.721987 83.23198
we label our nodes data so that it can be read by tidygraph which we will be using to create our graph data. For more information on tidygraph visit: http://www.data-imaginist.com/2017/Introducing-tidygraph/
names(airports) <- c("Id","Label","lat", "lon")
airports
## Id Label lat lon
## 1 1 AGARTALA 23.891529 91.24065
## 2 2 AGATTI 10.824322 72.17606
## 3 3 AHMEDABAD 23.073426 72.62657
## 4 4 AIZAWAL 23.838848 92.62417
## 5 5 ALLAHABAD 25.440081 81.73407
## 6 6 AMRITSAR 31.707238 74.79890
## 7 7 AURANGABAD 19.863972 75.39605
## 8 8 BAGDOGRA 26.682113 88.32906
## 9 9 BANGALORE 13.198635 77.70659
## 10 10 BELGAUM 15.861327 74.61656
## 11 11 BHATINDA 30.268111 74.76645
## 12 12 BHAVNAGAR 21.753780 72.18343
## 13 13 BHOPAL 23.289182 77.33372
## 14 14 BHUBANESHWAR 20.250007 85.81721
## 15 15 BHUJ 23.287810 69.67094
## 16 16 CALICUT 11.136433 75.95333
## 17 17 CHANDIGARH 30.674029 76.78927
## 18 18 CHENNAI(MADRAS) 12.994112 80.17087
## 19 19 COCHIN 10.151783 76.39296
## 20 20 COIMBATORE 11.030434 77.03928
## 21 21 DEHRADUN 30.190860 78.18184
## 22 22 DELHI 28.556162 77.09996
## 23 23 DHARAMSHALA 32.164453 76.26312
## 24 24 DIBRUGARH 27.484037 95.02289
## 25 25 DIMAPUR 25.880152 93.77293
## 26 26 DIU 20.713320 70.92500
## 27 27 DURGAPUR 23.617635 87.23989
## 28 28 GAYA 24.748822 84.94373
## 29 29 GOA 15.380349 73.83500
## 30 30 GORAKHPUR 26.737748 83.45281
## 31 31 GUWAHATI 26.106473 91.58605
## 32 32 GWALIOR 26.285499 78.21717
## 33 33 HUBLI 15.360014 75.08620
## 34 34 HYDERABAD 17.240263 78.42938
## 35 35 IMPHAL 24.764401 93.89727
## 36 36 INDORE 22.721994 75.80281
## 37 37 JABALPUR 23.178753 80.05283
## 38 38 JAIPUR 26.825486 75.80533
## 39 39 JAMMU 32.687566 74.83881
## 40 40 JAMNAGAR 22.460711 70.01590
## 41 41 JODHPUR 26.264248 73.05057
## 42 42 JORHAT 26.734678 94.18475
## 43 43 KANPUR (CHAKERI) 26.409181 80.40953
## 44 44 KHAJURAHO 24.818791 79.91663
## 45 45 KOLKATA 22.652043 88.44633
## 46 46 KULLU 31.876485 77.15390
## 47 47 LEH 34.142505 77.55572
## 48 48 LILABARI 27.288744 94.09314
## 49 49 LUCKNOW 26.761732 80.88565
## 50 50 MADURAI 9.836831 78.08945
## 51 51 MANGALORE 12.954508 74.88565
## 52 52 MUMBAI (BOMBAY) 19.089560 72.86561
## 53 53 NAGPUR 21.088115 79.04510
## 54 54 NANDED 19.183297 77.33476
## 55 55 PANT NAGAR 29.031055 79.47334
## 56 56 PATNA 25.590693 85.08804
## 57 57 PORT BLAIR 11.650336 92.73278
## 58 58 PUNE 18.581550 73.91988
## 59 59 RAIPUR 21.184250 81.74172
## 60 60 RAJAHMUNDRY 17.107290 81.81778
## 61 61 RAJKOT 22.310911 70.77948
## 62 62 RANCHI 23.316049 85.32219
## 63 63 SHILLONG 25.706209 91.97508
## 64 64 SILCHAR 24.915985 92.97946
## 65 65 SRINAGAR 33.992277 74.77197
## 66 66 SURAT 21.118440 72.74342
## 67 67 TEZPUR 26.709348 92.77554
## 68 68 THOISE 34.142505 77.55572
## 69 69 TIRUPATHI 13.634490 79.54480
## 70 70 TRICHY 10.764305 78.71508
## 71 71 TRIVANDRUM 8.483420 76.91982
## 72 72 TUTICORIN 8.723088 78.02634
## 73 73 UDAIPUR 24.618157 73.89850
## 74 74 VADODARA 22.333173 73.22449
## 75 75 VARANASI 25.451857 82.86162
## 76 76 VIJAYAWADA 16.529554 80.79702
## 77 77 VIZAG 17.721987 83.23198
using tidygraph to convert our tables into nodes and edges format for our analysis
flights = tbl_graph(nodes=airports,edges=routes,directed=TRUE)
flights
converting the data into tidygraph format allows us to calculate the centralities and other network statistics easily, hence we do not have to calculate the centralities separately which was done previously in case of igraph.
flights=flights%>%mutate(betweenness_centrality = centrality_betweenness(normalized = T)) %>%mutate(closeness_centrality = centrality_closeness(normalized = T))%>%mutate(centrality_degree=centrality_degree(mode='out',normalized = T))%>%mutate(centrality_eigen=centrality_eigen(weights=flights$Frequency,directed=T))
flights
Traditional method of plotting a map using ggmap and ggplot.
G=flights
#Normally we need to create two vectors in order to plot both the network data and the coordinates on a map
#plot_vector takes the longitude and latitude to be plotted on a map.
#plot_vector1 takes the lon, lat and the network centrality values.
plot_vector<- as.data.frame(cbind(V(G)$lon,V(G)$lat))
plot_vector1<- as.data.frame(cbind(V(G)$lon,V(G)$lat,V(G)$betweenness_centrality,V(G)$closeness_centrality))
#we are taking the edgelist which is being used to get the origin and destination of the airport data.
#edgelist[,1]- takes the origin values and the edgelist[,2] takes the destination values.
edgelist <- get.edgelist(G)
edgelist[,1]<-as.numeric(match(edgelist[,1],V(G)$Id))
edgelist[,2]<-as.numeric(match(edgelist[,2],V(G)$Id))
#the edges now consists of the edge between the origin and the destination values
edges <- data.frame(plot_vector[edgelist[,1],], plot_vector[edgelist[,2],])
#naming the coloumns obtained to plot
colnames(edges) <- c("X1","Y1","X2","Y2")
#Here we are taking ggmap as our base layer. on top of ggmap we are plotting the origin and destination coordinates using the plot_vector as our 2nd layer, then we are plotting the coordinates of all the centrality from the plot_vector1 as our 3rd layer and then we are ploting the geom_segment i.e. the edges as our 4th layer.
z=p + geom_segment(aes(x=X1, y=Y1, xend = X2, yend = Y2), color='yellow',data=edges,arrow = arrow(length = unit(0.2,"cm"))) + geom_point(aes(V1, V2), data=plot_vector1)+ geom_point(aes(x=V1,y=V2,size=plot_vector1$V3,colour=plot_vector1$V4),plot_vector1)+ scale_colour_gradientn(colours=rainbow(3))+scale_size_continuous(range = c(1, 10))
z
We have the map and the coordinates, but one thing we are missing is the curve lines. In order to draw a curve line we have to use the gcintermediate function (http://kateto.net/network-visualization). I tried using geom curve which didn’t work here.
So, using the traditional method the no of lines of code increases and becomes complicated as we try to refine the graph.
Plotting the network data using leaflet.
#we use the node3 to get our origin and destination airport for plotting it on our map
node1=data.frame(plot_vector1[edgelist[,1],])
node2=data.frame(plot_vector1[edgelist[,2],])
node3=data.frame(cbind(node1,node2))
#Then we use a for loop in order to put the destination and origin airport on to the leaflet
map3 = leaflet(node3) %>% addTiles()
for(i in 1:nrow(node3)){
map3 <- addPolylines(map3, lat = as.numeric(node3[i, c(2, 6)]),
lng = as.numeric(node3[i, c(1, 5)]))
#origin
map3<-addCircleMarkers(map3, lat = as.numeric(node3[i, 2]),
lng = as.numeric(node3[i,1]),radius =(as.numeric(node3[i, 4]))*100,color='red')
#destination
map3<-addCircleMarkers(map3, lat = as.numeric(node3[i, 6]),
lng = as.numeric(node3[i, 5]),radius =(as.numeric(node3[i, 8]))*100,color='red' )
}
map3
so we have now plotted the leaflet but the issue with leaflet is that it considers every values as a pixel and doesn’t adjust the values when plotting it on a map. As a result we have to manually adjust the values by multiplying or dividing a factor of 10 each time the centralities are calculated because the marker size depends on the centrality values which changes every time. so to plot this value on the leaflet manual adjustment has to be done otherwise the marker may appear large or vary small each time centralities are calculated for different air carriers. In the above example i have multiplied the node sizes to take the value of 100.
ggmap and ggraph:
Here our base layer will be ggraph instead of ggmap.
#Here we are telling ggraph to take the layout as the longitude and latitude instead of using nicely as the default layout.
flights$layout=cbind(V(flights)$lon,V(flights)$lat)
#arrow formating
a <- arrow(type = "closed", length = unit(.09, "inches"))
#from the below code we are telling ggmap to take ggraph as your base layer with layout as the cordinates.
#Here we get the curvature using geom_edge_arc.
g<-ggmap(map, fullpage = TRUE,base_layer = ggraph(flights) ) + geom_edge_arc(aes(edge_size=E(flights)$Frequency),edge_colour = 'yellow', edge_alpha = 0.5,curvature = 0.2,arrow=a,end_cap = circle(.008, 'inches')) +geom_node_point(aes(colour = closeness_centrality,size=betweenness_centrality))+ scale_colour_gradientn(colours=rainbow(3))+scale_size_continuous(range = c(1, 10))
plot(g)
The Advantage of using ggmap and ggraph
few lines of code does the trick of plotting the graph data on the map.
now we can access all the functionality of ggraph and plot it on a map which was not possible previously using ggplot.
using tidygraph and combining it with ggmap and ggraph enable us to calculate, plot the centrality efficiently on map.
Also we can use the facet_edges and facet_nodes functionality of ggraph.
Note: when integrating ggmap and ggplot to shiny we have to cache the data everytime, using “<<”.
#here the flights_facet consist of all the airline/air carrier data.
flights_facet = tbl_graph(nodes=airports,edges=routes1,directed=TRUE)
flights_facet=flights_facet%>%mutate(betweenness_centrality = centrality_betweenness()) %>%mutate(closeness_centrality = centrality_closeness())%>%mutate(centrality_degree=centrality_degree(mode='out',normalized = F))%>%mutate(centrality_eigen=centrality_eigen(weights=flights$Frequency,directed=F))
flights_facet$layout=cbind(V(flights_facet)$lon,V(flights_facet)$lat)
g<-ggmap(map, fullpage = TRUE,base_layer = ggraph(flights_facet) ) + geom_edge_arc(aes(),edge_colour = 'yellow', edge_alpha = 0.5,curvature = 0.2,arrow=a,end_cap = circle(.008, 'inches')) +geom_node_point(aes(colour = closeness_centrality,size=betweenness_centrality))+ scale_colour_gradientn(colours=rainbow(3))+scale_size_continuous(range = c(1, 10))+facet_edges(~Carrier)
g
Adding tooltip on to the map.
Library used: ggiraph
flights$tooltip=V(flights)$Label
g<-ggmap(map, fullpage = TRUE,base_layer = ggraph(flights),environment=environment() )+geom_edge_arc(aes(), edge_alpha = 0.5,curvature = 0.2,arrow=a,end_cap = circle(.008, 'inches')) +geom_node_point(aes(colour = closeness_centrality,size=betweenness_centrality))+ scale_colour_gradientn(colours=rainbow(3))
## Warning: fullpage and expand syntaxes deprecated, use extent.
## Using `nicely` as default layout
## Using `nicely` as default layout
## Warning: `panel.margin` is deprecated. Please use `panel.spacing` property
## instead
a=g + geom_point_interactive(x=V(flights)$lon,y=V(flights)$lat,tooltip = V(flights)$Label,size=3,alpha=0.01)
ggiraph(code = {print(a)})
## Warning: Removed 2079 rows containing missing values (geom_edge_path).
## Warning: Removed 8 rows containing missing values (geom_point).
My shiny app can be visited to get the finished work:
https://debasishb.shinyapps.io/shinynet/ https://wiki.smu.edu.sg/1617t3isss608g1/ShinyNET