R is a language and environment for statistical computing and graphics. R’s strenghts are:
Let’s us read and analyse a small social network of dolphins with R. Doubtful Sound is a fiord in New Zealand home to one of the southernmost population of bottlenose dolphins. None of the dolphins have been observed to leave or enter the fiord during a multi-year monitoring regime, which is also partly attributed to the difficult and unusual features of their habitat. Their social grouping is thus extremely close.
David Lusseau observed the school of dolphins between 1995 and 2001. He then built a social network with 62 dolphins and 159 undirected ties representing preferred companionships, defined as pair of individuals that were seen together more often than expected by chance.
To start, if not done already, install the igraph package for network analysis from an R environment, such as RStudio:
install.packages("igraph")
Load package igraph:
library("igraph")
Read the social network from a copy on the Web:
g = read.graph(file="http://users.dimi.uniud.it/~massimo.franceschet/teaching/datascience/network/R/dolphin.gml", format="gml")
Print a summary of the graph:
summary(g)
## IGRAPH U--- 62 159 --
## attr: id (v/n), label (v/c), sex (v/c)
The graph in undirected (U),has 62 nodes and 159 edges. Notice that the graph nodes have three attributes: id (a numeric identifier), label (name of the dolphins), and sex (sex of the dolphins).
Print number of nodes:
vcount(g)
## [1] 62
Print number of edges:
ecount(g)
## [1] 159
Print nodes:
V(g)
## Vertex sequence:
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
## [24] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
## [47] 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62
Print a sample of 10 edges:
E(g)[sample(1:ecount(g), 10)]
## Edge sequence:
##
## [54] 35 -- 34
## [57] 37 -- 21
## [149] 58 -- 55
## [145] 58 -- 18
## [83] 43 -- 11
## [3] 10 -- 7
## [22] 21 -- 19
## [39] 30 -- 11
## [53] 35 -- 15
## [135] 55 -- 20
Visualize the graph (Fruchterman-Reingold layout):
coords = layout.fruchterman.reingold(g)
plot(g, layout=coords, vertex.label=NA, vertex.size=5)
View names of dolphins:
V(g)$label
## [1] "Beak" "Beescratch" "Bumper" "CCL" "Cross"
## [6] "DN16" "DN21" "DN63" "Double" "Feather"
## [11] "Fish" "Five" "Fork" "Gallatin" "Grin"
## [16] "Haecksel" "Hook" "Jet" "Jonah" "Knit"
## [21] "Kringel" "MN105" "MN23" "MN60" "MN83"
## [26] "Mus" "Notch" "Number1" "Oscar" "Patchback"
## [31] "PL" "Quasi" "Ripplefluke" "Scabs" "Shmuddel"
## [36] "SMN5" "SN100" "SN4" "SN63" "SN89"
## [41] "SN9" "SN90" "SN96" "Stripes" "Thumper"
## [46] "Topless" "TR120" "TR77" "TR82" "TR88"
## [51] "TR99" "Trigger" "TSN103" "TSN83" "Upbang"
## [56] "Vau" "Wave" "Web" "Whitetip" "Zap"
## [61] "Zig" "Zipfel"
View sex of dolphins:
V(g)$sex
## [1] "M" "M" "M" "F" "M" "F" "M" "M" "F" "M" "F" "F" "M" "M" "F" "M" "F"
## [18] "M" "M" "M" "F" "M" "M" "M" "M" "M" "M" "M" "M" "M" "M" "M" "U" "F"
## [35] "F" "M" "F" "F" "F" "F" "F" "M" "M" "F" "M" "M" "F" "F" "U" "F" "F"
## [52] "F" "F" "U" "M" "F" "F" "M" "F" "U" "M" "M"
Count male and female dolphins:
male = V(g)$sex == "M"
male
## [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE FALSE
## [12] FALSE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE FALSE TRUE
## [23] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
## [34] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE
## [45] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## [56] FALSE FALSE TRUE FALSE FALSE TRUE TRUE
sum(male)
## [1] 33
female = V(g)$sex == "F"
sum(female)
## [1] 25
Visualize the graph with nodes colored with their sex:
V(g)$color = "white"
V(g)[male]$color = "blue"
V(g)[female]$color = "pink"
plot(g, layout=coords, vertex.label=NA, vertex.size=5)
Computes number of friends of dolphins (degrees of nodes) and print some descriptive statistics:
d = degree(g)
summary(d)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 3.000 5.000 5.129 7.000 12.000
Or use single statistics:
min(d)
## [1] 1
max(d)
## [1] 12
mean(d)
## [1] 5.129032
median(d)
## [1] 5
sd(d)
## [1] 2.955871
quantile(d, seq(0, 1, 0.1))
## 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
## 1.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 7.8 9.0 12.0
Plot the histogram of number of friends:
hist(d, xlab="Number of friends", main="Dolphin friendship", breaks=0:max(d))
The top-10 most friendly dolphins:
names(d) = V(g)$label
sort(d, decreasing=TRUE)[1:10]
## Grin SN4 Topless Scabs Trigger Jet
## 12 11 11 10 10 9
## Kringel Patchback Web Beescratch
## 9 9 9 8
The average friendship of males and females:
mean(d[male])
## [1] 5.060606
mean(d[female])
## [1] 5.6
Let’s check if the network is assortative by sex, that is if dolphins prefer to tie with dolphins of the same sex.
# the subgraph of dolphins with known sex
h = induced.subgraph(g, V(g)[male | female])
# males and females of the new graph
male.h = V(h)$sex == "M"
female.h = V(h)$sex == "F"
# edges as a matrix
edge = get.edgelist(h)
# some edges
edge[1:10,]
## [,1] [,2]
## [1,] 4 9
## [2,] 6 10
## [3,] 7 10
## [4,] 1 11
## [5,] 3 11
## [6,] 6 14
## [7,] 7 14
## [8,] 10 14
## [9,] 1 15
## [10,] 4 15
# first nodes of the edges
edge[1:10, 1]
## [1] 4 6 7 1 3 6 7 10 1 4
# second nodes of the edges
edge[1:10, 2]
## [1] 9 10 10 11 11 14 14 14 15 15
# all first nodes of the edges
# each edge is considered twice since it is undirected
x = c(edge[,1], edge[,2])
# all second nodes of the edges
y = c(edge[,2], edge[,1])
# sex of all first nodes of the edges
sx = male.h[x]
# sex of all second nodes of the edges
sy = male.h[y]
# Pearson correlation coefficient
cor(sx, sy)
## [1] 0.4014706
# number and percentage of male to male edges
m2m = sum(sx & sy) / 2
m2m
## [1] 58
m2m / ecount(h)
## [1] 0.3918919
# number and percentage of female to female edges
f2f = sum(!(sx | sy)) / 2
f2f
## [1] 46
f2f / ecount(h)
## [1] 0.3108108
# number and percentage of different sex edges
m2f = sum(xor(sx, sy)) / 2
m2f
## [1] 44
m2f / ecount(h)
## [1] 0.2972973
# also equal to:
ecount(h) - (m2m + f2f)
## [1] 44
# percentage of same sex edges
(m2m + f2f) / ecount(h)
## [1] 0.7027027