####Part 1 Replicate the Procedures.
We split the strings in the authorlist data set,to create an edgelist and adjacency matrix. The matrix showcases the frequency in which authors collaborate together.
a<-read.csv("~/Library/CloudStorage/Box-Box/Taurean/Year_1/SSNA/ID803.csv")
a<-as.data.frame(a[,1])
colnames(a)<-"AU"
a
## AU
## 1 LANGSTON, WJ: BEBIANNO, MJ: MINGJIANG, Z
## 2 BEBIANNO, MJ: LANGSTON, WJ
## 3 BEBIANNO, MJ: LANGSTON, WJ: SIMKISS, K
## 4 BEBIANNO, MJ: LANGSTON, WJ
## 5 NOTT, JA: BEBIANNO, MJ: LANGSTON, WJ: RYAN, KP
## 6 BEBIANNO, MJ: NOTT, JA: LANGSTON, WJ
## 7 BEBIANNO, MJ: LANGSTON, WJ
## 8 BEBIANNO, MJ: SERAFIM, MAP: RITA, MF
## 9 BEBIANNO, MJ: LANGSTON, WJ
## 10 BEBIANNO, MJ
## 11 Mudge, SM: Bebianno, MJ
## 12 Bebianno, MJ: Machado, LM
## 13 Caetano, M: Falcao, M: Vale, C: Bebianno, MJ
## 14 Gibbs, PE: Bebianno, MJ: Coelho, MR
## 15 Mudge, SM: East, JA: Bebianno, MJ: Barreira, LA
## 16 Bebianno, MJ: Langston, WJ
## 17 Bebianno, MJ: Serafim, MA
## 18 Gibbs, PE: Nott, JA: Nicolaidou, A: Bebianno, MJ
## 19 Bebianno, MJ: Langston, WJ
## 20 Cajaraville, MP: Bebianno, MJ: Blasco, J: Porte, C: Sarasquete, C: Viarengo, A
## 21 Bebianno, MJ: Serafim, MA: Simes, D
## 22 Serafim, MA: Bebianno, MJ
## 23 Coelho, MR: Fuentes, S: Bebianno, MJ
## 24 Barroso, CM: Moreira, MH: Bebianno, MJ
## 25 Dabrio, M: Rodriguez, AR: Bordin, G: Bebianno, MJ: De Ley, M: Sestakova, I: Vasak, M: Nordberg, M
## 26 Geret, F: Jouan, A: Turpin, V: Bebianno, MJ: Cosson, RP
## 27 Geret, F: Serafim, A: Barreira, L: Bebianno, MJ
## 28 Coelho, MR: Bebianno, MJ: Langston, WJ
## 29 Coelho, MR: Bebianno, MJ: Langston, WJ
## 30 Coelho, MR: Bebianno, MJ: Langston, WJ
## 31 Cravo, A: Foster, P: Bebianno, MJ
## 32 Serafim, MA: Company, RM: Bebianno, MJ: Langston, WJ
## 33 Geret, F: Serafim, A: Barreira, L: Bebianno, MJ
## 34 Bebianno, MJ: Serafim, MA
## 35 Bebianno, MJ: Cravo, A: Miguel, C: Morais, S
## 36 Simes, DC: Bebianno, MJ: Moura, JJG
## 37 Brown, CJ: Eaton, RA: Cragg, SM: Goulletquer, P: Nicolaidou, A: Bebianno, MJ: Icely, J: Daniel, G: Nilsson, T: Pitman, AJ: Sawyer, GS
## 38 Cravo, A: Madureira, M: Rita, F: Silva, JA: Bebianno, MJ
## 39 Geret, F: Serafim, A: Bebianno, MJ
## 40 Geret, F: Bebianno, MJ
## 41 Cravo, A: Bebianno, MJ: Foster, P
## 42 Company, R: Serafim, A: Bebianno, MJ: Cosson, R: Shillito, B: Fiala-Medioni, A
## 43 Geret, F: Manduzio, H: Company, R: Leboulenger, F: Bebianno, MJ: Danger, JM
## 44 Smaoui-Damak, W: Hamza-Chaffai, A: Bebianno, MJ: Amiard, JC
## 45 Martins, N: Lopes, I: Guilhermino, L: Bebianno, MJ: Ribeiro, R
## 46 Barreira, LA: Bebianno, MJ: Mudge, SM: Ferreira, AM: Albino, CI: Veriato, LM
## 47 Morgado, ME: Bebianno, MJ
## 48 Cravo, A: Bebianno, MJ
## 49 Bebianno, MJ: Company, R: Serafim, A: Camus, L: Cosson, RP: Fiala-Medoni, A
## 50 Gomes, TCM: Serafim, MA: Company, RS: Bebianno, MJ
## 51 Serafim, A: Company, R: Cravo, A: Bebianno, MJ
## 52 Company, R: Serafim, A: Cosson, R: Camus, L: Shillito, B: Fiala-Medioni, A: Bebianno, MJ
## 53 Coelho, MR: Langston, WJ: Bebianno, MJ
## 54 Hoarau, P: Damiens, G: Romeo, M: Gnassia-Barelli, M: Bebianno, MJ
## 55 Company, R: Serafim, A: Cosson, R: Fiala-Medioni, A: Dixon, D: Bebianno, MJ
## 56 Cravo, A: Madureira, M: Felicia, H: Rita, F: Bebianno, MJ
## 57 Guimaraes-Soares, L: Felicia, H: Bebianno, MJ: Cassio, F
## 58 Barreira, LA: Mudge, SM: Bebianno, MJ
## 59 Barreira, LA: Mudge, SM: Bebianno, MJ
## 60 Serafim, A: Bebianno, MJ
## 61 Fernandes, D: Porte, C: Bebianno, MJ
## 62 Company, R: Serafim, A: Cosson, R: Fiala-Medioni, A: Dixon, DR: Bebianno, MJ
## 63 Gonzalez-Rey, M: Serafim, A: Company, R: Bebianno, MJ
## 64 Barreira, LA: Mudge, SM: Bebianno, MJ
## 65 Bebianno, MJ: Lopes, B: Guerra, L: Hoarau, P: Ferreira, AM
## 66 Serafim, A: Bebianno, MJ
## 67 Cravo, A: Foster, P: Almeida, C: Company, R: Cosson, RP: Bebianno, MJ
## 68 Bebianno, MJ: Santos, C: Canario, J: Gouveia, N: Sena-Carvalho, D: Vale, C
## 69 Picado, A: Bebianno, MJ: Costa, MH: Ferreira, A: Vale, C
## 70 Peixoto, NC: Serafim, MA: Flores, EMM: Bebianno, MJ: Pereira, ME
## 71 Fernandes, D: Bebianno, MJ: Porte, C
## 72 Serafim, A: Lopes, B: Company, R: Ferreira, AM: Bebianno, MJ
## 73 Company, R: Serafim, A: Cosson, RP: Fiala-Medioni, A: Camus, L: Colaco, A: Serrao-Santos, R: Bebianno, MJ
## 74 Fernandes, D: Bebianno, MJ: Porte, C
## 75 Fernandes, D: Zanuy, S: Bebianno, MJ: Porte, C
## 76 Marques, CC: Gabriel, SI: Pinheiro, T: Viegas-Crespo, AM: Mathias, MD: Bebianno, MJ
## 77 Oliveira, M: Serafim, A: Bebianno, MJ: Pacheco, M: Santos, MA
## 78 Cravo, A: Foster, P: Almeida, C: Bebianno, MJ: Company, R
## 79 Cosson, RP: Thiebaut, E: Company, R: Castrec-Rouelle, M: Colaco, A: Martins, I: Sarradin, PM: Bebianno, MJ
## 80 Peixoto, NC: Rocha, LC: Moraes, DP: Bebianno, MJ: Dressler, VL: Flores, EMM: Pereira, ME
## 81 Gonzalez-Rey, M: Serafim, A: Company, R: Gomes, T: Bebianno, MJ
## 82 Chora, S: McDonagh, B: Sheehan, D: Starita-Geribaldi, M: Romeo, M: Bebianno, MJ
## 83 Fernandes, D: Andreu-Sanchez, O: Bebianno, MJ: Porte, C
## 84 Company, R: Serafim, A: Lopes, B: Cravo, A: Shepherd, TJ: Pearson, G: Bebianno, MJ
## 85 Ahmad, I: Maria, VL: Oliveira, M: Serafim, A: Bebianno, MJ: Pacheco, M: Santos, MA
## 86 Cravo, A: Lopes, B: Serafim, A: Company, R: Barreira, L: Gomes, T: Bebianno, MJ
## 87 Fonseca, V: Serafim, A: Company, R: Bebianno, MJ: Cabral, H
## 88 Oliveira, M: Maria, VL: Ahmad, I: Serafim, A: Bebianno, MJ: Pacheco, M: Santos, MA
## 89 Serafim, A: Bebianno, MJ
## 90 Maria, VL: Santos, MA: Bebianno, MJ
## 91 Bebianno, MJ: Barreira, LA
## 92 Maria, VL: Ahmad, I: Oliveira, M: Serafim, A: Bebianno, MJ: Pacheco, M: Santos, MA
## 93 Chora, S: Starita-Geribaldi, M: Guigonis, JM: Samson, M: Romeo, M: Bebianno, MJ
## 94 Azevedo, JS: Serafim, A: Company, R: Braga, ES: Favaro, DI: Bebianno, MJ
## 95 Maria, VL: Santos, MA: Bebianno, MJ
## 96 Fernandes, D: Bebianno, MJ: Porte, C
## 97 Gomes, T: Gonzalez-Rey, M: Bebianno, MJ
## 98 Oliveira, M: Ahmad, I: Maria, VL: Serafim, A: Bebianno, MJ: Pacheco, M: Santos, MA
## 99 Company, R: Felicia, H: Serafim, A: Almeida, AJ: Biscoito, M: Bebianno, MJ
## 100 Urena, R: Bebianno, MJ: del Ramo, J: Torreblanca, A
## 101 Company, R: Serafim, A: Cosson, RP: Fiala-Medioni, A: Camus, L: Serrao-Santos, R: Bebianno, MJ
## 102 Serafim, A: Bebianno, MJ
## 103 Chora, S: McDonagh, B: Sheehan, D: Starita-Geribaldi, M: Romeo, M: Bebianno, MJ
## 104 Oliveira, M: Maria, VL: Ahmad, I: Teles, M: Serafim, A: Bebianno, MJ: Pacheco, M: Santos, MA
## 105 Oliveira, M: Ahmad, I: Maria, VL: Ferreira, CSS: Serafim, A: Bebianno, MJ: Pacheco, M: Santos, MA
## 106 Gonzalez-Rey, M: Lau, TC: Gomes, T: Maria, VL: Bebianno, MJ: Wu, R
## 107 Company, R: Serafim, A: Lopes, B: Cravo, A: Kalman, J: Riba, I: DelValls, TA: Blasco, J: Delgado, J: Sarmiento, AM: Nieto, JM: Shepherd, TJ: Nowell, G: Bebianno, MJ
## 108 Fonseca, VF: Franca, S: Serafim, A: Company, R: Lopes, B: Bebianno, MJ: Cabral, HN
## 109 Maria, VL: Bebianno, MJ
## 110 Almeida, C: Pereira, C: Gomes, T: Bebianno, MJ: Cravo, A
## 111 Fonseca, VF: Franca, S: Vasconcelos, RP: Serafim, A: Company, R: Lopes, B: Bebianno, MJ: Cabral, HN
## 112 Gonzalez-Rey, M: Bebianno, MJ
## 113 Gomes, T: Pinheiro, JP: Cancio, I: Pereira, CG: Cardoso, C: Bebianno, MJ
## 114 Serafim, A: Lopes, B: Company, R: Cravo, A: Gomes, T: Sousa, V: Bebianno, MJ
## 115 Company, R: Antunez, O: Bebianno, MJ: Cajaraville, MP: Torreblanca, A
## 116 Gonzalez-Rey, M: Bebianno, MJ
## 117 Cravo, A: Pereira, C: Gomes, T: Cardoso, C: Serafim, A: Almeida, C: Rocha, T: Lopes, B: Company, R: Medeiros, A: Norberto, R: Pereira, R: Araujo, O: Bebianno, MJ
## 118 Lopes, B: Ferreira, AM: Bebianno, MJ
## 119 Serafim, A: Company, R: Lopes, B: Fonseca, VF: Franca, S: Vasconcelos, RP: Bebianno, MJ: Cabral, HN
## 120 Gomes, T: Pereira, CG: Cardoso, C: Pinheiro, JP: Cancio, I: Bebianno, MJ
#package for first split
# install.packages("splitstackshape")
library(splitstackshape)
#As can be seen in the file the separator of interest is :
a1<-cSplit(a, splitCols = "AU", sep = ":", direction = "wide", drop = FALSE,
type.convert) #retain the matrix form version of the adjacency list input
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
## Warning in type.convert.default(X[[i]], ...): 'as.is' should be specified by the
## caller; using TRUE
#Here we just drop the original first column
a1<-a1[,-1]
#class(a1)
write.csv(a1, file = "~/Library/CloudStorage/Box-Box/Taurean/Year_1/SSNA/authorship.csv")
mat <- as.matrix(a1)
head(mat)
## AU_01 AU_02 AU_03 AU_04 AU_05 AU_06 AU_07
## [1,] "LANGSTON, WJ" "BEBIANNO, MJ" "MINGJIANG, Z" NA NA NA NA
## [2,] "BEBIANNO, MJ" "LANGSTON, WJ" NA NA NA NA NA
## [3,] "BEBIANNO, MJ" "LANGSTON, WJ" "SIMKISS, K" NA NA NA NA
## [4,] "BEBIANNO, MJ" "LANGSTON, WJ" NA NA NA NA NA
## [5,] "NOTT, JA" "BEBIANNO, MJ" "LANGSTON, WJ" "RYAN, KP" NA NA NA
## [6,] "BEBIANNO, MJ" "NOTT, JA" "LANGSTON, WJ" NA NA NA NA
## AU_08 AU_09 AU_10 AU_11 AU_12 AU_13 AU_14
## [1,] NA NA NA NA NA NA NA
## [2,] NA NA NA NA NA NA NA
## [3,] NA NA NA NA NA NA NA
## [4,] NA NA NA NA NA NA NA
## [5,] NA NA NA NA NA NA NA
## [6,] NA NA NA NA NA NA NA
mat<-tolower(mat)
dim(mat)# the resulting column dimension is the number of times you will have to repeat the following procedure minus 1
## [1] 120 14
a1<-mat
edgelist1<-matrix(NA, 1, 2)#empty matrix two columns
for (i in 1:(ncol(a1)-1)) {
edgelist11 <- cbind(a1[, i], c(a1[, -c(1:i)]))
edgelist1 <- rbind(edgelist1,edgelist11)
edgelist1<-edgelist1[!is.na(edgelist1[,2]),]
edgelist1<-edgelist1[edgelist1[,2]!="",]
}
dim(edgelist1)
## [1] 1283 2
write.csv(edgelist1, file = "~/Library/CloudStorage/Box-Box/Taurean/Year_1/SSNA/authorship_edgelist.csv")
Both Graph G and Graph G.C are undirected networks. Graph G has 173 nodes with 1283 edges. Graph G.C. has 173 actor nodes with 734 edges. Graph G maps both double counts connections between nodes. If actor A is connected to actor b, graph g counts both connections. Graph G.C. removes double connections.
# install.packages("igraph")
library(igraph)
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
#### Graph G information
g<- graph.edgelist(edgelist1, directed = FALSE)
E(g)$weight <- 1#must step
gsize(g) # number of edges
## [1] 1283
gorder(g) # number of nodes
## [1] 173
dim(g) # number of vertices
## NULL
#plot(g, vertex.size = 10, vertex.label.cex=(0.5))
#### Graph G.C information
g.c <- simplify(g)
w.gc <- E(g.c)$weight
gsize(g.c) # number of edges
## [1] 734
gorder(g.c) # number of nodes
## [1] 173
dim(g.c) # number of vertices
## NULL
print(g)
## IGRAPH d032e8f UNW- 173 1283 --
## + attr: name (v/c), weight (e/n)
## + edges from d032e8f (vertex names):
## [1] langston, wj--bebianno, mj langston, wj--bebianno, mj
## [3] langston, wj--bebianno, mj langston, wj--bebianno, mj
## [5] bebianno, mj--nott, ja bebianno, mj--nott, ja
## [7] langston, wj--bebianno, mj bebianno, mj--serafim, map
## [9] langston, wj--bebianno, mj bebianno, mj--mudge, sm
## [11] bebianno, mj--machado, lm caetano, m --falcao, m
## [13] bebianno, mj--gibbs, pe mudge, sm --east, ja
## [15] langston, wj--bebianno, mj bebianno, mj--serafim, ma
## + ... omitted several edges
print(g.c)
## IGRAPH 09278e4 UNW- 173 734 --
## + attr: name (v/c), weight (e/n)
## + edges from 09278e4 (vertex names):
## [1] langston, wj--bebianno, mj langston, wj--nott, ja
## [3] langston, wj--serafim, ma langston, wj--coelho, mr
## [5] langston, wj--company, rm langston, wj--mingjiang, z
## [7] langston, wj--simkiss, k langston, wj--ryan, kp
## [9] bebianno, mj--nott, ja bebianno, mj--serafim, map
## [11] bebianno, mj--mudge, sm bebianno, mj--machado, lm
## [13] bebianno, mj--caetano, m bebianno, mj--falcao, m
## [15] bebianno, mj--gibbs, pe bebianno, mj--east, ja
## + ... omitted several edges
The g rendering has loops on itsself and directions that go back and forth between nodes. G.C. the simplified render does not have the directed edges.
####
plot(graph.edgelist(edgelist1), vertex.size = 10, vertex.label.cex =.5)
plot(g.c, vertex.size = 10, vertex.label.cex=(0.5), edge.width = w.gc)
Using the edgelist with the weights in vector 3, we can see that the top three collaborators are: Bebianno and Serafim. Their most frequent collaboration is on synthesis of proteins in bivalves and undersea animal gills. Bebianno and Company. Their research is also on marine biology. Typically on mussels. Serafim and Company collaborate the most. Their research is on marine responses to contaminants.
links<-as.data.frame(cbind(get.edgelist(g.c), E(g.c)$weight))
#head(links,20)
dim(links)
## [1] 734 3
links$V3<-as.numeric(links$V3)
links<- links[order(links$V3, decreasing=T),]
head(links)
## V1 V2 V3
## 27 bebianno, mj serafim, a 36
## 35 bebianno, mj company, r 27
## 234 serafim, a company, r 22
## 1 langston, wj bebianno, mj 15
## 28 bebianno, mj cravo, a 15
## 51 bebianno, mj lopes, b 11
write.csv(links, file = "~/Library/CloudStorage/Box-Box/Taurean/Year_1/SSNA/weighted_authorship.csv")