avaialable from (https://rpubs.com/staszkiewicz/cw6Net)
A network (graph) consists of a vertex (node) and an edge. The way many packages available for network analysis, we will use the `igraph’
#rm(list = ls()) # Gdyby była potrzeba wyczyszczenia środowiska
#install.packages("igraph") # jeśli po raz pierwszy instalujemy bibliotekę
library(igraph) # ładowanie pakietu
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
Igraph overlays its names in the workspace, so I rather recommend its use alone (without overloading the namespace libraries)
As usual, we will use primary data from the article Audit fee and communication sentiment. Economic Research-Ekonomska Istraživanja. https://doi.org/10.1080/1331677X.2021.1985567 Staszkiewicz and Karkowska (2021). The data, as usual, are available in the Essential (“Public Materials”) file named Bank.cvs. Please download it to your computer and upload it to R. Please note that the data may be used after class only for non-commercial purposes with attribution of the source.
# z bazowych funkcji systemu wczytamy klasyczny plik z csv ale w taki sposób
# że wybierzemy z okienka umiejscowienie pliku na włanym komuterze
# dlaego zagnieżdzamy polecenie "file.choose()"
# bank <- read.csv(file.choose()) # not acitve for this part
Igraph allows you to visualize a network (graph). Let’s name vertices 1,2,3 and their relationships that 1 connects to 2, further 2 connects to 3, while 3 connects to 1. Now let’s define such a network as an object S1 and see its structure:
S1<-graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )
class(S1)
## [1] "igraph"
S1
## IGRAPH ba5b536 U--- 3 3 --
## + edges from ba5b536:
## [1] 1--2 2--3 1--3
S1 is an “igraph” object, which is a list. Described by the following characteristics: 4 letters:
Two numbers (3 3) the number of vertices and edges
respectively.
And a letter designation of the features (attributes) of the vertices
and edges:
in our case we are dealing with an undirected network, because when
creating the object we chose the option: directed=F
and so
let’s visualize our network with the plot function
plot(S1)
ABB Group in 2018 was audited by KPMG in 2019 and 2016 and 2015 by Deloitte, in 2020 and 2017 by PWC and in 2021 by EY. Average annual audit firms hire 223, 198, 204, 187 auditors data for KPMG, Deloitte, PWC and EY respectively.
Let us create a network showing the direction of ABB flow between audit firms. Our nodes are the audit firms (W), while the edges are the change of audit firm in a given year. Since it matters whether ABB moves from EY to KPMG or vice versa in a given year, so our network will be directed.
W<-c("KPMG","Deloitte","PWC","EY") # zdefinowaliśmy wierzchołki
# Definiujemy krawędzie odpowiadające transfer firm audytorskich w poszczególnych latach
# 2015->2016; 2016->2017; 2017->2018 2018->2019 2019->2020 2020->2021
K<-c("KPMG","KPMG", "KPMG","Deloitte", "Deloitte","KPMG", "KPMG","Deloitte", "Deloitte","PWC", "PWC","EY")
Let’s visualize the network:
S2 <- S1<-graph( edges=K, n=4, directed=T )
## Warning in graph(edges = K, n = 4, directed = T): 'n' is ignored for edge list
## with vertex names
plot(S2)
Note that when we define vertices with proper names we do not need to
specify them with parameter n. In 2015 and 2016 ABB used the services
of KPMG so we have a cycled (loop) edge to KPMG, while in subsequent
years there were exchanges between KPMG and Deloitte. We don’t always
need such detailed information and sometimes we may want to omit
unnecessary repeats or zcycled edges between our vertices, hence we can
simplify the graph as follows by removing repeats and loops.
#upraszczamy sieć S2 poprze usunięcie powtórzeń i pętli
S2b <- simplify(S2, remove.multiple = T, remove.loops = T)
# zobrazujmy uproszczoną sieć
plot(S2b)
So far, we have operated on nodes and edges, but for both nodes and
edges we can create attributes (features). Let’s first see what the
current strycture of the S2 network is and then assign attributes to it.
We can check the vertices of
V(object_name)
,
V(S2)
## + 4/4 vertices, named, from 752f6e4:
## [1] KPMG Deloitte PWC EY
edges E(object_name)
.
E(S2)
## + 6/6 edges from 752f6e4 (vertex names):
## [1] KPMG ->KPMG KPMG ->Deloitte Deloitte->KPMG KPMG ->Deloitte
## [5] Deloitte->PWC PWC ->EY
The relationship between vertices and edges can also be shown in matrix form (generally sparse), as follows:
S2[]
## 4 x 4 sparse Matrix of class "dgCMatrix"
## KPMG Deloitte PWC EY
## KPMG 1 2 . .
## Deloitte 1 . 1 .
## PWC . . . 1
## EY . . . .
So far, we have not depicted on the graph the information when the transfer of the audit firm occurs. Since this is an edge feature, let us build such a feature by using the function V and adding the years of transfer.
lata<- c("2015->2016","2016->2017","2017->2018","2018->2019","2019->2020","2020->2021")
E(S2)$lata<-lata
edge_attr(S2)
## $lata
## [1] "2015->2016" "2016->2017" "2017->2018" "2018->2019" "2019->2020"
## [6] "2020->2021"
Let’s visualize
plot(S2, edge.label=E(S2)$lata,edge.arrow.size = 0)
Let’s zoom in this visualization on the size of the default font in the
graph
edge.label.cex=0.4
and show the returns of edges
edge.arrow.size = 0.05
plot(S2, edge.label=E(S2)$lata,edge.arrow.size = 1,edge.label.cex=0.4)
So far in our gaf we have extracted information about audit firms,
dynamics of change (edge characteristics), what we have not illustrated
is employment in audit firms. Employment is not a characteristic of an
edge but of a vertex. Now we have a vertex described with the name of
the audit firm, we will use two features of the node i.e. the color of
the circle and the color of the circle field to show the employment
data. Let’s start with the color of the circle field: Let us assign a
color to the employment size. In view of the above: 1. will assign an
attribute to the ‘staff’ hanger describing the employment
multiplicity
V(S2)$staff<-c(223,198,204,187) # przypisujemy atrybuty
vertex_attr(S2) # zobaczmy przypisanie
## $name
## [1] "KPMG" "Deloitte" "PWC" "EY"
##
## $staff
## [1] 223 198 204 187
Let’s visualize the network, with the circles colored according to the number of auditors
plot(S2, edge.label=E(S2)$lata,edge.arrow.size = 0.05,edge.label.cex=0.4,vertex.color=colors(V(S2)$staff) )
## Warning in if (distinct) c[!duplicated(t(col2rgb(c)))] else c: the condition has
## length > 1 and only the first element will be used
We can also assign individual colors to specific audit firms e.g.:
plot(S2, edge.label=E(S2)$lata,edge.arrow.size = 0.05,edge.label.cex=0.4,vertex.color=c("blue","yellow","green","red") )
And by framing the oval, divide into those sub-entities that have more
than 200 auditors
plot(S2, edge.label=E(S2)$lata,edge.arrow.size = 0.05,edge.label.cex=0.4,vertex.color=c("brown","yellow","green","red"),
vertex.frame.color=c( "black", "cyan")[1+(V(S2)$staff<200)])
Let’s combine area color scaling and circle borders into one
plot(S2, edge.label=E(S2)$lata,edge.arrow.size = 0.05,edge.label.cex=0.4,vertex.color=colors(V(S2)$staff),vertex.frame.color=c( "black", "cyan")[1+(V(S2)$staff<200)] )
## Warning in if (distinct) c[!duplicated(t(col2rgb(c)))] else c: the condition has
## length > 1 and only the first element will be used
Let us complement our network with vertex graphics. We would often like to depict vertices graphically, e.g. with logos of auditing companies. To do so, we need to 1) build a database of graphics 2) translate the graphics into their digital representation (raster) 3) create a graphic attribute in the igraphic object. So.
rasters <- as.list(c(imgType1='',imgType2='',imgType3='',imgType4=''))
rasters is a list containing four image objects. In the folder “path”
folder I put 4 graphics logs of each submot, to convert them to graphics
objects we need to run library library(jpeg)
library or
install it.
library(jpeg)
rasters$imgType1 <- readJPEG("path/KPMG.jpg",native=TRUE)
rasters$imgType2 <- readJPEG("path/PWC.jpg",native=TRUE)
rasters$imgType3 <- readJPEG("path/D.jpg",native=TRUE)
rasters$imgType4 <- readJPEG("path/EY.jpg",native=TRUE)
We are building a data frame in which we have vertex names and corresponding rasterized image types:
lkp_mat <- data.frame(from=c('KPMG','PWC','Deloitte','EY'),type=c('imgType1','imgType2','imgType3','imgType4'))
and then associate the image types with the vertices in our igraph
object, using the vertex name and the vertex ids in both objects,
i.e. To do this, we will use the vertex name ‘name’ and the vertex ids
in both objects, i.e. lkp_mat
and S2
as
follows:
for(i in V(S2)$name){
imgtype <- lkp_mat$type[lkp_mat["from"]==i]
V(S2)[name==i]$raster <- rasters[imgtype]
}
What we are left with is a visualization add 1) vertex.shape=“raster”
- to insert images 2)vertex.label=""
to not insert text
description
and here is the effect
plot(S2, edge.label=E(S2)$lata,edge.arrow.size = 0.05,edge.label.cex=0.4,vertex.color=colors(V(S2)$staff),
vertex.frame.color=c( "black", "cyan")[1+(V(S2)$staff<200)],
vertex.shape="raster",vertex.size=20, vertex.size2=20,vertex.label="")
## Warning in if (distinct) c[!duplicated(t(col2rgb(c)))] else c: the condition has
## length > 1 and only the first element will be used
## Generating Subgraphs (Subnetworks)
Often we don’t need the whole network, but just a part of it, so we can build the network based on the characteristics of hangers or edges. Let us analyze only two hangers, namely KPMG and Deloitte.
sel<-c("KPMG","Deloitte") #generujemy listę wierzchołków
S3<- induced.subgraph(graph=S2,vids=sel) #bubujemy subsieć
plot(S3, main="ABB zmiany firm audytorskich") #wizualizujemy
#Parameters of the plot function
vertices vertex.color color of the vertex vertex.frame.color color of vertex’s border vertex.shape One of “none”, “circle”, “square”, “csquare”, “rectangle” “crectangle”, “vrectangle”, “pie”, “raster”, or “sphere” vertex.size Size of vertex (default is 15) vertex.size2 Secondary vertex size (for example for a rectangle) vertex.label A string of characters used as vertex name vertex.label.family Font (“Times”, “Helvetica”, etc.) vertex.label.font Type of font: 1 plain, 2 bold, 3, italic, 4 bold and italic , 5 simile vertex.label.cex Font size (multiplication factor, device-dependent) vertex.label.dist distance label to vertex vertex.label.degree position of label in relation to vertex, where 0 right, “pi” left, “pi/2” below and “-pi/2” above
Edges. edge.colour color of edge edge.width width of edge, default =1, edge.arrow.size size of the arrow, default = 1 edge.arrow.width width of the arrow, default =1 edge.lty line type 0 for “none”, 1 for “continuous”, 2 for “dashed”, 3 for “dotted”, 4 for “dotted and dashed”, 5 for “long dashed”, 6 for “double dashed” edge.label String of characters used for the carver label edge.label.family Font type (“Times”, “Helvetica”, etc.) edge.label.font Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol edge.label.cex size of the font in the label edge.curved edge curvature, in range 0-1 (FALSE to 0, TRUE to 0.5) arrow.mode A vector indicating whether there should be an arrow, possible stay: 0 no arrow, 1 back, 2 forward, 3 both Other. margin Empty space margins around the plot, vector with length 4 frame if TRUE, the graph will be framed main if set, the gaf title will be displayed sub subtitle of graph
Marco Van Fun Marco Van Fun is the director of a cooperative bank in Gliwice, and today he has just received the monthly report on the realisation of the loan portfolio. Among the top 10 exposures is a loan granted to the Agricultural Advisory Centre (ODR) in Dobrzeniu Wielkie, for PLN 70 million. The loan was secured with ODR’s shares in RodzinaNaSwoim Sp. z o.o. at the time of granting the loan, i.e. 12 months ago, RodzinaNaSwoim was a capital group structured as follows:
Grandma sp. z.o.o. Wnuczek sp. z o.o. Uncle s.a. Cousin s.r.o The returns vectors indicate control, while percentages indicate the level of control and influence. The shareholders of the family on its own apart from Wnuczek sp. o.o. (40%) were also individuals, grandmother (10)%, mother (20%), father (15%), child (10%), granddaughter (5%). The report indicates the following transactions with related parties for the last 12 months.
Page nal. | Transaction value | Hourly value | |
RnS | Mom | 100 | 90 |
Mummy | RnS | 300 | 350 |
Mum | Mum | 200 | |
RnS | Grandson | 500 | 550 |
Uncle | RnS | 111 | 100 |
Uncle | Dad | 120 | 220 |
Uncle | Grandma | 100 | 100 |
Grandma | RnS | 500 | 350 |
Grandma | Mommy | 702 | 560 |
Ta-Ta | RnS | 200 | 210 |
In the case of the RnS | Dad | 300 | 100 |
In the reports submitted in the credit application, there was one transaction with related parties Mama sp. z o.o. sold to RodzinieNaSwoim, products for the amount of 100 million with a fair value of 130 million million.
** You are required to:**
Hint Discuss the idea of the article: Staszkiewicz, P. (2011). Risk structures. A conceptual sketch. (K. Jajuga & W. Ronka-Chmielowiec, Eds.). Financial investments and insurance - global trends vs. Poland, 183, 378–384.
Interacting network drawing in 3d interactive
A well written guide to igraph here.
How to polyline graphics to vertices image link
How to visualize a network segment. subnet generation