Assignment 1

Author

Deborah E. Lucas

Published

February 8, 2025

Basic Network Measures and Visualization in R

Dataset - The dataset for this assignment is from Palazzolo, E. T. (2005) Organizing for information retrieval in transactive memory systems. Communication Research, 32(6), 726-761. Two files are explored for this assignment:

edgelist_retrieve.csv

att_expertise.csv

Preparing the Environment

Load required packages

TASK 1: Preparing the environment

Load data set

Load library for igraph


Attaching package: 'igraph'
The following objects are masked from 'package:stats':

    decompose, spectrum
The following object is masked from 'package:base':

    union
IGRAPH 741181d DN-- 15 41 -- 
+ attr: name (v/c)
+ edges from 741181d (vertex names):
 [1] 1 ->6  1 ->9  2 ->6  2 ->9  3 ->5  3 ->6  3 ->9  3 ->17 4 ->5  4 ->6 
[11] 4 ->9  4 ->16 4 ->17 5 ->6  5 ->9  5 ->16 5 ->17 6 ->5  6 ->9  6 ->16
[21] 7 ->16 12->8  12->16 13->6  13->9  13->17 14->4  14->6  14->9  14->16
[31] 15->3  15->6  15->9  15->16 15->17 16->5  16->6  16->9  17->5  17->9 
[41] 17->16

For the edgelist_retrieve, there are 15 nodes and 41 edges. The number of nodes shows the number of entities within the network and the number of edges shoes the relationships between them

IGRAPH c6af2dd DN-- 25 17 -- 
+ attr: name (v/c)
+ edges from c6af2dd (vertex names):
 [1] 1 ->0.176470588 2 ->0.176470588 3 ->0.176470588 4 ->0.529411765
 [5] 5 ->0.529411765 6 ->0.588235294 7 ->0.058823529 8 ->0.294117647
 [9] 9 ->0.588235294 10->0.058823529 11->0.176470588 12->0.058823529
[13] 13->0.058823529 14->0.352941176 15->0.235294118 16->0.588235294
[17] 17->0.411764706

In the att_expertise, there are 25 nodes and 17 edges representing the connections or relationships between the nodes and edges. The edges have a weight that represents something about the relationship between nodes as indicated by the value - for example, 1-> 0. 176470588 has less connection than 15 -> 0.235294118

TASK 2: Network Descriptive Statistics with igraph

A. Calculate the density

density of the graph: 0.1952381 

The density value of .3904762 indicates that roughly 40% of the possible connections between nodes exist. This means that while there is a moderate level of connection, there is still the possibility for more connections.

B. Calculate the degree (in and out)

In-degree of each node:
 1  2  3  4  5  6  7 12 13 14 15 16 17  9  8 
 0  0  1  1  5  9  0  0  0  0  0  8  5 11  1 
Out-degree of each node:
 1  2  3  4  5  6  7 12 13 14 15 16 17  9  8 
 2  2  4  5  4  3  1  2  3  4  5  3  3  0  0 

In-Degree

Node 6 has the highest in degree which means that it is the “most popular” or most connected node within the network.

Nodes 5, 9, 16, and 17 also have relatively high in-degrees which suggests that these nodes are highly connected.

Nodes 1, 7, and 8 have the lowest in-degree which indicates that they are the least connective or least active of the nodes.

Out-Degree

Node 6 also has the highest out-degree which means that it participates in or sends connections to other nodes. Node 6 has the highest in and out degree indicating that it could be the center of the network.

Nodes 5, 9, and 16 also have comparatively high out-degree indicating that they are very active.

Nodes 1, 7, and 8 have the lowest out-degree which correlates to their low in-degree indicating that they are not very active in the network.

C. Calculate additional network measure

Betweeness centrality

betweenness centrality of each node:
Node with the highest betweenness centrality: 12 

In this calculation, Node 12 has the highest betweenness centrality which means that it is central to connecting the other parts of the network and is positioned in such a way that it provides the shortest path between nodes. Node 12 is a central node in terms of information flow and connectiveness.

Clustering Coeefficient

Local clustering coefficients for each node:
        1         2         3         4         5         6         7        12 
1.0000000 1.0000000 0.8000000 0.8000000 0.8000000 0.3777778       NaN 0.0000000 
       13        14        15        16        17         9         8 
0.6666667 1.0000000 0.8000000 0.4166667 0.5714286 0.4181818       NaN 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
 0.0000  0.4182  0.8000  0.6654  0.8000  1.0000       2 
[1] TRUE

For the values between 0 and 1, the node’s are all interconnected (Nodes 1 and 2). For other values such as 8 and 12, there is partial connectedness. The closer the value is to 1, the more connected the nodes are as evidenced by the visualization above.

TASK 3: Network Visualization