An Information Flow Model for Conflict and Fission in Small Group

Overview

This file implements Programming Assignment 2: Visualize Network Data for the Coursera Data Visualization Class.

This assignment uses R Programming Language to visualize data.

Q1. What is the data that you chose? Why?

Data from a voluntary association are used to construct a new formal model for a traditional anthropological problem, fission in small groups. The process leading to fission is viewed as an unequal flow of sentiments and information across the ties in a social network. This flow is unequal because it is uniquely constrained by the contextual range and sensitivity of each relationship in the network. The subsequent differential sharing of sentiments leads to the formation of subgroups with more internal stability than the group as a whole and results in fission.

Q2. Did you use a subset of the data? If so, what was it?

Ans: No I used the full data set as available from this link.

Q3. Are there any particular aspects of your visualization to which you would like to bring attention?

To improve the visualization and make it more clear:

  • Colors (Hue,): I’ve used colors to indicate which conferences the nodes belong to. The colors corespondent to the colors as seen in the legend on the left.
  • Clustering: All nodes are clustered.

The chart is made interactive:

  • Hovering a node will highlight and color the links to it’s linked nodes.
  • Hovering a node will temporary enlarge a node.

Q4. What do you think the data and your visualization show?

An edge is drawn if two individuals consistently were observed to interact outside the normal activities of the club (karate classes and club meetings). That is, an edge is drawn if the individuals could be said to be friends outside the club activities. All the edges in Figure 1 are non-directional (they represent interaction in both directions), and the graph is said to be symmetrical.

Loading the Data

The Data is obtained from University of Michigan Network Data: Zachary’s karate club social network of friendships between 34 members of a karate club at a US university in the 1970s. Please cite W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 (1977).

Loading the packages which will be used in analyzing the data

# setwd("D:/Coursera/Data Mining Specialization/Data Visualization")

library(igraph)
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
library(networkD3)

# Reading the file
karate <- read_graph(file = "karate.gml", format="gml")
summary(karate)
## IGRAPH bcc2113 U--- 34 78 -- 
## + attr: id (v/n)

Convert the igraph data into something more suitable for networkD3

karate.nodes <- get.data.frame(karate, what = "vertices")
karate.edges <- get.data.frame(karate, what = "edges")
str(karate.nodes)
## 'data.frame':    34 obs. of  1 variable:
##  $ id: num  1 2 3 4 5 6 7 8 9 10 ...
str(karate.edges)
## 'data.frame':    78 obs. of  2 variables:
##  $ from: num  1 1 2 1 2 3 1 1 1 5 ...
##  $ to  : num  2 3 3 4 4 4 5 6 7 7 ...

networkD3 requires edge references to nodes start from 0.

karate.edges$from <- karate.edges$from - 1
karate.edges$to <- karate.edges$to - 1

Create the interactive D3 plot.

forceNetwork(Links = karate.edges, Nodes = karate.nodes,
             Source = "from", Target = "to",
             NodeID = "id",
             legend = TRUE,
             linkDistance = 100, 
             Group = "id", opacity = 1.0, zoom = TRUE, fontSize = 20)

Figure 1: This is the graphic representation of the social relationships among the 34 individuals in the karate club. A line is drawn between two points when the two individuals being represented consistently interacted in contexts outside those of karate classes, workouts, and club meetings. Each such line drawn is referred to as an edge.

Note: You can Zoom with the scroll wheel.