Network Analysis using igraph and Statistical Analysis using ggplot

igraph is used to visualize network based on same user login on multiple hosts and ggplot is used to create statistical aalysis from the data set on diffrent category of events for hosts and user.

Loading and Subsetting the data

x<-read.csv('july.csv')
y<-c("Target.Host.Name","Target.User.Name")
z<-x[y]
z_df <- as.data.frame.matrix(table(z))

Creation of Boolean Matrix

z2<-replace(z_df,z_df >= 1, 1)
z3<-as.matrix(z2)

Creation of Adjacency Matrix

terma<-z3 %*% t(z3)

Loading igraph library for Network Analysis

library(igraph)
library(tcltk)

Plot Adjacency graph

g<-graph.adjacency(terma,weighted=T,mode="undirected")
g<-simplify(g)

Set Labels and Degrees of Vertices

V(g)$label<-V(g)$name
V(g)$degree<-degree(g)

Converting into an interactive graph

set.seed(3456)
tkplot(g, layout=layout.kamada.kawai)
## [1] 1
plot(g, layout=layout.kamada.kawai)

plot of chunk Converting into an interactive graph

Loading the library related to ggplot for Statistical Analysis

library(ggplot2)
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 3.1.1
## Loading required package: grid

Loading and modifying data for Statistical Analysis

y<-c("Target.Host.Name","Target.User.Name","Category.Behavior","Category.Outcome")
m<-x[y]
colnames(m)<-c("targethost","targetuser","categorybehavior","categoryoutcome")

Creating plot for Target Host and Target User Statistics

p1<-ggplot(aes(x=targethost),
           data =  subset(m,categorybehavior %in% c("/Modify/Attribute") & categoryoutcome%in% c("/Success") ))+
  geom_histogram(color =I('black'),fill = I('#099009'))+
  theme(axis.text.x=element_text(angle=30,hjust=1,vjust=1))+
  ggtitle('Distribution of Hosts for Attribution Modification Success events')
p2<-ggplot(aes(x=targetuser),
           data =  subset(m,categorybehavior %in% c("/Authentication/Verify") & categoryoutcome%in% c("/Failure") ))+
  geom_histogram(color =I('black'),fill = I('#099009'))+
  theme(axis.text.x=element_text(angle=30,hjust=1,vjust=1))+
  ggtitle('Distribution of Target Users for Authentication Failure events')
grid.arrange(p1,p2,ncol=1)

plot of chunk Creating Target Host and Target User plots