CSC595 Data Visualization Final Project

Peter Truax

12/2/2022

Saint Martin’s University Network

Saint Martin’s University Network

The Data

I pulled data from Saint Martin’s University’s main network firewall April 2022. This has detailed information dealing with security threats. This data is valuable for cyber security threat assessment and response.

I am going to use this dataset to teach you how to create two different plot diagrams that can be created in R Studio. These are the Donut Plot and the Heatmap.

Data Setup

Both charts use the same setup. We can answer several questions derived from this same code.

library(ggplot2)
library(viridis)
library(dplyr)
library(tidyverse)


threat_raw <- read_csv("C:\\linux\\CSC550 Data Visualization\\Class\\AprilThreats.csv", col_types = cols(timestamp = col_character()))
## Warning in register(): Can't find generic `scale_type` in package ggplot2 to
## register S3 method.
## Warning: package 'viridis' was built under R version 4.1.3
## Loading required package: viridisLite
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Warning: package 'tidyverse' was built under R version 4.1.3
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble  3.1.6     v purrr   0.3.4
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Question 1 - What countries attacked SMU the most?

Use a donut plot to answer the question. Filter the dataset for only the “Critical” severity threats. Use the Viridis “turbo” color palette.

Answer:

threats <- threat_raw %>% filter(Severity == "critical", na.rm=TRUE) %>%
  select(Source_Country) 

threatcount <- count(threats, Source_Country)

hsize <- 2

threatcount <- threatcount %>% 
  mutate(x = hsize)

ggplot(threatcount, aes(x = hsize, y = n, fill = Source_Country)) +
  geom_col() +
  coord_polar(theta = "y") +
  scale_fill_viridis_d(option="turbo") +
  xlim(c(0.2, hsize + .5))

Notice there isn’t a “geom_donut” for ggplot2. Instead you use the geom_col() function and bend the resulting columns into a circle. You use the coord_polar() function to perform the bend.

Question 2. What is the largest threat seen coming from China?

It seems like China is the country that is doing the most attacks. Refine the filter to include only China threats. Use a donut chart to display the Threat_IDs that came from China.Use the “turbo” Viridis palette.

Answer:

threats <- threat_raw %>% filter(Severity == "critical" & Source_Country == "China", na.rm=TRUE) %>%
  select(Threat_ID) 

threatId <- count(threats, Threat_ID)

hsize <- 2

threatId <- threatId %>% 
  mutate(x = hsize)

ggplot(threatId, aes(x = hsize, y = n, fill = Threat_ID)) +
  geom_col() +
  coord_polar(theta = "y") +
  scale_fill_viridis_d(option="turbo") +
  xlim(c(0.2, hsize + .5))

Question 3. What is a better graphic we can use to display these results?

Use both the Source_Country and Threat_ID values to create a Heatmap. Use the Viridis standard palette.

Answer.

threats <- threat_raw %>% filter(Severity == "critical", na.rm=TRUE) %>%
  select(Source_Country,Threat_ID) %>%
  count(Source_Country,Threat_ID)


ggplot(threats, aes(x=Source_Country, y=Threat_ID)) + 
  geom_tile(aes(fill=n),color='White', size=0.1) +
     scale_fill_viridis(name="Temperature") +
     theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Question 4. What applications were in use for SMU traffic and were they blocked or not?

Use the Application and Action values to create a Heatmap. Use the Viridis standard palette.

Answer.

threats <- threat_raw %>%
  select(Application, Action) %>%
  count(Application, Action)


ggplot(threats, aes(x=Application, y=Action)) + 
  geom_tile(aes(fill=n),color='White', size=0.1) +
     scale_fill_viridis(name="Temperature") +
     theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1, size = 6),
           axis.text.y = element_text(size = 5))