Peter Truax
12/2/2022
Saint Martin’s University Network
I pulled data from Saint Martin’s University’s main network firewall April 2022. This has detailed information dealing with security threats. This data is valuable for cyber security threat assessment and response.
I am going to use this dataset to teach you how to create two different plot diagrams that can be created in R Studio. These are the Donut Plot and the Heatmap.
Both charts use the same setup. We can answer several questions derived from this same code.
library(ggplot2)
library(viridis)
library(dplyr)
library(tidyverse)
threat_raw <- read_csv("C:\\linux\\CSC550 Data Visualization\\Class\\AprilThreats.csv", col_types = cols(timestamp = col_character()))
## Warning in register(): Can't find generic `scale_type` in package ggplot2 to
## register S3 method.
## Warning: package 'viridis' was built under R version 4.1.3
## Loading required package: viridisLite
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Warning: package 'tidyverse' was built under R version 4.1.3
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble 3.1.6 v purrr 0.3.4
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
Use a donut plot to answer the question. Filter the dataset for only the “Critical” severity threats. Use the Viridis “turbo” color palette.
threats <- threat_raw %>% filter(Severity == "critical", na.rm=TRUE) %>%
select(Source_Country)
threatcount <- count(threats, Source_Country)
hsize <- 2
threatcount <- threatcount %>%
mutate(x = hsize)
ggplot(threatcount, aes(x = hsize, y = n, fill = Source_Country)) +
geom_col() +
coord_polar(theta = "y") +
scale_fill_viridis_d(option="turbo") +
xlim(c(0.2, hsize + .5))
Notice there isn’t a “geom_donut” for ggplot2. Instead you use the geom_col() function and bend the resulting columns into a circle. You use the coord_polar() function to perform the bend.
It seems like China is the country that is doing the most attacks. Refine the filter to include only China threats. Use a donut chart to display the Threat_IDs that came from China.Use the “turbo” Viridis palette.
threats <- threat_raw %>% filter(Severity == "critical" & Source_Country == "China", na.rm=TRUE) %>%
select(Threat_ID)
threatId <- count(threats, Threat_ID)
hsize <- 2
threatId <- threatId %>%
mutate(x = hsize)
ggplot(threatId, aes(x = hsize, y = n, fill = Threat_ID)) +
geom_col() +
coord_polar(theta = "y") +
scale_fill_viridis_d(option="turbo") +
xlim(c(0.2, hsize + .5))
Use both the Source_Country and Threat_ID values to create a Heatmap. Use the Viridis standard palette.
threats <- threat_raw %>% filter(Severity == "critical", na.rm=TRUE) %>%
select(Source_Country,Threat_ID) %>%
count(Source_Country,Threat_ID)
ggplot(threats, aes(x=Source_Country, y=Threat_ID)) +
geom_tile(aes(fill=n),color='White', size=0.1) +
scale_fill_viridis(name="Temperature") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
Use the Application and Action values to create a Heatmap. Use the Viridis standard palette.
threats <- threat_raw %>%
select(Application, Action) %>%
count(Application, Action)
ggplot(threats, aes(x=Application, y=Action)) +
geom_tile(aes(fill=n),color='White', size=0.1) +
scale_fill_viridis(name="Temperature") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1, size = 6),
axis.text.y = element_text(size = 5))