The Threat Landscape

Row

Cyber incidents over time (2014–2025)

Row

Who is behind the attacks?

Who gets hurt the most?

How They Operate

Row

Attack fingerprints

How to read this

Australia’s Story

Row

Australia vs global pressure

Which Australian sectors get hit

Row

Why this matters

In 2014, cyber attacks were mostly spies and sabotage. In 2025, they're just Tuesday.
Cybercrime is now industrial: scalable, repeatable, profit-first. The same crews that ransom a hospital in Perth will hit a school district in the US the next week. 
Australia isn't “uniquely targeted.” We're on the menu because everyone is.
This isn't just a national security problem. It's an essential services problem — hospitals, public administration, finance, education. When they go down, real people get hurt.
Policy takeaway: treat cyber like critical infrastructure resilience. Fund and harden hospitals, government services, finance and education the same way we fund physical infrastructure.
Reference: 
UMD School of Public Policy. (2025). Cyber Events Database. Umd.edu. https://gotech.umd.edu/sites/default/files/2025-10/Cyber%20Events%20Database%20-%202014-2024%20%2B%20Jan_Aug_Sept%202025.xlsx.
Visuals: R + flexdashboard.
Palette: Criminal = red, Nation-State = blue, Hacktivist = orange, Australia = gold.
---
title: "Cybercrime Went Pro: A Decade of Global Attacks and Why Australia’s On the Menu"
output:
  flexdashboard::flex_dashboard:
    orientation: rows
    vertical_layout: fill
    theme: cosmo
    source_code: embed
    social: ["menu"]
---

```{r setup, include=FALSE}
# ---- Libraries ----
library(tidyverse)
library(flexdashboard)
library(ggplot2)
library(ggrepel)
library(forcats)
library(ggalluvial)
library(stringr)
library(scales)

# ---- Load data ----
raw <- read_csv("dataset.csv", locale = locale(encoding = "Latin1"))

# ---- Minimal cleaning / harmonising ----
df <- raw %>%
  mutate(
    year = as.integer(year),
    country = as.factor(country),
    actor_type = factor(actor_type),
    motive = factor(motive),
    event_subtype = factor(event_subtype),
    industry = factor(industry)
  ) %>%
  filter(!is.na(year), year >= 2014, year <= 2025)

# ---- Metadata ----
n_total_events <- nrow(df)

# ---- Colour palette ----
actor_palette <- c(
  "Criminal"      = "#D62728",
  "Nation-State"  = "#1F77B4",
  "Hacktivist"    = "#FF7F0E",
  "Terrorist"     = "#9467BD",
  "Hobbyist"      = "#8C564B",
  "Undetermined"  = "#7F7F7F",
  "Other"         = "#7F7F7F"
)

au_col  <- "#FFC300"  # Australia gold
oth_col <- "#BDBDBD"  # global grey

# ---- Dashboard-wide theme ----
# keep typography consistent everywhere

theme_cyber <- function(base_size = 14) {
  theme_minimal(base_size = base_size) +
    theme(
      text = element_text(family = "sans", color = "#111"),
      plot.title = element_text(face = "bold", size = base_size + 2, color = "#111111"),
      plot.subtitle = element_text(size = base_size - 2, color = "#444444"),
      plot.caption = element_text(size = base_size - 5, color = "#777777", hjust = 0),
      axis.title.x = element_text(size = base_size - 2, color = "#222222"),
      axis.title.y = element_text(size = base_size - 2, color = "#222222"),
      axis.text.x  = element_text(size = base_size - 3, color = "#222222"),
      axis.text.y  = element_text(size = base_size - 3, color = "#222222"),
      panel.grid.major.x = element_blank(),
      panel.grid.minor = element_blank(),
      panel.grid.major.y = element_line(color = "#DDDDDD", linewidth = 0.3),
      plot.background = element_rect(fill = "#FAFAFA", color = NA),
      panel.background = element_rect(fill = "#FAFAFA", color = NA),
      legend.background = element_rect(fill = "#FAFAFA", color = NA),
      legend.title = element_text(size = base_size - 3),
      legend.text = element_text(size = base_size - 4),
      legend.position = "bottom"
    )
}

# ---- Derived datasets ----
# incidents per year
by_year <- df %>%
  group_by(year) %>%
  summarise(incidents = n(), .groups = "drop")

# actor breakdown per year (for proportions)
by_year_actor <- df %>%
  group_by(year, actor_type) %>%
  summarise(incidents = n(), .groups = "drop") %>%
  filter(!is.na(actor_type))

by_year_actor_prop <- by_year_actor %>%
  group_by(year) %>%
  mutate(prop = incidents / sum(incidents)) %>%
  ungroup()

# global top industries
top_industries <- df %>%
  count(industry, sort = TRUE) %>%
  slice_head(n = 10) %>%
  mutate(industry = fct_reorder(industry, n))

# Australia timeline
au_timeline <- df %>%
  filter(country == "Australia") %>%
  group_by(year) %>%
  summarise(incidents = n(), .groups = "drop")

global_timeline <- df %>%
  group_by(year) %>%
  summarise(global_incidents = n(), .groups = "drop")

# AU vs global industries
au_industries <- df %>%
  filter(country == "Australia") %>%
  count(industry, sort = TRUE)

global_industries <- df %>%
  count(industry, sort = TRUE)

audiff <- au_industries %>%
  rename(au_n = n) %>%
  full_join(
    global_industries %>% rename(global_n = n),
    by = "industry"
  ) %>%
  slice_max(order_by = au_n, n = 8) %>%
  mutate(industry = fct_reorder(industry, au_n, .fun = identity))

# ---- Sankey prep (How they operate) ----
focus_actor_types <- c("Criminal", "Nation-State", "Hacktivist")

top_motives <- df %>%
  filter(actor_type %in% focus_actor_types) %>%
  count(motive, sort = TRUE) %>%
  slice_head(n = 4) %>%
  pull(motive)

top_techniques <- df %>%
  filter(actor_type %in% focus_actor_types) %>%
  count(event_subtype, sort = TRUE) %>%
  slice_head(n = 6) %>%
  pull(event_subtype)

clean_names <- function(x) {
  x <- str_replace_all(x, "Exploitation of Application Server", "Server access")
  x <- str_replace_all(x, "Exploitation of End Host", "End-user device")
  x <- str_replace_all(x, "External Denial of Service", "External DDoS")
  x <- str_replace_all(x, "Message Manipulation", "Defacement")
  x
}

sankey_data <- df %>%
  filter(
    actor_type %in% focus_actor_types,
    motive %in% top_motives,
    event_subtype %in% top_techniques,
    !is.na(actor_type),
    !is.na(motive),
    !is.na(event_subtype)
  ) %>%
  mutate(
    motive = clean_names(motive),
    event_subtype = clean_names(event_subtype)
  ) %>%
  count(actor_type, motive, event_subtype, sort = TRUE)

sankey_data_filtered <- sankey_data %>% filter(n >= 40)
```

The Threat Landscape
======================================================================

Row {data-height=400}
-----------------------------------------------------------------------

### Cyber incidents over time (2014–2025) {data-width=1000}

```{r}
# Time trend with bias disclaimer in caption

ggplot(by_year, aes(x = year, y = incidents)) +
  geom_area(fill = oth_col, alpha = 0.4) +
  geom_line(linewidth = 1.2, color = "#333333") +
  geom_point(size = 2.5, color = "#333333") +
  geom_text_repel(
    data = by_year %>% slice_max(incidents, n = 1),
    aes(label = paste0("Peak: ", incidents)),
    color = "#111111", size = 4,
    family = "sans"
  ) +
  labs(
    title = "Cyber activity accelerated sharply after 2020",
    subtitle = "Targeted espionage has turned into industrial-scale cybercrime.",
    x = "Year",
    y = "Recorded incidents",
    caption = paste0(
      "Data current as of Sept 2025. n = ",
      n_total_events,
      ". Counts reflect recorded events in the UMD Cyber Events Database; reporting visibility also increases over time."
    )
  ) +
  theme_cyber()
```

Row {data-height=450}
-----------------------------------------------------------------------

### Who is behind the attacks? {data-width=500}

```{r}
# 100% stacked area by actor type (share per year) for clarity

ggplot(by_year_actor_prop,
       aes(x = year, y = prop, fill = actor_type)) +
  geom_area(alpha = 0.9, colour = "transparent") +
  scale_y_continuous(labels = percent_format(accuracy = 1)) +
  scale_fill_manual(values = actor_palette) +
  labs(
    title = "Who's attacking?",
    subtitle = "Most attacks are profit-driven criminal operations. Nation-states are now a smaller share.",
    x = "Year",
    y = "Share of recorded incidents",
    fill = "Actor Type",
    caption = "Share of incidents each year. Source: UMD Cyber Events Database."
  ) +
  theme_cyber()
```

### Who gets hurt the most? {data-width=500}

```{r}
# Global top impacted sectors (ranked bars)

ggplot(top_industries, aes(x = industry, y = n)) +
  geom_col(fill = "#555555") +
  coord_flip() +
  labs(
    title = "Who's getting hit?",
    subtitle = "Hospitals, government, finance, education.",
    x = NULL,
    y = "Incidents recorded",
    caption = "Top impacted sectors globally. Source: UMD Cyber Events Database."
  ) +
  theme_cyber()
```

How They Operate
======================================================================

Row {data-height=400}
-----------------------------------------------------------------------

### Attack fingerprints {data-width=700}

```{r}
# Sankey / alluvial: Actor -> Motive -> Technique

ggplot(
  sankey_data_filtered,
  aes(
    axis1 = actor_type,
    axis2 = motive,
    axis3 = event_subtype,
    y = n,
    fill = actor_type
  )
) +
  geom_alluvium(alpha = 0.9, knot.pos = 0.4) +
  geom_stratum(width = 0.3, fill = "grey90", colour = "grey70") +
  geom_text(
    stat = "stratum",
    aes(label = after_stat(stratum)),
    family = "sans",
    size = 3.5,
    lineheight = 1.05
  ) +
  scale_x_discrete(
    limits = c("Actor", "Motive", "Technique"),
    expand = c(.05, .05)
  ) +
  scale_fill_manual(values = actor_palette) +
  labs(
    title = "Attack fingerprints",
    subtitle = "Criminal → Financial → Ransomware / Data Attack\nNation-State → Espionage → Server / Network Access\nHacktivist → Protest → Defacement / DDoS",
    x = NULL,
    y = "Incident count",
    caption = paste0(
      "Most common flows only (top actors, motives, techniques). Source: UMD Cyber Events Database (2014–2025), n = ",
      n_total_events,
      "."
    )
  ) +
  theme_cyber() +
  theme(
    legend.position = "none",
    axis.text.y = element_blank(),
    axis.title.y = element_blank(),
    axis.ticks.y = element_blank()
  )
```

### How to read this {data-width=300}

```{r}
# Accessibility helper panel
attacker_text <- tibble(
  y = c(3, 2, 1),
  label = c(
    "Criminal (red)\n— Business model.\n— Goal: money.\n— Pattern: Financial motive → Data Attack / Server Access (ransomware-as-a-service).\n— Impact: hospitals locked, citizen data leaked.",
    "Nation-State (blue)\n— Long game.\n— Goal: espionage & leverage.\n— Pattern: Espionage motive → Server / Network access.\n— Impact: steals sensitive data quietly.",
    "Hacktivist (orange)\n— Loud and fast.\n— Goal: protest / signal.\n— Pattern: Protest motive → Defacement / DDoS.\n— Impact: outage and embarrassment."
  )
)

ggplot(attacker_text, aes(x = 1, y = y, label = label)) +
  geom_text(
    hjust = 0,
    vjust = 1,
    size = 4,
    family = "sans",
    lineheight = 1.05
  ) +
  xlim(1, 2) +
  ylim(0.5, 3.5) +
  labs(
    title = "How to read this",
    subtitle = "Intent → technique → impact.\nEach actor type leaves a fingerprint.",
    caption = "Palette: Criminal = red; Nation-State = blue; Hacktivist = orange."
  ) +
  theme_void() +
  theme(
    plot.background = element_rect(fill = "#FAFAFA", color = NA),
    panel.background = element_rect(fill = "#FAFAFA", color = NA),
    plot.title = element_text(face = "bold", size = 18, color = "#111111"),
    plot.subtitle = element_text(size = 12, color = "#444444"),
    plot.caption = element_text(size = 9, color = "#777777", hjust = 0)
  )
```

Australia’s Story
======================================================================

Row {data-height=380}
-----------------------------------------------------------------------

### Australia vs global pressure {data-width=500}

```{r}
au_vs_global <- au_timeline %>%
  rename(au_incidents = incidents) %>%
  full_join(global_timeline, by = "year")

ggplot(au_vs_global, aes(x = year)) +
  geom_line(aes(y = global_incidents), linewidth = 1.2, color = oth_col) +
  geom_point(aes(y = global_incidents), size = 2.8, color = oth_col) +
  geom_line(aes(y = au_incidents), linewidth = 1.4, color = au_col) +
  geom_point(aes(y = au_incidents), size = 3.2, color = au_col) +
  labs(
    title = "Australia moves with global cyber pressure",
    subtitle = "When cybercrime spikes worldwide, Australia spikes too.",
    x = "Year",
    y = "Recorded incidents",
    caption = "Gold = Australia. Grey = global total. Source: UMD Cyber Events Database (2014–2025)."
  ) +
  theme_cyber()
```

### Which Australian sectors get hit {data-width=500}

```{r}
# Visual compare AU vs global: AU bar vs global reference line+dot

ggplot(audiff, aes(x = industry)) +
  geom_segment(aes(y = 0, yend = global_n, xend = industry),
               color = oth_col, linewidth = 1) +
  geom_point(aes(y = global_n),
             color = oth_col, size = 3) +
  geom_col(aes(y = au_n),
           fill = au_col, alpha = 0.9, width = 0.6) +
  coord_flip() +
  labs(
    title = "Same pressure points as elsewhere",
    subtitle = "Gov, health, education get hammered here too.",
    x = NULL,
    y = "Incidents recorded",
    caption = "Bar = Australia. Grey line/dot = global baseline."
  ) +
  theme_cyber()
```

Row {data-height=500}
-----------------------------------------------------------------------

### Why this matters {data-width=1000}

<div style="
  font-size:16px;
  line-height:1.45;
  color:#111111;
  font-family:sans-serif;
  padding:16px;
  background-color:#FAFAFA;
  border:1px solid #DDDDDD;
  border-radius:6px;
  max-width:1600px;
  max-height:360px;           /* limit total height */
  overflow-y:auto;            /* vertical scroll */
  overflow-x:hidden;          /* hide horizontal scroll */
  scrollbar-width: thin;      /* modern browsers: slim scroll */
  scrollbar-color: #BDBDBD #FAFAFA; /* scrollbar color */
">

  <div style="font-size:20px; font-weight:bold; margin-bottom:16px;">
    In 2014, cyber attacks were mostly spies and sabotage. In 2025, they're just Tuesday.
  </div>

  <div style="margin-bottom:16px; color:#333333;">
    Cybercrime is now industrial: scalable, repeatable, profit-first. The same crews that ransom a hospital in Perth will hit a school district in the US the next week. 
    Australia isn't “uniquely targeted.” We're on the menu because everyone is.
  </div>

  <div style="margin-bottom:16px; color:#333333;">
    This isn't just a national security problem. It's an essential services problem — hospitals, public administration, finance, education. When they go down, real people get hurt.
  </div>

  <div style="margin-bottom:16px; color:#333333;">
    Policy takeaway: treat cyber like critical infrastructure resilience. Fund and harden hospitals, government services, finance and education the same way we fund physical infrastructure.
  </div>

  <div style="font-size:13px; color:#777777; line-height:1.3;">
    Reference: 
    UMD School of Public Policy. (2025). Cyber Events Database. Umd.edu. https://gotech.umd.edu/sites/default/files/2025-10/Cyber%20Events%20Database%20-%202014-2024%20%2B%20Jan_Aug_Sept%202025.xlsx.
  </div>
  
  <div style="font-size:13px; color:#777777; line-height:1.3;">
    Visuals: R + flexdashboard.
    Palette: Criminal = red, Nation-State = blue, Hacktivist = orange, Australia = gold.
  </div>

</div>