Part I: Statista Data - AI vs Total Software Market

Overview

This analysis compares:

  1. AI Market Revenue (Billions USD)
  2. AI Tool Users (Millions)
  3. Total Software Industry Size (Billions USD)

Important Note: The Growth Index shows percentage growth relative to 2020, NOT absolute market size. Small, emerging industries (like AI) tend to grow much faster in percentage terms than large, mature industries (like total software).

Load Libraries

library(tidyverse)
library(lubridate)
library(scales)

Data Preparation

1. AI Market Growth

ai_market <- read_csv("~/Downloads/Statista_AIMarketGrowth - Sheet1.csv")

ai_market_long <- ai_market %>%
  rename(Category = Year) %>%
  pivot_longer(
    cols = -Category,
    names_to = "Year",
    values_to = "AI_Market_Billions"
  ) %>%
  mutate(Year = as.numeric(Year)) %>%
  select(Year, AI_Market_Billions)

2. AI Tool Users

ai_users <- read_csv("~/Downloads/Statista_AIToolsUsers - Sheet1.csv")

ai_users_long <- ai_users %>%
  rename(Category = Year) %>%
  pivot_longer(
    cols = -Category,
    names_to = "Year",
    values_to = "AI_Users_Millions"
  ) %>%
  mutate(Year = as.numeric(Year)) %>%
  select(Year, AI_Users_Millions)

3. Software Market Size (TOTAL)

software_market <- read_csv("~/Downloads/Statista_SoftwareMarketSize - Sheet1-2.csv")

software_total <- software_market %>%
  filter(Year == "Total (billions USD)") %>%
  rename(Category = Year) %>%
  pivot_longer(
    cols = -Category,
    names_to = "Year",
    values_to = "Software_Total_Billions"
  ) %>%
  mutate(Year = as.numeric(Year)) %>%
  select(Year, Software_Total_Billions)

4. Combine All Datasets

combinedStatista_data <- ai_market_long %>%
  inner_join(ai_users_long, by = "Year") %>%
  inner_join(software_total, by = "Year") %>%
  arrange(Year)

knitr::kable(combinedStatista_data, caption = "Combined Statista Data (2020-2030)")
Combined Statista Data (2020-2030)
Year AI_Market_Billions AI_Users_Millions Software_Total_Billions
2020 16.87 48.13 270.86
2021 36.09 59.72 286.85
2022 23.61 75.07 313.56
2023 25.60 84.10 338.22
2024 34.90 104.84 363.39
2025 46.99 129.08 379.29
2026 62.62 158.15 395.00
2027 84.25 193.36 410.14
2028 114.16 236.41 427.24
2029 159.28 289.41 445.40
2030 223.52 355.12 462.04

Absolute Values Analysis

AI Market Growth

ggplot(combinedStatista_data, aes(x = Year, y = AI_Market_Billions)) +
  geom_line(size = 1.2, color = "#2E86AB") +
  geom_point(size = 3, color = "#2E86AB") +
  labs(
    title = "AI Market Growth",
    y = "Billions USD",
    x = "Year"
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 14))

AI Tool Users

ggplot(combinedStatista_data, aes(x = Year, y = AI_Users_Millions)) +
  geom_line(size = 1.2, color = "#A23B72") +
  geom_point(size = 3, color = "#A23B72") +
  labs(
    title = "AI Tool Users",
    y = "Millions of Users",
    x = "Year"
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 14))

Total Software Market Size

ggplot(combinedStatista_data, aes(x = Year, y = Software_Total_Billions)) +
  geom_line(size = 1.2, color = "#F18F01") +
  geom_point(size = 3, color = "#F18F01") +
  labs(
    title = "Total Software Market Size",
    y = "Billions USD",
    x = "Year"
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 14))

Growth Index Analysis (Base Year 2020 = 100)

The Growth Index standardizes each variable so that 2020 = 100.

Formula: (Value in Year t / Value in 2020) × 100

Interpretation: - Index = 200 → the variable doubled since 2020 - Index = 150 → it grew 50%

Why AI Index rises so much: AI started small (~$16B), so large increases translate into massive percentage growth.

Why Software Index rises slowly: The total software industry is already large (~$270B). Mature industries grow incrementally, not exponentially.

normalized_data <- combinedStatista_data %>%
  mutate(
    AI_Market_Index = AI_Market_Billions / first(AI_Market_Billions) * 100,
    AI_Users_Index = AI_Users_Millions / first(AI_Users_Millions) * 100,
    Software_Index = Software_Total_Billions / first(Software_Total_Billions) * 100
  )

normalized_long <- normalized_data %>%
  select(Year, AI_Market_Index, AI_Users_Index, Software_Index) %>%
  pivot_longer(
    cols = -Year,
    names_to = "Metric",
    values_to = "Index_Value"
  )

ggplot(normalized_long, aes(x = Year, y = Index_Value, color = Metric)) +
  geom_line(size = 1.2) +
  geom_point(size = 2) +
  labs(
    title = "Growth Comparison (Index: 2020 = 100)",
    y = "Growth Index (Relative Growth)",
    x = "Year",
    color = "Metric"
  ) +
  scale_color_manual(values = c("#2E86AB", "#A23B72", "#F18F01"),
                     labels = c("AI Market Revenue", "AI Tool Users", "Total Software Market")) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    legend.position = "bottom"
  )

Economic Interpretation Summary

  1. AI shows exponential-style growth because it is in an early adoption and expansion phase.

  2. The total software market grows steadily because it is a mature industry with widespread adoption.

  3. The Growth Index highlights relative change, not size. Therefore AI appears to “explode” compared to software.

  4. This pattern is typical of technological innovation cycles.


Part II: Indeed Data - AI Job Mentions vs Software Hiring

Data Loading

# Software sector job postings index
software_data <- read_csv("~/Downloads/job-postings-sector-index-2.csv")

# AI headline share data
ai_data <- read_csv("~/Downloads/ai-headline-share.csv")

Data Cleaning

# Filter Software Development for United States
software_clean <- software_data %>%
  filter(
    countryName == "United States",
    sectorName == "Software Development"
  ) %>%
  mutate(date = as.Date(dateString)) %>%
  select(date, software_index = value)

# Filter AI data for United States
ai_clean <- ai_data %>%
  filter(countryName == "United States") %>%
  mutate(date = as.Date(dateString)) %>%
  select(date, ai_share = value)

# Join datasets
combined_data <- inner_join(ai_clean, software_clean, by = "date")

# Normalize AI Share to Index (Base = 100)
combined_data <- combined_data %>%
  arrange(date) %>%
  mutate(
    ai_index = (ai_share / first(ai_share)) * 100
  )

AI Mentions vs Software Job Postings

ggplot(combined_data, aes(x = date)) +
  geom_line(aes(y = ai_index, color = "AI Mentions (Indexed)"), linewidth = 1) +
  geom_line(aes(y = software_index, color = "Software Job Postings Index"), linewidth = 1) +
  labs(
    title = "AI Job Mentions vs Software Development Hiring (U.S.)",
    subtitle = "Both Series Indexed to Base = 100",
    x = "Date",
    y = "Index (Base = 100)",
    color = ""
  ) +
  scale_y_continuous(labels = comma) +
  scale_color_manual(values = c("#2E86AB", "#F18F01")) +
  theme_minimal() +
  theme(
    legend.position = "bottom",
    plot.title = element_text(face = "bold", size = 14)
  )

AI-Adjusted Hiring Measure

This approximates relative AI-related software job volume by multiplying the Software Job Index by the AI Share.

combined_data <- combined_data %>%
  mutate(
    ai_adjusted_hiring = software_index * (ai_share / 100)
  )

ggplot(combined_data, aes(x = date)) +
  geom_line(aes(y = ai_adjusted_hiring), linewidth = 1, color = "#A23B72") +
  labs(
    title = "Approximate AI-Related Software Hiring (U.S.)",
    subtitle = "Software Job Index × AI Share",
    x = "Date",
    y = "AI-Adjusted Hiring Index"
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 14))


Conclusion

This analysis demonstrates the explosive growth of the AI market relative to the broader software industry, both in terms of market revenue and job postings. The indexed comparisons reveal that while AI represents a smaller absolute market size, its growth rate significantly outpaces the mature software industry.