Project 2

Author

Maisha Ann Subin

Project 2: GLOBAL AI IMPACT

(McClure, 2024)

Introducing:

Data Set Variables
Country
Year
Industry
AI adoption rate (%)
AI generated content vol. (TBs per year)
Job loss due to AI (%)
Revenue increase due to AI (%)
Human-AI collaboration rate (%)
Top AI tools used
Regulation status
Consumer trust in AI (%)
Market share of AI companies (%)

With my major in Information Science, I aim to pursue a career in technology and AI. This data set from Kaggle immediately caught my interest because it draws from several reputable sources like the Stanford AI Index Report, MIT Technology Review, and more, and includes variables such as job loss, AI adoption rates, and AI market share.

Out of the 12 variables in the “Global AI Impact” data set, I chose to focus mainly on the job loss rate due to AI (%) and how it relates to country, industry, and year. I began by cleaning the data—checking for NA’s, renaming columns for clarity, and later narrowing it down to my key variables and working with average values to make it more manageable and focused.

Load necessary libraries

library(tidyverse)
library(gganimate)
library(gapminder)
library(viridis)
library(ggplot2)
library(DataExplorer)
library(ggthemes)

Read in AI impact data

AI_Impact<- read_csv("/Users/maishasubin/Desktop/DATA110/Global_AI_Impact_Dataset.csv")

Rows: 200 Columns: 12
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): Country, Industry, Top AI Tools Used, Regulation Status
dbl (8): Year, AI Adoption Rate (%), AI-Generated Content Volume (TBs per ye...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

head(AI_Impact)

# A tibble: 6 × 12
  Country      Year Industry   `AI Adoption Rate (%)` AI-Generated Content Vol…¹
  <chr>       <dbl> <chr>                       <dbl>                      <dbl>
1 South Korea  2022 Media                        44.3                       33.1
2 China        2025 Legal                        34.8                       66.7
3 USA          2022 Automotive                   81.1                       96.1
4 France       2021 Legal                        85.2                       93.8
5 France       2021 Gaming                       79.0                       45.6
6 USA          2021 Retail                       67.0                       47.7
# ℹ abbreviated name: ¹`AI-Generated Content Volume (TBs per year)`
# ℹ 7 more variables: `Job Loss Due to AI (%)` <dbl>,
#   `Revenue Increase Due to AI (%)` <dbl>,
#   `Human-AI Collaboration Rate (%)` <dbl>, `Top AI Tools Used` <chr>,
#   `Regulation Status` <chr>, `Consumer Trust in AI (%)` <dbl>,
#   `Market Share of AI Companies (%)` <dbl>

Necessary cleaning:

Renaming variables without space
Removing NA’s from all columns individually

# Renaming  variables 
AI_Impact2 <- AI_Impact |> 
  rename(ai_adoption_rate = `AI Adoption Rate (%)`
         , ai_gen_content_vol = `AI-Generated Content Volume (TBs per year)`
         , job_loss_ai_rate = `Job Loss Due to AI (%)`
         , rev_increase_ai_rate = `Revenue Increase Due to AI (%)`
         , human_ai_collab_rate = `Human-AI Collaboration Rate (%)`
         , top_ai_tools = `Top AI Tools Used`
         , regulation_status = `Regulation Status`
         , consumer_trust_ai_rate = `Consumer Trust in AI (%)`
         , market_share_ai_rate = `Market Share of AI Companies (%)`
         , country = `Country`, year = `Year`, industry = `Industry`)


#Removing missing values: NA's 
AI_Impact_nona <- AI_Impact2 %>%
  filter(!is.na(ai_adoption_rate), !is.na(ai_gen_content_vol)
         , !is.na(job_loss_ai_rate), !is.na(rev_increase_ai_rate)
         , !is.na(human_ai_collab_rate), !is.na(top_ai_tools)
         , !is.na(regulation_status), !is.na(consumer_trust_ai_rate)
         , !is.na(market_share_ai_rate), !is.na(country), !is.na(year)
         , !is.na(industry))

Viewing my data:

# To view the list of country's I am working with
unique(AI_Impact_nona$country)

 [1] "South Korea" "China"       "USA"         "France"      "Australia"  
 [6] "UK"          "Canada"      "India"       "Japan"       "Germany"

# To view the list of industry's I am working with
unique(AI_Impact_nona$industry)

 [1] "Media"         "Legal"         "Automotive"    "Gaming"       
 [5] "Retail"        "Education"     "Healthcare"    "Marketing"    
 [9] "Manufacturing" "Finance"

# To view years I am working with
unique(AI_Impact_nona$year)

[1] 2022 2025 2021 2023 2020 2024

Testing multiple linear regression model in relation to job loss due to AI rate

Exploration 1: with respect to all variables (for understanding purposes, regression analysis for exploration 2 coming up)

# Run linear regression models
lm_model <- lm(job_loss_ai_rate ~ ai_adoption_rate + year + ai_gen_content_vol +                                              rev_increase_ai_rate + consumer_trust_ai_rate + human_ai_collab_rate + regulation_status
                + market_share_ai_rate + industry + top_ai_tools + country , data = AI_Impact_nona)
# Summary of model
summary(lm_model)


Call:
lm(formula = job_loss_ai_rate ~ ai_adoption_rate + year + ai_gen_content_vol + 
    rev_increase_ai_rate + consumer_trust_ai_rate + human_ai_collab_rate + 
    regulation_status + market_share_ai_rate + industry + top_ai_tools + 
    country, data = AI_Impact_nona)

Residuals:
     Min       1Q   Median       3Q      Max 
-31.0968  -8.8642  -0.2018   9.0328  29.8798 

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)  
(Intercept)                  1389.07122 1180.20025   1.177   0.2409  
ai_adoption_rate               -0.01095    0.04194  -0.261   0.7944  
year                           -0.68092    0.58402  -1.166   0.2453  
ai_gen_content_vol              0.01115    0.03590   0.311   0.7564  
rev_increase_ai_rate            0.09966    0.04279   2.329   0.0211 *
consumer_trust_ai_rate          0.05837    0.06111   0.955   0.3409  
human_ai_collab_rate            0.02240    0.05448   0.411   0.6815  
regulation_statusModerate       3.32802    2.47297   1.346   0.1802  
regulation_statusStrict         0.26367    2.64207   0.100   0.9206  
market_share_ai_rate            0.07256    0.07425   0.977   0.3299  
industryEducation              -1.77904    4.90145  -0.363   0.7171  
industryFinance                 0.02543    5.09110   0.005   0.9960  
industryGaming                  0.14968    4.36277   0.034   0.9727  
industryHealthcare             -3.09022    4.84103  -0.638   0.5241  
industryLegal                  -2.35939    4.92809  -0.479   0.6327  
industryManufacturing           2.75090    4.84781   0.567   0.5712  
industryMarketing              -7.97446    4.63354  -1.721   0.0871 .
industryMedia                  -7.36095    4.17869  -1.762   0.0800 .
industryRetail                 -4.60081    4.57440  -1.006   0.3160  
top_ai_toolsChatGPT            -6.13845    4.00379  -1.533   0.1271  
top_ai_toolsClaude             -3.81987    4.02315  -0.949   0.3438  
top_ai_toolsDALL-E             -0.91187    4.17351  -0.218   0.8273  
top_ai_toolsMidjourney         -2.84421    3.86866  -0.735   0.4633  
top_ai_toolsStable Diffusion   -1.97316    4.08296  -0.483   0.6295  
top_ai_toolsSynthesia           1.30563    4.17744   0.313   0.7550  
countryCanada                  11.57367    5.36196   2.158   0.0323 *
countryChina                    8.49437    5.05364   1.681   0.0947 .
countryFrance                   8.48622    4.82318   1.759   0.0803 .
countryGermany                  8.35650    5.19316   1.609   0.1095  
countryIndia                    5.67046    4.95703   1.144   0.2543  
countryJapan                    9.30590    5.08581   1.830   0.0691 .
countrySouth Korea              8.00702    5.10165   1.569   0.1184  
countryUK                       7.38738    5.20701   1.419   0.1579  
countryUSA                      1.38714    4.91546   0.282   0.7781  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 13.65 on 166 degrees of freedom
Multiple R-squared:  0.1957,    Adjusted R-squared:  0.03586 
F-statistic: 1.224 on 33 and 166 DF,  p-value: 0.2046

Exploration 2: with respect to country, industry, AI adoption rate and year

# Run linear regression models
lm_model <- lm(job_loss_ai_rate ~ year + ai_adoption_rate + industry + country, data =  AI_Impact_nona )
# Summary of model
summary(lm_model)


Call:
lm(formula = job_loss_ai_rate ~ year + ai_adoption_rate + industry + 
    country, data = AI_Impact_nona)

Residuals:
     Min       1Q   Median       3Q      Max 
-30.9961 -10.0059   0.6298   9.3155  27.4944 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)  
(Intercept)           1273.18734 1162.47210   1.095   0.2749  
year                    -0.61809    0.57506  -1.075   0.2839  
ai_adoption_rate        -0.01554    0.04140  -0.375   0.7078  
industryEducation       -4.01746    4.66596  -0.861   0.3904  
industryFinance         -0.90974    4.95570  -0.184   0.8546  
industryGaming          -1.08993    4.21329  -0.259   0.7962  
industryHealthcare      -4.34091    4.79464  -0.905   0.3665  
industryLegal           -2.46743    4.81411  -0.513   0.6089  
industryManufacturing    2.39900    4.74296   0.506   0.6136  
industryMarketing       -9.15181    4.59830  -1.990   0.0481 *
industryMedia           -7.30683    4.17454  -1.750   0.0818 .
industryRetail          -6.04044    4.53904  -1.331   0.1850  
countryCanada           12.05361    5.07190   2.377   0.0185 *
countryChina             8.91410    4.79659   1.858   0.0648 .
countryFrance            8.52955    4.62530   1.844   0.0668 .
countryGermany           8.69716    5.00203   1.739   0.0838 .
countryIndia             7.15881    4.65214   1.539   0.1256  
countryJapan             7.41173    4.80797   1.542   0.1249  
countrySouth Korea       7.27710    4.82130   1.509   0.1330  
countryUK                6.64916    4.82398   1.378   0.1698  
countryUSA               2.29563    4.72688   0.486   0.6278  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 13.76 on 179 degrees of freedom
Multiple R-squared:  0.1193,    Adjusted R-squared:  0.02087 
F-statistic: 1.212 on 20 and 179 DF,  p-value: 0.2488

Exploration 2: Linear Regression analysis:

Equation for my model:

job_loss_ai_rate = 1273.19 - 0.618(year) - 0.0155(ai_adoption_rate) - 4.017(industryEducation) - 0.909(industryFinance) - 1.090(industryGaming) - 4.341(industryHealthcare) - 2.467(industryLegal) + 2.399(industryManufacturing) - 9.152(industryMarketing) - 7.307(industryMedia) - 6.040(industryRetail) + 12.053(countryCanada) + 8.914(countryChina) + 8.529(countryFrance) + 8.697(countryGermany) + 7.159(countryIndia) + 7.412(countryJapan) + 7.277(countrySouth Korea) + 6.649(countryUK) + 2.296(countryUSA)

Analyzing model based on p-values:

From the analysis above, industryMarketing (p = 0.0481) is statistically significant, suggesting that Marketing industry has a weak but significant relationship with job loss due to AI. Also, countryCanada (p = 0.0185) is significant as well with a p value of less than 0.05.

Rest of the variables have (p > 0.05) suggesting statistically insignificant values, which means they do not have a strong impact on job loss in this model.

Adjusted R^2 values:

Adjusted R square = 0.02087 (low), indicating that the model explains less than 2.1% of the variation in job loss due to AI.

Exploration 3: for Canada, since statistically significant

# Filter data for Canada only
Canada_data <- AI_Impact_nona %>% filter(country == "Canada")

# Run linear regression model
lm_model <- lm(job_loss_ai_rate ~ year + industry, data = Canada_data )

# Summary of model
summary(lm_model)


Call:
lm(formula = job_loss_ai_rate ~ year + industry, data = Canada_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-11.060  -3.912   0.000   3.912  11.060 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)  
(Intercept)           7109.292   2583.420   2.752   0.0332 *
year                    -3.495      1.278  -2.735   0.0340 *
industryEducation      -24.398     11.338  -2.152   0.0749 .
industryGaming         -10.944     10.707  -1.022   0.3461  
industryHealthcare     -16.270     11.392  -1.428   0.2032  
industryLegal           -6.562     10.681  -0.614   0.5615  
industryManufacturing   -2.085     13.133  -0.159   0.8791  
industryMarketing      -10.948     11.338  -0.966   0.3715  
industryMedia            8.635     13.133   0.657   0.5353  
industryRetail         -34.190     13.071  -2.616   0.0398 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 9.243 on 6 degrees of freedom
Multiple R-squared:  0.8118,    Adjusted R-squared:  0.5295 
F-statistic: 2.876 on 9 and 6 DF,  p-value: 0.1058

Statistically significant values with adjusted R^2 = 0.53, indicating that the model explains about 53% of the variation in job loss due to AI.

Exploration 3: testing visualization for Canada regression analysis

Correlation visualization for Canada to analyze the data…

# Selecting only the relevant columns for the pair plot to analyze correlation
selected_columns <- Canada_data  [, c("job_loss_ai_rate", "year", "industry")]

plot_correlation(selected_columns)

Bar graph analysis over the years, according to industry…

Canada_data |>
  ggplot(aes(x = year, y = job_loss_ai_rate, fill = industry)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  labs(title = "Job loss Due to AI from 2020-2025 According to Industries")

Here we see missing data for 2024, hence not going ahead with Canada filtering.

Exploration 4: with respect to just country and year

# Run linear regression models
lm_model <- lm(job_loss_ai_rate ~ year + country, data =  AI_Impact_nona)
# Summary of model
summary(lm_model)


Call:
lm(formula = job_loss_ai_rate ~ year + country, data = AI_Impact_nona)

Residuals:
    Min      1Q  Median      3Q     Max 
-27.587 -10.044  -1.018  10.962  28.293 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)  
(Intercept)        1455.8502  1113.6291   1.307   0.1927  
year                 -0.7106     0.5507  -1.290   0.1985  
countryCanada        12.2076     4.9642   2.459   0.0148 *
countryChina         10.1665     4.6699   2.177   0.0307 *
countryFrance         7.7135     4.5463   1.697   0.0914 .
countryGermany        8.8091     4.9065   1.795   0.0742 .
countryIndia          7.0528     4.5736   1.542   0.1247  
countryJapan          7.1399     4.6383   1.539   0.1254  
countrySouth Korea    6.9100     4.7200   1.464   0.1449  
countryUK             8.2868     4.7178   1.757   0.0806 .
countryUSA            2.1282     4.6732   0.455   0.6493  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 13.81 on 189 degrees of freedom
Multiple R-squared:  0.06236,   Adjusted R-squared:  0.01275 
F-statistic: 1.257 on 10 and 189 DF,  p-value: 0.2576

Going ahead with exploration 2 and 4, and calculating average for clear visualization

AI_means <- AI_Impact_nona %>%  
  group_by(year, country, industry) %>% # grouping data sets based on these variables
  summarize(job_loss_ai_rate = mean(job_loss_ai_rate)) # calculating the mean for job_loss_ai_rate


AI_means2 <- AI_Impact_nona %>%
  group_by(year, country) %>% # grouping data sets based on these variables
  summarize(job_loss_ai_rate = mean(job_loss_ai_rate)) # calculating the mean again

Extra regression analysis after calculating average job_loss_ai_rate for (plot1)…

# Run linear regression models
lm_model <- lm(job_loss_ai_rate ~ year + industry + country, data =  AI_means)

# Summary of model
summary(lm_model)


Call:
lm(formula = job_loss_ai_rate ~ year + industry + country, data = AI_means)

Residuals:
     Min       1Q   Median       3Q      Max 
-31.2145  -9.5189   0.7435   8.9062  27.1043 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)  
(Intercept)           1672.9500  1213.1147   1.379   0.1699  
year                    -0.8166     0.6001  -1.361   0.1756  
industryEducation       -4.9205     4.8882  -1.007   0.3157  
industryFinance         -0.2458     4.8958  -0.050   0.9600  
industryGaming           0.4682     4.2938   0.109   0.9133  
industryHealthcare      -3.0538     4.9351  -0.619   0.5370  
industryLegal           -3.6880     4.8936  -0.754   0.4522  
industryManufacturing    4.3189     4.9680   0.869   0.3860  
industryMarketing       -8.8418     4.6043  -1.920   0.0567 .
industryMedia           -7.4350     4.2789  -1.738   0.0843 .
industryRetail          -4.5661     4.7636  -0.959   0.3393  
countryCanada           12.4774     4.9138   2.539   0.0121 *
countryChina             8.6999     4.8176   1.806   0.0729 .
countryFrance            9.1434     4.5887   1.993   0.0481 *
countryGermany           8.0344     4.9729   1.616   0.1082  
countryIndia             8.3348     4.7443   1.757   0.0810 .
countryJapan             6.4985     4.9279   1.319   0.1892  
countrySouth Korea       7.9971     4.7397   1.687   0.0936 .
countryUK                7.1328     4.8261   1.478   0.1415  
countryUSA               1.4727     4.7165   0.312   0.7553  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 13.3 on 152 degrees of freedom
Multiple R-squared:  0.1487,    Adjusted R-squared:  0.04229 
F-statistic: 1.397 on 19 and 152 DF,  p-value: 0.1357

Adjusted R-square still less, explaining 4.23% of job loss. P value is also higher than 0.05: 0.1357

Another extra regression analysis after calculating average job_loss_ai_rate for just country and year for (plot2) …

# Run linear regression models
lm_model <- lm(job_loss_ai_rate ~ year + country, data =  AI_means2)
# Summary of model
summary(lm_model)


Call:
lm(formula = job_loss_ai_rate ~ year + country, data = AI_means2)

Residuals:
     Min       1Q   Median       3Q      Max 
-18.3420  -6.1481   0.7986   5.1637  17.5168 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)  
(Intercept)        1075.0872  1429.7581   0.752   0.4558  
year                 -0.5220     0.7069  -0.738   0.4639  
countryCanada        14.0157     5.4993   2.549   0.0141 *
countryChina         10.4709     5.2395   1.998   0.0515 .
countryFrance         6.2226     5.2395   1.188   0.2409  
countryGermany       10.3049     5.5066   1.871   0.0675 .
countryIndia          5.7824     5.2395   1.104   0.2754  
countryJapan          5.4882     5.2395   1.047   0.3002  
countrySouth Korea    4.8686     5.2395   0.929   0.3575  
countryUK             8.6471     5.2395   1.650   0.1055  
countryUSA           -0.1162     5.2395  -0.022   0.9824  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 9.075 on 47 degrees of freedom
Multiple R-squared:  0.219, Adjusted R-squared:  0.05285 
F-statistic: 1.318 on 10 and 47 DF,  p-value: 0.2489

Heat map analysis for filtered data set ‘AI_means’ to understand trends according to industry

ggplot(AI_means, aes(x = factor(year), y = industry, fill = job_loss_ai_rate)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "lightyellow", high = "red") +
  labs(title = "Heatmap of AI-Driven Average Job Loss by Industry and Year",
       x = "Year", y = "Industry", fill = "Average Job Loss (%)") +
  theme_minimal()

# To see the industry's with higher rates
AI_means %>%
  arrange(desc(job_loss_ai_rate))

# A tibble: 172 × 4
# Groups:   year, country [58]
    year country     industry      job_loss_ai_rate
   <dbl> <chr>       <chr>                    <dbl>
 1  2021 China       Gaming                    49.7
 2  2025 Japan       Manufacturing             49.6
 3  2024 South Korea Retail                    49.6
 4  2022 USA         Manufacturing             49.3
 5  2024 France      Gaming                    49.3
 6  2021 UK          Automotive                49.1
 7  2021 Germany     Automotive                48.5
 8  2020 India       Retail                    48.3
 9  2023 Canada      Media                     48.2
10  2021 India       Manufacturing             47.4
# ℹ 162 more rows

Heat map analysis for filtered data set ‘AI_means2’ to understand trends according to country

ggplot(AI_means2, aes(x = factor(year), y = country, fill = job_loss_ai_rate)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  labs(title = "Heatmap of AI-Driven Average Job Loss by Industry and Year",
       x = "Year", y = "Industry", fill = "Average Job Loss (%)") +
  theme_minimal()

Final Visualizations…

Plot 1: AI-Driven Average Job Loss by Industry and Year

library(gifski) # For converting animation as GIF for rendering purposes

## Animated graph
graph1 <- AI_means %>% # using AI_mean data set
  ggplot(aes(x = year, y = job_loss_ai_rate, color = industry, group = industry)) +
  geom_line(linewidth = 1.2) + 
  geom_point(size = 3, alpha = 0.8) +
  theme_fivethirtyeight() + # Theme for the plot’s overall appearance
  scale_x_continuous(breaks = 2020:2025) + # Setting x-axis to display year values from 2020 to 2025.
  labs(
    title = "AI-Driven Average Job Loss by Industry and Year",
    x = "Year",
    y = "Average Job Loss Rate ",
    color = "Industry",
    caption = "Source: Stanford AI Index Report, MIT Technology Review, et al."
  ) +
   theme_dark(base_size = 13) +
  theme(
  plot.background = element_rect(fill = "black", color = NA),
  panel.background = element_rect(fill = "black", color = NA),
  legend.background = element_rect(fill = "black", color = NA),
  legend.key = element_rect(fill = "black"),
  
  plot.title = element_text(color = "white", face = "bold", size = 16),
  plot.subtitle = element_text(color = "white", size = 13),
  axis.title = element_text(color = "white"),
  axis.text = element_text(color = "gray90"),
  legend.text = element_text(color = "gray90"),
  legend.title = element_text(color = "white"),
  
  panel.grid.major = element_line(color = "gray30"),
  panel.grid.minor = element_line(color = "gray20"),
  plot.caption = element_text(color = "gray80", size = 10, hjust = 1.5) # For making the caption visible in final plot
  ) +
  scale_color_viridis_d(option = "viridis")

# Animate it with transition_reveal
graph1_animation <- graph1 +
  transition_reveal(year) + # Gradually reveals the graph over the years
  labs(subtitle = "Year: {round(frame_along)}") # A dynamic subtitle that updates for each frame

# Render the animation
animate(graph1_animation, height = 500, width = 800, fps = 10, duration = 10, end_pause = 40, res = 100, renderer = gifski_renderer("ai_jobloss_industry.gif")) # Setting the frames per second to 10, total duration of the animation 10 secs, 40-frame pause at the end of the animation before it stops, resolution of the output image at 100 DPI.

Plot 2: AI-Driven Average Job Loss by country and Year

library(gifski)

# Create the base line graph
graph1 <- AI_means2 %>%
  ggplot(aes(x = year, y = job_loss_ai_rate, color = country, group = country)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 3, alpha = 0.8) +
  theme_fivethirtyeight() + # Theme for the plot’s overall appearance
  scale_x_continuous(breaks = 2020:2025) + # Setting x-axis to display year values from 2020 to 2025.
  labs(
    title = "AI-Driven Average Job Loss by Country and Year",
    x = "Year",
    y = "Average Job Loss Rate ",
    color = "Country",
    caption = "Source: Stanford AI Index Report, MIT Technology Review, et al."
  ) +
   theme_dark(base_size = 13) +
  theme(
  plot.background = element_rect(fill = "black", color = NA),
  panel.background = element_rect(fill = "black", color = NA),
  legend.background = element_rect(fill = "black", color = NA),
  legend.key = element_rect(fill = "black"),
  
  plot.title = element_text(color = "white", face = "bold", size = 16),
  plot.subtitle = element_text(color = "white", size = 13),
  axis.title = element_text(color = "white"),
  axis.text = element_text(color = "gray90"),
  legend.text = element_text(color = "gray90"),
  legend.title = element_text(color = "white"),
  
  panel.grid.major = element_line(color = "gray30"), # Customizing grid lines to a light gray color.
  panel.grid.minor = element_line(color = "gray20"),
  plot.caption = element_text(color = "gray80", size = 10, hjust = 1.5) # For making the caption visible in final plot (ChatGPT's help)
  ) +
  scale_color_viridis_d(option = "plasma")

# Animate it with transition_reveal
graph1_animation <- graph1 +
  transition_reveal(year) + # Gradually reveals the graph over the years
  labs(subtitle = "Year: {round(frame_along)}") # A dynamic subtitle that updates for each frame

# Render the animation
animate(graph1_animation, height = 500, width = 800, fps = 10, duration = 10, end_pause = 40, res = 100, renderer = gifski_renderer("ai_jobloss_country.gif")) # Setting the frames per second to 10, total duration of the animation 10 secs, 40-frame pause at the end of the animation before it stops, resolution of the output image at 100 DPI.

Essay in Continuation

Overall, the trend I observed in the job loss percentages across various industries from my Plot 1 was that in 2020, as expected, the job loss rates were higher compared to other years, especially when compared to 2023 and 2025, which saw recovery from COVID. Manufacturing, gaming, automotive, and retail sectors experienced consistently higher values, aligning with reports like those from Medium (Hewitt, 2024). This report highlights jobs in customer service, manufacturing, transportation, finance, and retail as being particularly vulnerable to replacement by robotics in the near future.

Looking at Plot 2 and its ‘heatmap’, there was a noticeable increase in job loss between 2020 and 2021. What stood out to me was the drastic drop in job loss in the USA in 2024, compared to a peak of about 50% job loss in Germany during the same year. However, I couldn’t find much data to support a rise in unemployment in Germany due to AI in 2024. Additionally, I wanted to explore an area chart for my plot 1 with respect to industry’s however I could not get it to work. I was also hoping to merge ‘highcharter’ with animated graph for clarity, but I couldn’t figure out how to.

In summary, countries like the USA, Australia, Japan, and Germany are seeing a notable increase in job loss due to AI, with industries such as manufacturing, automotive, and retail bearing the most impact. However, the full significance of the data is still something I need to explore further.

Bibliography

https://medium.com/@generup22/15-industries-that-ai-will-severely-disrupt-by-2034-a3416e77b894*#:~:text=Customer%20service%2C%20manufacturing%2C%20transportation%2C**,and%20parts%20of%20government%20bureaucracy.*
https://www.statista.com/statistics/227005/unemployment-rate-in-germany/
https://www.wsj.com/articles/it-unemployment-rises-to-5-7-as-ai-hits-tech-jobs-7726bb1b
ChatGPT ’s help with showing captions in my animated graph as it disappeared.
Tutorial followed for animated graph: https://www.youtube.com/watch?v=SnCi0s0e4Io