The Effects of the 2020 Pandemic on the Tech Industry
Author
A Warsaw
The Global Pandemic
On March 11, 2020, the World Health Organization (WHO) declared the COVID-19 Virus a pandemic. This ensued a number of precautionary measures across the globe, from borders closing and travel freezes, to a global quarantine measure put in place to avoid the spread of the (at the time) incurable virus as medical experts frantically put efforts into preparing a vaccination to return things to normalcy as soon as possible.
Many people had to drastically change their life to a more remote style of living, introducing the rise of popularity of remote jobs and remote-based operations (school, events, etc.) Unfortunately, not everyone was able to survive these changes. Many people lost their jobs, homes, loved ones, and even their own lives due to this unforseen catastrophe. The ripple effects of the pandemic’s trying market still affects us in our current economy, through waves of unemployment due to mass tech layoffs-an industry that saw massive profits during the peak of the pandemic era.
Review of Data Set
The data set that I will be observing here revolves around the tech layoffs from the timeframe of March 11,2020 to April 21,2025. This will give us the opportunity to observe the ripple effects mentioned earlier to observe and potentially find conclusions for the possible causes of the mass tech layoffs based on the analysis of how the data collected from the following variables this data set includes:
Company
Location
Total Laid Off (Total # of workers laid off)
Date
Percentage Laid Off (Total Percentage of workers laid off)
Industry
Source
Stage (How far developed the company is, which will not be used in this document)
Funds Raised
Country
As you can see, the original author has done a thorough job of preparing this dataset, even including the source for the information gathered for each individual row as a variable in the set itself. If you would like to take a look at the data set as well, it was created by Roger Lee and the source is https://layoffs.fyi/
Getting Started
For starters, you always want to load the library at the beginning. I will be using functions found in the tidyverse package for the majority of this document. But, to also be on the safe side, I will also load in zoo for additional functionas and a color palette package just in case I decide to jazz up the visualizations!
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RColorBrewer)library(zoo)
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
now, I will need to grab the dataset, saved as layoffs.csv in my directory, and give the new tibble (or in other words a new data table) the name “layoffs”
Rows: 4083 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (10): company, location, date, percentage_laid_off, industry, source, st...
dbl (1): total_laid_off
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Now that I have the dataset loaded into our global environment, or in other words I am now able to code using the data set, I will start with cleaning the data off. But first, let’s take a quick look at the data:
head(layoffs)
# A tibble: 6 × 11
company location total_laid_off date percentage_laid_off industry source
<chr> <chr> <dbl> <chr> <chr> <chr> <chr>
1 ZoomInfo Vancouv… 150 6/9/… 4% Sales https…
2 Playtika Tel Avi… 90 6/5/… <NA> Consumer https…
3 Airtime New Yor… 25 6/4/… 43% Consumer https…
4 Microsoft Seattle 305 6/2/… 3% Other https…
5 Hims & Hers SF Bay … 68 5/30… 4% Healthc… https…
6 Business In… New Yor… NA 5/29… 20% Media https…
# ℹ 4 more variables: stage <chr>, funds_raised <chr>, country <chr>,
# date_added <chr>
Cleaning the Data
Before we get started with any visualizations, we will need to ensure that the data set is ready for use! Now for my observations, I will only be utilizing the variables company, total_laid_off, date, and percentage_laid_off. Upon observing the following variables, I notice that there are missing values, and that the percentage_laid_off variable needs the percentage symbol removed. Also, the date needs to be reformatted to be recognized.
So to do just that, I will create a new tibble using the layoffs tibble, filtering out missing datum along with removing the percentages and reformatting the date.
layoffsnew <- layoffs |>select(company, total_laid_off, date, percentage_laid_off) # Creates a new tibble and only grabs the desired 4 variables
layoffsnew$percentage_laid_off <-gsub("%", "", layoffsnew$percentage_laid_off) # Removes the percentages from percentage_laid_off
layoffsfinal <- layoffsnew|>filter(!is.na(total_laid_off),!is.na(percentage_laid_off)) # Removes all datum that had empty values + final tibble is prepared
One last thing before moving to the next step, I immediately notice that the percentage_laid_off variable is being recognized as a character instead of a numerical value. This will be an issue when attempting to use this information for plotting. So let’s fix this before moving forward!
Now you see that the last line says “num” instead of “chr”, which means that the values in percentage_laid_off will now be recognized as numericals.
Global Pandemic and the Ripple Effects of Tech Mass Layoffs
Now that we have our final tibble prepared, we can now discuss how we would like to observe the information. Say we want to see a quick visualization of the different companies that experienced mass layoffs, I would recommend a tree map. If we wanted to see trends between the different companies, possibly I would like to observe them using an alluvial, or even a bar chart. I will take this time to go through a number of ways to observe this data set here.
Observing Tech Layoffs from 2020-2025
For the first visualization, I would like to see which companies have had a higher percentage of layoffs during the time frame of 2020-2025 in comparison to their counterparts by utilizing a tree map.
To do so, I will use the information in my tibble layoffsfinal, while also utilizing the treemap library for this observation.
library(treemap)treemap(layoffsfinal,index ="company",vSize ="total_laid_off", # The size of the box is determined by total laid off variablevColor ="percentage_laid_off", # The color code of the box is determined by percentage laid offtype ="manual",palette ="RdYlBu", # color palette used for this charttitle ="Total layoffs and Percentage layoffs by Company from 2020-2025",title.legend ="Percentage of Layoffs") # Title for the legend
Source: Roger Lee, https://layoffs.fyi/
Now I know this is not necessarily the easiest visual display to observe, but we can still pull very interesting information from this. Upon immediately looking at this tree map, you can see some of the most commonly known big name tech companies. They are known to employ a large number of people, but according to this visual we also can see that they also have laid off a large number of people from 2020-2025 (refer to the size of the box as that relatively tells us how much more people were laid off in a given company in comparison to the others). Whereas you can see the few smaller boxes which have very blue hues, but smaller boxes. From that you can immediately infer that, although they laid off a higher percentage of their company in comparison to the bigger tech companies, they did not lay off nearly as much as the large companies.
Might I also state that I love that this visual depicts what feels like a black hole, re-imagining what it may feel like as someone who has experienced being a part of a mass layoff while having a hard time getting back into the workforce. Mass tech layoffs is a serious issue that must be faced. But to get a better picture of this issue, lets take a new look at the data from another perspective.
Trend of Layoffs from 2020-2025
What if I would like to see the combined average total of layoffs for each year? For that, I will need to create one final tibble titled avg_layoffs, and also create a new variable to only show the year.
First, lets create the new variable:
layoffsfinal <- layoffsfinal |>mutate(year =format(as.Date(date, format ="%m/%d/%Y"), "%Y")) # Creates a new variable for year
Next, creating the new tibble:
avg_layoffs <- layoffsfinal |>group_by(year) |># Groups all the data by the yearsummarise(avg_total_laid_off =mean(total_laid_off, na.rm =TRUE)) # Summarizes the data to calculate the average total of laid off workers based on the mean of the total laid off values
Also, to ensure that the years display appropriately in order in the chart:
ggplot(avg_layoffs, aes(x = year, y = avg_total_laid_off, group =1)) +geom_line(color ="firebrick") +geom_point(size =2) +labs(title ="Trend of Avg Tech Layoffs from 2020-2025",x ="Year",y ="Average Total Laid Off",caption ="Source: Roger Lee, https://layoffs.fyi") +theme_minimal()
This clearly shows a rising concern as Tech layoffs, on average, has become more concerning over the years. However, there appears to be an interesting downtrend from 2021 to 2022, which suggests a high likelihood of profits in the tech industry during that period (which would likely explain the decline of layoffs). That would then explain how, from then on, there has been a drastic upward trend of layoffs in the tech industry. As of this year, we are still seeing a concerning high amount of layoffs, with the trend going above an average of 800 layoffs in one year among the focus group. This leads me to question what exactly is causing such a drastic increase? Could it possibly still be related to a decline in profits, or are there other factors at hand?
Before I conclude everything, I would like to observe one final display. To start off, I will need two final libraries to create the graph.
library(alluvial)library(ggalluvial)
Next, for this graph I will have to create a range for the alluvial plot:
layoffs2 <- layoffsfinal |># Creating a new tibble for this graphmutate(layoff_band =case_when( # Creates a new variable titled "layoff_band" total_laid_off <50~"Low", # If the layoff value is less than 50, band is low total_laid_off <200~"Medium", #If the layoff value is less than 200, medium total_laid_off <800~"High", #If the layoff value is less than 800, high total_laid_off >=800~"Massive", #If the layoff value is greater than or equal to 800, massiveTRUE~"Unknown")) #
And now that we are set up for the final visualization, let us prepare and run the final chunk for our display:
band_counts <- layoffs2 |>group_by(year, layoff_band) |>summarise(n =n(), .groups ="drop")# Credit to ChatGPT for assistance the details above this point with errors regarding properly viewing the data as companies that fall within a certain categoryband_counts <- band_counts |>mutate(layoff_band =factor(layoff_band, levels =c("Low", "Medium", "High", "Massive", "Unknown"))) # Reorganizing the ranges to display in orderggplot(band_counts, aes(x = year, stratum = layoff_band, alluvium = layoff_band,y = n, fill = layoff_band, label = layoff_band)) +geom_flow(stat ="alluvium", lode.guidance ="forward", alpha =0.7, color ="white") +scale_fill_manual(values =c("Low"="#a8dadc", "Medium"="#457b9d", "High"="#1d3557", "Massive"="#e63946", "Unknown"="#bdbdbd")) +theme_dark() +labs(title ="Flow of Layoff Severity From 2020-2025",x ="Year",y ="Number of Companies",fill ="Layoff Severity",caption ="Source: Source: Roger Lee, https://layoffs.fyi") +theme(legend.position ="bottom")
This gives a better visual of the severity of the layoffs in every given year. For instance, a large amount of companies had a medium layoff rate between the years 2022-2023. Given that the ranges I have set were:
Low = 50 or less layoffs
Medium = between 50 and 200 layoffs
High = between 200 and 800 layoffs
Massive = 800 or more layoffs
Combining both the line graph and alluvial gives us a better opportunity to make more sound judgement of the causes of tech layoffs between 2020-2025.
Conclusion
To give a rundown of what was done in this discussion, I have:
Created a tree map giving a quick display of all Tech companies layoffs from 2020-2025, where the size of the box relates to the total amount of layoffs and the color relates to the percentage of the company workers laid off.
Created a line graph showing a trend line of the total average amount of layoffs in the tech industry for every given year, to better track the trends of tech layoffs in our current economy
Created an alluvial to display the severity of layoffs by showcasing the number of companies laying off workers by a certain rate of severity (separated by 4 categories) which gives us even more information regarding the trends, but more relative to the severity of the trend in our current economy.
Below I will give a break down of how this was accomplished:
Tree Map Breakdown
This graph was intended to be an initial display of the data set, showcasing the severity of mass layoffs in the tech industry over the years in a more visual and less statistical manner
To create this tree map, I focused on cleaning the original data set (which I then titled “layoffs_final”, was originally intended to be my final tibble) first. In cleaning the data I:
Removed all empty datum from the variables, so that that would not cause issues using the “filter(!is.na())” function for both total_laid off and percentage_laid_off variables
Removed the percentage symbol (“%”) from the percentage_laid_off variable, to prevent any obstructions when creating charts using the “gsub()” to specifically remove the symbol and replace with nothing
Re-categorized the percentage_laid_off variable from categorical using the “as.numeric()” function to replace the “as.character()” variable. Then I used the “str()” function to double check that everything displayed as intended.
After this, I needed to load in “library(treemap)”, so that I could use the needed functions to create my tree map. And finally, for my parameters, I set the name to be the company, the size of the individual boxes correlate to the total number of people laid off, and the color code to be the percentage of the company that was laid off. These layoffs occurred over the course of 2020-2025, so that was not necessarily a factor in this occasion.
Also for my color palette for the tree map, I used the colors from library(RColorBrewer), as I felt this was the more appropriate colors to use in this scenario.
Line Graph Breakdown
For this graph, I created a new tibble, titled “avg_layoffs”, where after creating the year variable in the “layoffs_final” tibble by using the “mutate()” function, I then created “avg_layoffs” which did the following:
Grouped the data from “layoffs_final” by the year variable using “group_by(year)”
Created a new variable titled “avg_total_laid_off” by summarizing the “total_laid_off” variable using the function “summarise([new variable name here], mean=(total_laid_off…))”
Ensured the years would show up in chronological order by using the “factor()” function for the specific variable and concatenating the years in order.
After creating the tibble, it was possible for me to create a graph that displays the average amount of layoffs per year, but instead of using a bar graph, I wanted to see the trend over the years. So I created a line graph with scatter points at each given year to show how the trends are developing over the year by using “geom_point” and “geom_line” and customizing the color of the line, along with setting the x axis as the year and the y axis as the average total amount of layoffs using the function “ggplot()”
Alluvial Graph Breakdown
This graph was intended to show us the severity of the layoffs for all companies over the frame of 2020-2025 using a color code to separate between 4 different categories of severity. Where the y axis shows the number of companies, and the x axis shows the year.
For this graph I created two final tibbles, layoffs2 and band_counts. The purpose for “layoffs2” was to pull everything from the tibble “layoffs_final”, while also creating a new variable titled layoff_band which was used to show the range of severity in the plot. I separated the ranges in five different categories using the function “mutate([new variable name] = case when())” to determine when a total layoff amount would be considered low, medium, high, massive, or unknown (last one was set up as a safety precaution just in case I made a mistake with missing data). The ranges were set up as follows:
Low = 50 or less layoffs
Medium = between 50 and 200 layoffs
High = between 200 and 800 layoffs
Massive = 800 or more layoffs
after creating the new variable “layoff_band” under the tibble “layoffs2”, I was prepared to move forward. For the next tibble, “band_counts” I credit ChatGPT with assisting me on this. Prior to creating this tibble, I was running into consistent errors due to me improperly setting up my alluvial, as I was trying to ensure that the alluvial was set so that the number of companies related to the y axis and that the fill would observe the corresponding company’s severity rate (or “layoff_band for reference) properly. The AI recommended I create this tibble where I grouped the data by both year and”layoff_band” using the function “group_by()”. It then also recommended I added the “summarise(n = n(), .groups =”drop”)” line, which admittedly I am not fully sure on what it did to assist me.
Before graphing, I also mutated the severity rates (or “layoff_band”) by using the factor function to ensure that the legend at the bottom of the graph would show in the proper order as displayed.
For my graph I used a dark theme this time, to allow the contrasting bright colors to shine even more. I also manually changed the colors to a palette that I thought would like nice for this display. For the alluvium, stratum, fill, and legend I used the “layoff_band” variable. For the x axis I used the year, and the y axis I used the number of companies.
I do notice a minor issue with this graph, that being the empty spaces at the exact years, and unfortunately was not able to fix the issue within a timely manner. If I had more time to address this issue, this would be my first priority.