getwd()
## [1] "C:/Users/leyla/Documents/DATA 101"
layoffs_data <- read.csv("layoffs.csv")
head(layoffs_data)
## company location total_laid_off date
## 1 Sonder SF Bay Area NA 11/10/2025
## 2 Axonius New York City 100 11/6/2025
## 3 MyBambu Memphis 141 11/5/2025
## 4 Hewlett Packard Enterprise SF Bay Area 52 11/5/2025
## 5 Indeed Austin NA 11/5/2025
## 6 TripAdvisor Boston NA 11/5/2025
## percentage_laid_off industry
## 1 1.00 Travel
## 2 0.11 Security
## 3 1.00 Finance
## 4 NA Hardware
## 5 NA HR
## 6 0.20 Travel
## source
## 1 https://skift.com/2025/11/10/sonder-shuts-down-after-marriott-termination-marking-the-end-of-a-hospitality-experiment/
## 2 https://www.calcalistech.com/ctechnews/article/sj0syry1wg
## 3 https://cbs12.com/news/local/new-west-palm-beach-fintech-firm-to-lay-off-141-employees-amid-funding-collapse-layoff-south-florida-palm-beach-county-downtown-west-palm-beach-news-mybambu-a-fintech-startup-2751-s-dixie-hwy-420-november-5-2025
## 4 https://www.sfchronicle.com/tech/article/layoffs-hpe-hitachi-vantara-21140900.php
## 5 https://www.businessinsider.com/indeed-layoffs-job-cuts-after-summer-reorg-2025-11
## 6 https://skift.com/2025/11/05/layoffs-hit-20-of-tripadvisor-viator-and-administrative-staff-scoop/
## stage funds_raised country date_added
## 1 Post-IPO 839 United States 11/10/2025
## 2 Series E 865 United States 11/7/2025
## 3 Unknown 15 United States 11/7/2025
## 4 Post-IPO 1400 United States 11/7/2025
## 5 Acquired 5 United States 11/8/2025
## 6 Post-IPO 3 United States 11/7/2025
Research Question: What factors influence the total number of layoffs among companies in 2024 - 2025?
This project analyze a dataset containing information on company layoffs collected from publicly available new sources collected in Kaggle. The dataset consists of 4202 observations and 11 variables, where each observation represents a layoff event at a specific company. The main variables that will be use in this analysis include the total number of employee laid offs, company industry, company stage, total funds raised, country of operation, and the date of the layoff event.
The dataset is from Kaggle, and originally called “Layoffs 2022” dataset, which has been continuously updated to include recent years. I choose this topic due to the widespread layoffs that have occurred lately and how the economy of the country have been impacted. Understanding which company characteristics are associated with larger layoffs can provide insight into economic conditions and organization decision-making during periods of uncertainty.
To prepare the dataset for multiple regression analysis, is needed to perform data wrangling. First the date variable need to be converted from a character format into a proper date format to allow filtering by year. Then, the data need to be filtered to include only observations from 2024 and 2025, this way it aligns with the research question. Next, only the variables relevant to the analysis are going to be selected. Finally, observations with missing values in the dependent variable or key independent variable will be remove to ensure the regression model can be estimated correctly.
Step 1
# Convert date variable into into date format
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.5.2
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
layoffs <- layoffs_data |>
mutate(date = as.Date(date, format = "%m/%d/%Y"),
year = format(date, "%Y")
)
str(layoffs)
## 'data.frame': 4202 obs. of 12 variables:
## $ company : chr "Sonder" "Axonius" "MyBambu" "Hewlett Packard Enterprise" ...
## $ location : chr "SF Bay Area" "New York City" "Memphis" "SF Bay Area" ...
## $ total_laid_off : int NA 100 141 52 NA NA 350 165 14000 388 ...
## $ date : Date, format: "2025-11-10" "2025-11-06" ...
## $ percentage_laid_off: num 1 0.11 1 NA NA 0.2 0.18 0.1 0.01 0.45 ...
## $ industry : chr "Travel" "Security" "Finance" "Hardware" ...
## $ source : chr "https://skift.com/2025/11/10/sonder-shuts-down-after-marriott-termination-marking-the-end-of-a-hospitality-experiment/" "https://www.calcalistech.com/ctechnews/article/sj0syry1wg" "https://cbs12.com/news/local/new-west-palm-beach-fintech-firm-to-lay-off-141-employees-amid-funding-collapse-la"| __truncated__ "https://www.sfchronicle.com/tech/article/layoffs-hpe-hitachi-vantara-21140900.php" ...
## $ stage : chr "Post-IPO" "Series E" "Unknown" "Post-IPO" ...
## $ funds_raised : num 839 865 15 1400 5 3 357 724 8100 227 ...
## $ country : chr "United States" "United States" "United States" "United States" ...
## $ date_added : chr "11/10/2025" "11/7/2025" "11/7/2025" "11/7/2025" ...
## $ year : chr "2025" "2025" "2025" "2025" ...
Step 2
# Filter dataset to include only 2024 - 2025 layoffs
specific_layoffs <- layoffs |>
filter(year %in% c("2024", "2025"))
head(specific_layoffs)
## company location total_laid_off date
## 1 Sonder SF Bay Area NA 2025-11-10
## 2 Axonius New York City 100 2025-11-06
## 3 MyBambu Memphis 141 2025-11-05
## 4 Hewlett Packard Enterprise SF Bay Area 52 2025-11-05
## 5 Indeed Austin NA 2025-11-05
## 6 TripAdvisor Boston NA 2025-11-05
## percentage_laid_off industry
## 1 1.00 Travel
## 2 0.11 Security
## 3 1.00 Finance
## 4 NA Hardware
## 5 NA HR
## 6 0.20 Travel
## source
## 1 https://skift.com/2025/11/10/sonder-shuts-down-after-marriott-termination-marking-the-end-of-a-hospitality-experiment/
## 2 https://www.calcalistech.com/ctechnews/article/sj0syry1wg
## 3 https://cbs12.com/news/local/new-west-palm-beach-fintech-firm-to-lay-off-141-employees-amid-funding-collapse-layoff-south-florida-palm-beach-county-downtown-west-palm-beach-news-mybambu-a-fintech-startup-2751-s-dixie-hwy-420-november-5-2025
## 4 https://www.sfchronicle.com/tech/article/layoffs-hpe-hitachi-vantara-21140900.php
## 5 https://www.businessinsider.com/indeed-layoffs-job-cuts-after-summer-reorg-2025-11
## 6 https://skift.com/2025/11/05/layoffs-hit-20-of-tripadvisor-viator-and-administrative-staff-scoop/
## stage funds_raised country date_added year
## 1 Post-IPO 839 United States 11/10/2025 2025
## 2 Series E 865 United States 11/7/2025 2025
## 3 Unknown 15 United States 11/7/2025 2025
## 4 Post-IPO 1400 United States 11/7/2025 2025
## 5 Acquired 5 United States 11/8/2025 2025
## 6 Post-IPO 3 United States 11/7/2025 2025
At this point we only kept 907 observations from the 4202 that were originally on the dataset.
Step 3
# Select only variables needed
specific_layoffs <- specific_layoffs |>
select(total_laid_off, stage, industry, funds_raised, country)
head(specific_layoffs)
## total_laid_off stage industry funds_raised country
## 1 NA Post-IPO Travel 839 United States
## 2 100 Series E Security 865 United States
## 3 141 Unknown Finance 15 United States
## 4 52 Post-IPO Hardware 1400 United States
## 5 NA Acquired HR 5 United States
## 6 NA Post-IPO Travel 3 United States
Step 4
specific_layoffs <- na.omit(specific_layoffs)
str(specific_layoffs)
## 'data.frame': 512 obs. of 5 variables:
## $ total_laid_off: int 100 141 52 350 165 14000 388 85 1400 600 ...
## $ stage : chr "Series E" "Unknown" "Post-IPO" "Series F" ...
## $ industry : chr "Security" "Finance" "Hardware" "Logistics" ...
## $ funds_raised : num 865 15 1400 357 724 8100 227 335 2100 10700 ...
## $ country : chr "United States" "United States" "United States" "India" ...
## - attr(*, "na.action")= 'omit' Named int [1:395] 1 5 6 12 13 19 20 21 22 26 ...
## ..- attr(*, "names")= chr [1:395] "1" "5" "6" "12" ...
This way, the data is clean and prepare now. Removing the missing values reduce the observations to 512, ensuring that the multiple regression analysis would run as accurate as possible.
I am using multiple linear regression to examine how several factors influence the total number of layoffs in 2024- 2025. The dependent variable is total_laid_off, and the independent variables are stage, industry, funds_raised, and country. Multiple regression is appropriate here because it allow us to assess the simultaneous effect of several predictors on a continous outcome. After fitting the model, I will examine the regression coefficients, p-values, and R-squared to interpret the strength and significance of each predictor. I will also check the regression assumptions linearity, independence of observation, homoscedasticity, normality of residuals, and multicullinearity; using diagnostic plots to ensure the model is valid.
Multiple Regression model
model <- lm(total_laid_off ~ stage + industry + funds_raised + country,
data = specific_layoffs)
summary(model)
##
## Call:
## lm(formula = total_laid_off ~ stage + industry + funds_raised +
## country, data = specific_layoffs)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3457.3 -333.1 -43.9 127.0 18600.0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -193.62688 1199.35883 -0.161 0.871820
## stagePost-IPO 332.06647 281.42846 1.180 0.238666
## stagePrivate Equity -153.93357 621.41338 -0.248 0.804471
## stageSeed -390.60042 1187.80963 -0.329 0.742433
## stageSeries A -226.39298 491.23518 -0.461 0.645124
## stageSeries B -245.64095 379.78677 -0.647 0.518108
## stageSeries C -382.37342 411.34474 -0.930 0.353105
## stageSeries D -111.95499 393.61076 -0.284 0.776214
## stageSeries E -131.43609 403.92369 -0.325 0.745035
## stageSeries F -187.37937 500.49810 -0.374 0.708298
## stageSeries G -153.73381 801.50550 -0.192 0.847983
## stageSeries H -78.83028 986.77944 -0.080 0.936364
## stageSeries I -668.51993 1664.39509 -0.402 0.688130
## stageSeries J -202.48817 1209.99436 -0.167 0.867175
## stageSubsidiary -431.60729 1724.03698 -0.250 0.802436
## stageUnknown -130.60218 330.58378 -0.395 0.692987
## industryAI -306.65303 938.03067 -0.327 0.743891
## industryConstruction -36.65998 1821.73233 -0.020 0.983954
## industryConsumer -319.37652 850.85427 -0.375 0.707574
## industryCrypto -234.28052 968.72470 -0.242 0.809014
## industryData -89.93035 955.00906 -0.094 0.925019
## industryEducation -236.64267 942.18161 -0.251 0.801804
## industryEnergy -20.78344 930.07812 -0.022 0.982182
## industryFinance -77.03607 855.86559 -0.090 0.928321
## industryFitness -247.21021 1279.16415 -0.193 0.846845
## industryFood -8.03330 882.76833 -0.009 0.992743
## industryHardware 2706.21615 928.59187 2.914 0.003747 **
## industryHealthcare -246.48571 868.23443 -0.284 0.776626
## industryHR 160.08133 1003.37938 0.160 0.873315
## industryInfrastructure 1338.86943 1054.64419 1.269 0.204936
## industryLegal 98.89755 1841.92741 0.054 0.957205
## industryLogistics -101.70596 978.81261 -0.104 0.917290
## industryManufacturing -1.53572 1048.74395 -0.001 0.998832
## industryMarketing -421.62460 883.19803 -0.477 0.633326
## industryMedia -263.24557 935.66912 -0.281 0.778579
## industryOther 363.89892 855.30386 0.425 0.670709
## industryProduct -129.47594 1106.80935 -0.117 0.906928
## industryReal Estate -411.60444 1060.60831 -0.388 0.698142
## industryRecruiting -288.14756 1017.96102 -0.283 0.777262
## industryRetail 150.97161 865.71495 0.174 0.861640
## industrySales -4.55169 923.12248 -0.005 0.996068
## industrySecurity -174.57084 918.40715 -0.190 0.849334
## industrySupport -405.52710 1053.74517 -0.385 0.700540
## industryTransportation -123.92019 878.48924 -0.141 0.887887
## industryTravel -284.22712 981.20904 -0.290 0.772205
## funds_raised 0.07871 0.02349 3.351 0.000875 ***
## countryAustria 588.23145 1893.79997 0.311 0.756245
## countryBelgium 764.69204 1865.46391 0.410 0.682064
## countryCanada 555.92338 919.10915 0.605 0.545591
## countryCayman Islands 583.01267 1884.24222 0.309 0.757153
## countryChile 399.14155 1856.88455 0.215 0.829904
## countryCyprus 613.68717 1879.55262 0.327 0.744196
## countryCzech Republic 816.03452 1895.08491 0.431 0.666966
## countryEstonia -159.47584 1834.01289 -0.087 0.930747
## countryFrance 715.13751 1303.14565 0.549 0.583437
## countryGermany 852.37733 939.85263 0.907 0.364943
## countryIndia 591.02907 860.05638 0.687 0.492321
## countryIndonesia 432.70185 1259.73526 0.343 0.731397
## countryIreland 6.59575 1835.29372 0.004 0.997134
## countryIsrael 311.75301 869.23519 0.359 0.720028
## countryKenya 1476.92186 1855.88014 0.796 0.426574
## countryNetherlands 240.42819 1071.15318 0.224 0.822506
## countryNigeria 482.80395 1086.94769 0.444 0.657129
## countryNorway 177.88947 1432.79390 0.124 0.901249
## countryPoland 234.85095 1841.40261 0.128 0.898572
## countryPortugal 1040.36506 1964.27119 0.530 0.596626
## countrySingapore 129.32846 1079.31649 0.120 0.904677
## countrySpain 230.42318 1274.64388 0.181 0.856628
## countrySweden 650.30445 1042.77904 0.624 0.533197
## countrySwitzerland 416.57286 1953.87725 0.213 0.831267
## countryUnited Kingdom 1050.39704 1026.80181 1.023 0.306881
## countryUnited States 554.43090 830.16976 0.668 0.504579
## countryUruguay -119.81140 1634.58957 -0.073 0.941603
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1613 on 439 degrees of freedom
## Multiple R-squared: 0.1797, Adjusted R-squared: 0.04521
## F-statistic: 1.336 on 72 and 439 DF, p-value: 0.04381
The model is statiscally significant (p= 0.043), meaning that at least one predictor contributes to explaining layoffs. However, the R- squared value is 0.18, indicating that the model explains only 18% of the variation in layoffs. This low value suggests that layoffs are influenced by other factors not captured in this dataset. Among the predictors in the model, funds_raised and being in the hardware industry were statistically significant. Specifically, higher funding was associated with a small but significant increase in layoffs, while companies in the hardware industry experienced substantially more layoffs compared to other industries. The other variables, including company stage, other industries, and countries, were not statistically significant, suggesting that their effects on layoffs could not be reliably detected.
Model Diagnostics
par(mfrow = c(2,2))
plot(model)
## Warning: not plotting observations with leverage one:
## 76, 150, 151, 220, 297, 308, 356, 364, 406, 412, 440, 448, 467, 480, 496
## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced
## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced
The Residuals vs Fitted plot shows how the errors (the differences between the actual layoffs and what the model predicts) change as the predicted layoffs increase. Ideally, this errors should be spread evenly but in this case, they get bigger as the predicted layoffs get larger. This means the model’s prediction are less reliable for companies with very high layoffs. The Q-Q plot checks if the errors follow a normal distribution, Most points follow the expected line, showing that the errors are mostly normal. However, a few points at the top right don’t follow the line indicating some unusual or extreme values. The Scale Location plot shows whether the spread of the errors stays consistent across all predicted values. Here, the spread increases with larger predicted layoffs, meaning the model’s accuracy decreases as layoffs grow. Finally, the Residual vs Leverage plot, highlights data points that have a strong influence on the model’s results. Some points stand out as more influential, which means those companies might affect the model’s findings more than others. Overall, the diagnostic plots suggest that while the model fits the the data reasonably well, it is less precise for very large layoffs and some companies have a strong impact on the results. These are common challenges when working with real world data, so the model’s results should be interpreted with care.
The analysis shows that fundings and industry type (particularly hardware) are significant factors influencing layoffs in 2024- 2025. Companies with higher funding and those in the hardware industry tend to experience more layoffs. However, the model explains only about 18% of the variation, suggesting that many other factors also play a role. The diagnostic plots indicated the model fits reasonably well but is less precise for very large layoffs, and some companies have a strong influence on the results.
While it might seem surprising that companies with higher funding would experience more layoffs, research and business reporting suggest that well-funded firms may expand rapidly during boom periods, hiring aggressively, and later adjust their workforce when growth slows, revenues falls short of expectations, or investors demand greater efficiency (Vedantam, 2023). This provides a real-world explanation for the results of our model.
These results highlight that layoffs are complex and influenced by multiple variables. For future research, it would be useful to incorporate more detailed financial indicators, company size, or market conditions. Exploring alternative modeling approaches may also improve prediction accuracy and provide deeper insights.
Layoffs Dataset, Kaggle. https://www.kaggle.com/datasets/swaptr/layoffs-2022?resource=download
Vedantam, Kerthee. https://news.crunchbase.com/layoffs/companies-raised-funding-2021-gopuff-chime/?utm_source=chatgpt.com