Analyzing Deaths of Climbers in Mt. Everest

Can Data Analysis Provide Insights into the Deaths of Climbers?

John Karuitha

2024-01-02

1 Background

Mt. Everest, the highest mountain on the planet standing at 8849 meters above sea level is attractive to many climbers. The first people to conquer the mountain were Sir Edmund Percival Hillary and Tenzing Norgay on 29th May, 1953. Since then, thousands have succesfully conquered the mountain. However, a substantial number of climbers succumb to accidents and altitude sickness. So far, over 330 climbers have died as per the records. In the recent past only 1977 and 2020 (due to the closure occasioned by the COVID 19 pandemic) have passed without a climber dying (R Core Team 2022).

In this analysis, I explore data from Wikipedia on the recorded number of deaths among climbers of Mt. Everest. In particular, I seek answers to the following questions.

  1. What are the major causes of deaths among climbers of Mt. Everest.
  2. What nationality has the highest number of deaths?
  3. Are there particular days of the year with a higher risk of death?
  4. Which months of the year have higher death rates?

2 Summary of Results

  1. Avalanches are the most risky events, contributing to the bulk of deaths.
  2. Citizens of Nepal form the bulk of casualties, followed by Indians.
  3. The month of May has the highest casualties, probably because it is the month that most people attempt to climb. 4. Fidays and saturdays have a significantly higher rate of deaths among climbers. With ample data, we could examine the rates of deaths more deeply.

3 Data

I start by reading in the data. The also fill in the one missing date of death. The point concerns Maurice Wilson who attempted to climb the mountain in 1934. His last diary entry was 31st May 1934. Hence, we can reasonably presume he died in early June. In this case, I update the year of death to 1934. The other details remain as NA (2024).

read_html("https://en.wikipedia.org/wiki/List_of_people_who_died_climbing_Mount_Everest") %>%
  html_nodes("table") %>%
  html_table() %>%
  .[[2]] %>%
  clean_names() %>%
  mutate(date = case_when(
    
    name == "Maurice Wilson" ~ "June 1, 1934",
    
    TRUE ~ date
    
  )) %>%
  mutate(date = mdy(date)) %>%
  mutate(day_of_week = lubridate::wday(date, label = TRUE),
         month = lubridate::month(date, label = TRUE),
         year_d = lubridate::year(date)) %>%
  write_csv("everest.csv")

#### ============
## Read in the CSV
everest_data <- read_csv("everest.csv", na = "")

The data consists of 332 rows and 12 variables. The variables contained in the dataset are;

Variables Description

variable Data_type Description
Name Character Name of victim.
Date Date/Time Date of death.
Age Integer Age of Victim in years at the time of death.
Expedition Character The climbing expedition that the victim belonged, if any.
Nationality Character Nationality of the victim
Cause of death Character The victim’s cause of death.
Location Character Approximate location of death.
Day of Week Character Day of the week that victim died.
Month Character The month that the victim died.
Year_d Integer Year of victim’s death.

4 Exploratory Data Analysis

In this article, I have explored the deaths of people attempting to scale Mt. Everest. Avalanches are the most risky events, contributing to the bulk of deaths. Citizens of Nepal form the bulk of casualties, followed by Indians. The month of May has the highest casualties, probably because it is the month that most people attempt to climb. Fidays and saturdays have a significantly higher rate of deaths among climbers. With ample data, we could examine the rates of deaths more deeply.

4.1 Missing Data

The dataset has substantial missing data points, especially the age of the climbers and their expeditions.

everest_data %>% 
  sapply(., is.na) %>%
  colSums() %>%
  tibble(variables = names(everest_data), missing = .) %>%
  arrange(desc(missing)) %>%
  kbl(booktabs = TRUE, 
      caption = "Missing Data") %>%
  kable_classic(full_width = FALSE)

Missing Data

variables missing
remains_status 256
age 128
expedition 35
location 17
cause_of_death 9
nationality 3
name 0
date 0
refs 0
day_of_week 0
month 0
year_d 0

5 Exploratory Data Analysis

5.1 Deaths due to extreme events

There are certain dates that had high casualties due to extreme events. In this section I examine the these dates.

everest_data %>%
  group_by(date) %>%
  summarise(casualty_count = n(),
            cause = cause_of_death) %>%
  slice_head(n = 1) %>%
  arrange(desc(casualty_count)) %>%
  ungroup() %>%
  head(10) %>%
  kbl(booktabs = TRUE, 
      captions = "Deadly Days") %>%
  kable_classic(full_width = FALSE)
date casualty_count cause
2015-04-25 17 Base Camp avalanche following the April 2015 Nepal earthquake
2014-04-18 15 2014 Mount Everest Avalanche
1996-05-11 8 Suspected HACE (high-altitude cerebral edema), exhaustion, frostbite and exposure.
1922-06-07 7 Avalanche
1974-09-09 6 Avalanche
1970-04-05 5 Avalanche
2007-05-17 5 NA
1985-10-11 4 Exposure
1988-10-17 4 Disappearance (likely accidental death during descent after reaching South Summit with Jozef Just rejoining group after he summited Everest solo)[76][77]
1989-05-27 4 Avalanche

Here it appears like avalanches are notorious for killing climbers. But are avalanches, like earthquakes unpredictable?

In the rest of the analysis, I consider these extreme events as they may tilt our observations.

5.2 Deaths by Nationality

The highest number of casualties are nationals of Nepal. This observation is not surprising given that most of the mountain climbing guides are from the country. Outside Nepal, India, Japan, United Kingdom, United States and South Korea have the highest fatalities.

everest_data %>% 
  count(nationality) %>%
  mutate(nationality = factor(nationality)) %>%
  mutate(nationality = fct_reorder(nationality, n, max)) %>%
  ggplot(aes(x = nationality, y = n)) + 
  geom_col() +
  coord_flip()

However, there are several cases where a large number of climbers died due to an extreme event.

5.3 Deaths by Month

The deadliest months for climbers in Mt. Everest are May, April, Septembe, October, and June respectively. These months also correspond to the peak season for climbing the mountain owing to extreme weather during the rest of the year. Due to lack of data on the total climbers in a given month, it is not possible to compute the rate of death. However, it would appear that any attempt to climb the mountain outside of these months would result in higher casualty rate.

everest_data %>%
  count(month) %>%
  mutate(month = fct_reorder(month, n, max)) %>%
  ggplot(mapping = aes(x = month, y = n)) + 
  geom_col() +
  coord_flip() + 
  labs(x = NULL, 
       y = NULL, 
       title = "Deaths Among Climbers by Month")

5.4 Deaths by Day of the Week

Saturdays, followed by Fridays have the highest fatalities which could be due to the days being the most popular climbing days across the climbers.

everest_data %>% 
  count(day_of_week) %>%
  mutate(day_of_week = fct_reorder(day_of_week, n, max)) %>%
  ggplot(mapping = aes(x = day_of_week, y = n)) + 
  geom_col()

Moreover, it is possible that extreme weather and other natural events like earthquakes could tilt these numbers. Hence, I eliminate those days that had more than five deaths and plot the figures.

everest_data %>% 
  group_by(date) %>%
  filter(n() <= 3) %>%
  count(day_of_week) %>%
  mutate(day_of_week = fct_reorder(day_of_week, n, max)) %>%
  ggplot(mapping = aes(x = day_of_week, y = n)) + 
  geom_col()

6 Conclusion

In this article, I have explored the deaths of people attempting to scale Mt. Everest. Avalanches are the most risky events, contributing to the bulk of deaths. Citizens of Nepal form the bulk of casualties, followed by Indians. The month of May has the highest casualties, probably because it is the month that most people attempt to climb. Fidays and saturdays have a significantly higher rate of deaths among climbers. With ample data, we could examine the rates of deaths more deeply.

References

2024. Wikipedia. Wikimedia Foundation. https://en.wikipedia.org/wiki/List_of_people_who_died_climbing_Mount_Everest.
R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.