2025-03-30

Data Source

  • The data source I selected for this project is the Mount Everest Ascent Data (1953-2020) from kaggle
  • This dataset includes info about climbers who attempted Mt. Everest. It includes the following info for each climber:
    • name
    • date of ascent
    • time of ascent
    • citizenship
    • sex
    • age
    • oxygen usage
    • death (Y/N)
    • host country

Bar graph of # of deaths per year

This plot used ggplot to visualize the number of deaths that occured each year on the mountain and is colorized based on sex.

R code to set up a new dataframe

The following R code demonstrates how I manipulated the original dataset in order to separate climbers from Nepal and climbers from other countries. The plot representing this data will be shown on the next slide.

#separate each climber by whether or not they were local
country_data = data %>%
  mutate(is_nepalese = ifelse(citizenship == "Nepal", 
                              "Nepalese", "Foreigner"))

#count how many locals vs nonlocals there are
nepal_counts = country_data %>%
  count(is_nepalese)

#calculate death counts
death_count_df = country_data %>%
  group_by(is_nepalese) %>%
  summarize(deaths = sum(dth == "Y"), total = n()) 

Pie chart of citizenship and # of deaths

Is the death rate of Nepalese climbers lower than for non-locals? Based on this chart, there are much less deaths for climbers of Nepalese citizenship compared to climbers from other countries.

Death status by age and time of ascent

The following 3D scatter plot shows the time of ascent (x-axis), age (y-axis), and whether or not the climber survived the ascent (represented as Y for Yes and N for No) (z-axis). There doesn’t appear to be any clear correlation between age or time of ascent on whether or not climbers died during the descent.

Box plots of oxygen use vs death

The two box plots were created using ggplot and show the different distributions in age of climbers who used oxygen and those who did not use oxygen.

Statistical Analysis of Previous Box Plot

5-number summary of the ages of climbers who did NOT use oxygen:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   20.00   31.00   35.00   35.53   40.00   55.00

5-number summary of the ages of climbers who DID use oxygen:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   13.00   28.00   34.00   35.57   42.00   80.00

According to these summaries, the median age for those who did and did not use oxygen are very similar. However, the minimum and maximum ages for oxygen usage vary more drastically. Those who were older, over 55, tended to all use oxygen during the climb which is logical since it is more difficult to make the climb when you are older or much younger than the average climber (around 34-35 years old).