Do not change anything in the following chunk

You will be working on olympic_gymnasts dataset. Do not change the code below:

olympics <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-07-27/olympics.csv')

olympic_gymnasts <- olympics %>% 
  filter(!is.na(age)) %>%             # only keep athletes with known age
  filter(sport == "Gymnastics") %>%   # keep only gymnasts
  mutate(
    medalist = case_when(             # add column for success in medaling
      is.na(medal) ~ FALSE,           # NA values go to FALSE
      !is.na(medal) ~ TRUE            # non-NA values (Gold, Silver, Bronze) go to TRUE
    )
  )

olympics

More information about the dataset can be found at

https://github.com/rfordatascience/tidytuesday/blob/master/data/2021/2021-07-27/readme.md

Question 1: Create a subset dataset with the following columns only: name, sex, age, team, year and medalist. Call it df.

df<- olympic_gymnasts|>
  select(name, sex, age, team, year, medalist)
df

Question 2: From df create df2 that only have year of 2008 2012, and 2016

df2 <- df |>
  filter(year %in% c("2008","2012","2016")) |>
  group_by(year) 
  df2

Question 3 Group by these three years (2008,2012, and 2016) and summarize the mean of the age in each group.

df2 <- df |>
  filter(year %in% c("2008","2012","2016")) |>
  group_by(year) |>
  summarise(mean(age))

df2

Question 4 Use olympic_gymnasts dataset, group by year, and find the mean of the age for each year, call this dataset oly_year. (optional after creating the dataset, find the minimum average age)

oly_year <- olympic_gymnasts |>
  group_by(year) |>
  summarise(mean(age)) 
  
oly_year

Question 5 This question is open ended. Create a question that requires you to use at least two verbs. Create a code that answers your question. Then below the chunk, reflect on your question choice and coding procedure

# Your R code here
Olympic_Basketball <- olympics |>
  filter(sport == "Basketball") |>
  filter(medal %in% c("Gold","Silver","Bronze"))|>
  count(team) |>
  arrange(desc(n))
 
  

Olympic_Basketball

Discussion: Enter your discussion of results here.

How many players from each country have won an olympic medal in Basketball?

This code answers the question of how many individual players for each country have won an olympic medal in Basketball. In adittion to this I also arranged them in descending order from the country with the most players (USA) who have won medals to the country with the least (Canada) to make the data easier to read.

To come up with these results I used the filter, count, and arrange functions when coding. Filter was used twice, first to select Basketball from the olympic dataset and second to select only the rows with players who values of Gold, Silver, or Bronze in the medals column. After that I used the count function to find the total number of players for each country, and finally used arrange to order the countries from highest number of medals to lowest.