In week 10 we’re doing some more practice with JSON in assignment 10B. This assignment involves the Nobel Prize organizations public APIs. We’ll use them for accessing structured Nobel Prize data. The task is to use one or both of the APIs available at the Nobel Prize Developer Zone, using JSON data to investigate and answer four interesting, data-driven questions.
Planned Workflow
I’ll navigate through nobel prizes API by using their instructions for API examples. Similar to week 9, I’ll make a request to the endpoint for the topic of interest, parse the JSON response, and then transform the results into a clean R data frame by using tidyr, jsonlite, and dplyr. After retrieving the required information, my four questions will be easier to create after analyzing what data is available.
Anticipated Challenges
Challenges I anticipate facing are different from Week 9. A potential security breach is not the case this time since we can retrieve the API without an account by Nobel Prize, unlike The New York Times. I anticipate a challenge is parsing the JSON code and navigating it to extract information I want from Nobel prizes and laureates.
# A tibble: 6 × 2
category.en total_prizes
<chr> <int>
1 Chemistry 125
2 Literature 125
3 Peace 125
4 Physics 125
5 Physiology or Medicine 125
6 Economic Sciences 57
Answer: We can see that the Nobel prize that’s been awarded the least was the economic sciences. This could be because it’s category came later in 1968, whereas the rest of the global prizes were from 1901.
Question 2: How many women have won a Nobel Prize in the most recent 20 years (2006–2026) compared to the first 20 years of the prize (1901–1921)?
# A tibble: 2 × 2
period n
<chr> <int>
1 Early (1901-1921) 4
2 Recent (2006-2026) 34
Answer: In the most recent years women have won 34 Nobel prizes, while in the early years only 4 won. That’s a 30 sum difference of women that have won.
Question 3: Who are the top 5 youngest Nobel Prize winners in history, and what were the specific categories of their awards?
# A tibble: 5 × 4
fullName.en age_at_award nobelPrizes_category…¹ nobelPrizes_awardYear
<chr> <dbl> <chr> <chr>
1 Malala Yousafzai 17 Peace 2014
2 William Lawrence Br… 25 Physics 1915
3 Nadia Murad Basee T… 25 Peace 2018
4 Carl David Anderson 31 Physics 1936
5 Paul Adrien Maurice… 31 Physics 1933
# ℹ abbreviated name: ¹nobelPrizes_category.en
Answer: For this question we can see the top 5 youngest ranges from 17 to 31 years old. The specific categories of their wins are between physics and peace. The top 5 winners names are Malala Yousafzai, William Lawrence Bragg, Nadia Murad Basee Taha, Carl David Anderson, and Paul Adrian Maurice Dirac.
Question 4: Which countries have the highest number of “International Laureates”? Which are individuals born in one country but affiliated with an institution in a different country at the time of their award?
# A tibble: 5 × 2
Host_Country International_Laureate_Count
<chr> <int>
1 USA 150
2 United Kingdom 39
3 Germany 37
4 Switzerland 13
5 France 12
Answer: This table shows that the USA has the highest number of recorded affiliations with International Laureate winners. So this can possibly signal that the USA is viewed as a place that attracts talent through it’s resources and large amount of opportunities. This doesn’t mean that other countries do not also attract talent at the global level, but it might not be as accessible.