Assignment 10B: Nobelprize.org

Author

Kristoff Oliphant

Introduction

In week 10 we’re doing some more practice with JSON in assignment 10B. This assignment involves the Nobel Prize organizations public APIs. We’ll use them for accessing structured Nobel Prize data. The task is to use one or both of the APIs available at the Nobel Prize Developer Zone, using JSON data to investigate and answer four interesting, data-driven questions.

Planned Workflow

I’ll navigate through nobel prizes API by using their instructions for API examples. Similar to week 9, I’ll make a request to the endpoint for the topic of interest, parse the JSON response, and then transform the results into a clean R data frame by using tidyr, jsonlite, and dplyr. After retrieving the required information, my four questions will be easier to create after analyzing what data is available.

Anticipated Challenges

Challenges I anticipate facing are different from Week 9. A potential security breach is not the case this time since we can retrieve the API without an account by Nobel Prize, unlike The New York Times. I anticipate a challenge is parsing the JSON code and navigating it to extract information I want from Nobel prizes and laureates.

Source: Developer zone. NobelPrize.org. Nobel Prize Outreach 2026. Sun. 5 Apr 2026. https://www.nobelprize.org/about/developer-zone-2/

library(jsonlite)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
prizes_url <- "https://api.nobelprize.org/2.1/nobelPrizes?limit=1000"
prizes_raw <- fromJSON(prizes_url, flatten = TRUE)
prizes_df <- prizes_raw$nobelPrizes

Question 1: Which Nobel Prize category has been awarded the fewest number of times in history?

colnames(prizes_df)
 [1] "awardYear"           "dateAwarded"         "prizeAmount"        
 [4] "prizeAmountAdjusted" "links"               "laureates"          
 [7] "category.en"         "category.no"         "category.se"        
[10] "categoryFullName.en" "categoryFullName.no" "categoryFullName.se"
[13] "topMotivation.en"    "topMotivation.se"   
category_analysis <- prizes_df %>%
  group_by(category.en) %>%
  summarise(total_prizes = n()) %>%
  arrange(desc(total_prizes))
category_analysis
# A tibble: 6 × 2
  category.en            total_prizes
  <chr>                         <int>
1 Chemistry                       125
2 Literature                      125
3 Peace                           125
4 Physics                         125
5 Physiology or Medicine          125
6 Economic Sciences                57

Answer: We can see that the Nobel prize that’s been awarded the least was the economic sciences. This could be because it’s category came later in 1968, whereas the rest of the global prizes were from 1901.

Question 2: How many women have won a Nobel Prize in the most recent 20 years (2006–2026) compared to the first 20 years of the prize (1901–1921)?

laureates_url <- "https://api.nobelprize.org/2.1/laureates?limit=1000"
laureates_raw <- fromJSON(laureates_url, flatten = TRUE)
laureates_df <- laureates_raw$laureates
colnames(laureates_df)
  [1] "id"                               "fileName"                        
  [3] "gender"                           "sameAs"                          
  [5] "links"                            "nobelPrizes"                     
  [7] "acronym"                          "nativeName"                      
  [9] "penName"                          "knownName.en"                    
 [11] "knownName.se"                     "knownName.no"                    
 [13] "givenName.en"                     "givenName.se"                    
 [15] "givenName.no"                     "familyName.en"                   
 [17] "familyName.se"                    "familyName.no"                   
 [19] "fullName.en"                      "fullName.se"                     
 [21] "fullName.no"                      "birth.date"                      
 [23] "birth.year"                       "birth.place.city.en"             
 [25] "birth.place.city.no"              "birth.place.city.se"             
 [27] "birth.place.country.en"           "birth.place.country.no"          
 [29] "birth.place.country.se"           "birth.place.cityNow.en"          
 [31] "birth.place.cityNow.no"           "birth.place.cityNow.se"          
 [33] "birth.place.cityNow.sameAs"       "birth.place.cityNow.latitude"    
 [35] "birth.place.cityNow.longitude"    "birth.place.countryNow.en"       
 [37] "birth.place.countryNow.no"        "birth.place.countryNow.se"       
 [39] "birth.place.countryNow.sameAs"    "birth.place.countryNow.latitude" 
 [41] "birth.place.countryNow.longitude" "birth.place.continent.en"        
 [43] "birth.place.continent.no"         "birth.place.continent.se"        
 [45] "birth.place.locationString.en"    "birth.place.locationString.no"   
 [47] "birth.place.locationString.se"    "wikipedia.slug"                  
 [49] "wikipedia.english"                "wikidata.id"                     
 [51] "wikidata.url"                     "death.date"                      
 [53] "death.place.city.en"              "death.place.city.no"             
 [55] "death.place.city.se"              "death.place.country.en"          
 [57] "death.place.country.no"           "death.place.country.se"          
 [59] "death.place.country.sameAs"       "death.place.cityNow.en"          
 [61] "death.place.cityNow.no"           "death.place.cityNow.se"          
 [63] "death.place.cityNow.sameAs"       "death.place.cityNow.latitude"    
 [65] "death.place.cityNow.longitude"    "death.place.countryNow.en"       
 [67] "death.place.countryNow.no"        "death.place.countryNow.se"       
 [69] "death.place.countryNow.sameAs"    "death.place.countryNow.latitude" 
 [71] "death.place.countryNow.longitude" "death.place.continent.en"        
 [73] "death.place.continent.no"         "death.place.continent.se"        
 [75] "death.place.locationString.en"    "death.place.locationString.no"   
 [77] "death.place.locationString.se"    "orgName.en"                      
 [79] "orgName.no"                       "orgName.se"                      
 [81] "founded.date"                     "founded.place.city.en"           
 [83] "founded.place.city.no"            "founded.place.city.se"           
 [85] "founded.place.country.en"         "founded.place.country.no"        
 [87] "founded.place.country.se"         "founded.place.country.sameAs"    
 [89] "founded.place.cityNow.en"         "founded.place.cityNow.no"        
 [91] "founded.place.cityNow.se"         "founded.place.cityNow.sameAs"    
 [93] "founded.place.countryNow.en"      "founded.place.countryNow.no"     
 [95] "founded.place.countryNow.se"      "founded.place.countryNow.sameAs" 
 [97] "founded.place.continent.en"       "founded.place.continent.no"      
 [99] "founded.place.continent.se"       "founded.place.locationString.en" 
[101] "founded.place.locationString.no"  "founded.place.locationString.se" 
[103] "penNameOf.fullName"               "foundedCountry.en"               
[105] "foundedCountry.no"                "foundedCountry.se"               
[107] "foundedCountryNow.en"             "foundedCountryNow.no"            
[109] "foundedCountryNow.se"             "foundedContinent.en"             
laureates_tidy <- laureates_df %>%
  unnest(nobelPrizes, names_sep = "_")
colnames(laureates_tidy)
  [1] "id"                               "fileName"                        
  [3] "gender"                           "sameAs"                          
  [5] "links"                            "nobelPrizes_awardYear"           
  [7] "nobelPrizes_sortOrder"            "nobelPrizes_portion"             
  [9] "nobelPrizes_dateAwarded"          "nobelPrizes_prizeStatus"         
 [11] "nobelPrizes_prizeAmount"          "nobelPrizes_prizeAmountAdjusted" 
 [13] "nobelPrizes_affiliations"         "nobelPrizes_links"               
 [15] "nobelPrizes_category.en"          "nobelPrizes_category.no"         
 [17] "nobelPrizes_category.se"          "nobelPrizes_categoryFullName.en" 
 [19] "nobelPrizes_categoryFullName.no"  "nobelPrizes_categoryFullName.se" 
 [21] "nobelPrizes_motivation.en"        "nobelPrizes_motivation.se"       
 [23] "nobelPrizes_motivation.no"        "nobelPrizes_residences"          
 [25] "nobelPrizes_topMotivation.en"     "nobelPrizes_topMotivation.se"    
 [27] "acronym"                          "nativeName"                      
 [29] "penName"                          "knownName.en"                    
 [31] "knownName.se"                     "knownName.no"                    
 [33] "givenName.en"                     "givenName.se"                    
 [35] "givenName.no"                     "familyName.en"                   
 [37] "familyName.se"                    "familyName.no"                   
 [39] "fullName.en"                      "fullName.se"                     
 [41] "fullName.no"                      "birth.date"                      
 [43] "birth.year"                       "birth.place.city.en"             
 [45] "birth.place.city.no"              "birth.place.city.se"             
 [47] "birth.place.country.en"           "birth.place.country.no"          
 [49] "birth.place.country.se"           "birth.place.cityNow.en"          
 [51] "birth.place.cityNow.no"           "birth.place.cityNow.se"          
 [53] "birth.place.cityNow.sameAs"       "birth.place.cityNow.latitude"    
 [55] "birth.place.cityNow.longitude"    "birth.place.countryNow.en"       
 [57] "birth.place.countryNow.no"        "birth.place.countryNow.se"       
 [59] "birth.place.countryNow.sameAs"    "birth.place.countryNow.latitude" 
 [61] "birth.place.countryNow.longitude" "birth.place.continent.en"        
 [63] "birth.place.continent.no"         "birth.place.continent.se"        
 [65] "birth.place.locationString.en"    "birth.place.locationString.no"   
 [67] "birth.place.locationString.se"    "wikipedia.slug"                  
 [69] "wikipedia.english"                "wikidata.id"                     
 [71] "wikidata.url"                     "death.date"                      
 [73] "death.place.city.en"              "death.place.city.no"             
 [75] "death.place.city.se"              "death.place.country.en"          
 [77] "death.place.country.no"           "death.place.country.se"          
 [79] "death.place.country.sameAs"       "death.place.cityNow.en"          
 [81] "death.place.cityNow.no"           "death.place.cityNow.se"          
 [83] "death.place.cityNow.sameAs"       "death.place.cityNow.latitude"    
 [85] "death.place.cityNow.longitude"    "death.place.countryNow.en"       
 [87] "death.place.countryNow.no"        "death.place.countryNow.se"       
 [89] "death.place.countryNow.sameAs"    "death.place.countryNow.latitude" 
 [91] "death.place.countryNow.longitude" "death.place.continent.en"        
 [93] "death.place.continent.no"         "death.place.continent.se"        
 [95] "death.place.locationString.en"    "death.place.locationString.no"   
 [97] "death.place.locationString.se"    "orgName.en"                      
 [99] "orgName.no"                       "orgName.se"                      
[101] "founded.date"                     "founded.place.city.en"           
[103] "founded.place.city.no"            "founded.place.city.se"           
[105] "founded.place.country.en"         "founded.place.country.no"        
[107] "founded.place.country.se"         "founded.place.country.sameAs"    
[109] "founded.place.cityNow.en"         "founded.place.cityNow.no"        
[111] "founded.place.cityNow.se"         "founded.place.cityNow.sameAs"    
[113] "founded.place.countryNow.en"      "founded.place.countryNow.no"     
[115] "founded.place.countryNow.se"      "founded.place.countryNow.sameAs" 
[117] "founded.place.continent.en"       "founded.place.continent.no"      
[119] "founded.place.continent.se"       "founded.place.locationString.en" 
[121] "founded.place.locationString.no"  "founded.place.locationString.se" 
[123] "penNameOf.fullName"               "foundedCountry.en"               
[125] "foundedCountry.no"                "foundedCountry.se"               
[127] "foundedCountryNow.en"             "foundedCountryNow.no"            
[129] "foundedCountryNow.se"             "foundedContinent.en"             
gender_analysis <- laureates_tidy %>%
  filter(gender == "female") %>%
  mutate(period = case_when(
    nobelPrizes_awardYear >= 1901 & nobelPrizes_awardYear <= 1921 ~ "Early (1901-1921)",
    nobelPrizes_awardYear >= 2006 & nobelPrizes_awardYear <= 2026 ~ "Recent (2006-2026)",
    TRUE ~ "Other"
  )) %>%
  filter(period != "Other") %>%
  count(period)
gender_analysis
# A tibble: 2 × 2
  period                 n
  <chr>              <int>
1 Early (1901-1921)      4
2 Recent (2006-2026)    34

Answer: In the most recent years women have won 34 Nobel prizes, while in the early years only 4 won. That’s a 30 sum difference of women that have won.

Question 3: Who are the top 5 youngest Nobel Prize winners in history, and what were the specific categories of their awards?

youngest_analysis <- laureates_tidy %>%
  mutate(
    award_year = as.numeric(nobelPrizes_awardYear),
    birth_year = as.numeric(birth.year)
  ) %>%
  mutate(age_at_award = award_year - birth_year) %>%
  filter(!is.na(age_at_award)) %>%
  arrange(age_at_award) %>%
  select(fullName.en, age_at_award, nobelPrizes_category.en, nobelPrizes_awardYear) %>%
  head(5)
youngest_analysis
# A tibble: 5 × 4
  fullName.en          age_at_award nobelPrizes_category…¹ nobelPrizes_awardYear
  <chr>                       <dbl> <chr>                  <chr>                
1 Malala Yousafzai               17 Peace                  2014                 
2 William Lawrence Br…           25 Physics                1915                 
3 Nadia Murad Basee T…           25 Peace                  2018                 
4 Carl David Anderson            31 Physics                1936                 
5 Paul Adrien Maurice…           31 Physics                1933                 
# ℹ abbreviated name: ¹​nobelPrizes_category.en

Answer: For this question we can see the top 5 youngest ranges from 17 to 31 years old. The specific categories of their wins are between physics and peace. The top 5 winners names are Malala Yousafzai, William Lawrence Bragg, Nadia Murad Basee Taha, Carl David Anderson, and Paul Adrian Maurice Dirac.

Question 4: Which countries have the highest number of “International Laureates”? Which are individuals born in one country but affiliated with an institution in a different country at the time of their award?

laureates_international <- laureates_tidy %>%
unnest(nobelPrizes_affiliations, names_sep = "_")
international_results <- laureates_international %>%
  filter(!is.na(birth.place.country.en), !is.na(nobelPrizes_affiliations_country.en)) %>%
  filter(birth.place.country.en != nobelPrizes_affiliations_country.en) %>%
  count(nobelPrizes_affiliations_country.en, sort = TRUE) %>%
  rename(Host_Country = nobelPrizes_affiliations_country.en,
         International_Laureate_Count = n) %>%
  head(5)
international_results
# A tibble: 5 × 2
  Host_Country   International_Laureate_Count
  <chr>                                 <int>
1 USA                                     150
2 United Kingdom                           39
3 Germany                                  37
4 Switzerland                              13
5 France                                   12

Answer: This table shows that the USA has the highest number of recorded affiliations with International Laureate winners. So this can possibly signal that the USA is viewed as a place that attracts talent through it’s resources and large amount of opportunities. This doesn’t mean that other countries do not also attract talent at the global level, but it might not be as accessible.