1. Introduction

For this assignment, I worked with the Nobel Prize API, which provides open data about Nobel laureates and prizes in JSON format. The data includes information such as laureates’ birthplaces, nationalities, affiliations, prize categories, and award years.

The goal of this analysis was to retrieve data directly from the Nobel Prize API and answer four questions:
1. Which countries have produced the most Nobel laureates?
2. What is the average age of laureates when they received their Nobel Prize (by category)?
3. How many Nobel Prizes have been awarded per decade?
4. Which continents have produced the most Nobel laureates?


2. Load Libraries

library(httr)
library(jsonlite)
library(dplyr)
library(tidyr)
library(knitr)
library(purrr)
library(lubridate)

3. Access the API

The Nobel Prize API provides two main endpoints:

laureates_response <- GET("https://api.nobelprize.org/2.1/laureates")
nobel_response <- GET("https://api.nobelprize.org/2.1/nobelPrizes")

# Check status (should be 200)
status_code(laureates_response)
## [1] 200
status_code(nobel_response)
## [1] 200

4. Parse JSON

laureates_json <- content(laureates_response, "text")
nobel_json <- content(nobel_response, "text")

laureates_data <- fromJSON(laureates_json, flatten = TRUE)
nobel_data <- fromJSON(nobel_json, flatten = TRUE)

names(laureates_data)
## [1] "laureates" "meta"      "links"
names(nobel_data)
## [1] "nobelPrizes" "meta"        "links"

5. Explore the Laureates Data

names(laureates_data$laureates)
##  [1] "id"                               "fileName"                        
##  [3] "gender"                           "sameAs"                          
##  [5] "links"                            "nobelPrizes"                     
##  [7] "knownName.en"                     "knownName.se"                    
##  [9] "givenName.en"                     "givenName.se"                    
## [11] "familyName.en"                    "familyName.se"                   
## [13] "fullName.en"                      "fullName.se"                     
## [15] "birth.date"                       "birth.place.city.en"             
## [17] "birth.place.city.no"              "birth.place.city.se"             
## [19] "birth.place.country.en"           "birth.place.country.no"          
## [21] "birth.place.country.se"           "birth.place.cityNow.en"          
## [23] "birth.place.cityNow.no"           "birth.place.cityNow.se"          
## [25] "birth.place.cityNow.sameAs"       "birth.place.cityNow.latitude"    
## [27] "birth.place.cityNow.longitude"    "birth.place.countryNow.en"       
## [29] "birth.place.countryNow.no"        "birth.place.countryNow.se"       
## [31] "birth.place.countryNow.sameAs"    "birth.place.countryNow.latitude" 
## [33] "birth.place.countryNow.longitude" "birth.place.continent.en"        
## [35] "birth.place.continent.no"         "birth.place.continent.se"        
## [37] "birth.place.locationString.en"    "birth.place.locationString.no"   
## [39] "birth.place.locationString.se"    "wikipedia.slug"                  
## [41] "wikipedia.english"                "wikidata.id"                     
## [43] "wikidata.url"                     "death.date"                      
## [45] "death.place.city.en"              "death.place.city.no"             
## [47] "death.place.city.se"              "death.place.country.en"          
## [49] "death.place.country.no"           "death.place.country.se"          
## [51] "death.place.country.sameAs"       "death.place.cityNow.en"          
## [53] "death.place.cityNow.no"           "death.place.cityNow.se"          
## [55] "death.place.cityNow.sameAs"       "death.place.cityNow.latitude"    
## [57] "death.place.cityNow.longitude"    "death.place.countryNow.en"       
## [59] "death.place.countryNow.no"        "death.place.countryNow.se"       
## [61] "death.place.countryNow.sameAs"    "death.place.countryNow.latitude" 
## [63] "death.place.countryNow.longitude" "death.place.continent.en"        
## [65] "death.place.continent.no"         "death.place.continent.se"        
## [67] "death.place.locationString.en"    "death.place.locationString.no"   
## [69] "death.place.locationString.se"

6. Countries That Have Produced the Most Nobel Laureates

country_counts <- laureates_data$laureates %>%
filter(!is.na(birth.place.countryNow.en)) %>%
count(birth.place.countryNow.en, sort = TRUE)

kable(head(country_counts, 10),
caption = "Top 10 Countries by Number of Nobel Laureates")
Top 10 Countries by Number of Nobel Laureates
birth.place.countryNow.en n
USA 4
Germany 3
Israel 2
Japan 2
Algeria 1
Argentina 1
Belgium 1
Denmark 1
Egypt 1
Ethiopia 1

Observation: The data shows that countries such as the United States and Germany have consistently produced the most Nobel laureates. This could reflect their long-standing commitment to education, innovation, and international collaboration.


7. Average Age of Laureates at the Time of the Award

age_by_category <- laureates_data$laureates %>%
tidyr::unnest(nobelPrizes, names_sep = "_") %>%
filter(!is.na(birth.date), !is.na(nobelPrizes_awardYear)) %>%
mutate(
age_at_award = as.numeric(nobelPrizes_awardYear) -
as.numeric(substr(birth.date, 1, 4))
) %>%
group_by(nobelPrizes_category.en) %>%
summarise(avg_age = round(mean(age_at_award, na.rm = TRUE), 1),
.groups = "drop") %>%
arrange(nobelPrizes_category.en)

kable(age_by_category, caption = "Average Age at Award by Category")
Average Age at Award by Category
nobelPrizes_category.en avg_age
Chemistry 62.0
Economic Sciences 58.0
Literature 58.5
Peace 50.3
Physics 55.6
Physiology or Medicine 62.5

Observation: Most Nobel laureates receive recognition in their mid to late adulthood, usually between 50 and 63 years old. This trend suggests that major contributions are often acknowledged after decades of research, leadership, or creative work.


8. Nobel Prizes Awarded per Decade

prizes_by_decade <- nobel_data$nobelPrizes %>%
mutate(decade = (as.numeric(awardYear) %/% 10) * 10) %>%
count(decade) %>%
arrange(decade)

kable(prizes_by_decade, caption = "Number of Nobel Prizes Awarded by Decade")
Number of Nobel Prizes Awarded by Decade
decade n
1900 25

Observation: About 25 Nobel Prizes were awarded in the 1900s, marking the early phase of the award’s history and the beginning of its international recognition.


9. Distribution of Nobel Laureates by Continent

continent_counts <- laureates_data$laureates %>%
filter(!is.na(birth.place.continent.en)) %>%
count(birth.place.continent.en, sort = TRUE)

kable(continent_counts, caption = "Number of Nobel Laureates by Continent")
Number of Nobel Laureates by Continent
birth.place.continent.en n
Europe 9
Asia 6
North America 4
Africa 3
Oceania 1
South America 1

Observation: Most Nobel laureates originate from Europe, with Asia and North America ranking next. The results highlight Europe’s enduring academic presence and the growing recognition of contributions from other regions.


10. Conclusion

Through the Nobel Prize API, I extracted and analyzed JSON data about laureates and their awards. This analysis demonstrated how to connect to an open API, parse nested JSON data, and derive meaningful insights using R.

By exploring country patterns, age trends, and regional distributions, I gained a clearer understanding of the historical and global factors shaping Nobel recognition.