For this assignment, I worked with the Nobel Prize API, which provides open data about Nobel laureates and prizes in JSON format. The data includes information such as laureates’ birthplaces, nationalities, affiliations, prize categories, and award years.
The goal of this analysis was to retrieve data directly from the
Nobel Prize API and answer four questions:
1. Which countries have produced the most Nobel laureates?
2. What is the average age of laureates when they received their Nobel
Prize (by category)?
3. How many Nobel Prizes have been awarded per decade?
4. Which continents have produced the most Nobel laureates?
library(httr)
library(jsonlite)
library(dplyr)
library(tidyr)
library(knitr)
library(purrr)
library(lubridate)
The Nobel Prize API provides two main endpoints:
laureates_response <- GET("https://api.nobelprize.org/2.1/laureates")
nobel_response <- GET("https://api.nobelprize.org/2.1/nobelPrizes")
# Check status (should be 200)
status_code(laureates_response)
## [1] 200
status_code(nobel_response)
## [1] 200
laureates_json <- content(laureates_response, "text")
nobel_json <- content(nobel_response, "text")
laureates_data <- fromJSON(laureates_json, flatten = TRUE)
nobel_data <- fromJSON(nobel_json, flatten = TRUE)
names(laureates_data)
## [1] "laureates" "meta" "links"
names(nobel_data)
## [1] "nobelPrizes" "meta" "links"
names(laureates_data$laureates)
## [1] "id" "fileName"
## [3] "gender" "sameAs"
## [5] "links" "nobelPrizes"
## [7] "knownName.en" "knownName.se"
## [9] "givenName.en" "givenName.se"
## [11] "familyName.en" "familyName.se"
## [13] "fullName.en" "fullName.se"
## [15] "birth.date" "birth.place.city.en"
## [17] "birth.place.city.no" "birth.place.city.se"
## [19] "birth.place.country.en" "birth.place.country.no"
## [21] "birth.place.country.se" "birth.place.cityNow.en"
## [23] "birth.place.cityNow.no" "birth.place.cityNow.se"
## [25] "birth.place.cityNow.sameAs" "birth.place.cityNow.latitude"
## [27] "birth.place.cityNow.longitude" "birth.place.countryNow.en"
## [29] "birth.place.countryNow.no" "birth.place.countryNow.se"
## [31] "birth.place.countryNow.sameAs" "birth.place.countryNow.latitude"
## [33] "birth.place.countryNow.longitude" "birth.place.continent.en"
## [35] "birth.place.continent.no" "birth.place.continent.se"
## [37] "birth.place.locationString.en" "birth.place.locationString.no"
## [39] "birth.place.locationString.se" "wikipedia.slug"
## [41] "wikipedia.english" "wikidata.id"
## [43] "wikidata.url" "death.date"
## [45] "death.place.city.en" "death.place.city.no"
## [47] "death.place.city.se" "death.place.country.en"
## [49] "death.place.country.no" "death.place.country.se"
## [51] "death.place.country.sameAs" "death.place.cityNow.en"
## [53] "death.place.cityNow.no" "death.place.cityNow.se"
## [55] "death.place.cityNow.sameAs" "death.place.cityNow.latitude"
## [57] "death.place.cityNow.longitude" "death.place.countryNow.en"
## [59] "death.place.countryNow.no" "death.place.countryNow.se"
## [61] "death.place.countryNow.sameAs" "death.place.countryNow.latitude"
## [63] "death.place.countryNow.longitude" "death.place.continent.en"
## [65] "death.place.continent.no" "death.place.continent.se"
## [67] "death.place.locationString.en" "death.place.locationString.no"
## [69] "death.place.locationString.se"
country_counts <- laureates_data$laureates %>%
filter(!is.na(birth.place.countryNow.en)) %>%
count(birth.place.countryNow.en, sort = TRUE)
kable(head(country_counts, 10),
caption = "Top 10 Countries by Number of Nobel Laureates")
| birth.place.countryNow.en | n |
|---|---|
| USA | 4 |
| Germany | 3 |
| Israel | 2 |
| Japan | 2 |
| Algeria | 1 |
| Argentina | 1 |
| Belgium | 1 |
| Denmark | 1 |
| Egypt | 1 |
| Ethiopia | 1 |
Observation: The data shows that countries such as the United States and Germany have consistently produced the most Nobel laureates. This could reflect their long-standing commitment to education, innovation, and international collaboration.
age_by_category <- laureates_data$laureates %>%
tidyr::unnest(nobelPrizes, names_sep = "_") %>%
filter(!is.na(birth.date), !is.na(nobelPrizes_awardYear)) %>%
mutate(
age_at_award = as.numeric(nobelPrizes_awardYear) -
as.numeric(substr(birth.date, 1, 4))
) %>%
group_by(nobelPrizes_category.en) %>%
summarise(avg_age = round(mean(age_at_award, na.rm = TRUE), 1),
.groups = "drop") %>%
arrange(nobelPrizes_category.en)
kable(age_by_category, caption = "Average Age at Award by Category")
| nobelPrizes_category.en | avg_age |
|---|---|
| Chemistry | 62.0 |
| Economic Sciences | 58.0 |
| Literature | 58.5 |
| Peace | 50.3 |
| Physics | 55.6 |
| Physiology or Medicine | 62.5 |
Observation: Most Nobel laureates receive recognition in their mid to late adulthood, usually between 50 and 63 years old. This trend suggests that major contributions are often acknowledged after decades of research, leadership, or creative work.
prizes_by_decade <- nobel_data$nobelPrizes %>%
mutate(decade = (as.numeric(awardYear) %/% 10) * 10) %>%
count(decade) %>%
arrange(decade)
kable(prizes_by_decade, caption = "Number of Nobel Prizes Awarded by Decade")
| decade | n |
|---|---|
| 1900 | 25 |
Observation: About 25 Nobel Prizes were awarded in the 1900s, marking the early phase of the award’s history and the beginning of its international recognition.
continent_counts <- laureates_data$laureates %>%
filter(!is.na(birth.place.continent.en)) %>%
count(birth.place.continent.en, sort = TRUE)
kable(continent_counts, caption = "Number of Nobel Laureates by Continent")
| birth.place.continent.en | n |
|---|---|
| Europe | 9 |
| Asia | 6 |
| North America | 4 |
| Africa | 3 |
| Oceania | 1 |
| South America | 1 |
Observation: Most Nobel laureates originate from Europe, with Asia and North America ranking next. The results highlight Europe’s enduring academic presence and the growing recognition of contributions from other regions.
Through the Nobel Prize API, I extracted and analyzed JSON data about laureates and their awards. This analysis demonstrated how to connect to an open API, parse nested JSON data, and derive meaningful insights using R.
By exploring country patterns, age trends, and regional distributions, I gained a clearer understanding of the historical and global factors shaping Nobel recognition.