Exoplanets

Author

Robert Gravatt

Artist’s conception of the surface of Proxima Centauri b. The Alpha Centauri AB binary system can be seen in the distance, to the upper right of Proxima, as two white dots.

Load NASA Exoplanet Archive

library(tidyverse)

exoplanets <- read_csv("Exoplanets.csv")

Plot the Radii of Known Exoplanets

# Clean dataset
exoplanets_clean <- exoplanets |>
  filter(!is.na(pl_rade))

# Refined histogram
exoplanets_clean |>
  ggplot(aes(x = pl_rade)) +
  geom_histogram(bins = 50, fill = "forestgreen", color = "white") +
  scale_x_log10() +
  labs(
    title = "Distribution of Exoplanet Radii",
    x = "Planet Radius (Earth radii, log scale)",
    y = "Count"
  ) +
  theme_minimal()

Plot Interpretation

This histogram displays the distribution of exoplanet radii on a logarithmic scale, revealing a clear bimodal pattern. The first peak occurs around 2-3 Earth radii, corresponding to rocky planets and super-Earths, while the second peak appears near 15 Earth radii, representing gas giants like Neptune and Jupiter. The dip between these peaks reflects the relative scarcity of intermediate-sized planets, often referred to as the “radius gap,” which may result from atmospheric loss or formation dynamics. Overall, the plot highlights two dominant populations in the exoplanet census: small terrestrial worlds and large gaseous planets.

[1] Radius gap. Wikipedia, Wikimedia Foundation, 14 Nov. 2025, en.wikipedia.org/wiki/Radius_gap..

Exoplanet Radii Faceted by Discovery Method

# Clean dataset
exoplanets_clean <- exoplanets |>
  filter(!is.na(pl_rade), !is.na(discoverymethod))

# a custom palette of 11 vivid colors
vivid_colors <- c(
  "forestgreen",
  "cyan",
  "royalblue",
  "darkorchid",
  "goldenrod",
  "dodgerblue",
  "darkorange",
  "deeppink",
  "steelblue",
  "mediumseagreen",
  "brown"
)

# Histogram faceted by discovery method with vivid colors
exoplanets_clean |>
  ggplot(aes(x = pl_rade, fill = discoverymethod)) +
  geom_histogram(bins = 40) +
  scale_x_log10() +
  scale_fill_manual(values = vivid_colors) +
  labs(
    title = "Distribution of Exoplanet Radii by Discovery Method",
    x = "Planet Radius (Earth radii, log scale)",
    y = "Count"
  ) +
  facet_wrap(~discoverymethod, scales = "free_y") +
  theme_minimal() +
  theme(
    legend.position = "none",              # remove legend
    strip.text = element_text(size = 7),   # small facet titles
    plot.title = element_text(size = 12, face = "bold")
  )

Interpretation of Plots

This faceted histogram reveals how different exoplanet discovery methods are biased toward detecting planets of specific sizes. Each panel shows the distribution of planet radii for a particular method, plotted on a logarithmic scale to accommodate the wide range of sizes—from Earth-like rocky planets to massive gas giants.

Transit and radial velocity methods dominate the dataset and show broad distributions. Transit detections peak around 1–3 Earth radii, reflecting their sensitivity to small planets that periodically block starlight. Radial velocity methods, on the other hand, favor larger planets (5–20 Earth radii), since massive planets induce stronger stellar wobbles that are easier to measure. Transit is best for detecting Earth-sized planets due to its sensitivity to small dips in starlight, while radial velocity excels at finding gas giants by measuring the strong stellar wobbles they induce.

Microlensing and direct imaging methods show distinct patterns. Microlensing tends to detect planets at intermediate radii, often several Earth radii, due to its reliance on gravitational lensing events. Direct imaging skews heavily toward very large planets because only massive, widely separated planets are bright enough to be resolved against their host stars.

Other methods like astrometry and eclipse timing show narrower distributions, often with fewer detections, reflecting their more specialized or less mature status. Overall, the plot highlights how observational technique shapes our view of planetary populations, with each method contributing a unique slice of the exoplanet census.

[2] Methods of Detecting Exoplanets. Wikipedia, Wikimedia Foundation, 14 Nov. 2025, en.wikipedia.org/wiki/Methods_of_detecting_exoplanets..

Exoplanet Radii by Host Star Luminosity

# Clean dataset: keep planets with radius and spectral type, drop NA luminosity classes
exoplanets_clean <- exoplanets |>
  filter(!is.na(pl_rade), !is.na(st_spectype)) |>
  mutate(
    lum_class = str_extract(st_spectype, "(I{1,3}|II|V)")   # extract I, II, III, V
  ) |>
  filter(!is.na(lum_class))   # drop NA classes

# Histogram of planet radius faceted by luminosity class
exoplanets_clean |>
  ggplot(aes(x = pl_rade, fill = lum_class)) +
  geom_histogram(bins = 40, alpha = 0.7) +
  scale_x_log10() +
  scale_fill_manual(values = c("I" = "firebrick",
                               "II" = "goldenrod",
                               "III" = "forestgreen",
                               "V" = "royalblue")) +
  labs(
    title = "Exoplanet Radius Distribution by Host Star Luminosity Class",
    x = "Planet Radius (Earth radii, log scale)",
    y = "Count",
    fill = "Luminosity Class"
  ) +
  facet_wrap(~lum_class, scales = "free_y") +
  theme_minimal() +
  theme(
    legend.position = "none",                 # remove legend
    strip.text = element_text(size = 9),
    plot.title = element_text(size = 12, face = "bold")
  )

Interpretation

This faceted histogram shows how the distribution of exoplanet sizes varies depending on the luminosity class of their host stars. Planets around main sequence stars (V) are the most numerous, with a broad spread from Earth‑sized worlds up to gas giants, reflecting the fact that most surveys target Sun‑like stars. Systems around giant stars (III) tend to host larger planets, since close‑in small planets may be engulfed as the star expands. Supergiants (I) and bright giants (II) appear less frequently in the dataset, but when present they are more often associated with massive planets, consistent with observational biases and stellar evolution effects. Overall, the plot highlights that stellar type strongly influences both the detectability and survival of planets, with dwarfs yielding the richest diversity and giants skewing toward larger companions.

[3] Planet-hosting star. Wikipedia, Wikimedia Foundation, 14 Nov. 2025, en.wikipedia.org/wiki/Planet-hosting_star..

Histogram of Number of Planets per Stellar System

ggplot(exoplanets, aes(x = sy_pnum)) +
  geom_bar(fill = "steelblue") +
  labs(title = "Number of Planets per System",
       x = "Planets in System", y = "Count") +
  theme_minimal()

Interpretation

This bar chart shows that most known exoplanet systems contain only a single detected planet, with progressively fewer systems hosting multiple planets.

Host Star Temperature by Year of Discovery

exoplanets |>
  group_by(disc_year) |>
  summarize(mean_teff = mean(st_teff, na.rm = TRUE)) |>
  ggplot(aes(x = disc_year, y = mean_teff)) +
  geom_line(color = "purple") +
  labs(title = "Average Host Star Temperature by Year of Discovery",
       x = "Discovery Year", y = "Mean Temperature (K)") +
  theme_minimal()

Plot Interpretation

By 2010, the Kepler Space Telescope had just begun operations (launched in 2009). Kepler was designed to monitor over 150,000 stars, most of them F‑ and G‑type stars with effective temperatures in the range of 6000–7000 K. These stars were chosen because they are bright enough for precise photometry and similar to the Sun, making them ideal targets for detecting Earth‑sized planets via the transit method.

The spike in 2010 reflects the first wave of Kepler discoveries, which dramatically increased the number of known exoplanets. Unlike earlier radial velocity surveys that favored nearby cooler stars, Kepler’s wide‑field approach focused on hotter, Sun‑like stars in a distant patch of the Milky Way. This explains why the NASA dataset shows an average host star temperature of 6500 K in 2010.

[4] Kepler Space Telescope. Wikipedia, Wikimedia Foundation, 14 Nov. 2025, en.wikipedia.org/wiki/Kepler_space_telescope.

Diagram of Exoplanet Radius versus Orbital Period for Binary Star Systems

library(scales)   # label_comma() and other formatters

binary_planets <- exoplanets |>
  filter(!is.na(pl_orbper), !is.na(pl_rade), sy_snum >= 2)

ggplot(binary_planets, aes(x = pl_orbper, y = pl_rade, color = discoverymethod)) +
  geom_point(alpha = 0.6) +
  scale_x_log10(labels = label_comma()) +   # commas instead of 1e+ notation
  scale_y_log10(labels = label_comma()) +
  labs(
    title = "Exoplanet Radius vs Orbital Period (Binary Star Systems)",
    x = "Orbital Period (days, log scale)",
    y = "Planet Radius (Earth radii, log scale)",
    color = "Discovery Method"
  ) +
  theme_minimal()

Interpretation

Exoplanet discoveries in binary star systems reveal how stellar multiplicity influences planet formation and detection. When plotting planet radius against orbital period for these systems, the distribution shows that many planets orbiting binaries tend to be larger and often detected through radial velocity or transit methods, reflecting observational biases and the dynamical challenges of detecting smaller, longer‑period planets in multi‑star environments. This pattern highlights both the resilience of planet formation in complex gravitational settings and the limitations of current detection techniques, offering insight into how planetary systems differ when more than one star is present.

[5] “Circumbinary Planet.” Wikipedia, Wikimedia Foundation, 17 Nov. 2025, en.wikipedia.org/wiki/Circumbinary_planet..