Introduction

This report examines worldwide earthquakes recorded during the month of January, 2026, compiled by the USGS.

For each recorded earthquake, this dataset includes:

  • time
  • depth (kilometers)
  • magnitude
  • depth error (kilometers)
  • magnitude error
  • and more not used in this report

An analysis of the data through ggplot visualizations will shed light on whether the depth of an earthquake can predict its magnitude, and vice versa.

Data Preparation

Load necessary packages

# load libraries

library(tidyverse)
library(here)
library(ggbeeswarm)
library(RColorBrewer)

Read in necessary data

# assign csv data to variable eqs

eqs <- read_csv(here("data", "world_eqs_jan_26.csv"))

Clean the data

We will be focusing on earthquakes with a magnitude error of less than 0.5, and a depth error of less than 10 km.

# filter data by magnitude and depth

eqs %>%
  filter(magError < 0.5, depthError < 10)
## # A tibble: 9,649 × 22
##    time                latitude longitude depth   mag magType   nst   gap
##    <dttm>                 <dbl>     <dbl> <dbl> <dbl> <chr>   <dbl> <dbl>
##  1 2026-01-01 00:00:43     38.8    -123.   2.04  1.03 md         18    54
##  2 2026-01-01 00:03:02     60.4    -140.   5     3.2  ml         37    78
##  3 2026-01-01 00:05:12     34.0    -117.  14.1   0.74 ml         29    78
##  4 2026-01-01 00:06:41     28.7     -99.2  6.52  1.8  ml         21    59
##  5 2026-01-01 00:13:06     59.9    -142.  11.4   2.2  ml         16   203
##  6 2026-01-01 00:17:27     32.1    -101.   2.32  1    ml         13    84
##  7 2026-01-01 00:24:44     31.8    -102.   4.71  0.4  ml         13    72
##  8 2026-01-01 00:33:16     37.7    -122.   4.52  0.95 md         13    88
##  9 2026-01-01 00:39:57     38.8    -123.   1.70  1.61 md         26    42
## 10 2026-01-01 00:41:23     38.8    -123.   2.38  0.39 md         17    67
## # ℹ 9,639 more rows
## # ℹ 14 more variables: dmin <dbl>, rms <dbl>, net <chr>, id <chr>,
## #   updated <dttm>, place <chr>, type <chr>, horizontalError <dbl>,
## #   depthError <dbl>, magError <dbl>, magNst <dbl>, status <chr>,
## #   locationSource <chr>, magSource <chr>

The data contains a column with a timestamp for each earthquake occurrence. To plot the data by date alone, I use the mutate() function. This step is necessary in order to use the group_by() function in a meaningful way by grouping occurrences by date rather than by each individual timestamp.

# mutate time column as date

eqs_clean <- eqs %>%
  mutate(date = as.Date(as.character(time)))

Analysis

How many earthquakes occurred each day?

First, we need a basic understanding of how many earthquakes occurred each day in January 2016. We look at this with geom_col().

eqs_clean %>% # pipe the cleaned earthquake data into the following functions
  na.omit() %>% # omit records with any 'na' values
  group_by(date) %>% # group by newly mutated date
  summarize(obs = n()) %>% # gather the total number of observations for each date
  ggplot(aes(x = date, y = obs)) + # plot a bar chart with ggplot() and geom_col()
  geom_col(fill = "purple") +
  labs(
    title = "Number of earthquakes per day",
    subtitle = "Global seismic activity, January 2026",
    x = "Date",
    y = "Count",
    caption = "Data obtained from earthquake.usgs.gov/earthquakes/"
  ) +
  theme_classic()

The data shows a slightly left-skewed distribution of earthquakes occurring in January. Let’s dive deeper into the data.

Analyzing overall magnitude

To properly visualize the differences in earthquake depth, I categorize depth into the three zones provided by the USGS:

  • Shallow (0-700km)
  • Intermediate (70-300km)
  • Deep (300-700km)
eqs_clean %>% # pipe the earthquake data
  na.omit() %>% # omit 'na' values
  
  # use case_when() to define depth categories
  mutate(category = case_when (
         depth <= 70 ~ "Shallow (0-70 km)",
         depth <= 300 ~ "Intermediate (70-300 km)",
         depth > 300 ~ "Deep (300+ km)")) %>%
  ggplot(aes(x = mag)) + # plot histograms with ggplot() and geom_histogram()
  geom_histogram(binwidth = 0.5, fill = "white", color = "purple") +
  facet_wrap(~ category) + # facet_wrap() creates 3 histograms for each category
  labs(
    title = "Global Earthquake Magnitudes",
    subtitle = "Global seismic activity, January 2026",
    x = "Magnitude",
    y = "Count",
    caption = "Data obtained from earthquake.usgs.gov/earthquakes/"
  ) +
  theme_classic()

The majority of earthquakes in January 2026 were low in magnitude, with a few larger magnitude events.

We can also get a general sense that the majority of earthquakes occurred in the shallow depth category.

Are deeper earthquakes stronger in magnitude?

We’ll get a clearer picture of the potential relationship between magnitude and depth with a geom_jitter() plot focusing on these two variables.

eqs_clean %>% # pipe the earthquake data
  na.omit() %>% # omit 'na' values
  ggplot(aes(x = depth, y = mag, color = depth)) +
  geom_jitter() + # add more variation to point locations
  labs(
    title = "Does depth predict magnitude?",
    subtitle = "Global seismic activity, January 2026",
    x = "Depth (km)",
    y = "Magnitude",
    color = "Depth (km)",
    caption = "Data obtained from earthquake.usgs.gov/earthquakes/"
  ) +
  theme_classic() +
  scale_color_gradient(low = "blue", high = "red")

The plot shows more red-colored, deeper earthquakes appear to be of a higher magnitude. This would support our assertion that deeper earthquakes have generally higher magnitudes.

To strengthen this observation, we will next evaluate the data and model’s reliability.

Examining the magnitude vs. depth relationship

Next, we need to calculate the data reliability with the standard deviation and standard error of magnitude.

eqs_clean %>% # pipe the earthquake data
  na.omit() %>% # omit 'na' values
  
  # finding model reliability with standard deviation and standard error of magnitude
  group_by(depth) %>%
  summarise(mean = mean(mag), # find the mean magnitude
            sd = sd(mag), # create a standard deviation column
            n = n(), # create a count column
            stderr = sd / sqrt(n)) %>% # create a standard error column
  ggplot(aes(x = depth, y = mean)) +
  geom_point() +
  geom_smooth() +
  geom_errorbar(aes(ymin = mean - stderr, ymax = mean + stderr), # display thin error bars
                alpha = 0.2, width = 0) +
  labs(
    title = "Mean magnitude by depth with standard error",
    subtitle = "Error bar shows USGS-reported measurement uncertainty, January 2026",
    x = "Depth (km)",
    y = "Mean magnitude"
  ) +
  theme_classic()

ggsave("earthquakes.png") # save the visualization as a png file

The graph supports the assertion that deeper earthquakes occur at higher magnitudes. The curve exhibits a steep rise in magnitude at shallow depths, and levels off ~200 km depth. The deeper earthquakes seem steadily high in magnitude. However, there are notably fewer deep earthquake occurrences than shallow. Interestingly, the shallow earthquakes show high variability and unpredictability.

Although this analysis of earthquakes in January 2026 would support the assertion that deeper earthquakes are higher in magnitude, this is only a surface level (no pun intended) observation. Further analyses accounting for variables such as aftershocks, geographic location, seasonality, and many other confounders would need to be conducted in order to come to a firmer conclusion.