This report examines worldwide earthquakes recorded during the month of January, 2026, compiled by the USGS.
An analysis of the data through ggplot visualizations will shed light on whether the depth of an earthquake can predict its magnitude, and vice versa.
# load libraries
library(tidyverse)
library(here)
library(ggbeeswarm)
library(RColorBrewer)
# assign csv data to variable eqs
eqs <- read_csv(here("data", "world_eqs_jan_26.csv"))
We will be focusing on earthquakes with a magnitude error of less than 0.5, and a depth error of less than 10 km.
# filter data by magnitude and depth
eqs %>%
filter(magError < 0.5, depthError < 10)
## # A tibble: 9,649 × 22
## time latitude longitude depth mag magType nst gap
## <dttm> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 2026-01-01 00:00:43 38.8 -123. 2.04 1.03 md 18 54
## 2 2026-01-01 00:03:02 60.4 -140. 5 3.2 ml 37 78
## 3 2026-01-01 00:05:12 34.0 -117. 14.1 0.74 ml 29 78
## 4 2026-01-01 00:06:41 28.7 -99.2 6.52 1.8 ml 21 59
## 5 2026-01-01 00:13:06 59.9 -142. 11.4 2.2 ml 16 203
## 6 2026-01-01 00:17:27 32.1 -101. 2.32 1 ml 13 84
## 7 2026-01-01 00:24:44 31.8 -102. 4.71 0.4 ml 13 72
## 8 2026-01-01 00:33:16 37.7 -122. 4.52 0.95 md 13 88
## 9 2026-01-01 00:39:57 38.8 -123. 1.70 1.61 md 26 42
## 10 2026-01-01 00:41:23 38.8 -123. 2.38 0.39 md 17 67
## # ℹ 9,639 more rows
## # ℹ 14 more variables: dmin <dbl>, rms <dbl>, net <chr>, id <chr>,
## # updated <dttm>, place <chr>, type <chr>, horizontalError <dbl>,
## # depthError <dbl>, magError <dbl>, magNst <dbl>, status <chr>,
## # locationSource <chr>, magSource <chr>
The data contains a column with a timestamp for each earthquake occurrence. To plot the data by date alone, I use the mutate() function. This step is necessary in order to use the group_by() function in a meaningful way by grouping occurrences by date rather than by each individual timestamp.
# mutate time column as date
eqs_clean <- eqs %>%
mutate(date = as.Date(as.character(time)))
First, we need a basic understanding of how many earthquakes occurred each day in January 2016. We look at this with geom_col().
eqs_clean %>% # pipe the cleaned earthquake data into the following functions
na.omit() %>% # omit records with any 'na' values
group_by(date) %>% # group by newly mutated date
summarize(obs = n()) %>% # gather the total number of observations for each date
ggplot(aes(x = date, y = obs)) + # plot a bar chart with ggplot() and geom_col()
geom_col(fill = "purple") +
labs(
title = "Number of earthquakes per day",
subtitle = "Global seismic activity, January 2026",
x = "Date",
y = "Count",
caption = "Data obtained from earthquake.usgs.gov/earthquakes/"
) +
theme_classic()
The data shows a slightly left-skewed distribution of earthquakes occurring in January. Let’s dive deeper into the data.
To properly visualize the differences in earthquake depth, I categorize depth into the three zones provided by the USGS:
eqs_clean %>% # pipe the earthquake data
na.omit() %>% # omit 'na' values
# use case_when() to define depth categories
mutate(category = case_when (
depth <= 70 ~ "Shallow (0-70 km)",
depth <= 300 ~ "Intermediate (70-300 km)",
depth > 300 ~ "Deep (300+ km)")) %>%
ggplot(aes(x = mag)) + # plot histograms with ggplot() and geom_histogram()
geom_histogram(binwidth = 0.5, fill = "white", color = "purple") +
facet_wrap(~ category) + # facet_wrap() creates 3 histograms for each category
labs(
title = "Global Earthquake Magnitudes",
subtitle = "Global seismic activity, January 2026",
x = "Magnitude",
y = "Count",
caption = "Data obtained from earthquake.usgs.gov/earthquakes/"
) +
theme_classic()
The majority of earthquakes in January 2026 were low in magnitude, with a few larger magnitude events.
We can also get a general sense that the majority of earthquakes occurred in the shallow depth category.
We’ll get a clearer picture of the potential relationship between magnitude and depth with a geom_jitter() plot focusing on these two variables.
eqs_clean %>% # pipe the earthquake data
na.omit() %>% # omit 'na' values
ggplot(aes(x = depth, y = mag, color = depth)) +
geom_jitter() + # add more variation to point locations
labs(
title = "Does depth predict magnitude?",
subtitle = "Global seismic activity, January 2026",
x = "Depth (km)",
y = "Magnitude",
color = "Depth (km)",
caption = "Data obtained from earthquake.usgs.gov/earthquakes/"
) +
theme_classic() +
scale_color_gradient(low = "blue", high = "red")
The plot shows more red-colored, deeper earthquakes appear to be of a higher magnitude. This would support our assertion that deeper earthquakes have generally higher magnitudes.
To strengthen this observation, we will next evaluate the data and model’s reliability.
Next, we need to calculate the data reliability with the standard deviation and standard error of magnitude.
eqs_clean %>% # pipe the earthquake data
na.omit() %>% # omit 'na' values
# finding model reliability with standard deviation and standard error of magnitude
group_by(depth) %>%
summarise(mean = mean(mag), # find the mean magnitude
sd = sd(mag), # create a standard deviation column
n = n(), # create a count column
stderr = sd / sqrt(n)) %>% # create a standard error column
ggplot(aes(x = depth, y = mean)) +
geom_point() +
geom_smooth() +
geom_errorbar(aes(ymin = mean - stderr, ymax = mean + stderr), # display thin error bars
alpha = 0.2, width = 0) +
labs(
title = "Mean magnitude by depth with standard error",
subtitle = "Error bar shows USGS-reported measurement uncertainty, January 2026",
x = "Depth (km)",
y = "Mean magnitude"
) +
theme_classic()
ggsave("earthquakes.png") # save the visualization as a png file
The graph supports the assertion that deeper earthquakes occur at higher magnitudes. The curve exhibits a steep rise in magnitude at shallow depths, and levels off ~200 km depth. The deeper earthquakes seem steadily high in magnitude. However, there are notably fewer deep earthquake occurrences than shallow. Interestingly, the shallow earthquakes show high variability and unpredictability.
Although this analysis of earthquakes in January 2026 would support the assertion that deeper earthquakes are higher in magnitude, this is only a surface level (no pun intended) observation. Further analyses accounting for variables such as aftershocks, geographic location, seasonality, and many other confounders would need to be conducted in order to come to a firmer conclusion.