Introduction

Indonesia is one of the most seismically active countries in the world, experiencing thousands of earthquakes each year. This high level of seismic activity is primarily due to its location along the Pacific Ring of Fire, a region characterized by intense tectonic activity, frequent earthquakes, and numerous volcanic eruptions. The interaction of several major tectonic plates in this area makes Indonesia particularly vulnerable to recurring seismic events, highlighting the importance of understanding both the spatial and temporal patterns of earthquake occurrences.

This study utilizes a comprehensive dataset that records earthquake events in Indonesia over the period from 2008 to 2022. The dataset includes key variables such as the date and time of each event, geographic coordinates (latitude and longitude), depth of the earthquake hypocenter measured in kilometers, and magnitude measured on the Richter scale. These variables provide essential information for analyzing both the location and intensity of seismic activity.

The main objective of this study is to examine the spatial distribution and temporal dynamics of earthquakes in Indonesia using point pattern analysis techniques. By applying both descriptive and quantitative methods, the study aims to determine whether earthquake occurrences are randomly distributed or exhibit significant clustering patterns. In addition, the analysis explores how earthquake intensity varies across space and over time. Through this approach, the study seeks to provide a clearer understanding of earthquake behavior in Indonesia and demonstrate the usefulness of spatial statistical methods in analyzing geophysical phenomena.

Setup

In this section, we load all required libraries and set repository options. These packages support data manipulation, spatial analysis, and visualization.

options(repos = c(CRAN = "https://cloud.r-project.org"))

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(sf)
## Warning: package 'sf' was built under R version 4.4.3
## Linking to GEOS 3.13.0, GDAL 3.10.1, PROJ 9.5.1; sf_use_s2() is TRUE
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
library(spatstat)
## Warning: package 'spatstat' was built under R version 4.4.3
## Loading required package: spatstat.data
## Warning: package 'spatstat.data' was built under R version 4.4.3
## Loading required package: spatstat.univar
## Warning: package 'spatstat.univar' was built under R version 4.4.3
## spatstat.univar 3.1-6
## Loading required package: spatstat.geom
## Warning: package 'spatstat.geom' was built under R version 4.4.3
## spatstat.geom 3.7-0
## Loading required package: spatstat.random
## Warning: package 'spatstat.random' was built under R version 4.4.3
## spatstat.random 3.4-4
## Loading required package: spatstat.explore
## Warning: package 'spatstat.explore' was built under R version 4.4.3
## Loading required package: nlme
## 
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
## 
##     collapse
## spatstat.explore 3.7-0
## Loading required package: spatstat.model
## Warning: package 'spatstat.model' was built under R version 4.4.3
## Loading required package: rpart
## spatstat.model 3.6-1
## Loading required package: spatstat.linnet
## Warning: package 'spatstat.linnet' was built under R version 4.4.3
## spatstat.linnet 3.4-1
## 
## spatstat 3.5-1 
## For an introduction to spatstat, type 'beginner'
library(spatstat.geom)
library(rnaturalearth)
## Warning: package 'rnaturalearth' was built under R version 4.4.3
library(rnaturalearthdata)
## Warning: package 'rnaturalearthdata' was built under R version 4.4.3
## 
## Attaching package: 'rnaturalearthdata'
## The following object is masked from 'package:rnaturalearth':
## 
##     countries110
library(zoo)
## Warning: package 'zoo' was built under R version 4.4.2
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

Data Loading and Preparation

We load the earthquake dataset and retain only the variables relevant for spatial and temporal analysis. This ensures a clean and efficient workflow.

# Load dataset
eq <- read.csv("C:\\Users\\MAGWALI\\Desktop\\data.csv")

# Keep only relevant variables
eq <- eq %>%
  select(date, time, latitude, longitude, depth, magnitude)

# Check structure
glimpse(eq)
## Rows: 87,372
## Columns: 6
## $ date      <chr> "2008-11-01", "2008-11-01", "2008-11-01", "2008-11-01", "200…
## $ time      <chr> "00:31:25", "01:34:29", "01:38:14", "02:20:05", "02:32:18", …
## $ latitude  <dbl> -0.60, -6.61, -3.65, -4.20, -4.09, -3.76, -3.89, 0.49, -4.26…
## $ longitude <dbl> 98.89553, 129.38722, 127.99068, 128.09700, 128.20047, 127.38…
## $ depth     <dbl> 20.0, 30.1, 5.0, 5.0, 10.0, 10.0, 10.0, 15.9, 10.0, 10.0, 9.…
## $ magnitude <dbl> 2.99, 5.51, 3.54, 2.42, 2.41, 2.94, 2.22, 3.85, 3.24, 2.32, …

Study Area Map (Indonesia)

The map of Indonesia provides essential geographic context for the study by outlining the spatial boundaries within which the earthquake data are analyzed. It makes it easier to identify the location of the study area and understand that all subsequent spatial analyses are confined to this region. While the map does not present analytical results, it is important for interpreting the spatial patterns observed in later plots.

# Get world map
world <- ne_countries(scale = "medium", returnclass = "sf")

# Filter Indonesia
indonesia <- world[world$name == "Indonesia", ]

# Plot
ggplot(data = indonesia) +
  geom_sf(fill = "darkgreen", color = "black") +
  labs(title = "Map of Indonesia") +
  theme_minimal()

Convert to spatial data

We convert the dataset into an `sf` object and transform it into a projected coordinate system. This step is necessary for spatial point pattern analysis.

eq_sf <- st_as_sf(eq, coords = c("longitude", "latitude"), crs = 4326)
eq_sf <- st_transform(eq_sf, 3857)

Spatial Distribution of Earthquakes



The spatial distribution plot shows the geographic locations of all recorded earthquake events across Indonesia. The points are unevenly distributed and form visible clusters in certain regions rather than being spread uniformly. This pattern suggests that earthquake occurrences are influenced by underlying geological structures, such as tectonic plate boundaries, and are therefore not randomly distributed in space.

ggplot() +
  geom_sf(data = eq_sf, color = "blue", size = 0.3, alpha = 0.5) +
  theme_minimal() +
  labs(title = "Spatial distribution of earthquakes")

Magnitude Distribution


The histogram of earthquake magnitudes shows the frequency distribution of seismic events. The distribution is strongly right-skewed, indicating that most earthquakes are of low magnitude, while high-magnitude events are relatively rare. This pattern is consistent with typical seismic activity, where small tremors occur frequently and large earthquakes occur infrequently.

ggplot(eq, aes(x = magnitude)) +geom_histogram(bins = 30, fill = "lightblue") +
  theme_classic() +
  labs(title = "Magnitude distribution")

Creating a Spatial Point Pattern

We convert the spatial data into a point pattern object (`ppp`) required for spatial statistical analysis.

W <- as.owin(st_bbox(eq_sf))
coords <- st_coordinates(eq_sf)

eq.ppp <- ppp(coords[,1], coords[,2], window = W)
## Warning: data contain duplicated points
# Transform to planar CRS (meters)
eq_sf_proj <- st_transform(eq_sf, 3857)
# Create window from bounding box
W <- as.owin(st_bbox(eq_sf_proj))
# Extract coordinates
coords <- st_coordinates(eq_sf_proj)

# Create point pattern
eq.ppp <- ppp(x = coords[,1],
              y = coords[,2],
              window = W)
## Warning: data contain duplicated points

Some earthquake events share identical geographic coordinates, resulting in duplicate points. To avoid computational issues in spatial point pattern analysis, a small random displacement (jitter) is applied to separate overlapping points while preserving the overall spatial structure.

# Count duplicates
coords_df <- as.data.frame(coords)

counts <- coords_df %>%
  group_by(X, Y) %>%
  summarise(n = n(), .groups = "drop")

# Create weighted point pattern
ppp_eq <- ppp(
  x = counts$X,
  y = counts$Y,
  window = W,
  marks = counts$n
)

Visualization of Point Pattern

The point pattern plot represents earthquake locations as a spatial point process, forming the basis for formal spatial statistical analysis. The visualization confirms the clustering observed in earlier plots and provides a structured representation of the data that can be used for further quantitative spatial analysis.

plot(eq.ppp, main = "Earthquake Locations (Point Pattern)")

First-order properties (Intensity)

Kernel density

The kernel density plot provides an estimate of earthquake intensity across the study region. Areas with higher density values represent locations where earthquakes occur more frequently, effectively identifying seismic hotspots. The results show that earthquake activity is concentrated in specific regions, reinforcing the presence of spatial clustering rather than a uniform distribution.

# Get bounding box
bbox <- st_bbox(eq_sf)

# Convert to spatstat window
W <- owin(
  xrange = c(bbox["xmin"], bbox["xmax"]),
  yrange = c(bbox["ymin"], bbox["ymax"])
)
coords <- st_coordinates(eq_sf)

ppp_eq <- ppp(
  x = coords[,1],
  y = coords[,2],
  window = W
)
## Warning: data contain duplicated points
# Kernel density (intensity)
density_eq <- density(ppp_eq)

plot(density_eq, main = "Earthquake Intensity (Kernel Density)")

Quadrat Analysis

The quadrat count plot divides the study area into a grid and displays the number of earthquake events within each cell. The counts vary considerably across the grid, with some cells containing many events and others very few. This uneven distribution indicates that earthquakes are clustered in certain areas rather than randomly distributed.

The quadrat test formally evaluates whether the observed spatial pattern deviates from complete spatial randomness. The extremely small p-value (p < 2.2e-16) leads to a strong rejection of the null hypothesis of spatial randomness. This provides statistical confirmation that the observed clustering is significant and not due to chance.

q <- quadratcount(ppp_eq, nx = 5, ny = 5)
plot(q, main = "Quadrat count")

quadrat.test(ppp_eq)
## 
##  Chi-squared test of CSR using quadrat counts
## 
## data:  ppp_eq
## X2 = 104981, df = 24, p-value < 2.2e-16
## alternative hypothesis: two.sided
## 
## Quadrats: 5 by 5 grid of tiles

Second-order properties

Nearest neighbour distance

The nearest neighbour distance histogram shows the distribution of distances between each earthquake and its closest neighbouring event. The concentration of small distances indicates that earthquakes tend to occur close to one another. This provides further evidence of clustering within the spatial pattern..

nnd <- nndist(ppp_eq)

hist(nnd, main = "Nearest neighbour distances", col = "lightblue")

mean(nnd)
## [1] 2636.074

Spatial Interaction Functions

F-function

The F-function describes the distribution of distances from randomly selected locations in the study area to the nearest earthquake event. The observed pattern indicates a higher probability of short distances than expected under spatial randomness, suggesting that earthquakes are spatially clustered.

G-function

The G-function examines the distribution of distances between each earthquake and its nearest neighbouring event. The results show that earthquakes are more closely spaced than would be expected under a random distribution, further supporting the presence of clustering.

Ripley’s K-function

Ripley’s K-function evaluates spatial dependence over a range of distances, allowing for the detection of clustering at multiple spatial scales. The results indicate that the observed values exceed those expected under spatial randomness, demonstrating that clustering occurs both locally and over broader spatial ranges.

L-function

The L-function, a transformation of Ripley’s K-function, provides a more interpretable measure of spatial structure. The observed values lie above the reference line for randomness, confirming a significant degree of clustering in the earthquake data.

F <- Fest(eq.ppp)
G <- Gest(eq.ppp)

plot(F, main = "F-function")

plot(G, main = "G-function")

K <- Kest(eq.ppp)
## number of data points exceeds 3000 - computing border correction estimate only
plot(K, main = "Ripley's K-function")

L <- Lest(eq.ppp)
## number of data points exceeds 3000 - computing border correction estimate only
plot(L, main = "L-function")
abline(h = 0, col = "red", lty = 2)

Marked Point Pattern Analysis


The marked point pattern plot incorporates earthquake magnitude as an attribute associated with each spatial point. This allows for the simultaneous visualization of both location and magnitude, making it possible to assess whether stronger earthquakes are concentrated in specific areas or follow the same spatial pattern as weaker events.

The categorized magnitude plot groups earthquakes into low, medium, and high magnitude categories. This classification enables comparison of their spatial distributions and helps identify whether higher-magnitude earthquakes are concentrated in particular regions or are distributed similarly to lower-magnitude events.

ppp_marked <- ppp(coords[,1], coords[,2], window = W, marks = eq$magnitude)
## Warning: data contain duplicated points
plot(ppp_marked, main = "Earthquakes with Magnitude")

eq$mag_cat <- cut(eq$magnitude,
                  breaks = c(-Inf, 4, 6, Inf),
                  labels = c("Low", "Medium", "High"))

ppp_marked_cat <- ppp(coords[,1], coords[,2], window = W, marks = eq$mag_cat)
## Warning: data contain duplicated points
plot(ppp_marked_cat, main = "Earthquakes by Magnitude Category")

Temporal Data Preparation

We combine date and time into a single datetime variable for time series analysis.

eq$datetime <- as.POSIXct(
  paste(eq$date, eq$time),
  format = "%Y-%m-%d %H:%M:%S",
  tz = "UTC"
)

Daily Earthquake Frequency

The daily earthquake frequency plot shows the number of recorded earthquake events over time. The time series exhibits fluctuations in daily counts, indicating that earthquake activity varies over time rather than remaining constant. Periods of increased activity may reflect episodic seismic events.

eq_daily <- eq %>%
  mutate(day = as.Date(datetime)) %>%
  group_by(day) %>%
  summarise(count = n())

ggplot(eq_daily, aes(day, count)) +
  geom_line() +
  labs(title = "Daily Number of Earthquakes",
       x = "Date",
       y = "Count") +
  theme_minimal()

Magnitude Over Time

The magnitude over time plot displays individual earthquake magnitudes along with a smoothed trend line. Although individual magnitudes vary considerably, the overall trend remains relatively stable, suggesting that there is no clear long-term increase or decrease in earthquake strength over the study period.

ggplot(eq, aes(datetime, magnitude)) +
  geom_point(alpha = 0.3) +
  geom_smooth() +
  labs(title = "Earthquake Magnitude Over Time",
       x = "Time",
       y = "Magnitude") +
  theme_minimal()
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

Moving Average Analysis

The moving average plot smooths daily earthquake counts using a 7-day window to reduce short-term fluctuations. This approach highlights underlying trends in seismic activity, making it easier to identify periods of relatively higher or lower earthquake frequency over time.

eq_daily$moving_avg <- rollmean(eq_daily$count, k = 7, fill = NA)

ggplot(eq_daily, aes(day)) +
  geom_line(aes(y = count), alpha = 0.4) +
  geom_line(aes(y = moving_avg), size = 1) +
  labs(title = "Earthquake Frequency with 7-Day Moving Average",
       x = "Date",
       y = "Count") +
  theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_line()`).

Conclusion

This study investigated the spatial and temporal characteristics of earthquake occurrences in Indonesia using a comprehensive point pattern analysis framework. Drawing on a large dataset of recorded earthquake events, both descriptive and quantitative methods were applied to examine the distribution, intensity, and dynamics of seismic activity.

The exploratory data analysis revealed that earthquake magnitudes are predominantly low, with relatively few high-magnitude events, indicating a strongly skewed distribution typical of seismic processes. Spatial visualizations further showed that earthquake events are unevenly distributed across the study region, with clear concentrations in specific areas. Kernel density estimation reinforced these findings by identifying distinct hotspots of seismic intensity.

The quantitative spatial analysis provided strong statistical evidence that earthquake occurrences are not randomly distributed. The quadrat test decisively rejected the null hypothesis of complete spatial randomness, confirming significant clustering. This conclusion was further supported by second-order analyses, including nearest neighbour distances and spatial interaction functions. The F- and G-functions indicated a higher probability of short inter-event distances, while Ripley’s K-function and the L-function demonstrated that clustering persists across multiple spatial scales. Together, these results highlight the presence of structured spatial dependence in earthquake occurrences.

The marked point pattern analysis showed that incorporating earthquake magnitude does not substantially alter the overall spatial pattern, although it allows for clearer differentiation between levels of seismic intensity. Temporal analysis revealed that earthquake frequency varies over time, with observable fluctuations and short-term trends. However, there is no clear long-term trend in earthquake magnitude, suggesting stability in the overall strength of seismic events despite variations in frequency.

In conclusion, the results demonstrate that earthquake activity in Indonesia is characterized by significant spatial clustering and temporal variability, reflecting the influence of underlying tectonic processes. The integration of descriptive, spatial, and statistical analyses provides a coherent understanding of earthquake patterns in the region. These findings illustrate the effectiveness of point pattern analysis in uncovering complex spatial structures and contribute to a deeper understanding of seismic risk in tectonically active areas.

Literature Review

Previous research has highlighted the importance of spatial and statistical methods in understanding earthquake activity in Indonesia, a region characterized by complex tectonic interactions. Analysis of Earthquake Activity in Indonesia by Clustering Method demonstrates that earthquake occurrences tend to form clusters rather than being randomly distributed. Using clustering techniques, the study identifies regions of high seismic activity and emphasizes the role of tectonic structures in shaping these patterns. This supports the application of spatial point pattern methods in analyzing earthquake distributions.

In a more advanced analytical framework, Seismic Activity Analysis in Indonesia: Integrating Machine Learning, Geospatial Data, and Environmental Factors for Risk Assessment incorporates machine learning and geospatial data to assess seismic risk. The study highlights the value of combining spatial analysis with computational techniques to improve the prediction and understanding of earthquake-prone areas. It further reinforces the idea that earthquake patterns are influenced by multiple interacting factors, including environmental and geological conditions.

Additionally, Visualization of the Relationship Between Magnitude and Depth of Earthquakes in Southern Indonesia Through Curve Interpolation focuses on the relationship between earthquake magnitude and depth. The findings suggest that variations in magnitude are linked to underlying geophysical processes and can provide further insight into seismic behavior. This perspective complements spatial analyses by incorporating additional dimensions of earthquake characteristics.

Overall, the existing literature consistently shows that earthquake activity in Indonesia is spatially structured and influenced by geological processes. While previous studies have applied clustering methods, machine learning, and interpolation techniques, this study contributes by applying spatial point pattern analysis, including first-order and second-order properties, to provide a comprehensive statistical assessment of earthquake distribution and intensity.