In this assignment, I am using data retrieved from the NOAA.
I have used tsunami data for 1920-2020 (100 years) from NOAA.
The data was saved as a .tsv
file to be used for the
following tasks:
## Search.Parameters Year Mo Dy Hr Mn Sec Tsunami.Event.Validity
## 1 ["1920 <= Year >= 2020"] NA NA NA NA NA NA NA
## 2 1920 1 29 NA NA NA 1
## 3 1920 2 2 11 22 18 1
## 4 1920 8 20 16 15 38 3
## 5 1920 8 NA NA NA NA -1
## 6 1920 9 20 14 39 NA 3
## Tsunami.Cause.Code Earthquake.Magnitude Vol More.Info Deposits
## 1 NA NA NA NA NA
## 2 1 NA NA NA 0
## 3 1 7.7 NA NA 0
## 4 1 7.0 NA NA 0
## 5 1 NA NA NA 0
## 6 1 7.8 NA NA 0
## Country Location.Name Latitude Longitude
## 1 NA NA
## 2 INDONESIA N. MOLUCCAS ISLANDS 0.870 122.92
## 3 PAPUA NEW GUINEA SOLOMON SEA -6.500 150.00
## 4 CHILE CENTRAL CHILE -38.000 -73.50
## 5 USA TERRITORY PAGO PAGO, AMERICAN SAMOA -15.000 -170.00
## 6 VANUATU VANUATU ISLANDS -19.919 168.53
## Maximum.Water.Height..m. Number.of.Runups Tsunami.Magnitude..Abe.
## 1 NA NA NA
## 2 2.0 2 NA
## 3 NA 1 NA
## 4 1.4 1 NA
## 5 NA 1 NA
## 6 NA 2 NA
## Tsunami.Magnitude..Iida. Tsunami.Intensity Deaths Death.Description Missing
## 1 NA NA NA NA NA
## 2 1.0 1 NA NA NA
## 3 NA NA NA NA NA
## 4 0.5 1 NA NA NA
## 5 NA NA NA NA NA
## 6 1.0 NA NA NA NA
## Missing.Description Injuries Injuries.Description Damage...Mil.
## 1 NA NA NA NA
## 2 NA NA NA NA
## 3 NA NA NA NA
## 4 NA NA NA NA
## 5 NA NA NA NA
## 6 NA NA NA NA
## Damage.Description Houses.Destroyed Houses.Destroyed.Description
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 1 NA NA
## Houses.Damaged Houses.Damaged.Description Total.Deaths
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
## Total.Death.Description Total.Missing Total.Missing.Description
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
## Total.Injuries Total.Injuries.Description Total.Damage...Mil.
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
## Total.Damage.Description Total.Houses.Destroyed
## 1 NA NA
## 2 NA NA
## 3 1 NA
## 4 2 NA
## 5 NA NA
## 6 1 NA
## Total.Houses.Destroyed.Description Total.Houses.Damaged
## 1 NA NA
## 2 NA NA
## 3 1 NA
## 4 2 NA
## 5 NA NA
## 6 NA NA
## Total.Houses.Damaged.Description
## 1 NA
## 2 NA
## 3 NA
## 4 2
## 5 NA
## 6 NA
# Vector of tsunami causes
causes <- c(
"0 = Unknown",
"1 = Earthquake",
"2 = Questionable Earthquake",
"3 = Earthquake and Landslide",
"4 = Volcano and Earthquake",
"5 = Volcano, Earthquake, and Landslide",
"6 = Volcano",
"7 = Volcano and Landslide",
"8 = Landslide",
"9 = Meteorological",
"10 = Explosion",
"11 = Astronomical Tide"
)
causes
## [1] "0 = Unknown"
## [2] "1 = Earthquake"
## [3] "2 = Questionable Earthquake"
## [4] "3 = Earthquake and Landslide"
## [5] "4 = Volcano and Earthquake"
## [6] "5 = Volcano, Earthquake, and Landslide"
## [7] "6 = Volcano"
## [8] "7 = Volcano and Landslide"
## [9] "8 = Landslide"
## [10] "9 = Meteorological"
## [11] "10 = Explosion"
## [12] "11 = Astronomical Tide"
# Filter for Tsunami Event Validity >= 3
filtered_tsunami_data <- tsunami_data[tsunami_data$Tsunami.Event.Validity >= 3, ]
# Histogram of maximum water height
hist(
filtered_tsunami_data$Maximum.Water.Height..m.,
main = "Histogram of Maximum Water Height",
xlab = "Maximum Water Height (m)",
breaks = 30,
col = "lightpink"
)
# Max Water Height
summary(filtered_tsunami_data$Maximum.Water.Height..m.)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.010 0.160 0.600 5.907 3.000 524.600 115
Max Water Height Histogram
This distribution is left-skewed, with a long tail extending to the right, meaning most tsunamis had lower water heights, with a few having extremely high values. This makes it rather difficult to analyze trends directly.
# New column for log10 of Maximum Water Height
filtered_tsunami_data$Log_Max_Height <- log10(filtered_tsunami_data$Maximum.Water.Height..m.)
# Histogram of the log10-transformed Maximum Water Height
hist(
filtered_tsunami_data$Log_Max_Height,
main = "Histogram of Log10(Maximum Water Height)",
xlab = "Log10(Maximum Water Height)",
breaks = 30,
col = "lightblue"
)
# log-transformed data
summary(filtered_tsunami_data$Log_Max_Height)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -2.0000 -0.7959 -0.2218 -0.1142 0.4771 2.7198 115
Log Histogram
The histogram of the Log10 (Max Water Height) demonstrates that the log transformation normalized the left-skewed data. The resulting distribution is relaitivly symmetric, making it easier to analyze trends in tsunami water heights across events without the lower values taking over the dataset.
# Vector of causes
cause_labels <- c(
"0" = "Unknown",
"1" = "Earthquake",
"2" = "Questionable Earthquake",
"3" = "Earthquake and Landslide",
"4" = "Volcano and Earthquake",
"5" = "Volcano, Earthquake, and Landslide",
"6" = "Volcano",
"7" = "Volcano and Landslide",
"8" = "Landslide",
"9" = "Meteorological",
"10" = "Explosion",
"11" = "Astronomical Tide"
)
# Match causes with Tsunami Cause Code
filtered_tsunami_data$Cause <- cause_labels[as.character(filtered_tsunami_data$Tsunami.Cause.Code)]
# Preview
head(filtered_tsunami_data[, c("Tsunami.Cause.Code", "Cause")])
## Tsunami.Cause.Code Cause
## NA NA <NA>
## 4 1 Earthquake
## 6 1 Earthquake
## 10 1 Earthquake
## 11 1 Earthquake
## 12 1 Earthquake
# Increase the margins to allow more space for x-axis labels
par(mar = c(14, 5, 4, 2), mgp = c(7, 1, 0)) # Adjust bottom margin and move x-axis label down
# Create the box plot
boxplot(
filtered_tsunami_data$Log_Max_Height ~ filtered_tsunami_data$Cause,
main = "Box Plot of Log10(Maximum Water Height) by Cause",
xlab = "Tsunami Cause",
ylab = "Log10(Maximum Water Height)",
las = 2, # Rotate x-axis labels
col = "purple",
cex.axis = 0.8 # Adjust axis text size
)
Log Boxplot
The box plot illustrates the distribution of log-transformed maximum water height for tsunamis grouped by their causes. It provides a graph of data as it groups the log-transformed maximum water heights by cause, allowing for a clearer comparison of the variability, medians, and range for each cause.
# ANOVA test
anova_result <- aov(Log_Max_Height ~ Cause, data = filtered_tsunami_data)
# Summarize ANOVA results
summary(anova_result)
## Df Sum Sq Mean Sq F value Pr(>F)
## Cause 9 124.0 13.778 25.14 <2e-16 ***
## Residuals 744 407.8 0.548
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 115 observations deleted due to missingness
# Tukey's HSD test
tukey_result <- TukeyHSD(anova_result, "Cause", ordered = TRUE)
# Print Tukey test results
tukey_result
## Tukey multiple comparisons of means
## 95% family-wise confidence level
## factor levels have been ordered
##
## Fit: aov(formula = Log_Max_Height ~ Cause, data = filtered_tsunami_data)
##
## $Cause
## diff
## Explosion-Volcano and Earthquake 0.12316721
## Earthquake-Volcano and Earthquake 0.14839653
## Meteorological-Volcano and Earthquake 0.56661936
## Volcano-Volcano and Earthquake 0.59999401
## Unknown-Volcano and Earthquake 0.62960374
## Earthquake and Landslide-Volcano and Earthquake 0.94144078
## Volcano, Earthquake, and Landslide-Volcano and Earthquake 1.15937938
## Landslide-Volcano and Earthquake 1.36760361
## Volcano and Landslide-Volcano and Earthquake 1.70054537
## Earthquake-Explosion 0.02522932
## Meteorological-Explosion 0.44345215
## Volcano-Explosion 0.47682680
## Unknown-Explosion 0.50643654
## Earthquake and Landslide-Explosion 0.81827357
## Volcano, Earthquake, and Landslide-Explosion 1.03621217
## Landslide-Explosion 1.24443640
## Volcano and Landslide-Explosion 1.57737816
## Meteorological-Earthquake 0.41822283
## Volcano-Earthquake 0.45159748
## Unknown-Earthquake 0.48120722
## Earthquake and Landslide-Earthquake 0.79304425
## Volcano, Earthquake, and Landslide-Earthquake 1.01098285
## Landslide-Earthquake 1.21920708
## Volcano and Landslide-Earthquake 1.55214884
## Volcano-Meteorological 0.03337465
## Unknown-Meteorological 0.06298439
## Earthquake and Landslide-Meteorological 0.37482142
## Volcano, Earthquake, and Landslide-Meteorological 0.59276002
## Landslide-Meteorological 0.80098425
## Volcano and Landslide-Meteorological 1.13392601
## Unknown-Volcano 0.02960973
## Earthquake and Landslide-Volcano 0.34144677
## Volcano, Earthquake, and Landslide-Volcano 0.55938537
## Landslide-Volcano 0.76760959
## Volcano and Landslide-Volcano 1.10055136
## Earthquake and Landslide-Unknown 0.31183704
## Volcano, Earthquake, and Landslide-Unknown 0.52977564
## Landslide-Unknown 0.73799986
## Volcano and Landslide-Unknown 1.07094162
## Volcano, Earthquake, and Landslide-Earthquake and Landslide 0.21793860
## Landslide-Earthquake and Landslide 0.42616282
## Volcano and Landslide-Earthquake and Landslide 0.75910459
## Landslide-Volcano, Earthquake, and Landslide 0.20822422
## Volcano and Landslide-Volcano, Earthquake, and Landslide 0.54116599
## Volcano and Landslide-Landslide 0.33294176
## lwr
## Explosion-Volcano and Earthquake -2.75422811
## Earthquake-Volcano and Earthquake -1.51572530
## Meteorological-Volcano and Earthquake -1.16248004
## Volcano-Volcano and Earthquake -1.13514065
## Unknown-Volcano and Earthquake -1.20699378
## Earthquake and Landslide-Volcano and Earthquake -0.75986041
## Volcano, Earthquake, and Landslide-Volcano and Earthquake -1.71801594
## Landslide-Volcano and Earthquake -0.31902565
## Volcano and Landslide-Volcano and Earthquake -0.15680533
## Earthquake-Explosion -2.32617510
## Meteorological-Explosion -1.95437729
## Volcano-Explosion -1.92535837
## Unknown-Explosion -1.97003106
## Earthquake and Landslide-Explosion -1.55958829
## Volcano, Earthquake, and Landslide-Explosion -2.28631776
## Landslide-Explosion -1.12295025
## Volcano and Landslide-Explosion -0.91451929
## Meteorological-Earthquake -0.07114780
## Volcano-Earthquake -0.05868793
## Unknown-Earthquake -0.30796282
## Earthquake and Landslide-Earthquake 0.41340675
## Volcano, Earthquake, and Landslide-Earthquake -1.34042157
## Landslide-Earthquake 0.91193336
## Volcano and Landslide-Earthquake 0.71581727
## Volcano-Meteorological -0.66007647
## Unknown-Meteorological -0.85531437
## Earthquake and Landslide-Meteorological -0.22900628
## Volcano, Earthquake, and Landslide-Meteorological -1.80506941
## Landslide-Meteorological 0.23982438
## Volcano and Landslide-Meteorological 0.17479424
## Unknown-Volcano -0.90000320
## Earthquake and Landslide-Volcano -0.27945216
## Volcano, Earthquake, and Landslide-Volcano -1.84279980
## Landslide-Volcano 0.18812017
## Volcano and Landslide-Volcano 0.13058159
## Earthquake and Landslide-Unknown -0.55298296
## Volcano, Earthquake, and Landslide-Unknown -1.94669196
## Landslide-Unknown -0.09758744
## Volcano and Landslide-Unknown -0.07065353
## Volcano, Earthquake, and Landslide-Earthquake and Landslide -2.15992326
## Landslide-Earthquake and Landslide -0.04238964
## Volcano and Landslide-Earthquake and Landslide -0.14895641
## Landslide-Volcano, Earthquake, and Landslide -2.15916242
## Volcano and Landslide-Volcano, Earthquake, and Landslide -1.95073146
## Volcano and Landslide-Landslide -0.54732370
## upr p adj
## Explosion-Volcano and Earthquake 3.0005625 1.0000000
## Earthquake-Volcano and Earthquake 1.8125184 0.9999998
## Meteorological-Volcano and Earthquake 2.2957188 0.9897058
## Volcano-Volcano and Earthquake 2.3351287 0.9849407
## Unknown-Volcano and Earthquake 2.4662013 0.9858260
## Earthquake and Landslide-Volcano and Earthquake 2.6427420 0.7625468
## Volcano, Earthquake, and Landslide-Volcano and Earthquake 4.0367747 0.9582102
## Landslide-Volcano and Earthquake 3.0542329 0.2323111
## Volcano and Landslide-Volcano and Earthquake 3.5578961 0.1058072
## Earthquake-Explosion 2.3766337 1.0000000
## Meteorological-Explosion 2.8412816 0.9998867
## Volcano-Explosion 2.8790120 0.9997955
## Unknown-Explosion 2.9829041 0.9997384
## Earthquake and Landslide-Explosion 3.1961354 0.9854416
## Volcano, Earthquake, and Landslide-Explosion 4.3587421 0.9928102
## Landslide-Explosion 3.6118230 0.8131433
## Volcano and Landslide-Explosion 4.0692756 0.5930960
## Meteorological-Earthquake 0.9075935 0.1706371
## Volcano-Earthquake 0.9618829 0.1354039
## Unknown-Earthquake 1.2703773 0.6450057
## Earthquake and Landslide-Earthquake 1.1726817 0.0000000
## Volcano, Earthquake, and Landslide-Earthquake 3.3623873 0.9375673
## Landslide-Earthquake 1.5264808 0.0000000
## Volcano and Landslide-Earthquake 2.3884804 0.0000003
## Volcano-Meteorological 0.7268258 1.0000000
## Unknown-Meteorological 0.9812831 1.0000000
## Earthquake and Landslide-Meteorological 0.9786491 0.6206249
## Volcano, Earthquake, and Landslide-Meteorological 2.9905895 0.9987844
## Landslide-Meteorological 1.3621441 0.0002933
## Volcano and Landslide-Meteorological 2.0930578 0.0072220
## Unknown-Volcano 0.9592227 1.0000000
## Earthquake and Landslide-Volcano 0.9623457 0.7691370
## Volcano, Earthquake, and Landslide-Volcano 2.9615705 0.9992457
## Landslide-Volcano 1.3470990 0.0012145
## Volcano and Landslide-Volcano 2.0705211 0.0124539
## Earthquake and Landslide-Unknown 1.1766570 0.9798930
## Volcano, Earthquake, and Landslide-Unknown 3.0062432 0.9996211
## Landslide-Unknown 1.5735872 0.1373025
## Volcano and Landslide-Unknown 2.2125368 0.0874678
## Volcano, Earthquake, and Landslide-Earthquake and Landslide 2.5958005 0.9999997
## Landslide-Earthquake and Landslide 0.8947153 0.1112128
## Volcano and Landslide-Earthquake and Landslide 1.6671656 0.1953206
## Landslide-Volcano, Earthquake, and Landslide 2.5756109 0.9999998
## Volcano and Landslide-Volcano, Earthquake, and Landslide 3.0330634 0.9995715
## Volcano and Landslide-Landslide 1.2132072 0.9722733
par(mar = c(5, 17, 5, 1))
# Plot the Tukey test
plot(tukey_result, las = 2, cex.axis = 0.7)
# Extract Tukey HSD results as a data frame
tukey_df <- as.data.frame(tukey_result$Cause)
# Add row names as a column for group comparisons
tukey_df$Comparison <- rownames(tukey_df)
# Filter significant results (p.adj < 0.05)
significant_results <- tukey_df[tukey_df$`p adj` < 0.05, ]
# Display significant results
significant_results
## diff lwr upr p adj
## Earthquake and Landslide-Earthquake 0.7930443 0.4134068 1.172682 0.000000e+00
## Landslide-Earthquake 1.2192071 0.9119334 1.526481 0.000000e+00
## Volcano and Landslide-Earthquake 1.5521488 0.7158173 2.388480 2.572946e-07
## Landslide-Meteorological 0.8009842 0.2398244 1.362144 2.933475e-04
## Volcano and Landslide-Meteorological 1.1339260 0.1747942 2.093058 7.221986e-03
## Landslide-Volcano 0.7676096 0.1881202 1.347099 1.214489e-03
## Volcano and Landslide-Volcano 1.1005514 0.1305816 2.070521 1.245394e-02
## Comparison
## Earthquake and Landslide-Earthquake Earthquake and Landslide-Earthquake
## Landslide-Earthquake Landslide-Earthquake
## Volcano and Landslide-Earthquake Volcano and Landslide-Earthquake
## Landslide-Meteorological Landslide-Meteorological
## Volcano and Landslide-Meteorological Volcano and Landslide-Meteorological
## Landslide-Volcano Landslide-Volcano
## Volcano and Landslide-Volcano Volcano and Landslide-Volcano
The following pairwise differences were statistically significant (p < 0.05):
# Filter dataset to include only rows with non-missing Earthquake Magnitude
filtered_eq_data <- filtered_tsunami_data[!is.na(filtered_tsunami_data$Earthquake.Magnitude), ]
# Check the number of rows in the filtered dataset
nrow(filtered_eq_data)
## [1] 707
# Preview the filtered dataset
head(filtered_eq_data)
## Search.Parameters Year Mo Dy Hr Mn Sec Tsunami.Event.Validity
## 4 1920 8 20 16 15 38.0 3
## 6 1920 9 20 14 39 NA 3
## 11 1921 9 11 4 1 38.0 4
## 12 1921 11 11 18 36 8.0 3
## 15 1922 1 19 21 58 58.5 3
## 26 1922 11 11 4 32 51.0 4
## Tsunami.Cause.Code Earthquake.Magnitude Vol More.Info Deposits
## 4 1 7.0 NA NA 0
## 6 1 7.8 NA NA 0
## 11 1 7.5 NA NA 0
## 12 1 7.5 NA NA 0
## 15 1 6.9 NA NA 0
## 26 1 8.5 NA NA 0
## Country Location.Name Latitude Longitude
## 4 CHILE CENTRAL CHILE -38.000 -73.500
## 6 VANUATU VANUATU ISLANDS -19.919 168.530
## 11 INDONESIA JAVA -11.000 111.000
## 12 PHILIPPINES PHILIPPINE TRENCH 8.000 127.000
## 15 PAPUA NEW GUINEA BISMARCK SEA -7.111 143.530
## 26 CHILE NORTHERN CHILE -28.293 -69.852
## Maximum.Water.Height..m. Number.of.Runups Tsunami.Magnitude..Abe.
## 4 1.4 1 NA
## 6 NA 2 NA
## 11 0.1 3 NA
## 12 NA 4 NA
## 15 1.8 1 NA
## 26 9.0 36 NA
## Tsunami.Magnitude..Iida. Tsunami.Intensity Deaths Death.Description Missing
## 4 0.5 1.0 NA NA NA
## 6 1.0 NA NA NA NA
## 11 -2.3 -2.0 NA NA NA
## 12 0.5 1.0 NA NA NA
## 15 NA NA NA NA NA
## 26 3.2 2.5 200 3 NA
## Missing.Description Injuries Injuries.Description Damage...Mil.
## 4 NA NA NA NA
## 6 NA NA NA NA
## 11 NA NA NA NA
## 12 NA NA NA NA
## 15 NA NA NA NA
## 26 NA NA NA NA
## Damage.Description Houses.Destroyed Houses.Destroyed.Description
## 4 NA NA NA
## 6 1 NA NA
## 11 NA NA NA
## 12 1 NA 1
## 15 NA NA NA
## 26 3 NA 3
## Houses.Damaged Houses.Damaged.Description Total.Deaths
## 4 NA NA NA
## 6 NA NA NA
## 11 NA NA NA
## 12 NA NA NA
## 15 NA NA NA
## 26 NA NA 700
## Total.Death.Description Total.Missing Total.Missing.Description
## 4 NA NA NA
## 6 NA NA NA
## 11 NA NA NA
## 12 NA NA NA
## 15 NA NA NA
## 26 3 NA NA
## Total.Injuries Total.Injuries.Description Total.Damage...Mil.
## 4 NA NA NA
## 6 NA NA NA
## 11 NA NA NA
## 12 NA NA NA
## 15 NA NA NA
## 26 NA NA NA
## Total.Damage.Description Total.Houses.Destroyed
## 4 2 NA
## 6 1 NA
## 11 NA NA
## 12 1 NA
## 15 1 NA
## 26 3 NA
## Total.Houses.Destroyed.Description Total.Houses.Damaged
## 4 2 NA
## 6 NA NA
## 11 NA NA
## 12 1 NA
## 15 1 NA
## 26 3 NA
## Total.Houses.Damaged.Description Log_Max_Height Cause
## 4 2 0.1461280 Earthquake
## 6 NA NA Earthquake
## 11 NA -1.0000000 Earthquake
## 12 NA NA Earthquake
## 15 NA 0.2552725 Earthquake
## 26 NA 0.9542425 Earthquake
# Scatterplot of Log10(Maximum Water Height) vs Earthquake Magnitude
plot(
filtered_eq_data$Earthquake.Magnitude,
filtered_eq_data$Log_Max_Height,
main = "Log10(Maximum Water Height) vs Earthquake Magnitude",
xlab = "Earthquake Magnitude",
ylab = "Log10(Maximum Water Height)",
pch = 19, # Solid circles
col = "lightgreen"
)
Scatterplot The scatterplot shows a positive relationship between earthquake magnitude and log10(maximum water height). As earthquake magnitude increases, the maximum water height generally tends to increase, though there is substantial variability.
# Fit a linear model
lm_eq <- lm(Log_Max_Height ~ Earthquake.Magnitude, data = filtered_eq_data)
# Summarize the model
summary(lm_eq)
##
## Call:
## lm(formula = Log_Max_Height ~ Earthquake.Magnitude, data = filtered_eq_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8735 -0.5730 -0.1085 0.5272 2.7343
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.92770 0.28331 -10.334 <2e-16 ***
## Earthquake.Magnitude 0.37349 0.03946 9.466 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7267 on 622 degrees of freedom
## (83 observations deleted due to missingness)
## Multiple R-squared: 0.1259, Adjusted R-squared: 0.1245
## F-statistic: 89.6 on 1 and 622 DF, p-value: < 2.2e-16
# Extract and report the p-value
lm_p_value <- summary(lm_eq)$coefficients[2, 4]
lm_p_value
## [1] 5.932267e-20
The linear model suggests a statistically significant positive relationship:
Intercept: -2.92770
Slope: 0.37349, for each unit increase in earthquake magnitude, the log-transformed maximum water height increases by approximately 0.37.
P-Value: < 2e-16, relationship is statistically significant.
R-Squared Value: 0.1259 indicates that about 12.59% of the variability in log-transformed maximum water height can be explained by earthquake magnitude.
Conclusion for Linear Model Even though there is a statistically significant relationship between earthquake magnitude and maximum water height, the somewhat low R-squared value suggests that other factors likely contribute to tsunami height variability. This analysis highlights the importance of considering additional variables when predicting tsunami magnitude.
# Load required libraries
library(ggplot2)
library(maps)
library(viridis)
# Get world map data
world_map <- map_data("world")
# Plot world map with tsunami data scaled by Log_Max_Height
ggplot() +
geom_polygon(data = world_map, aes(x = long, y = lat, group = group),
fill = "gray90", color = "gray60") +
geom_point(data = filtered_tsunami_data,
aes(x = Longitude, y = Latitude, color = Log_Max_Height),
size = 2, alpha = 0.7) +
scale_color_viridis(name = "Log(Max Water Height)") +
labs(x = "Longitude", y = "Latitude", title = "Tsunamis by Log(Max Water Height)") +
theme_minimal()
This map displays the location of each tsunami by max water height (using log).
Tsunamis with higher log (max water height) values are concentrated in tectonically active regions like the Pacific Ring of Fire, the Indian Ocean, and the Mediterranean. The Pacific coasts of Asia, North America, and South America exhibit the highest tsunami heights, corresponding to subduction zones. In contrast, the North Atlantic/Arctic region mostly features smaller tsunami heights due to the lack of major subduction zones, with tsunamis resulting from landslides or meteorological events.
# Plot world map with tsunami data colored by Cause
ggplot() +
geom_polygon(data = world_map, aes(x = long, y = lat, group = group),
fill = "gray90", color = "gray60") +
geom_point(data = filtered_tsunami_data,
aes(x = Longitude, y = Latitude, color = Cause),
size = 2, alpha = 0.7) +
labs(x = "Longitude", y = "Latitude", color = "Tsunami Cause",
title = "Tsunamis by Cause") +
theme_minimal()
This map displays the location of each tsunami by cause.
Earthquakes (red) are the dominant cause of tsunamis globally, particularly along the Pacific and Indian Oceans, where tectonic activity is high. Other causes, such as landslides (green) and meteorological events (yellow), are more prevalent in regions with lower tectonic activity, such as the North Atlantic and Arctic. The diversity of causes reflects the varying geological and meteorological conditions influencing tsunami generation worldwide.