The topic of data I chose to analyze for this project is meteorites. The reason I chose this topic is because I’ve always found things coming from space to be a really cool topic but I’ve never had the time to rigorously look into it. The dataset if of all meteorite landings in recorded history until the date of the dataset in 2013. The data was collected by Javier de la Torre in 2013 and was subsequently published by NASA. The dataset can be found at https://catalog.data.gov/dataset/meteorite-landings. The fields includes the name, ID, class, mass, year found/fell, how the meteor war found (while falling or on the ground), and the coordinates of landing. All of the data included was clean, but the crux was everything that wasn’t included. In order to add the group and subgroup, I had to filter manually for each category and label them each accordingly. To do so I split the dataset into each group, added the columns necessary, and then merged the data back together so I could analyze it.
library(tidyverse)
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.3.2
setwd("C:\\Users\\Shea\\Documents\\data110\\csvs")
meteorites <- read.csv("Meteorite_Landings.csv")
I found during my research that meteorites can be split into three primary groups: stony, stony-iron, and iron. The vast majority (about 94.6%) of all meteorites are stony. Iron meteorites make up about 4.4% and stony-iron about 1%. As they sound, stony meteorites are mostly stone, iron ones mostly iron, and stony-iron are about even. Note that stony ones are not completely stone. A completely stone meteorite is so rare that if someone were to claim they found one without clear proof it would more likely than not be rejected by The Meteoritical Society. These three groups can be split into further subgroups: stony into chondrites and achondrites, stony-iron into pallasites and mesosiderites, and iron into primitive and magmatic. Each of these is split into many classes and sub classes but that is beyond the scope of this project. This information as well as more about classes and sub classes can be found at https://www.lpi.usra.edu/meteor/ and http://class.meteorites.com.au/.
chondrite <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "L") ~ "Chondrite",
startsWith(recclass, "H") ~ "Chondrite",
startsWith(recclass, "E") ~ "Chondrite",
startsWith(recclass, "R") ~ "Chondrite",
startsWith(recclass, "R") ~ "Chondrite",
startsWith(recclass, "C") ~ "Chondrite",
startsWith(recclass, "OC") ~ "Chondrite",
startsWith(recclass, "K") ~ "Chondrite")), class=="Chondrite")
chondrite <- filter(chondrite, !startsWith(recclass, "Lunar"))
chondrite <- filter(chondrite, !startsWith(recclass, "How"))
chondrite <- filter(chondrite, !startsWith(recclass, "Lon"))
chondrite <- filter(chondrite, !startsWith(recclass, "Lod"))
chondrite <- filter(chondrite, !startsWith(recclass, "En"))
chondrite <- filter(chondrite, !startsWith(recclass, "Eu"))
chondrite <- filter(chondrite, !startsWith(recclass, "Relict i"))
stoneucl <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "Ston") ~ "Stony")), class=="Stony")
achondrite <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "Acapu") ~ "Achondrite",
startsWith(recclass, "Win") ~ "Achondrite",
startsWith(recclass, "How") ~ "Achondrite",
startsWith(recclass, "Eu") ~ "Achondrite",
startsWith(recclass, "Dio") ~ "Achondrite",
startsWith(recclass, "Ang") ~ "Achondrite",
startsWith(recclass, "Enst ach") ~ "Achondrite",
startsWith(recclass, "Aub") ~ "Achondrite",
startsWith(recclass, "Ure") ~ "Achondrite",
startsWith(recclass, "Bra") ~ "Achondrite",
startsWith(recclass, "Achon") ~ "Achondrite",
startsWith(recclass, "Lunar") ~ "Achondrite",
startsWith(recclass, "Mart") ~ "Achondrite",
startsWith(recclass, "Lod") ~ "Achondrite")), class=="Achondrite")
pallasite <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "Pall") ~ "Pallasite")), class=="Pallasite")
mesosiderite <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "Meso") ~ "Mesosiderite")), class=="Mesosiderite")
magmatic <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "Iron") ~ "Magmatic")), class=="Magmatic")
magmatic <- filter(magmatic, !startsWith(recclass, "Iron, IAB"))
magmatic <- filter(magmatic, !startsWith(recclass, "Iron, IIE"))
primitive <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "Iron") ~ "Primitive")), class=="Primitive")
primitive <- filter(primitive, startsWith(recclass, "Iron, IAB"))
iron <- filter(mutate(meteorites, class = case_when(startsWith(recclass, "Iron, IIE") ~ "Iron",
startsWith(recclass, "Relict iron") ~ "Iron")), class=="Iron")
other <- filter(meteorites, startsWith(recclass, "Unknown") | startsWith(recclass, "Relict iron") | startsWith(recclass, "Fusion crust") | startsWith(recclass, "Impact melt breccia"))
stony <- rbind(chondrite, stoneucl, achondrite)
stony <- mutate(stony, group = "Stony")
stonyiron <- rbind(pallasite, mesosiderite)
stonyiron <- mutate(stonyiron, group = "Stony-iron")
irons <- rbind(magmatic, primitive, iron)
irons <- mutate(irons, group = "Iron")
other <- mutate(other, class = NA)
other <- mutate(other, group = NA)
allms <- rbind(stony, stonyiron, irons, other)
# Excluding those that don't fit into a group
allmsbox <- filter(allms, !is.na(group))
box <- ggplot(allmsbox, aes(x = factor(group, c("Stony", "Stony-iron", "Iron")), y = mass..g., fill = group)) +
geom_boxplot() +
labs(title = "Groups of Meteorites and Their Masses",
x = "Meteorite Group",
y = "Mass in Grams (Log scale)",
fill = "Meteorite Group",
caption = "Source: NASA (https://catalog.data.gov/dataset/meteorite-landings)") +
scale_y_continuous(trans='log10') +
theme_bw()
box
## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 143 rows containing non-finite values (`stat_boxplot()`).
I found this box plot to be very interesting although not so surprising. The reason I had to use a logarithmic scale is because the vast majority of meteorites found are tiny. Therefore using a linear scale just shows all the boxes at 0 with all other points being outliers. The Graph shows pretty much what I expected: the higher iron values had much larger masses. The reason I think this is is because iron holds together much better when falling through the atmosphere. It was surprising to me that stony-iron didn’t have any outliers while both iron and stony had a bunch. My best guess for the reason for that would be that they are so much more uniform (being close to 50% iron 50% stone) while the other categories have a much larger range.
popupmet <- paste0(
"<b>Name: </b>", allms$name, "<br>",
"<b>Year: </b>", allms$year, "<br>",
"<b>ID: </b>", allms$id, "<br>",
"<b>Group: </b>", allms$group, "<br>",
"<b>Subgroup: </b>", allms$class, "<br>",
"<b>Class: </b>", allms$recclass, "<br>",
"<b>Mass (Grams): </b>", allms$mass..g., "<br>",
"<b>Geo Location: </b>", allms$GeoLocation, "<br>"
)
pal <- colorFactor(
palette = c("#cc5500", "#C4A484", "#b2beb5"),
domain = factor(allms$group, c("Iron", "Stony-iron", "Stony"))
)
leaflet() |>
setView(lng = -98.5, lat = 40, zoom = 1) |>
addProviderTiles("Esri.WorldStreetMap") |>
addCircles(
data = allms,
color = ~pal(group),
lng = allms$reclong,
lat = allms$reclat,
radius = sqrt(allms$mass..g.),
popup = popupmet
)
## Warning in validateCoords(lng, lat, funcName): Data contains 7315 rows with
## either missing or invalid lat/lon values and will be ignored