https://data.nasa.gov/Space-Science/Meteorite-Landings/gh4g-9sfh/data

title: “Meteorite Landings” author: “480198815, 480393708,480367983” subtitle: “Project 2” date: “University of Sydney | MATH1005 | 09/2018” output: html_document: fig_caption: yes number_sections: yes self_contained: yes theme: flatly toc: true toc_depth: 3 toc_float: true code_folding: hide —



# Executive Summary

The aim of this report is to analyse a dataset that compiles the characteristics of all known meteorite landings. For each landing, the type of meteorite, mass, year of landing and position of landing are given. In addition to this, identification (a name and number) are provided for each meteorite along with whether the meteorite has weathered and become unclassifiable (relict) since landing and if the meteorite was observed falling or was found after it fell.

The main discoveries for our investigation were the distribution of meteorites overall, the distribution and average mass of meteorites classified as ‘relict’ and an analysis of the mass of all meteorites in the data set.

From analysing the overall distribution of meteorites to determine whether certain areas of the earth encounter strikes more often it was found that inconsistencies arose in relation to observability. The analysis uncovered that areas with higher populations or with research centres encountered a larger number of observed strikes. While this was not the aspect it was initially intended to explore, it was the only compelling conclusion that was able to be drawn as a result of several confounding variables.

The investigation into the distribution of location and mass for those meteorites defined as, ‘relict’ produced varied findings. The overall lack of data prevented the making of reliable findings as the variance which comes with small sample sizes is often too large to have certainty. However, it was discovered that relict meteorites are often found in clusters as well as their mass being, on average, lower than meteorites classified as valid.

Finally, through the graphical representation of data, the skewed nature of meteorites in relation to their mass was able to be discerned. A large skew toward lower masses was found in the investigation. From this it was possible to suggest that meteorites may skew toward the lower end of the mass spectrum due to their almost inevitable break-up upon entering the earth’s atmosphere.

The analysis of the data set revealed just how difficult it is to find reliable sets of data. This seemingly sound, once analysed, revealed multiple statistical hurdles. As the investigation proceed increasing amounts of confounding variables as well as incomplete rows of data and unclassifiable meteorites were discovered. The original hypotheses were continually rerouted upon the discovery of factors that would taint the reliability of any conclusions.



Full Report

Initial Data Analysis (IDA)

library(readr)
Meteorite_Landings <- read_csv("~/Desktop/Meteorite_Landings.csv")
## Parsed with column specification:
## cols(
##   name = col_character(),
##   id = col_integer(),
##   nametype = col_character(),
##   recclass = col_character(),
##   `mass (g)` = col_double(),
##   fall = col_character(),
##   year = col_character(),
##   reclat = col_double(),
##   reclong = col_double(),
##   GeoLocation = col_character()
## )

Here we can take a look at the first five rows of the dataset to get an idea of how the rest looks like.

head(Meteorite_Landings)
## # A tibble: 6 x 10
##   name     id nametype recclass `mass (g)` fall  year  reclat reclong
##   <chr> <int> <chr>    <chr>         <dbl> <chr> <chr>  <dbl>   <dbl>
## 1 Aach…     1 Valid    L5               21 Fell  01/0…   50.8    6.08
## 2 Aarh…     2 Valid    H6              720 Fell  01/0…   56.2   10.2 
## 3 Abee      6 Valid    EH4          107000 Fell  01/0…   54.2 -113   
## 4 Acap…    10 Valid    Acapulc…       1914 Fell  01/0…   16.9  -99.9 
## 5 Achi…   370 Valid    L6              780 Fell  01/0…  -33.2  -65.0 
## 6 Adhi…   379 Valid    EH4            4239 Fell  01/0…   32.1   71.8 
## # ... with 1 more variable: GeoLocation <chr>

This is a huge dataset with 45716 obstacles and 10 variables

dim(Meteorite_Landings)
## [1] 45716    10

R classifies most of this data as characters and some as numeric values. However some of the data which r classifies as characters is technically incorrect, since for example, R classifies the mass of the meteorites as a character while it actually is a quantitative variable. The longitude and latitude of the meteorites are used and they are also quantitative.

class(Meteorite_Landings)
## [1] "tbl_df"     "tbl"        "data.frame"
  • Produce a snapshot of the data
str(Meteorite_Landings)
## Classes 'tbl_df', 'tbl' and 'data.frame':    45716 obs. of  10 variables:
##  $ name       : chr  "Aachen" "Aarhus" "Abee" "Acapulco" ...
##  $ id         : int  1 2 6 10 370 379 390 392 398 417 ...
##  $ nametype   : chr  "Valid" "Valid" "Valid" "Valid" ...
##  $ recclass   : chr  "L5" "H6" "EH4" "Acapulcoite" ...
##  $ mass (g)   : num  21 720 107000 1914 780 ...
##  $ fall       : chr  "Fell" "Fell" "Fell" "Fell" ...
##  $ year       : chr  "01/01/1880 12:00:00 AM" "01/01/1951 12:00:00 AM" "01/01/1952 12:00:00 AM" "01/01/1976 12:00:00 AM" ...
##  $ reclat     : num  50.8 56.2 54.2 16.9 -33.2 ...
##  $ reclong    : num  6.08 10.23 -113 -99.9 -64.95 ...
##  $ GeoLocation: chr  "(50.775000, 6.083330)" "(56.183330, 10.233330)" "(54.216670, -113.000000)" "(16.883330, -99.900000)" ...
##  - attr(*, "spec")=List of 2
##   ..$ cols   :List of 10
##   .. ..$ name       : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ id         : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ nametype   : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ recclass   : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ mass (g)   : list()
##   .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
##   .. ..$ fall       : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ year       : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ reclat     : list()
##   .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
##   .. ..$ reclong    : list()
##   .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
##   .. ..$ GeoLocation: list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   ..$ default: list()
##   .. ..- attr(*, "class")= chr  "collector_guess" "collector"
##   ..- attr(*, "class")= chr "col_spec"

``` Summary: - The data is sourced from NASA’s open data portal. NASA in turn obtained this data from the meteoritical society which is a non-profit scholarly organisation founded in 1933 focused on the studies of meteorites.

  • The data is valid because the meteoritical society focuses on the study of meteorites to better understand the history and origin of the solar system, and NASA is an agency that is part of the federal government of the United States responsible for aerospace research. This data must therefore be valid because NASA has put this data in their database.

  • The stakeholders in this data can be categorised into those who produced it and those who use it. The Meteoritical Society is a vital stakeholder as they were central to compiling this large database and the people to conduct the observations included members and affiliates of the society, to which belong just about every scientist or researcher involved in the study of meteorites. The next most important stakeholder is NASA. Their role in publishing this data as part of their open-source initiative is extremely important in the dissemination of information on the subject. In addition to this, NASA, through its study of celestial bodies has contributed to this data and also benefited from it. Other stakeholders include members of the public such as students and citizen scientists who make use of this data and also those in the amateur astronomy community for whom such data may be useful or interesting. These stakeholders are much less important as they weren’t involved in the compilation of this data but nonetheless do play a role in supporting the publication of data that otherwise would have been inaccessible.

  • One possible issue is that when analysing the susceptibility of an area to a meteorite landing, the data has not considered the confounding variable of observability. The implication of this oversight is a bias towards areas where there is a higher chance of being able to observe a falling or fallen meteorite (where humans are living or have easy access to). This bias is exemplified in our scatter plot of landings based on longitude and latitude; there were close to zero reported fallings in the ocean due to the lack of the people, and therefore a bias. If we did not take this confounding variable into account we may have drawn the conclusion that almost no meteorites land in the ocean. Another issue is related to the fact that technology was not advanced enough in the past to get an accurate measurement of the mass of meteorites. There are instances where data is missing because the full information cannot be known. This results in the data not representing reality and so the analysis may not be fully correct. Another issue is linked to the fact that the downloaded data is formatted poorly, which prevents analysis to be done.

  • Each row represents a meteorite that has landed on earth

  • Each column represents the type of meteorite(its composition), its mass in g, its time/date of landing, the coordinates of its landing, whether or not the meteorite was observed falling or was found after it fell and if the meteorite has weathered since landing.


Research Question 1

Are there latitudes and longitudes that have a higher frequency of meteorite impact?

Before we answer the question, it is suspected that there will be a higher frequency of landings in locations near the equator. This is based on the assumption that most meteorites (and indeed most orbiting bodies) orbit on a plane that closely matches earth’s equatorial plane. We would then expect a random distribution of landings longitudinally, assuming that meteors are equally likely to land at any time of day.

To begin to answer this question we plot the latitude of each landing site against the longitude to generate a ‘map’ of where these landings occur.

lat = Meteorite_Landings$reclat
long = Meteorite_Landings$reclong


plot(long,lat,main="Distribution of Meteorites across the surface of the Earth", 
    xlab="Longitude", ylab="Latitude")

The plot generated makes two things immediately obvious. Firstly, as mentioned earlier, observed and measured meteorite landings occur over land because meteorites that land at sea are unlikely to be detected. As a result the data displays a correlation between meteorite landings and landmass which is the result of an observation bias. Secondly, The data needs to be tidied with many entries not having any coordinates given and with one point given incorrect coordinates. This aberrant data point is not consistent with the convention of describing latitude as a number between -180 and 180. It is entered 354 but likely it refers to the latitude of -6 degrees. The incorrect data point will be removed from the data. The entries with coordinates given as (0,0) will be left on the ‘map’ but removed from later analysis. By inspecting the data the abberant entry was found.

Meteorite_Landings$reclong[22947] 
## [1] 354.4733
NewMeteoriteLandings <- Meteorite_Landings[-c(22947), ]

lat = NewMeteoriteLandings$reclat
long = NewMeteoriteLandings$reclong


plot(long,lat,main="Distribution of Meteorites across the surface of the Earth", xlab="Longitude", ylab="Latitude", pch=)

Having eliminated outliers, the data visualisation is more clear. The data now reveals that Meteorite Landings are least common in remote, largely uninhabited areas. Of note are the lack of landings in the Amazon Rainforest, Siberia and the deserts of Africa, Australia and the Middle East. This further reinforces the previous discovery that Meteorite landing data is driven more by human observation than by the actual occurrence of meteorite landings.

Generating a histogram of the latitudes and longitudes, excluding outliers, provides further information of the distribution of landings, beyond what can be discerned in the plot.

newLat <- lat[ lat != 0 ]
hist(newLat, breaks = 180)

newLong <- long[ long != 0]
hist(newLong, breaks = 180)

Firstly, there are extreme peaks of meteorite landings at approximately 165 degrees east and very low latitudes. These coordinates indicate the position of several Antarctic research stations, most notably the American McMurdo Antarctic Research outpost. This indicates that there has been a disproportionately greater effort to find meteorites in this area. This is consistent not just with the research station location but also with the fact that meteorites are less likely to become weathered when frozen in ice and the level of scientific research in the field that is conducted there. This indicates that scientists are actively searching for meteorite landings in areas where they are expected to be well preserved.

Other locations that stand out in the histograms are North America (20-30 North, 100 West) Europe (30-40 North, 0-50 East) and Australia (25 South,125 East). Again this is consistent with the level of research conducted in this field and the higher population density for Europe and North America.

Another quite likely reason that this dataset does not accurately reflect reality is that it is not clear whether this database includes data from all sources and agencies investigating Meteorite Landings. This would skew the data towards locations in the US or that the US works in and collaborates with (e.g. McMurdo, Europe, Australia).

Summary: In conclusion, the question of whether Meteorites more commonly impact certain locations cannot be accurately answered with this data and the results obtained were not predicted by the initial assumptions. This data has been skewed by several confounding factors and biases. An observation bias exists toward observing and recovering meteors on land in accessible, populated regions or in regions where Meteorites are unusually well preserved such as Antarctica. The spread of landmass across the globe and the accessibility of this land are confounding factors along with the local climate with respect to how well it preserved fallen meteorites.

Research Question 2

Do relict meteorites fall evenly across the globe and are they, on average, of a lower mass?

Of the 45,716 recorded meteorites, only 75 are recorded as relict.

library(MASS)
type = Meteorite_Landings$nametype
type.freq = table(type)
type.freq
## type
## Relict  Valid 
##     75  45641

From the plot of relict meteorites according to latitude and longitude an uneven distribution is observable. In fact, the graphical representation of this appears quite sparse. This is due, in part, to the low quantity of relict data points . On the other hand, it is predicted that this is a result of the relict meteorites being found in the same area. Credited to this phenomenon is the reality that the process of a meteorite becoming relict through eroding and breaking apart, results in the collection of two meteorites instead of one. While the data does suggest that there are specific places where relict meteorites are discovered, the degree of variance due to a low sample size does not allow any conclusive or compelling arguments to be drawn.

relict_Long = Meteorite_Landings$reclong[Meteorite_Landings$nametype == "Relict"]
relict_Lat = Meteorite_Landings$reclat[Meteorite_Landings$nametype == "Relict"]
plot(relict_Long, relict_Lat, main="Distribution of Relict Meteorites",
    xlab="Longitude", ylab="Latitude", pch=)

The average mass of relict of meteorites was found to be:

relict_mass = Meteorite_Landings$`mass (g)`[Meteorite_Landings$nametype == "Relict"]
average = sum(relict_mass[2], relict_mass[5:11])/8
average
## [1] 0.394125

It was found that relict meteorites were, on average, of lower mass than valid meteorites, with relict posting an average of only 0.39 grams. This was as expected as the classification of a meteorite as, ‘relict’, implies that some form of erosion has occurred and therefore that the meteorite has lost mass.

From the analysis of the relict data two things were able to be determined. Firstly, that relict meteorites are often found in clusters. This finding is attributed to the reality that a broken and eroded meteorite will be recorded and identified as more than a one. Secondly, it was found that the average mass of relict meteorites was lower than that of valid meteorites. This was, quite predictably, due to the effect of erosion on the mass of an object. While these findings were evident, they are by no means conclusive. The classification of relict meteorites according to the Meteoritical Society are those that either, “cannot be assigned a class”, or those which are, “Highly altered object[s] that may have a meteoritic origin. These are dominantly (>95%) composed of secondary minerals formed on the body on which the object was found.” Due to these classification criteria it is increasingly difficult to observe relict meteorites. The result of this was a limited amount of data on the variable as well as incomplete information. For example many of the meteorites did not have a recorded weight or position found. Overall, however, the findings provided some insight into the nature of relict meteorites.

Research Question 3

Research Question 3

What were the mass of the heaviest meteorites, what were their locations, date that they were found and what is their composition? What is the mass distribution of the meteorites?

The top ten heaviest meteorite that has landed on earth are all heavier than 22 tonne, with the heaviest being the meteorite named Hoba with a mass of 60 tonne, which was found in Namibia. Out of these ten meteorites, only one was observed falling on to earth in 1947 in Russia, making it the biggest fall in recorded history. The other meteorites were found after they have landed without observing its fall to earth. This means that for these kind of meteorites, the date given is the date that the meteorite was found by humans but not necessarily the date that it landed on earth, which could have been a long time ago. For example, the meteorite Hoba is estimated to have landed on earth in the past 80,000 years. These meteorites were found mostly during the past 200 years, with one exception being the Campo del Cielo which was found in 1575. The most recent find amongst these ten is the meteorite, Sikhote-Alin, which was observed falling on to earth. By searching out the location given by the longitude and latitude, the top ten heaviest meteorites that was found by humanity can be found in different locations across the globe, with 2 being in Namibia and 2 in Mexico. The others are found in the following countries: Greenland, Argentina, Arizona (US), China, Australia and Russia. These meteorites are all mainly composed of Iron, so there may be a link between a meteorites’ mass and its composition.

Nmass = as.numeric(Meteorite_Landings$`mass (g)`)
head(Meteorite_Landings[order(Nmass, decreasing = T),][c(1,4,5,6,7,10)],10)
## # A tibble: 10 x 6
##    name       recclass     `mass (g)` fall  year         GeoLocation      
##    <chr>      <chr>             <dbl> <chr> <chr>        <chr>            
##  1 Hoba       Iron, IVB      60000000 Found 01/01/1920 … (-19.583330, 17.…
##  2 Cape York  Iron, IIIAB    58200000 Found 01/01/1818 … (76.133330, -64.…
##  3 Campo del… Iron, IAB-MG   50000000 Found 12/22/1575 … (-27.466670, -60…
##  4 Canyon Di… Iron, IAB-MG   30000000 Found 01/01/1891 … (35.050000, -111…
##  5 Armanty    Iron, IIIE     28000000 Found 01/01/1898 … (47.000000, 88.0…
##  6 Gibeon     Iron, IVA      26000000 Found 01/01/1836 … (-25.500000, 18.…
##  7 Chupaderos Iron, IIIAB    24300000 Found 01/01/1852 … (27.000000, -105…
##  8 Mundrabil… Iron, IAB-u…   24000000 Found 01/01/1911 … (-30.783330, 127…
##  9 Sikhote-A… Iron, IIAB     23000000 Fell  01/01/1947 … (46.160000, 134.…
## 10 Bacubirito Iron, ungro…   22000000 Found 01/01/1863 … (26.200000, -107…

As shown on this boxplot, not all meteorites that have landed on earth are as heavy as the ten meteorites which were shown before. The majority of the meteorites have a small mass with 40762 meteorites having a mass of 1kg or under.

boxplot(Nmass, horizontal =T,xlab = "Mass of the meteorites", main = "Boxplot of meteorite's mass")

under = Nmass <= 1000
summary(under)
##    Mode   FALSE    TRUE    NA's 
## logical    4823   40762     131

In summary, meteorites heavier than 1kg seems to be rare, and meteorites that come close to the top ten in mass even rarer. The fact that heavy meteorites are not found on earth may be due to the fact that the meteorites break up into little pieces during their fall to earth. The composition of the meteorite may also be linked to whether or not the meteorite breaks into little pieces, since all the heaviest meteorites are composed of Iron, we can assume that because being mainly composed of Iron, it allowed the meteorite to land in one piece or to not break as easily.


Conclusions

To conclude, the Meteorite Landings database, which records the mass and position of meteorites that have landed on Earth and when they landed, has proved to be an intriguing source of data to analyse. The first research question determined that the data does not accurately convey the distribution of meteorite landings across the world because of an inherent observer bias. Meteorite landings are unlikely to be detected if they occur in remote locations such as at sea or in rainforests and deserts. Furthermore, researchers tend to look in areas where meteorites are better preserved such as in Antarctica. With question 2, it was found that relict meteorites have very little data to draw on. They represent an extremely small subset of meteorite that tends to be of a lower mass than valid meteorites. This is consistent with the understanding that relict meteorites are almost always heavily eroded, but the data is not strong enough to conclusively indicate whether this is really the case. It was also found that relict meteorites are found in clusters which it was found may be due to a single meteorite breaking up over extended periods of time. Finally with question 3, it was shown that the overwhelming majority of meteorites have a mass less than 1kg. Out of the over 45,000 data points, it was the top ten whose mass was over 22 tonnes each. It was found that all these meteorites fell in the distant past and most were not directly observed when they impacted. The analysis conducted on this dataset has revealed findings that were both intriguing and unexpected. As a result of this analysis, a deeper understanding of meteorite landings across the globe has been achieved.


References

Meteoritical Society. (2018). Retrieved from https://en.wikipedia.org/wiki/Meteoritical_Society Meteoritical Bulletin: Recommended classifications. (2018). Retrieved from https://www.lpi.usra.edu/meteor/metbullclass.php?sea=Relict%20meteorite About NASA. (2018). Retrieved from https://www.nasa.gov/about/index.html

(2018). Retrieved from https://data.nasa.gov/Space-Science/Meteorite-Landings/gh4g-9sfh/data

Personal reflection on group work