Specialisation | Data Analysis and Interpretation |
Course | Data Management and Visualisation |
Education Institution | Wesleyan University |
Publisher | Coursera |
Assignment | Running Your Second Program |
The Mars Craters data set was made available by Wesleyan University/Coursera as part of the Data Management and Visualisation course, of the Data Analysis and Interpretation Specialisation, from the Ph.D. Thesis Planetary Surface Properties, Cratering Physics, and the Volcanic History of Mars from a New Global Martian Crater Database (2011) by Robbins, S.J., University of Colorado at Boulder.
The data set has a total of 384343 observations and 10 variables.
The variables are: CRATER_ID, CRATER_NAME, LATITUDE_CIRCLE_IMAGE, LONGITUDE_CIRCLE_IMAGE, DIAM_CIRCLE_IMAGE, DEPTH_RIMFLOOR_TOPOG, MORPHOLOGY_EJECTA_1, MORPHOLOGY_EJECTA_2, MORPHOLOGY_EJECTA_3 and NUMBER_LAYERS.
Hemisphere is a variable derived from the LATITUDE_CIRCLE_IMAGE variable to transform the continuous coordinates into categories, for the sake of brevity.
Hemisphere shows seven occurrences in the Equator, same as Latitude equals to zero. Just above 60% of the observations are located in the South Hemisphere. Also, all the observations have values.
Quadrangle is a variable derived from both LATITUDE_CIRCLE_IMAGE and LONGITUDE_CIRCLE_IMAGE variables. (see below a definition from Wikipedia)
Each Quadrangle has approximatelly from one to five percent of the recorded craters, being MC-16, Memnonia the one with the most observations (20455 = 5.32%), and MC-10: Lunae Palus the one with the lower number of records (3478 = 0.90%).
List of quadrangles on Mars (Wikipedia):
The surface of Mars has been divided into 30 quadrangles by the United States Geological Survey, so named because their borders lie along lines of latitude and longitude and so maps appear rectangular. Martian quadrangles are named after local features and are numbered with the prefix “MC” for “Mars Chart”. West longitude is used.
The following imagemap of the planet Mars is divided into 30 linked quadrangles. Click on the quadrangle and you will be taken to the corresponding article pages. North is at the top; 0°N 180°W is at the far left on the equator. The map images were taken by the Mars Global Surveyor.
From Wikipedia, Source: http://photojournal.jpl.nasa.gov/catalog/PIA03467
The variable MORPHOLOGY_EJECTA_1 has 339718 out of 384343 values missing, or 88.3%. The recording with existing content are divided in a large number of categories if considered the full morphology qualification. If taken into account just the first classification, the number of categories is reduced to 29.
From the recorded data, considering just the first classification, shows that 27069, or 60.6%, are of the RD category. The only two other categories that have more than 10% are SLERS (11.45% = 5111) and SLEPS (11.20% = 4998).
The variable MORPHOLOGY_EJECTA_2 has 364867 out of 384343 values missing, or 94.9%. The recording with existing content are divided in a large number of categories if considered the full morphology qualification. If taken into account just the first classification, the number of categories is reduced to 10.
From the recorded data, considering just the first classification, shows that 6540, or 33.6%, are of the HUSL category. The only two other categories that have more than 10% are HUBS (4459 = 22.9%) and SMSL (2797 = 14.3%).
The variable MORPHOLOGY_EJECTA_3 has 383050 out of 384343 values missing, or 99.6%. The recording with existing content are divided in a large number of categories if considered the full morphology qualification. If taken into account just the first classification, the number of categories is reduced to 24.
From the recorded data, considering just the first classification, shows that 351, or 27.1%, are of the PIN-CUSHION category. The other category that has more than 10% is SMALL-CROWN (20.6% = 267).
The NUMBER_LAYERS variable has six categories (0, 1, 2, 3, 4 and 5) and none of its observations are missing. The vast majority of craters are identified as having “0” layers, counting 364612, or 94.87% of the records.
/* Use Course's Library */
LIBNAME mydata "/courses/d1406ae5ba27fe300" ACCESS = readonly;
/* Configure the Data */
DATA NEW;
/* Data set */
SET mydata.marscrater_pds;
LABEL Hemisphere = "Hemisphere"
Quadrangles = "Quadrangles"
MorphoE1U = "Ejecta Morphology 1 (Group by Main Feature)"
MorphoE2U = "Ejecta Morphology 2 (Group by Main Feature)"
MorphoE3U = "Ejecta Morphology 3 (Group by Main Feature)"
NUMBER_LAYERS = "Maximum Number of Cohesive Layers";
/* Categorise the Latitude in Hemispheres */
IF (LATITUDE_CIRCLE_IMAGE < 0)
THEN Hemisphere = "South ";
ELSE IF (LATITUDE_CIRCLE_IMAGE > 0)
THEN Hemisphere = "North ";
ELSE Hemisphere = "Equator";
/* convert coordinates to Quadrangles: https://en.wikipedia.org/wiki/List_of_quadrangles_on_Mars */
LA = LATITUDE_CIRCLE_IMAGE;
LO = LONGITUDE_CIRCLE_IMAGE + 180;
IF LA >= 65 AND LA <= 90 AND LO >= 0 AND LO <= 360 THEN Quadrangle = "MC-01: Mare Boreum (North Pole)";
IF LA >= 30 AND LA < 65 AND LO >= 120 AND LO < 180 THEN Quadrangle = "MC-02: Diacria";
IF LA >= 30 AND LA < 65 AND LO >= 60 AND LO < 120 THEN Quadrangle = "MC-03: Arcadia";
IF LA >= 30 AND LA < 65 AND LO >= 0 AND LO < 60 THEN Quadrangle = "MC-04: Mare Acidalium";
IF LA >= 30 AND LA < 65 AND LO >= 300 AND LO <= 360 THEN Quadrangle = "MC-05: Ismenius Lacus";
IF LA >= 30 AND LA < 65 AND LO >= 240 AND LO < 300 THEN Quadrangle = "MC-06: Casius";
IF LA >= 30 AND LA < 65 AND LO >= 180 AND LO < 240 THEN Quadrangle = "MC-07: Cebrenia";
IF LA >= 0 AND LA < 30 AND LO >= 135 AND LO < 180 THEN Quadrangle = "MC-08: Amazonis";
IF LA >= 0 AND LA < 30 AND LO >= 90 AND LO < 135 THEN Quadrangle = "MC-09: Tharsis";
IF LA >= 0 AND LA < 30 AND LO >= 45 AND LO < 90 THEN Quadrangle = "MC-10: Lunae Palus";
IF LA >= 0 AND LA < 30 AND LO >= 0 AND LO < 45 THEN Quadrangle = "MC-11: Oxia Palus";
IF LA >= 0 AND LA < 30 AND LO >= 315 AND LO <= 360 THEN Quadrangle = "MC-12: Arabia";
IF LA >= 0 AND LA < 30 AND LO >= 270 AND LO < 315 THEN Quadrangle = "MC-13: Syrtis Major";
IF LA >= 0 AND LA < 30 AND LO >= 225 AND LO < 270 THEN Quadrangle = "MC-14: Amenthes";
IF LA >= 0 AND LA < 30 AND LO >= 180 AND LO < 225 THEN Quadrangle = "MC-15: Elysium";
IF LA >= -30 AND LA < 0 AND LO >= 135 AND LO < 180 THEN Quadrangle = "MC-16: Memnonia";
IF LA >= -30 AND LA < 0 AND LO >= 90 AND LO < 135 THEN Quadrangle = "MC-17: Phoenicis Lacus";
IF LA >= -30 AND LA < 0 AND LO >= 45 AND LO < 90 THEN Quadrangle = "MC-18: Coprates";
IF LA >= -30 AND LA < 0 AND LO >= 0 AND LO < 45 THEN Quadrangle = "MC-19: Margaritifer Sinus";
IF LA >= -30 AND LA < 0 AND LO >= 315 AND LO <= 360 THEN Quadrangle = "MC-20: Sinus Sabaeus";
IF LA >= -30 AND LA < 0 AND LO >= 270 AND LO < 315 THEN Quadrangle = "MC-21: Iapygia";
IF LA >= -30 AND LA < 0 AND LO >= 225 AND LO < 270 THEN Quadrangle = "MC-22: Mare Tyrrhenum";
IF LA >= -30 AND LA < 0 AND LO >= 180 AND LO < 225 THEN Quadrangle = "MC-23: Aeolis";
IF LA >= -65 AND LA < -30 AND LO >= 120 AND LO < 180 THEN Quadrangle = "MC-24: Phaethontis";
IF LA >= -65 AND LA < -30 AND LO >= 60 AND LO < 120 THEN Quadrangle = "MC-25: Thaumasia";
IF LA >= -65 AND LA < -30 AND LO >= 0 AND LO < 60 THEN Quadrangle = "MC-26: Argyre";
IF LA >= -65 AND LA < -30 AND LO >= 300 AND LO <= 360 THEN Quadrangle = "MC-27: Noachis";
IF LA >= -65 AND LA < -30 AND LO >= 240 AND LO < 300 THEN Quadrangle = "MC-28: Hellas";
IF LA >= -65 AND LA < -30 AND LO >= 180 AND LO < 240 THEN Quadrangle = "MC-29: Eridania";
IF LA >= -90 AND LA < -65 AND LO >= 0 AND LO <= 360 THEN Quadrangle = "MC-30: Mare Australe (South Pole)";
/* Collapse the Morphology of Eject 1 to its Main Feature, to reduce the output */
IF (INDEX(MORPHOLOGY_EJECTA_1, "/") = 0)
THEN MorphoE1 = MORPHOLOGY_EJECTA_1;
ELSE MorphoE1 = SUBSTR(MORPHOLOGY_EJECTA_1, 1, INDEX(MORPHOLOGY_EJECTA_1, "/") - 1);
MorphoE1U = UPCASE(TRIM(MorphoE1));
/* Collapse the Morphology of Eject 2 to its Main Feature, to reduce the output */
IF (INDEX(MORPHOLOGY_EJECTA_2, "/") = 0)
THEN MorphoE2 = MORPHOLOGY_EJECTA_2;
ELSE MorphoE2 = SUBSTR(MORPHOLOGY_EJECTA_2, 1, INDEX(MORPHOLOGY_EJECTA_2, "/") - 1);
MorphoE2U = UPCASE(TRIM(MorphoE2));
/* Collapse the Morphology of Eject 3 to its Main Feature, to reduce the output */
IF (INDEX(MORPHOLOGY_EJECTA_3, "/") = 0)
THEN MorphoE3 = MORPHOLOGY_EJECTA_3;
ELSE MorphoE3 = SUBSTR(MORPHOLOGY_EJECTA_3, 1, INDEX(MORPHOLOGY_EJECTA_3, "/") - 1);
MorphoE3U = UPCASE(TRIM(MorphoE3));
PROC SORT;
BY CRATER_ID;
/* Calculate Frequencies and Proportions */
PROC FREQ;
TABLE Hemisphere Quadrangle MorphoE1U MorphoE2U MorphoE3U NUMBER_LAYERS;
RUN;
Hemisphere | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
---|---|---|---|---|
Equator | 7 | 0.00 | 7 | 0.00 |
North | 150887 | 39.26 | 150894 | 39.26 |
South | 233449 | 60.74 | 384343 | 100.00 |
Quadrangle | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
---|---|---|---|---|
MC-01: Mare Boreum (North Pole) | 6405 | 1.67 | 6405 | 1.67 |
MC-02: Diacria | 9592 | 2.50 | 15997 | 4.16 |
MC-03: Arcadia | 7835 | 2.04 | 23832 | 6.20 |
MC-04: Mare Acidalium | 4436 | 1.15 | 28268 | 7.35 |
MC-05: Ismenius Lacus | 6762 | 1.76 | 35030 | 9.11 |
MC-06: Casius | 9068 | 2.36 | 44098 | 11.47 |
MC-07: Cebrenia | 15028 | 3.91 | 59126 | 15.38 |
MC-08: Amazonis | 16738 | 4.35 | 75864 | 19.74 |
MC-09: Tharsis | 12921 | 3.36 | 88785 | 23.10 |
MC-10: Lunae Palus | 3478 | 0.90 | 92263 | 24.01 |
MC-11: Oxia Palus | 4666 | 1.21 | 96929 | 25.22 |
MC-12: Arabia | 7640 | 1.99 | 104569 | 27.21 |
MC-13: Syrtis Major | 13522 | 3.52 | 118091 | 30.73 |
MC-14: Amenthes | 13937 | 3.63 | 132028 | 34.35 |
MC-15: Elysium | 18866 | 4.91 | 150894 | 39.26 |
MC-16: Memnonia | 20455 | 5.32 | 171349 | 44.58 |
MC-17: Phoenicis Lacus | 14561 | 3.79 | 185910 | 48.37 |
MC-18: Coprates | 5777 | 1.50 | 191687 | 49.87 |
MC-19: Margaritifer Sinus | 17022 | 4.43 | 208709 | 54.30 |
MC-20: Sinus Sabaeus | 17664 | 4.60 | 226373 | 58.90 |
MC-21: Iapygia | 19422 | 5.05 | 245795 | 63.95 |
MC-22: Mare Tyrrhenum | 19977 | 5.20 | 265772 | 69.15 |
MC-23: Aeolis | 18703 | 4.87 | 284475 | 74.02 |
MC-24: Phaethontis | 14786 | 3.85 | 299261 | 77.86 |
MC-25: Thaumasia | 16011 | 4.17 | 315272 | 82.03 |
MC-26: Argyre | 15775 | 4.10 | 331047 | 86.13 |
MC-27: Noachis | 16141 | 4.20 | 347188 | 90.33 |
MC-28: Hellas | 9535 | 2.48 | 356723 | 92.81 |
MC-29: Eridania | 13930 | 3.62 | 370653 | 96.44 |
MC-30: Mare Australe (South Pol | 13690 | 3.56 | 384343 | 100.00 |
MorphoE1U | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
---|---|---|---|---|
DLEPC | 495 | 1.11 | 495 | 1.11 |
DLEPCPD | 10 | 0.02 | 505 | 1.13 |
DLEPD | 1 | 0.00 | 506 | 1.13 |
DLEPS | 631 | 1.41 | 1137 | 2.55 |
DLEPSPD | 2 | 0.00 | 1139 | 2.55 |
DLERC | 386 | 0.86 | 1525 | 3.42 |
DLERCPD | 7 | 0.02 | 1532 | 3.43 |
DLERS | 1242 | 2.78 | 2774 | 6.22 |
DLERSRD | 2 | 0.00 | 2776 | 6.22 |
DLSPC | 1 | 0.00 | 2777 | 6.22 |
MLEPC | 22 | 0.05 | 2799 | 6.27 |
MLEPS | 43 | 0.10 | 2842 | 6.37 |
MLERC | 24 | 0.05 | 2866 | 6.42 |
MLERS | 491 | 1.10 | 3357 | 7.52 |
MLERSRD | 1 | 0.00 | 3358 | 7.52 |
PD | 2 | 0.00 | 3360 | 7.53 |
RD | 27069 | 60.66 | 30429 | 68.19 |
SLEPC | 2601 | 5.83 | 33030 | 74.02 |
SLEPCPD | 75 | 0.17 | 33105 | 74.18 |
SLEPCRD | 2 | 0.00 | 33107 | 74.19 |
SLEPD | 44 | 0.10 | 33151 | 74.29 |
SLEPS | 4998 | 11.20 | 38149 | 85.49 |
SLEPSPD | 52 | 0.12 | 38201 | 85.60 |
SLEPSRD | 3 | 0.01 | 38204 | 85.61 |
SLERC | 1280 | 2.87 | 39484 | 88.48 |
SLERCPD | 10 | 0.02 | 39494 | 88.50 |
SLERS | 5111 | 11.45 | 44605 | 99.96 |
SLERSPD | 16 | 0.04 | 44621 | 99.99 |
SLERSRD | 4 | 0.01 | 44625 | 100.00 |
Frequency Missing = 339718
MorphoE2U | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
---|---|---|---|---|
HU | 1466 | 7.53 | 1466 | 7.53 |
HUAM | 1373 | 7.05 | 2839 | 14.58 |
HUBL | 4459 | 22.89 | 7298 | 37.47 |
HUSL | 6540 | 33.58 | 13838 | 71.05 |
HUSP | 77 | 0.40 | 13915 | 71.45 |
SM | 805 | 4.13 | 14720 | 75.58 |
SMAM | 844 | 4.33 | 15564 | 79.91 |
SMBL | 1097 | 5.63 | 16661 | 85.55 |
SMSL | 2797 | 14.36 | 19458 | 99.91 |
SMSP | 18 | 0.09 | 19476 | 100.00 |
Frequency Missing = 364867
MorphoE3U | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
---|---|---|---|---|
BUMBLEBEE | 18 | 1.39 | 18 | 1.39 |
BUTTERFLY | 76 | 5.88 | 94 | 7.27 |
INNER IS BUTTERFLY | 2 | 0.15 | 96 | 7.42 |
INNER IS PIN-CUSHION | 87 | 6.73 | 183 | 14.15 |
INNER IS PSEUDO-BUTTERFLY | 1 | 0.08 | 184 | 14.23 |
INNER IS PSEUDO-PIN-CUSHION | 1 | 0.08 | 185 | 14.31 |
INNER IS PSEUDO-SMALL-CROWN | 4 | 0.31 | 189 | 14.62 |
INNER IS SMALL-CROWN | 66 | 5.10 | 255 | 19.72 |
INNER-MOST IS SMALL-CROWN | 1 | 0.08 | 256 | 19.80 |
MIDDLE IS RECTANGULAR | 1 | 0.08 | 257 | 19.88 |
OUTER IS BUTTERFLY | 4 | 0.31 | 261 | 20.19 |
OUTER IS PSEUDO-BUTTERFLY | 4 | 0.31 | 265 | 20.49 |
OUTER IS RECTANGULAR | 1 | 0.08 | 266 | 20.57 |
OUTER IS SPLASH | 56 | 4.33 | 322 | 24.90 |
PIN-CUSHION | 351 | 27.15 | 673 | 52.05 |
PSEDUO-BUTTERFLY | 1 | 0.08 | 674 | 52.13 |
PSEUDO-BUTTERFLY | 118 | 9.13 | 792 | 61.25 |
PSEUDO-PIN-CUSHION | 1 | 0.08 | 793 | 61.33 |
PSEUDO-RECTANGULAR | 26 | 2.01 | 819 | 63.34 |
PSEUDO-SMALL-CROWN | 56 | 4.33 | 875 | 67.67 |
RECTANGULAR | 38 | 2.94 | 913 | 70.61 |
SANDBAR | 58 | 4.49 | 971 | 75.10 |
SMALL-CROWN | 267 | 20.65 | 1238 | 95.75 |
SPLASH | 55 | 4.25 | 1293 | 100.00 |
Frequency Missing = 383050
NUMBER_LAYERS | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
---|---|---|---|---|
0 | 364612 | 94.87 | 364612 | 94.87 |
1 | 15467 | 4.02 | 380079 | 98.89 |
2 | 3435 | 0.89 | 383514 | 99.78 |
3 | 739 | 0.19 | 384253 | 99.98 |
4 | 85 | 0.02 | 384338 | 100.00 |
5 | 5 | 0.00 | 384343 | 100.00 |
"""
Created on Tue Oct 01 01:27:35 2015
@author: angeloklin
"""
# Import libraries
import pandas as pd
# load data
data = pd.read_csv("marscrater_pds.csv", na_values = [" "], low_memory = False)
# function to return hemisphere
def Hemisphere(Latitude):
if Latitude > 0:
return "North"
elif Latitude < 0:
return "South"
else:
return "Equator"
# function to return hemisphere
def Quadrangle(Coordinates):
la = Coordinates["LATITUDE_CIRCLE_IMAGE"]
lo = Coordinates["LONGITUDE_CIRCLE_IMAGE"] + 180
if la >= 65 and la <= 90 and lo >= 0 and lo <= 360: return "MC-01: Mare Boreum (North Pole)"
if la >= 30 and la < 65 and lo >= 120 and lo < 180: return "MC-02: Diacria"
if la >= 30 and la < 65 and lo >= 60 and lo < 120: return "MC-03: Arcadia"
if la >= 30 and la < 65 and lo >= 0 and lo < 60: return "MC-04: Mare Acidalium"
if la >= 30 and la < 65 and lo >= 300 and lo <= 360: return "MC-05: Ismenius Lacus"
if la >= 30 and la < 65 and lo >= 240 and lo < 300: return "MC-06: Casius"
if la >= 30 and la < 65 and lo >= 180 and lo < 240: return "MC-07: Cebrenia"
if la >= 0 and la < 30 and lo >= 135 and lo < 180: return "MC-08: Amazonis"
if la >= 0 and la < 30 and lo >= 90 and lo < 135: return "MC-09: Tharsis"
if la >= 0 and la < 30 and lo >= 45 and lo < 90: return "MC-10: Lunae Palus"
if la >= 0 and la < 30 and lo >= 0 and lo < 45: return "MC-11: Oxia Palus"
if la >= 0 and la < 30 and lo >= 315 and lo <= 360: return "MC-12: Arabia"
if la >= 0 and la < 30 and lo >= 270 and lo < 315: return "MC-13: Syrtis Major"
if la >= 0 and la < 30 and lo >= 225 and lo < 270: return "MC-14: Amenthes"
if la >= 0 and la < 30 and lo >= 180 and lo < 225: return "MC-15: Elysium"
if la >= -30 and la < 0 and lo >= 135 and lo < 180: return "MC-16: Memnonia"
if la >= -30 and la < 0 and lo >= 90 and lo < 135: return "MC-17: Phoenicis Lacus"
if la >= -30 and la < 0 and lo >= 45 and lo < 90: return "MC-18: Coprates"
if la >= -30 and la < 0 and lo >= 0 and lo < 45: return "MC-19: Margaritifer Sinus"
if la >= -30 and la < 0 and lo >= 315 and lo <= 360: return "MC-20: Sinus Sabaeus"
if la >= -30 and la < 0 and lo >= 270 and lo < 315: return "MC-21: Iapygia"
if la >= -30 and la < 0 and lo >= 225 and lo < 270: return "MC-22: Mare Tyrrhenum"
if la >= -30 and la < 0 and lo >= 180 and lo < 225: return "MC-23: Aeolis"
if la >= -65 and la < -30 and lo >= 120 and lo < 180: return "MC-24: Phaethontis"
if la >= -65 and la < -30 and lo >= 60 and lo < 120: return "MC-25: Thaumasia"
if la >= -65 and la < -30 and lo >= 0 and lo < 60: return "MC-26: Argyre"
if la >= -65 and la < -30 and lo >= 300 and lo <= 360: return "MC-27: Noachis"
if la >= -65 and la < -30 and lo >= 240 and lo < 300: return "MC-28: Hellas"
if la >= -65 and la < -30 and lo >= 180 and lo < 240: return "MC-29: Eridania"
if la >= -90 and la < -65 and lo >= 0 and lo <= 360: return "MC-30: Mare Australe (South Pole)"
# function to get the morphology's main feature
def MainMorpho(Morpho):
if pd.isnull(Morpho):
return Morpho
foundAt = Morpho.find("/")
if foundAt >= 0:
m = Morpho[0:foundAt]
else:
m = Morpho
return m.strip().upper()
print("Mars Craters' data set summary:")
print("- Number of observations(rows): ", len(data))
print("- Number of variables(columns): ", len(data.columns))
print("")
print("Hemispheres:")
Hemispheres = data["LATITUDE_CIRCLE_IMAGE"].map(lambda lat: Hemisphere(lat))
freq = Hemispheres.value_counts(sort = True)
prop = Hemispheres.value_counts(sort = True, normalize = True)
print("- Missing Values: ", Hemispheres.isnull().sum())
print("- Frequency Table: ")
print("| | Frequency | Proportion |")
for i in range(len(freq)):
print("|", format(freq.index[i], "<10s"), "| ", format(freq[i], ">6d"), "| ", format(prop[i], ">.4f"), "|")
print("")
print("Quadrangles:")
Quadrangles = data.loc[:, "LATITUDE_CIRCLE_IMAGE":"LONGITUDE_CIRCLE_IMAGE"].apply(Quadrangle, axis = 1)
freq = Quadrangles.value_counts(sort = True)
prop = Quadrangles.value_counts(sort = True, normalize = True)
print("- Missing Values: ", Quadrangles.isnull().sum())
print("- Frequency Table: ")
print("| | Frequency | Proportion |")
for i in range(len(freq)):
print("|", format(freq.index[i], "<35s"), "| ", format(freq[i], ">6d"), "| ", format(prop[i], ">.4f"), "|")
print("")
print("Ejecta Morphology 1 (Group by Main Feature):")
MorphoE1 = data["MORPHOLOGY_EJECTA_1"].map(lambda morpho: MainMorpho(morpho))
MorphoE1a = MorphoE1[MorphoE1.notnull()]
freq = MorphoE1a.value_counts(sort = True)
prop = MorphoE1a.value_counts(sort = True, normalize = True, dropna = False)
print("- Missing Values: ", MorphoE1.isnull().sum())
print("- Frequency Table: ")
print("| | Frequency | Proportion |")
for i in range(len(freq)):
print("|", format(freq.index[i], "<10s"), "| ", format(freq[i], ">6d"), "| ", format(prop[i], ">.4f"), "|")
print("")
print("Ejecta Morphology 2 (Group by Main Feature):")
MorphoE2 = data["MORPHOLOGY_EJECTA_2"].map(lambda morpho: MainMorpho(morpho))
MorphoE2a = MorphoE2[MorphoE2.notnull()]
freq = MorphoE2a.value_counts(sort = True)
prop = MorphoE2a.value_counts(sort = True, normalize = True, dropna = False)
print("- Missing Values: ", MorphoE2.isnull().sum())
print("- Frequency Table: ")
print("| | Frequency | Proportion |")
for i in range(len(freq)):
print("|", format(freq.index[i], "<10s"), "| ", format(freq[i], ">6d"), "| ", format(prop[i], ">.4f"), "|")
print("")
print("Ejecta Morphology 3 (Group by Main Feature):")
MorphoE3 = data["MORPHOLOGY_EJECTA_3"].map(lambda morpho: MainMorpho(morpho))
MorphoE3a = MorphoE3[MorphoE3.notnull()]
freq = MorphoE3a.value_counts(sort = True)
prop = MorphoE3a.value_counts(sort = True, normalize = True, dropna = False)
print("- Missing Values: ", MorphoE3.isnull().sum())
print("- Frequency Table: ")
print("| | Frequency | Proportion |")
for i in range(len(freq)):
print("|", format(freq.index[i], "<28s"), "| ", format(freq[i], ">6d"), "| ", format(prop[i], ">.4f"), "|")
print("")
print("Maximum Number of Cohesive Layers:")
freq = data["NUMBER_LAYERS"].value_counts(sort = False)
prop = data["NUMBER_LAYERS"].value_counts(sort = False, normalize = True)
print("- Missing Values: ", data["NUMBER_LAYERS"].isnull().sum())
print("- Frequency Table: ")
print("| | Frequency | Proportion |")
for i in range(len(freq)):
print("|", format(freq.index[i], "<6d"), "| ", format(freq[i], ">6d"), "| ", format(prop[i], ">.4f"), "|")
print("")
Mars Craters’ data set summary:
Missing Values: 0
Frequency Table:
Frequency | Proportion | |
---|---|---|
South | 233449 | 0.6074 |
North | 150887 | 0.3926 |
Equator | 7 | 0.0000 |
Missing Values: 0
Frequency Table:
Frequency | Proportion | |
---|---|---|
MC-16: Memnonia | 20455 | 0.0532 |
MC-22: Mare Tyrrhenum | 19977 | 0.0520 |
MC-21: Iapygia | 19422 | 0.0505 |
MC-15: Elysium | 18866 | 0.0491 |
MC-23: Aeolis | 18703 | 0.0487 |
MC-20: Sinus Sabaeus | 17664 | 0.0460 |
MC-19: Margaritifer Sinus | 17022 | 0.0443 |
MC-08: Amazonis | 16738 | 0.0435 |
MC-27: Noachis | 16141 | 0.0420 |
MC-25: Thaumasia | 16011 | 0.0417 |
MC-26: Argyre | 15775 | 0.0410 |
MC-07: Cebrenia | 15028 | 0.0391 |
MC-24: Phaethontis | 14786 | 0.0385 |
MC-17: Phoenicis Lacus | 14561 | 0.0379 |
MC-14: Amenthes | 13937 | 0.0363 |
MC-29: Eridania | 13930 | 0.0362 |
MC-30: Mare Australe (South Pole) | 13690 | 0.0356 |
MC-13: Syrtis Major | 13522 | 0.0352 |
MC-09: Tharsis | 12921 | 0.0336 |
MC-02: Diacria | 9592 | 0.0250 |
MC-28: Hellas | 9535 | 0.0248 |
MC-06: Casius | 9068 | 0.0236 |
MC-03: Arcadia | 7835 | 0.0204 |
MC-12: Arabia | 7640 | 0.0199 |
MC-05: Ismenius Lacus | 6762 | 0.0176 |
MC-01: Mare Boreum (North Pole) | 6405 | 0.0167 |
MC-18: Coprates | 5777 | 0.0150 |
MC-11: Oxia Palus | 4666 | 0.0121 |
MC-04: Mare Acidalium | 4436 | 0.0115 |
MC-10: Lunae Palus | 3478 | 0.0090 |
Missing Values: 339718
Frequency Table:
Frequency | Proportion | |
---|---|---|
RD | 27069 | 0.6066 |
SLERS | 5111 | 0.1145 |
SLEPS | 4998 | 0.1120 |
SLEPC | 2601 | 0.0583 |
SLERC | 1280 | 0.0287 |
DLERS | 1242 | 0.0278 |
DLEPS | 631 | 0.0141 |
DLEPC | 495 | 0.0111 |
MLERS | 491 | 0.0110 |
DLERC | 386 | 0.0086 |
SLEPCPD | 75 | 0.0017 |
SLEPSPD | 52 | 0.0012 |
SLEPD | 44 | 0.0010 |
MLEPS | 43 | 0.0010 |
MLERC | 24 | 0.0005 |
MLEPC | 22 | 0.0005 |
SLERSPD | 16 | 0.0004 |
SLERCPD | 10 | 0.0002 |
DLEPCPD | 10 | 0.0002 |
DLERCPD | 7 | 0.0002 |
SLERSRD | 4 | 0.0001 |
SLEPSRD | 3 | 0.0001 |
SLEPCRD | 2 | 0.0000 |
DLEPSPD | 2 | 0.0000 |
DLERSRD | 2 | 0.0000 |
PD | 2 | 0.0000 |
DLEPD | 1 | 0.0000 |
DLSPC | 1 | 0.0000 |
MLERSRD | 1 | 0.0000 |
Missing Values: 364867
Frequency Table:
Frequency | Proportion | |
---|---|---|
HUSL | 6540 | 0.3358 |
HUBL | 4459 | 0.2289 |
SMSL | 2797 | 0.1436 |
HU | 1466 | 0.0753 |
HUAM | 1373 | 0.0705 |
SMBL | 1097 | 0.0563 |
SMAM | 844 | 0.0433 |
SM | 805 | 0.0413 |
HUSP | 77 | 0.0040 |
SMSP | 18 | 0.0009 |
Missing Values: 383050
Frequency Table:
Frequency | Proportion | |
---|---|---|
PIN-CUSHION | 351 | 0.2715 |
SMALL-CROWN | 267 | 0.2065 |
PSEUDO-BUTTERFLY | 118 | 0.0913 |
INNER IS PIN-CUSHION | 87 | 0.0673 |
BUTTERFLY | 76 | 0.0588 |
INNER IS SMALL-CROWN | 66 | 0.0510 |
SANDBAR | 58 | 0.0449 |
PSEUDO-SMALL-CROWN | 56 | 0.0433 |
OUTER IS SPLASH | 56 | 0.0433 |
SPLASH | 55 | 0.0425 |
RECTANGULAR | 38 | 0.0294 |
PSEUDO-RECTANGULAR | 26 | 0.0201 |
BUMBLEBEE | 18 | 0.0139 |
OUTER IS BUTTERFLY | 4 | 0.0031 |
OUTER IS PSEUDO-BUTTERFLY | 4 | 0.0031 |
INNER IS PSEUDO-SMALL-CROWN | 4 | 0.0031 |
INNER IS BUTTERFLY | 2 | 0.0015 |
INNER-MOST IS SMALL-CROWN | 1 | 0.0008 |
OUTER IS RECTANGULAR | 1 | 0.0008 |
INNER IS PSEUDO-BUTTERFLY | 1 | 0.0008 |
MIDDLE IS RECTANGULAR | 1 | 0.0008 |
PSEUDO-PIN-CUSHION | 1 | 0.0008 |
INNER IS PSEUDO-PIN-CUSHION | 1 | 0.0008 |
PSEDUO-BUTTERFLY | 1 | 0.0008 |
Missing Values: 0
Frequency Table:
Frequency | Proportion | |
---|---|---|
0 | 364612 | 0.9487 |
1 | 15467 | 0.0402 |
2 | 3435 | 0.0089 |
3 | 739 | 0.0019 |
4 | 85 | 0.0002 |
5 | 5 | 0.0000 |