INTRODUCTION

In this report, I explore the “satellite database” data set, which comes from the Union of Concerned Scientists Satellite Database. This dataset was used for a graduate course that I took previously to predict satellite life expectancy, or the time in years that an artificial satellite is expected to be operational. The data shows each satellite’s life expectancy, as provided by the manufacturer, and their respective orbital attributes. The orbital attributes mainly pertain to the positioning of the satellite as it orbits around the Earth. In this report, I focus on the relationships that life expectancy has with other varibables, closely looking at LEOs or Low Earth Orbit satellites.

Below is a preview of my previous project. If interested, right-click to download and view the full report.

knitr::include_graphics("KQuimzon_Final.pdf")

Load packages

These are the packages that will be utilized for this report.

library(tidyverse)
library(here)
library(skimr)
library(janitor)
library(psych)
library(cowplot)

DATA PREPARATION

First, the satellite data was loaded into RStudio and then cleaned up using clean_names(). Though the data had been previously cleaned for another study, this was to ensure all variables had consistent naming conventions. Then, three variables were renamed to indicate their belonging to a set of categorical Boolean variables. Each satellite entry can have multiple types, thus the Boolean variables.

sats <- read_csv(here("Lab3", "data", "satellites.csv")) # The data is loaded.
## Rows: 3095 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): orbit_class
## dbl (8): life, geo_longitude, perigee, apogee, eccentricity, inclination, pe...
## lgl (3): gov, com, mil
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
clean_sats <- sats %>%      #This is the clean up step.
    na.omit() %>%           #Omit entries with NA values  
    clean_names() %>%       #Clean variable names
    rename(type_gov = gov, type_com = com, type_mil = mil)  #Rename the Boolean variables. 
head(clean_sats)     #This shows the sample of the cleaned data.
## # A tibble: 6 × 12
##    life orbit_class geo_longitude perigee apogee eccentricity inclination period
##   <dbl> <chr>               <dbl>   <dbl>  <dbl>        <dbl>       <dbl>  <dbl>
## 1     1 Ellip                   0     460  33200       0.706         31     580 
## 2     1 Ellip                   0     952   1155       0.0137        31     106.
## 3     2 Ellip                   0    6292 156833       0.856         54.0  4033.
## 4     2 Ellip                   0     461  87304       0.864         15.7  1869.
## 5     2 Ellip                   0     467  87260       0.864         15.7  1868.
## 6     2 Ellip                   0     474  87526       0.864         15.7  1876.
## # ℹ 4 more variables: launch_mass <dbl>, type_gov <lgl>, type_com <lgl>,
## #   type_mil <lgl>

The above table shows a subset of the data. “Life” corresponds to the life expectancy in years, “orbit_class” to a type of orbit, “geo_longitude” is where a GEO satellite sits in relation to the Earth, ““perigee” is the closest distance of the satellite to the Earth in its orbits, “apogee” is the farthest distance of the satellite from the Earth, “eccentricity” is how close the orbit comes to a perfect circle, “inclination” is the angle of inclination for the orbit, “period” is the time it takes a satellite to complete a full orbit, and “launch_mass” is the mass of the satellite in kilograms. The last three variables “type_gov,” “type_com,” and “type_mil” indicate whether the satellite is considered a government, commercial, or military satellite, respectively.

Summary statistics for the data using the “psych” package is shown below:

describe(clean_sats)
##               vars    n    mean       sd median trimmed    mad     min
## life             1 3066    6.21     4.20    4.0    5.57   1.48    0.25
## orbit_class*     2 3066    2.87     0.46    3.0    2.92   0.00    1.00
## geo_longitude    3 3066    2.36    36.05    0.0    0.00   0.00 -179.80
## perigee          4 3066 6874.57 12919.78  548.0 4089.07 189.77  170.00
## apogee           5 3066 7643.33 16451.28  561.0 4428.84 202.37  280.00
## eccentricity     6 3066    0.01     0.07    0.0    0.00   0.00    0.00
## inclination      7 3066   59.54    30.97   53.0   62.05  43.74    0.00
## period           8 3066  346.71   574.16   95.6  229.00   1.63   91.34
## launch_mass      9 3066 1008.73  1802.85  260.0  576.12 166.05    1.00
## type_gov        10 3066     NaN       NA     NA     NaN     NA     Inf
## type_com        11 3066     NaN       NA     NA     NaN     NA     Inf
## type_mil        12 3066     NaN       NA     NA     NaN     NA     Inf
##                     max     range  skew kurtosis     se
## life              30.00     29.75  1.42     0.77   0.08
## orbit_class*       4.00      3.00 -1.07     2.93   0.01
## geo_longitude    180.00    359.80  0.53     9.26   0.65
## perigee        37782.00  37612.00  1.67     0.92 233.33
## apogee        330000.00 329720.00  5.57    74.49 297.11
## eccentricity       0.96      0.96 11.26   128.44   0.00
## inclination      143.40    143.40 -0.54    -0.41   0.56
## period         11520.00  11428.66  4.84    63.08  10.37
## launch_mass    22500.00  22499.00  3.71    23.89  32.56
## type_gov           -Inf      -Inf    NA       NA     NA
## type_com           -Inf      -Inf    NA       NA     NA
## type_mil           -Inf      -Inf    NA       NA     NA

There are n = 3,066 observations with a total of 12 variables. The highest variation can be seen in “perigee” and “apogee.”

ANALYSIS

Part 1: Eccentricity

How does eccentricity affect the lifespan of a satellie? Eccentricity is how “round” the orbit of the satellite is. An orbit that is a perfect circle has an eccentricity of zero, while an ellipse has an eccentricity between zero and one. Read more here:Eccentricity

clean_sats %>%    #A plot of eccentricity vs life expectancy, colored by the orbit class. 
  na.omit() %>% 
  ggplot(aes(x = life, y = eccentricity, color=orbit_class)) + 
  geom_jitter() +
  labs(title = "Eccentricity vs. Life Expectancy",
       subtitle = "Coded by orbit class", 
       caption = "Data from https://www.ucsusa.org/resources/satellite-database", 
       fill = "Orbit Class", 
       y="Eccentricity",
       x="Satellite Life Expectancy")

ggsave("sat_lifespan.png") # Save the plot!
## Saving 7 x 5 in image

When the plot was first created without color, at first glance eccentricity didn’t seem to have a clear pattern. When colored by the orbit class, however, one can see that the LEO satellites, in light blue, which tend to be highly eccentric (close to zero), have shorter lifespans. LEOs, or Low Earth Orbit satellites orbit that Earth at an altitude of less than 1000 km. The other main types of orbits are GEO and MEO. Compared to GEO, which orbits at 35,000 km, and MEO, which at half the distance of GEO, LEOs orbit the Earth at a much closer distance. Find out more about orbit types here: Orbit Types

Part 2: How is the data for LEO satellites distributed?

clean_sats %>%     #Histogram created for LEO satellite life expectancy. 
  na.omit() %>% 
  filter(orbit_class == "LEO") %>%     # Filer for LEO
  ggplot(aes(x=life)) + 
  geom_histogram(binwidth = 0.75) +    # Set bin width
  theme_bw() +    # Set theme to black and white
  labs(title = "Histogram of LEO satellites life expectancy",
       caption = "Data from https://www.ucsusa.org/resources/satellite-database", 
       y="Count",
       x="Satellite Life Expectancy")

Overall, the data for LEO satellites look like it could have a positive skew, though there are quite a few values that fall under 15 years life expectancy. This could be a default or more generic life expectancy selected by aerospace companies for certain types of satellites.

Next, violin plots were created to compare the distribution of LEOs to the other orbit classes.

clean_sats %>%
  na.omit() %>%
  ggplot(aes(x=eccentricity, y=life, color=orbit_class)) + # Plots created for eccentricity vs life expectancy, by orbit class.
  geom_violin() +
  facet_wrap(~orbit_class) +
  labs(title = "Violin plots of satellites eccentricity vs. life expectancy",
       caption = "Data from https://www.ucsusa.org/resources/satellite-database", 
       fill = "Orbit Class",
       y="Life expectancy",
       x="Eccentricity")

The violin plots show that LEOs do indeed have smaller eccentricity, as shown by the bottom-left plot in the figure above. The life expectancy is pretty variable, but most of the data points seem to fall around the 5-year point, as indicated by the wide area near the bottom of the violin. In contrast, this is higher for MEO and GEO, at around 10 years and 15 years, respectively. The catch-all category of “ellip”, shown on the top-left, contains any satellite that has an elliptical orbit. This distribution seems to vary greatly, but also has most data falling around a 5-year lifespan.

Part 3: How do other variables relate to the life expectancy of a LEO satellite?

Scatterplots were created plotting life expectancy against three different variables: “apogee,” “launch mass,” and “period.” Additionally, a smoothed function line was also drawn over each line. Again, apogee is the point farthest away from Earth within the satellite’s orbit, launch mass is the weight of the satellite in kilograms and period is the time it takes the satellite to do one rotation around the earth, calculated in minutes.

p1<- clean_sats %>%    # Scatterplot for apogee and life expectancy.
  na.omit() %>%
  filter(orbit_class == "LEO") %>%
  ggplot(aes(x=apogee, y=life, color=type_gov)) +
  geom_point() + 
  geom_smooth() + 
  labs(title = "Apogee vs life",
       fill = "Government satellite?",
       y="Life expectancy",
       x="Apogee")
p2<- clean_sats %>%     # Scatterplot for laumch mass and life expectancy.
  na.omit() %>%
  filter(orbit_class == "LEO") %>%
  ggplot(aes(x=launch_mass, y=life, color=type_gov)) +
  geom_point() + 
  geom_smooth() +
  labs(title = "Launch mass vs life",
       fill = "Government satellite?",
       y="Life expectancy",
       x="Launch mass")
p3<- clean_sats %>%      # Scatterplot for period and life expectancy.
  na.omit() %>%
  filter(orbit_class == "LEO") %>%
  ggplot(aes(x=period, y=life, color=type_gov)) +
  geom_point() + 
  geom_smooth() +
  labs(title = "Period vs life",
       fill = "Government satellite?",
       y="Life expectancy",
       x="Period")

A composite plot was created combining the three scatterplots, as shown below.

all_plot <- plot_grid(p1, p2, p3, labels= c('A', 'B', 'C'), label_size=10) # puts all the plot together
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
all_plot # show the plot

The last figure shows the three scatterplots created. For each plot, the data points were sorted based on whether or not the satellite was categorized as a “government”-operated one based on the variable “type_gov.” Though there are not clear trend lines for any of the variables, they seem to “spike” at one point. These could potentially indicate “sweet spots” where life expectancy is highest for each of the variables. Additionally, it’s interesting to note that the smooth line rises and than falls for high values for non-government satellites, but it tends to rise for launch mass and period for government satellites. This could indicate that government satellites may have more life expectancy built into them since they operate for a long period time.

CONCLUSION

LEOs or Low Earth Orbit satellites have small eccentricity or close to zero, meaning almost a perfect circle. They also seem to have the shortest life expectancy compared to other satellite orbit classes. For the data, the distribution of LEOs does not seem to have a normal distribution and the violin plots show that life expectancy is generally lower compared to the other classes. Finally, there does not seem to be direct relationships between “apogee,” “launch mass,” or “period” with life expectancy. However, government satellites may have slightly better life expectancy compared to non-government satellites.

Thank you for reading!

via GIPHY