Load Required Packages

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.5     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggthemes)
library(ggrepel)
library(dplyr)
library(psych)
## 
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha

DS Labs Datasets

Use the package DSLabs (Data Science Labs)

There are a number of datasets in this package to use to practice creating visualizations

#install.packages("dslabs")  # these are data science labs
library("dslabs")
data(package="dslabs")
list.files(system.file("script", package = "dslabs"))
##  [1] "make-admissions.R"                   
##  [2] "make-brca.R"                         
##  [3] "make-brexit_polls.R"                 
##  [4] "make-death_prob.R"                   
##  [5] "make-divorce_margarine.R"            
##  [6] "make-gapminder-rdas.R"               
##  [7] "make-greenhouse_gases.R"             
##  [8] "make-historic_co2.R"                 
##  [9] "make-mnist_27.R"                     
## [10] "make-movielens.R"                    
## [11] "make-murders-rda.R"                  
## [12] "make-na_example-rda.R"               
## [13] "make-nyc_regents_scores.R"           
## [14] "make-olive.R"                        
## [15] "make-outlier_example.R"              
## [16] "make-polls_2008.R"                   
## [17] "make-polls_us_election_2016.R"       
## [18] "make-reported_heights-rda.R"         
## [19] "make-research_funding_rates.R"       
## [20] "make-stars.R"                        
## [21] "make-temp_carbon.R"                  
## [22] "make-tissue-gene-expression.R"       
## [23] "make-trump_tweets.R"                 
## [24] "make-weekly_us_contagious_diseases.R"
## [25] "save-gapminder-example-csv.R"

View Dataset

data("stars")

Data Description

stars %>% describe()
##           vars  n    mean      sd median trimmed     mad  min   max range  skew
## star*        1 96   47.88   27.45   47.5   47.85   34.84    1    95    94  0.01
## magnitude    2 96    4.26    7.35    2.4    4.18   10.16   -8    17    25  0.12
## temp         3 96 8752.29 7727.86 5050.0 7337.31 3528.59 2500 33600 31100  1.45
## type*        4 96    5.85    3.28    8.0    6.04    1.48    1    10     9 -0.41
##           kurtosis     se
## star*        -1.23   2.80
## magnitude    -1.41   0.75
## temp          0.99 788.72
## type*        -1.62   0.33

Stars Dataset (Physical Properties of Stars)

The stars dataset provides data on the physical properties of selected stars, including luminosity, temperature, and spectral class.

Source: Compiled from multiple open-access references on VizieR. http://vizier.u-strasbg.fr/viz-bin/VizieR

Dataset Variable Description:

star. Name of star.

magnitude. Absolute magnitude of the star, which is a function of the star’s luminosity and distance to the star.

temp. Surface temperature in degrees Kelvin (K).

type. Spectral class of star in the OBAFGKM system.

Prepare and Filter Data

The four dominant stars are Aldebaran, Regulus, Antares, and Famlhaut. Filter for the four dominant stars are to determine what their spectral class (types) are. Once this is determined filter for the four dominant stars spectral class (types) and filter for them (create a new dataset).

big4 <- stars %>%
  filter(star == "Aldebaran" | star == "Regulus" | star == "Antares" | star == "Fomalhaut") %>%
  arrange(magnitude)

Fliter for Big 4 Types (A, B, K, and M)

big4type <- stars %>%
  filter(type == "A" | type == "B" | type == "K" | type == "M") %>%
  arrange(magnitude)

DS Labs: Spectral Class (Type) and Surface Temperature

big4type %>%
  ggplot(aes(log10(temp), magnitude, col=type)) +
    geom_point() +
    geom_text(aes(label = star)) +
    scale_x_reverse("Surface Temperature (Kelvin) - log10") +
    scale_y_reverse("Magnitude") +
  ggtitle("Spectral Class (Type) and Surface Temperature") +
      theme(legend.title=element_blank())  

DS Labs Stars Dataset Review: Spectral Class (Type) and Temperature

I used the “stars” dataset from DS Labs. This dataset contains the physical properties of selected stars, including luminosity, temperature, and spectral class. I wanted to look at what is known as the dominant four stars: Aldebaran, Regulus, Antares, and Fomalhaut to determine their spectral class (type) and I created the “big4” dataset. I then create the “big4type” dataset filtering only on spectral class (type) for the dominant 4 stars. This results in this filter included “A”, “B”, “K”, “M”

The DS Labs chart analyzes spectral class (type) and surface temperature (Kelvin). Spectral class (types) is essential for the amount of light produced at various wavelengths by stars. Spectral class (types) are arranged according to temperature and have letter designations in a sequence that goes from hot to cool in the order — O, B, A, F, G, K, and M. Type O stars have the highest surface temperatures that can be as hot as 30,000 Kelvins and Type M stars can be as cool as 3,000 Kelvins.

The points in the data set are colored by star type. The results in the visualization reflected that of the spectral class (types) sequence mentioned above. Type “B” stars have the highest surface temperatures to type “M” experience the lowest surface temperatures. However, type “M” has about three stars that are outliers in that class type. Furthermore, type “A” and “K” are in the middle with “A” experiencing higher surface temperatures than “K”.