DS Lab Assignment

Author

D Devkota

Published

May 5, 2026

Introduction

For this DS Lab Assignment, I am using the STARS dataset. This dataset contains information about stars including their temperature , magnitude, and type.

Installing package

if (!requireNamespace("dslabs")) {
  install.packages("dslabs", repos = "https://cran.r-project.org")
}
Loading required namespace: dslabs

Loading Libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
library(ggplot2)
library(dplyr)

Loading the star dataset

head(stars)
            star magnitude temp type
1            Sun       4.8 5840    G
2        SiriusA       1.4 9620    A
3        Canopus      -3.1 7400    F
4       Arcturus      -0.4 4590    K
5 AlphaCentauriA       4.3 5840    G
6           Vega       0.5 9900    A

Cleaning data

stars_clean <- stars |>
  filter(!is.na(temp), !is.na(magnitude), !is.na(type))

Star Temperature VS Magnitude Scatterplot

# Building Scatterplot using ggplot
ggplot(stars_clean, aes(x = log10(temp), y = magnitude, color = type)) +
# Adding one dot per star
  geom_point(size = 2.5, alpha = 0.8) +
# Reverse x axis so hotter stars appear on left
  scale_x_reverse() +
# Reverse y axis so brighter stars appear on top 
  scale_y_reverse() +
# Manually assigning colors 
  scale_color_manual(
    name = "Type",
    values = c(
      "O"  = "maroon",
      "B"  = "lightblue",
      "A"  = "white",
      "F"  = "gold",
      "G"  = "blue",
      "K"  = "limegreen",
      "M"  = "cyan",
      "DA" = "purple",
      "DB" = "black",
      "DF" = "brown"
    )
  ) +
# Adding title and axis 
   labs(
    title = "Star Temperature vs. Magnitude",
    x = "Temperature",
    y = "Magnitude",
    color = "Type"
  ) +
 # Applying theme
   theme_dark()

I created a multivariable scatterplot to examine the relationship between temperature and magnitude of stars. The scatterplot uses color to represent star types and point size to represent stars. I used custom theme (theme_dark) to improve appearance. I used manual color selection to make the visualization easier to interpret.