Numeric classes and storage modes

# I focused on numeric classes and storage modes in R, specifically diving into the distinctions between integers and doubles. These foundational concepts are essential for optimizing memory usage and computational efficiency in data analysis.
# 
# I began by examining how R handles numeric values. Using the typeof() function, I differentiated between doubles, which are R’s default numeric type, and integers, denoted by an L suffix. Both types pass the is.numeric() check, yet they differ significantly in storage and precision. Doubles, as double precision vectors, occupy 8 bytes per value, while integers use only 4 bytes, highlighting the importance of selecting the appropriate type based on the task.
# 
# I also experimented with coercion, converting logical values to numerics using as.numeric(). It was intriguing to see how TRUE is converted to 1, though it remains a double rather than an integer. This distinction is critical when considering memory optimization, as integers are more efficient for large datasets where precision beyond whole numbers is unnecessary.
# 
# To further grasp the impact of numeric types on performance, I conducted a benchmarking test using the microbenchmark package. By comparing arithmetic operations on integers and doubles across a large range of values, I noticed subtle differences in execution time. Although minor in small-scale operations, these differences can become significant when handling large datasets, underscoring the value of efficient data type usage in R.

# Custom Numeric Class Exploration
a <- 25.7
b <- 25L

# Confirm types
typeof(a)  # Double

## [1] "double"

typeof(b)  # Integer

## [1] "integer"

# Confirm both are numeric
is.numeric(a)

## [1] TRUE

is.numeric(b)

## [1] TRUE

# Logical to numeric conversion
as.numeric(FALSE)

## [1] 0

# Checking integer status of coerced logical
is.integer(as.numeric(FALSE))

## [1] FALSE

# Testing double precision
is.double(5.5)

## [1] TRUE

is.double(7L)

## [1] FALSE

# Benchmarking arithmetic operations
#install.packages("microbenchmark")
library(microbenchmark)

microbenchmark(
  for(j in 1:200000){
    3L * j
    15L + j
  },
  for(j in 1:200000){
    3.0 * j
    15.0 + j
  }
)

## Unit: milliseconds
##                                           expr     min      lq     mean
##  for (j in 1:2e+05) {     3L * j     15L + j } 18.9785 27.6969 76.63727
##    for (j in 1:2e+05) {     3 * j     15 + j } 18.4144 30.8769 82.19029
##    median      uq      max neval
##  54.88110 84.1297 554.1568   100
##  55.71495 99.9772 487.6908   100

library(knitr)
library(kableExtra)

# My code 
code_data <- data.frame(
  Step = c(
    "Define numeric values", 
    "Confirm types", 
    "Check if numeric", 
    "Logical to numeric", 
    "Check integer status of coerced logical", 
    "Test double precision", 
    "Benchmarking arithmetic"
  ),
  Code_Snippet = c(
    "a <- 25.7\nb <- 25L",
    "typeof(a)\ntypeof(b)",
    "is.numeric(a)\nis.numeric(b)",
    "as.numeric(FALSE)",
    "is.integer(as.numeric(FALSE))",
    "is.double(5.5)\nis.double(7L)",
    "library(microbenchmark)\nmicrobenchmark(\n  for(j in 1:200000){\n    3L * j\n    15L + j\n  },\n  for(j in 1:200000){\n    3.0 * j\n    15.0 + j\n  }\n)"
  )
)

# Tabular table in different colors 
code_data %>%
  kable("html", escape = FALSE, col.names = c("Step", "Code Snippet")) %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed")) %>%
  row_spec(1, background = "#D6EAF8") %>%  # Light blue for first row
  row_spec(2, background = "#F9E79F") %>%  # Light yellow for second row
  row_spec(3, background = "#D5F5E3") %>%  # Light green for third row
  row_spec(4, background = "#FADBD8") %>%  # Light red for fourth row
  row_spec(5, background = "#EBDEF0") %>%  # Light purple for fifth row
  row_spec(6, background = "#FCF3CF") %>%  # Light gold for sixth row
  row_spec(7, background = "#AED6F1") %>%  # Light sky blue for seventh row
  column_spec(1, background = "#F5B7B1")   # Light coral for the first column

Step	Code Snippet
Define numeric values	a <- 25.7 b <- 25L
Confirm types	typeof(a) typeof(b)
Check if numeric	is.numeric(a) is.numeric(b)
Logical to numeric	as.numeric(FALSE)
Check integer status of coerced logical	is.integer(as.numeric(FALSE))
Test double precision	is.double(5.5) is.double(7L)
Benchmarking arithmetic	library(microbenchmark) microbenchmark( for(j in 1:200000){ 3L * j 15L + j }, for(j in 1:200000){ 3.0 * j 15.0 + j } )

Numeric classes and storage modes

Avery Holloman

2024-11-03