Debugging Principles & Tools

Debugging

Debugging in programming refers to the process of identifying and fixing errors, or bugs, in the code.

Debugging is a crucial skill for developers as it ensures that programs run correctly and produce the expected results.

Why we Debug

Why do we spend time debugging?

Programming requires large amount of precision
It’s not realistic to write a completely correct program on your first try
Instead, we should write a program that works, simulate different user scenarios, and fix bugs along the way

Debugging R Functions

In this lesson, we will be finding/correcting bugs within the functions we wrote/someone else wrote

Principles of Debugging

The Essence of Debugging: The Principle of Confirmation

This principle emphasizes the importance of confirming the existence of a bug before attempting to fix it. Many times, what appears to be a bug may actually be a misunderstanding of how the code is supposed to work.

x <- 5
y <- 10
z <- x+y
print(z)

## [1] 15

In this example, before assuming there’s a bug, we confirm whether the result of z is what we expect by printing it.

Repeat It (Consistency to See the Bug)

This principle suggests repeating the process that leads to the bug consistently so that you can observe the bug each time. Using a small and consistent dataset or setting a random seed can help achieve this.

# Example 
set.seed(123)  # Set a random seed for reproducibility
data <- rnorm(10)  # Generate random data
print(mean(data))  # Check the mean of the data

## [1] 0.07462564

Start Small

When debugging, it’s often helpful to start with a small and simple version of the code to isolate the issue. Once the problem is identified, gradually expand to more complex scenarios

Debug in a modular, top-down manner

Antibugging

Use if statement to enforce value of input / intermediate value

Or use stopifnot()

Don’t forget: Google it first

Debugging Tools in R and Rstudio

Old school way

Use print()
Copy / paste a function into new R script file

This refers to traditional methods of debugging, such as inserting print() statements or manually inspecting variables to understand the flow of the program and identify potential issues.

Printing the values of variables or expressions at various points in the code helps track their values and behavior, aiding in identifying discrepancies or unexpected outcomes.

Sometimes, copying the code into a new R script file can help identify issues that may be caused by external factors such as hidden characters, encoding problems, or interactions with other parts of the code.

Debugging Demo

Demo example:

Two groups of patients: drug VS control
Each group contains 5 subjects
Each subject was measured gene expression level for 10,000 genes

Goal: Perform t test on each gene and report the gene that has smallest p value

#Simulate
n_tests <- 10000
rep <- 5
n <- n_tests * rep

#set.seed(1)
ctrl <- matrix(rnorm(n, mean = 10), nrow = n_tests)
rownames(ctrl) <- paste("gene_", 1:n_tests, sep ="")
colnames(ctrl) <- paste("subject_", 1:rep, sep = "")

#set.seed(2)
trt <- matrix(rnorm(n, mean = 13), nrow = n_tests)
rownames(trt) <- paste("gene_", 1:n_tests, sep ="")
colnames(trt) <- paste("subject_", 1:rep, sep = "")

trt[n_tests]

## [1] 13.4528

#initiate a p value = 1
min.pval <- 1
gene.ind <- 0

for (i in 1:n_tests){
  
  #Perform a t test for two groups of data
  my.test <- (t.test(ctrl[i,], trt[i,]))
  
  #extract p value
  my.pval <- my.test$p.value
  if(my.pval < min.pval){
    min.pval <- my.pval
    gene.ind <- i
  }
}

print(min.pval)

## [1] 5.059427e-09

print(gene.ind)

## [1] 3580

Demo Code

n_tests <- 10
rep <- 3
n <- n_tests * rep

set.seed(1)
ctrl <- matrix(rnorm(n), nrow = n_tests)

set.seed(2)
trt <- matrix(rnorm(n, mean = 10), nrow = n_tests)

trt[n_tests,]

## [1]  9.861213 10.432265 10.289637

debug() will open into a browser that lets you debug step by step

You have to run the function after for it to open into the browser
You can exit with a capital q

undebug() stops the function from entering into debug mode

browser() lets you start the debugging process from a specified point

debugone() You only have ro tun this once, no need to undebug

trace() This finds the spot where the error is located in the

Also must be run again to enter browser mode

untrace() exits the trace function

traceback() run immediately after an error and it will give the possible bugs

my_function <- function(x,y){
  z <- x * y + 5
  
  browser()
  
  w <- z / 0
  
  v <- log(w)
  
  return(v)
}

my_function(2,3)

## Called from: my_function(2, 3)
## debug: w <- z/0
## debug: v <- log(w)
## debug: return(v)

## [1] Inf

trace(my_function)

debug(my_function)

debugonce(my_function)

undebug(my_function)

Module-6-HW_KF

Kayla Foht

2025-06-18

Directory