Introduction

This document demonstrates how to retrieve compounds for each pathway entry from the KEGG database using the KEGGREST R package. We will get the list of pathway IDs for Homo sapiens and then retrieve and concatenate the compounds for each pathway into a single data frame.

Load Required Package

First, we need to load the KEGGREST package. If it is not already installed, you can install it using install.packages("KEGGREST").

# Load the KEGGREST package
library(KEGGREST)

Get List of Pathway IDs

We retrieve the list of pathway IDs for a given organism, in this case, Gallus gallus (gga).

# Get the list of pathway IDs for Gallus gallus
pathways <- keggList("pathway", "gga")
pathway_ids <- names(pathways)

Retrieve Compounds for Each Pathway

Next, we retrieve the compounds for each pathway. We use keggGet to get detailed information for each pathway and extract the compounds. If a pathway has compounds, we concatenate their IDs into a single string separated by commas.

# Retrieve compounds for each pathway and concatenate them
pathway_compounds <- lapply(pathway_ids, function(pid) {
  pathway_info <- keggGet(pid)
  compounds <- pathway_info[[1]]$COMPOUND
  if (!is.null(compounds)) {
    compound_ids <- paste(names(compounds), collapse = ", ")
  } else {
    compound_ids <- NA
  }
  return(compound_ids)
})

Combine Data into a Data Frame

We combine the pathway IDs and their corresponding concatenated compounds into a data frame.

# Combine into a data frame
pathway_compounds_df <- data.frame(
  PathwayID = pathway_ids,
  CompoundID = unlist(pathway_compounds),
  stringsAsFactors = FALSE
)

Display the Resulting Data Frame

Finally, we print the resulting data frame to see the pathway IDs and their associated compounds.

# Print the resulting data frame
tail(pathway_compounds_df)

Conclusion

This document demonstrated how to use the KEGGREST package to retrieve and organize compounds for each pathway from the KEGG database. The resulting data frame provides a convenient way to view the compounds associated with each pathway ID. ```