This document demonstrates how to retrieve compounds for each pathway
entry from the KEGG database using the KEGGREST R package.
We will get the list of pathway IDs for Homo sapiens and then retrieve
and concatenate the compounds for each pathway into a single data
frame.
First, we need to load the KEGGREST package. If it is
not already installed, you can install it using
install.packages("KEGGREST").
# Load the KEGGREST package
library(KEGGREST)
We retrieve the list of pathway IDs for a given organism, in this
case, Gallus gallus (gga).
# Get the list of pathway IDs for Gallus gallus
pathways <- keggList("pathway", "gga")
pathway_ids <- names(pathways)
Next, we retrieve the compounds for each pathway. We use
keggGet to get detailed information for each pathway and
extract the compounds. If a pathway has compounds, we concatenate their
IDs into a single string separated by commas.
# Retrieve compounds for each pathway and concatenate them
pathway_compounds <- lapply(pathway_ids, function(pid) {
pathway_info <- keggGet(pid)
compounds <- pathway_info[[1]]$COMPOUND
if (!is.null(compounds)) {
compound_ids <- paste(names(compounds), collapse = ", ")
} else {
compound_ids <- NA
}
return(compound_ids)
})
We combine the pathway IDs and their corresponding concatenated compounds into a data frame.
# Combine into a data frame
pathway_compounds_df <- data.frame(
PathwayID = pathway_ids,
CompoundID = unlist(pathway_compounds),
stringsAsFactors = FALSE
)
Finally, we print the resulting data frame to see the pathway IDs and their associated compounds.
# Print the resulting data frame
tail(pathway_compounds_df)
This document demonstrated how to use the KEGGREST
package to retrieve and organize compounds for each pathway from the
KEGG database. The resulting data frame provides a convenient way to
view the compounds associated with each pathway ID. ```