%\VignetteIndexEntry{reactome.db: How to use the reactome.db package} %\VignetteKeywords{annotation, database, reactome} %\VignettePackage{reactome.db} \documentclass[11pt]{article} \usepackage{theorem} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} %% Excercises and Questions \theoremstyle{break} \newtheorem{Ex}{Exercise} \theoremstyle{break} \newtheorem{Q}{Question} %% And solution or answer \newenvironment{solution}{\color{blue}}{\bigskip} \title{reactome.db: How to use the reactome.db package} \author{Willem Ligtenberg} \SweaveOpts{keep.source=TRUE} \begin{document} \maketitle \section{Introduction} The \Rpackage{reactome.db} package contains mappings from Gene Ontology (GO) identifiers and Entrez Gene identifiers to Reactome identifiers of pathways. Reactome is an open-source, open access, manually curated and peer-reviewed pathway database. Pathway annotations are authored by expert biologists, in collaboration with Reactome editorial staff and cross-referenced to many bioinformatics databases. The core unit of the Reactome data model is the reaction. Entities (nucleic acids, proteins, complexes and small molecules) participating in reactions form a network of biological interactions and are grouped into pathways. Examples of biological pathways in Reactome include signaling, innate and acquired immune function, transcriptional regulation, translation, apoptosis and classical intermediary metabolism. The \Rpackage{reactome.db} package also contains a mapping from Reactome identifier to the name of the pathway. It can be used for gene set enrichment analysis or just to look up the biological context of a given gene. The Bioconductor version contains information from the Reactome database that was available at the release of the new Bioconductor version. Open Analytics also has their own repository, where we update the package as soon as Reactome releases a new version. This repository can be found at: http://repos.openanalytics.eu/ \section{Examples} \subsection{Basic information} The \Rpackage{reactome.db} package provides an interface to an SQLite database, which contains a subset of the Reactome database. The command \Rfunction{ls} can be used to get an overview of the methods that the package contains. <>= options(continue=" ", prompt="R> ", width=72L) @ <>= library("reactome.db") @ The same basic set of objects is provided with the db packages: <>= ls("package:reactome.db") @ \begin{Ex} Start an R session and use the \Rfunction{library} function to load the \Rpackage{reactome.db} software package. <>= library("reactome.db") @ \end{Ex} It is possible to call the package name as a function to get some QC information about it. <>= qcdata = capture.output(reactome()) head(qcdata, 20) @ Alternatively, you can get similar information on how many items are in each of the provided maps by looking at the MAPCOUNTS: <>= reactomeMAPCOUNTS @ To demonstrate the \Rclass{environment} API, we'll start with a random sample of Reactome IDs. <>= all_reactomeIds <- ls(reactomePATHID2EXTID) length(all_reactomeIds) set.seed(0xa1beef) reactomeIds <- sample(all_reactomeIds, 3) reactomeIds @ The usual ways of accessing annotation data are also available. <>= reactomePATHID2EXTID[[reactomeIds[1]]] reactomePATHID2EXTID$"1163762" pathwayNames <- unlist(mget(reactomeIds, reactomePATHID2NAME)) pathwayNames @ For all mappings the \Rpackage{reactome.db} package provides, there is a manual page provided which describes the data in the mapping and where it came from. <>= ?reactomeEXTID2PATHID @ \subsection{Getting pathways for a specific species} To get all pathways for a specific species one can use the following code: <>= pathways <- toTable(reactomePATHNAME2ID) pathwaysSelectedSpecies <- pathways[grep("Homo sapiens: ", iconv(pathways$path_name)), ] @ \begin{Ex} For the reactome IDs in 'reactomeIds' above, use the annotation mappings to find the GO annotations. <>= mget(reactomeIds, reactomeREACTOMEID2GO, ifnotfound=NA)[1:2] @ \end{Ex} \begin{Ex} How many Reactome IDs do not have a GO mapping in the \Rpackage{reactome.db} package? Find a Reactome ID that has a GO mapping. Now look at the GO mappings for this Reactome ID in table form. <>= count.mappedLkeys(reactomeREACTOMEID2GO) length(reactomeREACTOMEID2GO) - count.mappedLkeys(reactomeREACTOMEID2GO) mappedLkeys(reactomeREACTOMEID2GO)[1] toTable(reactomeREACTOMEID2GO["1008200"]) @ \end{Ex} The version number of R and packages loaded for generating the vignette were: <>= sessionInfo() @ \end{document}