--- title: "shinyDSP tutorial" author: - name: Seung J. Kim affiliation: - Interstitial Lung Disease Lab, London Health Sciences Center email: skim823@uwo.ca - name: Marco Mura affiliation: - Interstitial Lung Disease Lab, London Health Sciences Center - Division of Respirology, Department of Medicine, Western University email: marco.mura@lhsc.on.ca package: "`r pkg_ver('shinyDSP')`" date: "`r doc_date()`" output: BiocStyle::html_document: toc: true number_sections: true toc_float: smooth_scroll: true toc_depth: 2 self_contained: yes vignette: > %\VignetteIndexEntry{shinyDSP} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} bibliography: references.bib link-citations: true editor_options: markdown: wrap: 80 --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction shinyDSP: Analyzing and Visualizing Nanostring GeoMx DSP Data shinyDSP is an intuitive Shiny application designed for the comprehensive analysis and visualization of [Nanostring GeoMx DSP](https://nanostring.com/products/geomx-digital-spatial-profiler/geomx-dsp-overview/) data. Users can upload either demo or custom datasets, consisting of count and sample annotation tables. The app prompts users to select variables of interest, potential batch effects, and confounding factors, allowing for customized exploration. With shinyDSP, users can create x-y scatter plots of any combination of categorical variables and apply user-defined cutoffs to filter samples. The app utilizes the R package, standR [@liu_standr_2023], to perform normalization using methods such as CPM, upper quartile (Q3), or RUV4 (Remove Unwanted Variation). Users can visualize PCA plots generated for each normalization method, color-coded by chosen variables or batch. After selecting a normalization scheme, users can identify differentially expressed genes between specified biological groups using limma-voom [@ritchie_limma_2015]. The app provides "raw" output numbers in tables, generates volcano plots for all pairwise comparisons, and displays heatmaps of the top differentially expressed genes. shinyDSP aims to provide a robust, start-to-finish analysis of GeoMx data, producing publication-ready outputs that are easily customizable to meet individual aesthetic preferences. ## Installation ```{r installation, eval = FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) { install.packages("BiocManager") } BiocManager::install("shinyDSP") # To install the development version from Github: if (!requireNamespace("devtools", quietly = TRUE)) { install.packages("devtools") } devtools::install_github("kimsjune/shinyDSP") library(shinyDSP) ``` ## Usage ```{r usage, eval = FALSE} library(shinyDSP) app <- shinyDSP() # This will open a new browser tab/window. if (interactive()) shiny::runApp(app) ``` ## User interface ![](ui_annotated.png){style="display: block; margin: 1em auto;" width="100%"} There are four main UI components:\ 1. The `nav bar` where `nav panels` will appear. `Setup` is a `nav panel`.\ 2. The main `side bar`. This is where you can set global parameters.\ 3. The main display within each `nav panel`. This is where outputs will appear.\ 4. A `side bar` within the `setup` `nav panel`. This is where customization options will appear. ## Loading data shinyDSP requires count and annotation tables (in .csv or .txt format) as input. These tables are generated with DSP Data Analysis (DSPDA) software. To see how these files should be formatted, click on "Use [demo data](https://nanostring.com/products/geomx-digital-spatial-profiler/spatial-organ-atlas/human-kidney/)", then "Load data". The top 10 rows of each table will be shown in the main display. The main `side bar` will also be updated with various options. ![](main_sidebar_annotated.png){style="display: block; margin: 1em auto;" width="50%"} 1\. You can pick one or more variables of interest. It's common to combine two variables into one grouped variable. For example, "genotype" and "treatment" can be combined into "genotype_treatment". A new column in your annotation table is automatically created.\ 2. A batch variable. "SlideName" is selected by default, but it could be any categorical variable provided in the annotation table such as "sample preparation batch", etc.\ 3. Any confounding variable(s) such as age or sex of your samples that you want to include in the design matrix for differential gene expression analysis. None selected by default. For this demo, "disease_status" and "region" are selected as the variables of interest. All four groups of interest are selected (DKD_glomerulus, DKD_tubule, normal_tubule, normal_glomerulus). After selecting "Variable(s) of interest", two new `nav panels` appear: "QC" and "PCA". ## QC Click on the "QC" `nav panel` to create scatter plots and (optionally) filter samples not meeting cutoffs. ![](qc_sidebar_annotated.png){style="display: block; margin: 1em auto;" width="50%"} 1\. Pick **two** or more quantitative variables to plot and (optionally) filter.\ 2. If you select more than two variables, increase this number to show all possible x-y plots.\ 3. Pick a variable for the colour legend. "SlideName" is the default.\ 4. Pick one of the five colour palettes. "glasbey" is the default.\ 5. Click to open. You have the option of providing minimum threshold value(s) for each variable from (1). Lastly, click on "Show QC plots" to show/update the plots. ![](qc_main_annotated.png){width="100%"} In the example above, "SequencingSaturation" and "DeduplicatedReads" were selected. Then, I removed any samples with "SequencingSaturation" below 85. No filtering was applied based on "DeduplicatedReads". The scatter plot can be saved as .png, .tiff, .svg or .pdf by clicking the download buttons below each plot. Now we move on to "PCA" (click on the `nav panel`). ## PCA ![](pca_sidebar.png){style="display: block; margin: 1em auto;" width="50%"} Click on "Run" to generate PCA plots. For each group of interest and batch variable that you selected in the main `side bar`, you can pick its shape and colour. For example, "DKD_glomerulus" will appear as black circles. You can pick between five different shapes and any colour in [grDevices::colours()](). Two sets of three PCA plots are generated in the main display area. Three normalization schemes are shown: CPM, Q3 (upperquartile), and RUV4 (Remove Unwanted Variation). Across the top and buttom row, the plots are colour-coded by "Variable(s) of interest", and the "batch variable", respectively. Click on "Download" in the `side bar` to find all the download options. Select the smallest value of "k value for RUV4 norm." that removes any batch effect. Click on "Run" to show updated plots. ## Normalization ![](main_sidebar_norm.png){style="display: block; margin: 1em auto;" width="50%"} Now you can choose the normalization scheme to use for differential gene expression testing and the log fold-change cutoff for [limma::topTable](https://www.rdocumentation.org/packages/limma/versions/3.28.14/topics/toptable). Selecting a normalization scheme will reveal three new `nav panels` in the `nav bar`: "Table", "Volcano", and "Heatmap". ![](updated_nav_bar.png){style="display: block; margin: 1em auto;" width="100%"} > If you choose "RUV4", you need to open the "PCA" `nav panel` to load the k value. ## Table Clicking on the "Table" `nav panel` will automatically start performing differential gene expression testing between all selected "Groups of interest". If more than two groups were selected, all possible pairwise comparisons *and* an ANOVA-like test between all groups are executed. This step can take about 3 minutes. Results are separated into `tabs` highlighted in blue. Click on the "Download table" button below each table to download it (in .csv). ![](table.png){style="display: block; margin: 1em auto;" width="100%"} ## Volcano plots Click on the "Volcano" `nav panel` and "Show/update". ![](volcano_main.png){style="display: block; margin: 1em auto;" width="100%"} Like the tables, Volcano plots are shown in individual `tabs` highlighted in blue. ![](volcano_sidebar_annotated.png){style="display: block; margin: 1em auto;" width="50%"} There are several customization options for tweaking the look and feel of these plots. 1. Higher number increases the number of gene names shown by allowing them to overlap each other. 2. Higher number makes the labels larger. 3-4. These options are used to colour those genes not meeting cutoffs to have "Not DE colour (7)". 5. Those genes with logFC >= "logFC cutoff" are given this colour. Must be a grDevices::color(). 6. Those genes with logFC <= -"logFC cutoff" are given this colour. Must be a grDevices::color()". 7. Must be a grDevices::color(). 8. Click to enable custom x and y ranges. These settings are applied to all Volcano plots. Below each Volcano plot, you have the option to save it as four different file types. ## Heatmap For each "Table", a corresponding heatmap is generated. The heatmaps are also organized into individual `tabs`. By default, the top 50 genes (sorted by adjusted P value) are shown as rows, clustered based on Euclidean distances. There are a few customization options: ![](heatmap_sidebar_annotated.png){style="display: block; margin: 1em auto;" width="50%"} 1. Any N top genes can be plotted. 2. The [viridis](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html) colour map is available. 3. A custom range of Z score. Does not have to be balanced. 4. This adjusts the overall size of the **downloaded* heatmap. Each "square" will become smaller/bigger. 5. Font size for row/gene labels. Below each heatmap, you have the option to save it as four different file types. ## Data processing and analysis For details on underlying functions, please check out the [secondary vignette](shinyDSP_secondary.html). ## Session Info ```{r sessionInfo} sessionInfo() ```