---
title: "SEtools"
author:
- name: Pierre-Luc Germain
  affiliation:
  - D-HEST Institute for Neurosciences, ETH Zürich
  - Laboratory of Statistical Bioinformatics, University Zürich
package: SEtools
output:
  BiocStyle::html_document:
        fig_height: 3.5
abstract: |
  Showcases the use of SEtools to merge objects of the SummarizedExperiment class.
vignette: |
  %\VignetteIndexEntry{SEtools}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include=FALSE}
library(BiocStyle)
```

# Getting started

The `r Rpackage("SEtools")` package is a set of convenience functions for the _Bioconductor_ class `r Biocpkg("SummarizedExperiment")`. It facilitates merging, melting, and plotting `SummarizedExperiment` objects.

**NOTE that the heatmap-related and melting functions have been moved to a standalone package, `r Biocpkg("sechm")`.**
The old `sehm` function of `SEtools` should be considered deprecated, and most `SEtools` functions are conserved for legacy/reproducibility reasons (or until they find a better home).

## Package installation

```{r, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("SEtools")
```

Or, to install the latest development version:

```{r, eval=FALSE}
BiocManager::install("plger/SEtools")
```

## Example data

To showcase the main functions, we will use an example object which contains (a subset of) whole-hippocampus RNAseq of mice after different stressors:

```{r}
suppressPackageStartupMessages({
  library(SummarizedExperiment)
  library(SEtools)
})
data("SE", package="SEtools")
SE
```

This is taken from [Floriou-Servou et al., Biol Psychiatry 2018](https://doi.org/10.1016/j.biopsych.2018.02.003).

## Merging and aggregating SEs

```{r}
se1 <- SE[,1:10]
se2 <- SE[,11:20]
se3 <- mergeSEs( list(se1=se1, se2=se2) )
se3
```

All assays were merged, along with rowData and colData slots.

By default, row z-scores are calculated for each object when merging. This can be prevented with:
```{r}
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE)
```

If more than one assay is present, one can specify a different scaling behavior for each assay:
```{r}
se3 <- mergeSEs( list(se1=se1, se2=se2), use.assays=c("counts", "logcpm"), do.scale=c(FALSE, TRUE))
```

Differences to the `cbind` method include prefixes added to column names, optional scaling, handling of metadata (e.g. for `sechm`)

### Merging by rowData columns

It is also possible to merge by rowData columns, which are specified through the `mergeBy` argument. 
In this case, one can have one-to-many and many-to-many mappings, in which case two behaviors are possible:

* By default, all combinations will be reported, which means that the same feature of one object might appear multiple times in the output because it matches multiple features of another object.
* If a function is passed through `aggFun`, the features of each object will by aggregated by `mergeBy` using this function before merging.

```{r merging}
rowData(se1)$metafeature <- sample(LETTERS,nrow(se1),replace = TRUE)
rowData(se2)$metafeature <- sample(LETTERS,nrow(se2),replace = TRUE)
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE, mergeBy="metafeature", aggFun=median)
sechm::sechm(se3, features=row.names(se3))
```

### Aggregating a SE

A single SE can also be aggregated by using the `aggSE` function:

```{r aggregating}
se1b <- aggSE(se1, by = "metafeature")
se1b
```

If the aggregation function(s) are not specified, `aggSE` will try to guess decent aggregation functions from the assay names.

This is similar to `scuttle::sumCountsAcrossFeatures`, but preserves other SE slots.

***

## Other convenience functions

Calculate an assay of log-foldchanges to the controls:

```{r}
SE <- log2FC(SE, fromAssay="logcpm", controls=SE$Condition=="Homecage")
```

<br/><br/>

# Session info {.unnumbered}

```{r sessionInfo, echo=FALSE}
sessionInfo()
```