---
title: "Example Workflow For Processing a Single Pooled Screen"
author: "Russell Bainer"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Example_Workflow_gCrisprTools}
  %\VignetteEngine{knitr::knitr}
  %\VignetteEncoding{UTF-8}
---

### Example Workflow For Processing a Single Screen

This is an example workflow for processing a pooled screening eperiment using the provided sample data. See the various manpages for additional visualization options and algorithmic details. 

Note that what follows describes a very basic analysis. If you are considering integrating the results of many different screen contrasts or even different experiments and/or technologies, refer to the `Advanced Screen Analysis: Contrast Comparisons` vignette. 

Load dependencies and data
```{r setup, eval = TRUE}
suppressPackageStartupMessages(library(Biobase))
suppressPackageStartupMessages(library(limma))
suppressPackageStartupMessages(library(gCrisprTools))

data("es", package = "gCrisprTools")
data("ann", package = "gCrisprTools")
data("aln", package = "gCrisprTools")
knitr::opts_chunk$set(message = FALSE, fig.width = 8, fig.height = 8, warning = FALSE)
```

Make a sample key, structured as a factor with control samples in the first level
```{r, eval = TRUE}
sk <- relevel(as.factor(pData(es)$TREATMENT_NAME), "ControlReference")
names(sk) <- row.names(pData(es))
```

Generate a contrast of interest using voom/limma; pairing replicates is a good idea if that information is available. 
```{r, eval = TRUE}
design <- model.matrix(~ 0 + REPLICATE_POOL + TREATMENT_NAME, pData(es))
colnames(design) <- gsub('TREATMENT_NAME', '', colnames(design))
contrasts <-makeContrasts(DeathExpansion - ControlExpansion, levels = design)
```

Optionally, trim of trace reads from the unnormalized object (see man page for details)
```{r, eval = TRUE, fig.width = 8, fig.height = 10}
es <- ct.filterReads(es, trim = 1000, sampleKey = sk)
```

Normalize, convert to a voom object, and generate a contrast
```{r, eval = TRUE, fig.width = 6, fig.height = 12}
es <- ct.normalizeGuides(es, method = "scale", plot.it = TRUE) #See man page for other options
vm <- voom(exprs(es), design)

fit <- lmFit(vm, design)
fit <- contrasts.fit(fit, contrasts)
fit <- eBayes(fit)
```

Edit the annotation file if you used `ct.filterReads` above
```{r, eval = TRUE}
ann <- ct.prepareAnnotation(ann, fit, controls = "NoTarget")
```

Summarize gRNA signals to identify target genes of interest

```{r, eval = TRUE}
resultsDF <-
  ct.generateResults(
    fit,
    annotation = ann,
    RRAalphaCutoff = 0.1,
    permutations = 1000,
    scoring = "combined", 
    permutation.seed = 2
  )
```

#### Alternative Annotations

In some cases, reagents might target multiple known elements (e.g., gRNAs in a CRISPRi library that target multiple promoters of the same gene). In such cases, you can specify this via the `alt.annotation` argument to `ct.generateResults()`. Alternative annotations are supplied as a list of character vectors named for the reagents.

```{r, eval = TRUE}

# Create random alternative target associations 

altann <- sapply(ann$ID, 
                 function(x){
                   out <- as.character(ann$geneSymbol)[ann$ID %in% x]
                   if(runif(1) < 0.01){out <- c(out, sample(as.character(ann$geneSymbol), size = 1))}
                   return(out)
                 }, simplify = FALSE)

resultsDF <-
  ct.generateResults(
    fit,
    annotation = ann,
    RRAalphaCutoff = 0.1,
    permutations = 1000,
    scoring = "combined", 
    alt.annotation = altann,
    permutation.seed = 2
  )
```


Optionally, just load an example results object for testing purposes (trimming out reads as necessary)
```{r, eval = TRUE}
data("fit", package = "gCrisprTools")
data("resultsDF", package = "gCrisprTools")

fit <- fit[(row.names(fit) %in% row.names(ann)),]
resultsDF <- resultsDF[(row.names(resultsDF) %in% row.names(ann)),]

targetResultDF <- ct.simpleResult(resultsDF) #For a simpler target-level result object
```

### Quality Control

gCrisprTools contains a variety of pooled screen-specific quality control and visualization tools (see man pages for details):
```{r, eval = TRUE}
ct.alignmentChart(aln, sk)
ct.rawCountDensities(es, sk)
```

Visualize gRNA abundance distributions
```{r, eval = TRUE}
ct.gRNARankByReplicate(es, sk) 
ct.gRNARankByReplicate(es, sk, annotation = ann, geneSymb = "NoTarget")  #Show locations of NTC gRNAs
```

Visualize control guide behavior across conditions
```{r, eval = TRUE}
ct.viewControls(es, ann, sk, normalize = FALSE)
ct.viewControls(es, ann, sk, normalize = TRUE)
```

Visualize GC bias across samples, or within an experimental contrast
```{r, eval = TRUE, fig.width = 8, fig.height = 12}
ct.GCbias(es, ann, sk)
ct.GCbias(fit, ann, sk)
```

View most variable gRNAs/Genes (as % of sequencing library)
```{r, eval = TRUE}
ct.stackGuides(es,
               sk,
               plotType = "gRNA",
               annotation = ann,
               nguides = 40)
```

```{r, eval = TRUE}
ct.stackGuides(es, 
               sk, 
               plotType = "Target", 
               annotation = ann)
```

```{r, eval = TRUE}
ct.stackGuides(es,
               sk,
               plotType = "Target",
               annotation = ann,
               subset = names(sk)[grep('Expansion', sk)])
```
               
               
View a CDF of genes/guides
```{r, eval = TRUE}
ct.guideCDF(es, sk, plotType = "gRNA")
ct.guideCDF(es, sk, plotType = "Target", annotation = ann)
```

#### Target-Level Visualization and Analysis

View the overall enrichment and depletion trends identified in the screen: 
```{r, eval = TRUE}
ct.contrastBarchart(resultsDF)
```

View top enriched/depleted candidates
```{r, eval = TRUE}
ct.topTargets(fit,
              resultsDF,
              ann,
              targets = 10,
              enrich = TRUE)
ct.topTargets(fit,
              resultsDF,
              ann,
              targets = 10,
              enrich = FALSE)
```

View the behavior of reagents targeting a particular gene of interest
```{r, eval = TRUE, fig.width = 8, fig.height = 10}
ct.viewGuides("Target1633", fit, ann)
ct.gRNARankByReplicate(es, sk, annotation = ann, geneSymb = "Target1633")
```

Observe the effects detected for sets of targets within a screen contrast

```{r, eval = TRUE}
ct.signalSummary(resultsDF,
                 targets = list('TargetSetA' = c(sample(unique(resultsDF$geneSymbol), 3)),
                                'TargetSetB' = c(sample(unique(resultsDF$geneSymbol), 2))))
```

You could test a known gene set for enrichment within target candidates:

```{r, eval = TRUE}
data("essential.genes", package = "gCrisprTools")
ct.targetSetEnrichment(resultsDF, essential.genes)
```

Or optionally add a visualization:
```{r rocprc, eval = TRUE}
ROC <- ct.ROC(resultsDF, essential.genes, direction = "deplete")
PRC <- ct.PRC(resultsDF, essential.genes, direction = "deplete")
show(ROC) # show(PRC) is equivalent for the PRC analysis
```

Or alternatively you could test for ontological enrichment within the depleted/enriched targets via the `sparrow` package: 

```{r, eval = TRUE, warning=FALSE, message = FALSE}
#Create a geneset database using one of the many helper functions
genesetdb <- sparrow::getMSigGeneSetDb(collection = 'h', species = 'human', id.type = 'entrez')

ct.seas(resultsDF, gdb = genesetdb)

# If you have a library that targets elements unevenly (e.g., variable numbers of 
# elements/promoters per genes), you can conform it via `sparrow::convertIdentifiers()`

genesetdb.GREAT <- sparrow::convertIdentifiers(genesetdb, 
                                               from = 'geneID', 
                                               to = 'geneSymbol', 
                                               xref = ann)
ct.seas(resultsDF, gdb = genesetdb.GREAT)
```

See the `Contrast_Comparisons` vignette for more advanced use cases of gCrisprTools and extension to complex experiments and study designs. 


Finally, you can make reports in a directory of interest: 

```{r, eval = FALSE}
path2report <-      #Make a report of the whole experiment
  ct.makeReport(fit = fit, 
                eset = es, 
                sampleKey = sk, 
                annotation = ann, 
                results = resultsDF, 
                aln = aln, 
                outdir = ".") 

path2QC <-          #Or one focusing only on experiment QC
  ct.makeQCReport(es, 
                  trim = 1000, 
                  log2.ratio = 0.05, 
                  sampleKey = sk, 
                  annotation = ann, 
                  aln = aln, 
                  identifier = 'Crispr_QC_Report',
                  lib.size = NULL
                  )                

path2Contrast <-    #Or Contrast-specific one
  ct.makeContrastReport(eset = es, 
                        fit = fit, 
                        sampleKey = sk, 
                        results = resultsDF, 
                        annotation = ann, 
                        comparison.id = NULL, 
                        identifier = 'Crispr_Contrast_Report')            
```

```{r}
sessionInfo()
```