% -*- mode: noweb; noweb-default-code-mode: R-mode; -*-
%\VignetteIndexEntry{3. Custom Processing Methods}
%\VignetteKeywords{Preprocessing, Affymetrix}
%\VignetteDepends{affy}
%\VignettePackage{affy}
%documentclass[12pt, a4paper]{article}
\documentclass[12pt]{article}

\usepackage{amsmath}
\usepackage{hyperref}
\usepackage[authoryear,round]{natbib}

\textwidth=6.2in
\textheight=8.5in
%\parskip=.3cm
\oddsidemargin=.1in
\evensidemargin=.1in
\headheight=-.3in

\newcommand{\scscst}{\scriptscriptstyle}
\newcommand{\scst}{\scriptstyle}

\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\newcommand{\Rmethod}[1]{{\texttt{#1}}}
\newcommand{\Rfunarg}[1]{{\texttt{#1}}}
\newcommand{\Rclass}[1]{{\textit{#1}}}

\author{Laurent}
\begin{document}
\title{affy: Custom Processing Methods (HowTo)}

\maketitle
\tableofcontents
\section{Introduction}
This document describes briefly how to customize the affy package by
adding one's own processing methods. The types of processing methods
are background correction, normalization, perfect match correction 
and summary expression value computation. 
We tried our best to make this as easy as we could, but we are aware
that it is far from being perfect. We are still working on things to
improve them.
Hopefully this document should let you extend the
package with supplementary processing methods easily.

As usual, loading the package in your \verb+R+ session is required. 
\begin{Sinput}
R> library(affy) ##load the affy package
\end{Sinput}
<<echo=F,results=hide>>=
library(affy)
@

\section{How-to}
For each processing step, labels for the methods known to the package
are stored in variables. 

<<>>=
normalize.AffyBatch.methods()
bgcorrect.methods()
pmcorrect.methods()
express.summary.stat.methods()
@
We would recommend the use of the method \verb+normalize.methods+ to access
the list of available normalization methods (as a scheme for normalization
methods that would go beyond 'affy' is thought).
<<>>=
library(affydata)
data(Dilution)
normalize.methods(Dilution)
@

For each processing step, a naming convention exists between the method label
and the function name in \verb+R+ (see table~\ref{table:summary.labels}).
Each processing methods should be passed objects (and return objects)
corresponding to the processing step (see table~\ref{table:summary.methods}). 
\begin{table}
 \begin{tabular}{|c|c|}
\hline
variable for labels & naming convention \\
\hline
bgcorrect.methods & bg.correct.<label> \\
%\hline
normalize.AffyBatch.methods & normalize.AffyBatch.<label> \\
%\hline
pmcorrect.methods & pmcorrect.<label> \\
%\hline
express.summary.stat.methods & generateExprset.methods.<label>\\
\hline
 \end{tabular}
\caption{\label{table:summary.labels}Summary table for the processing methods.}
\end{table}

\begin{table}
 %\begin{tabular}{|c|c|p{0.5\textwidth}|}
 \begin{tabular}{|c|c|p{5cm}|}
\hline
step & argument(s) & returns \\
\hline
background correction & \verb+AffyBatch+ & \verb+AffyBatch+ \\
%\hline
normalization & \verb+AffyBatch+ & \verb+AffyBatch+ \\
%\hline
{\it pm} correction &  \verb+ProbeSet+ & a \verb+matrix+ of corrected PM values
(one probe per row, one column per chip).\\
%\hline
expression value & a \verb+matrix+ of PM values & a list of two elements
\verb+exprs+ and \verb+se.exprs+\\
\hline
 \end{tabular}
\caption{\label{table:summary.methods}Summary table for the processing methods.}
\end{table}

Practically, this means that to add a method one has to 
\begin{enumerate}
 \item create an appropriate method with a name satisfying the convention.
 \item register the method by adding the label name to the corresponding
 variable.
\end{enumerate}

\section{Examples}
As an example we show how to add two simple new methods. The first one does
{\it pm} correction. The method subtract {\it mm} values to the {\it pm} values,
except when the result is negative. In this case the {\it pm} value remains
unchanged.

We create a function using the label name \verb+subtractmmsometimes+.
<<>>=
pmcorrect.subtractmmsometimes <- function(object) {

  ## subtract mm
  mm.subtracted <- pm(object) - mm(object)

  ## find which ones are unwanted and fix them
  invalid <- which(mm.subtracted <= 0)
  mm.subtracted[invalid] <- pm(object)[invalid]

  return(mm.subtracted)
}
@ 
Once the method defined, we just register the \emph{label name} in the
corresponding variable.
<<>>=
upDate.pmcorrect.methods(c(pmcorrect.methods(), "subtractmmsometimes"))
@

The second new method intends to be an robust alternative to the summary
expression value computation \verb+avgdiff+. The idea is to use the function
\verb+huber+ of the package \verb+MASS+.
<<>>=
huber <- function (y, k = 1.5, tol = 1e-06) {
    y <- y[!is.na(y)]
    n <- length(y)
    mu <- median(y)
    s <- mad(y)
    if (s == 0) 
        stop("cannot estimate scale: MAD is zero for this sample")
    repeat {
        yy <- pmin(pmax(mu - k * s, y), mu + k * s)
        mu1 <- sum(yy)/n
        if (abs(mu - mu1) < tol * s) 
            break
        mu <- mu1
    }
    list(mu = mu, s = s)
}
@
This method returns P.J. Huber's location estimate with MAD scale. You
think this is what you want to compute the summary expression value
from the probe intensities. What is needed to have as a processing method
is a simple wrapper:
<<>>=
computeExprVal.huber <- function(probes) {
  res <- apply(probes, 2, huber)
  mu <- unlist(lapply(res, function(x) x$mu))
  s <- unlist(lapply(res, function(x) x$s))
  return(list(exprs=mu, se.exprs=s))
}

upDate.generateExprSet.methods(c(generateExprSet.methods(), "huber"))
@ 

From now the package is aware of the two new methods\ldots in this session.

The code for the methods included in the package can be informative if you plan
to develop methods. An example that demonstrates how a normalization method can
be added is given by the function \Rfunction{normalize.AffyBatch.vsn} in the
package \Rpackage{vsn}; see its help file.

\end{document}