% pedigreeHandling.Rnw
% -------------------------------------------------------------------------
% What: Pedigree handling vignette
% $Id: pedigreeHandling.Rnw 1179 2007-04-03 14:13:02Z ggorjan $
% Time-stamp: <2007-04-01 01:09:08 ggorjan>
% -------------------------------------------------------------------------

% --- Vignette stuff ---
% -------------------------------------------------------------------------

%\VignetteIndexEntry{Pedigree handling}
%\VignettePackage{GeneticsPed}
%\VignetteDepends{gdata}
%\VignetteKeywords{genetics, pedigree, parents, children, progeny, ancestor, ascendant, descendant, generation, sex, birth date, family}

% --- Preamble ---
% -------------------------------------------------------------------------

% Document class and general packages
% -------------------------------------------------------------------------
\documentclass[fleqn,a4paper]{article}  % fleqn - allignment of equations
                                        % a4paper appropriate page size

\usepackage{graphicx}                   % For inclusion of pictures, ...
\usepackage[authoryear]{natbib}         % Bibliography - Natbib
\newcommand{\SortNoop}[1]{}             % Sort command for bib entries
\usepackage{hyperref}

\hypersetup{%
  pdftitle={Pedigree handling},
  pdfauthor={Gregor Gorjanc},
  pdfkeywords={genetics, pedigree, parents, children, progeny, ancestor,
    ascendant, descendant, generation, sex, birth date, family}
}

\newcommand{\email}[1]{\href{mailto:#1}{#1}} % Email command

\makeatletter                           % Allow the use of @ in command names

% Paragraph and page
% -------------------------------------------------------------------------

\usepackage{setspace}                   % Line spacing
\onehalfspacing
\setlength\parskip{\medskipamount}
\setlength\parindent{0pt}

% --- Lyx Tips&Tricks: better formatting, less hyphenation problems ---
\tolerance 1414
\hbadness 1414
\emergencystretch 1.5em
\hfuzz 0.3pt
\widowpenalty=10000
\vfuzz \hfuzz
\raggedbottom

% Page size
\usepackage{geometry}
\geometry{verbose,a4paper,tmargin=3.0cm,bmargin=3.0cm,lmargin=2.5cm,
  rmargin=2.5cm,headheight=20pt,headsep=0.7cm,footskip=12pt}

\makeatother                            % Cancel the effect of \makeatletter

% R and friends
% -------------------------------------------------------------------------

\newcommand{\program}[1]{{\textit{\textbf{#1}}}}
\newcommand{\code}[1]{{\texttt{#1}}}

% http://www.bioconductor.org/develPage/guidelines/vignettes/vignetteGuidelines.pdf
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\newcommand{\Rclass}[1]{{\textit{#1}}}
\newcommand{\Rmethod}[1]{{\textit{#1}}}
\newcommand{\Rfunarg}[1]{{\textit{#1}}}

\newcommand{\R}{\program{R}}

\usepackage{Sweave}                     % Sweave
\SweaveOpts{strip.white=all, keep.source=TRUE}

<<options, echo=FALSE, results=hide>>=
options(width=85)
@

% --- Document ---
% -------------------------------------------------------------------------

\begin{document}

\title{Pedigree handling}
\author{
  Gregor Gorjanc\\
  \email{gregor.gorjanc@bfro.uni-lj.si},\\
  David A. Henderson\\
  \email{dnadave@insightful.com}
}

\maketitle

\section*{Introduction}

Pedigrees are collections of related individuals.  Often we represent these as a
linked list, a collection of trios that links (or almost everyone) everyone
together: an individual and its two parents.  This simple representation allows
the use of graph theory in analysis.  The GeneticsPed package provides utilities
for managing pedigrees; inputing, sorting, and subsetting pedigrees; and
computing on pedigrees by calculating relationship coefficients and other
similar quantities.

name some fields were pedigree is used
\cite{Falconer:1996}

\section*{?}

Pedigree class

\subsection*{}

describe

individual
subject
ascendant

can be factor, character, numeric, but all must have the same class

\subsection*{Unknown individuals}

FIXME
Pedigrees are never complete because it is not possible to get data on all
ascendants. Therefore there are always some subjects with unknown
ascendants. As with pedigree form there are also differences in
representation of unknown individuals between different applications,
namely using 0, blank field, particular string as ``unknown'', etc. In
GeneticsPed R's unknown representation \code{NA} is used. Change from other
representations to \code{NA} can be done prior to definition of a pedigree
object. To ease this process, we have provided argument \code{unknown} in
\Rfunction{Pedigree}. Multiple values can be passed to that argument in one
call i.e. \code{uknown=c(0, "", "unknown")}. Internally, change is done via
\Rfunction{uknownToNA} generic function. In case one wants to use some
other representation for example for special application in R or exporting
to outer application, \Rfunction{NAToUknown} function is provided.

How will we handle if one wants anything else than 0 in R - should we allow
for this or just convert each time to NA in our functions?


<<bla>>=
1+1
@

\section{Check consistency of data in pedigree}

check

check.Pedigree
checkId

check performs a series of checks on
  pedigree object to ensure consistency of data.

check(x, $\ldots$)
checkId(x)


\begin{itemize}
  \item[x] pedigree, object to be checked
  \item[]$\ldots$] arguments to other methods, none for now
\end{itemize}

  checkId performs various checks on subjects and
  their ascendants. These checks are:
\begin{itemize}
    \item idClass: all ids must have the same class
    \item subjectIsNA: subject can not be NA
    \item subjectNotUnique: subject must be unique
    \item subjectEqualAscendant: subject can not be equal (in
      identification) to its ascendant
    \item ascendantEqualAscendant: ascendant can not be equal to another
      ascendant
    \item ascendantInAscendant: ascendant can not appear again as
      asescendant of other sex i.e. father can not be a mother to someone
      else
    \item unusedLevels: in case factors are used for id presentation,
      there might be unused levels for some ids - some functions rely on
      number of levels and a check is provided for this
\end{itemize}

  checkAttributes is intended primarly for internal use and
  performs a series of checks on attribute values needed in various
  functions. It causes stop with error messages for all given attribute
  checks.


List of more or less self-explanatory errors and "pointers" to
  these errors for ease of further work i.e. removing errors.

\begin{verbatim}
  ## EXAMPLES BELLOW ARE ONLY FOR TESTING PURPOSES AND ARE NOT INTENDED
  ## FOR USERS, BUT IT CAN NOT DO ANY HARM.

  ## --- checkAttributes ---
  tmp <- generatePedigree(5)
  attr(tmp, "sorted") <- FALSE
  attr(tmp, "coded") <- FALSE
  GeneticsPed:::checkAttributes(tmp)
  try(GeneticsPed:::checkAttributes(tmp, sorted=TRUE, coded=TRUE))

  ## --- idClass ---
  tmp <- generatePedigree(5)
  tmp$id <- factor(tmp$id)
  class(tmp$id)
  class(tmp$father)
  try(GeneticsPed:::idClass(tmp))

  ## --- subjectIsNA ---
  tmp <- generatePedigree(2)
  tmp[1, 1] <- NA
  GeneticsPed:::subjectIsNA(tmp)

  ## --- subjectNotUnique ---
  tmp <- generatePedigree(2)
  tmp[2, 1] <- 1
  GeneticsPed:::subjectNotUnique(tmp)

  ## --- subjectEqualAscendant ---
  tmp <- generatePedigree(2)
  tmp[3, 2] <- tmp[3, 1]
  GeneticsPed:::subjectEqualAscendant(tmp)

  ## --- ascendantEqualAscendant ---
  tmp <- generatePedigree(2)
  tmp[3, 2] <- tmp[3, 3]
  GeneticsPed:::ascendantEqualAscendant(tmp)

  ## --- ascendantInAscendant ---
  tmp <- generatePedigree(2)
  tmp[3, 2] <- tmp[5, 3]
  GeneticsPed:::ascendantInAscendant(tmp)
  ## Example with multiple parents
  tmp <- data.frame(id=c("A", "B", "C", "D"),
                    father1=c("E", NA, "F", "H"),
                    father2=c("F", "E", "E", "I"),
                    mother=c("G", NA, "H", "E"))
  tmp <- Pedigree(tmp, ascendant=c("father1", "father2", "mother"),
                  ascendantSex=c(1, 1, 2),
                  ascendantLevel=c(1, 1, 1))
  GeneticsPed:::ascendantInAscendant(tmp)

  ## --- unusedLevels ---
  tmp <- generatePedigree(2, colClass="factor")
  tmp[3:4, 2] <- NA
  GeneticsPed:::unusedLevels(tmp)
\end{verbatim}

\bibliographystyle{apalike}
\bibliography{library}

\end{document}

% -------------------------------------------------------------------------
% pedigreeHandling.Rnw ends here