\name{getLikelihoods} \alias{getLikelihoods} \alias{getLikelihoods.Dirichlet} \alias{getLikelihoods.NBboot} \alias{getLikelihoods.Pois} %- Also NEED an '\alias' for EACH other topic documented here. \title{Finds posterior likelihoods for each count as belonging to some hypothesis.} \description{ These functions calculate posterior probabilities for each of the 'counts' in the countDP object belonging to each of the groups specified. The choice of function depends on the prior belief about the underlying distribution of the data. It is essential that the method used for calculating priors matches the method used for calculating the posterior probabilites. For a comparison of the methods, see Hardcastle & Kelly, 2009. } \usage{ getLikelihoods.Dirichlet(cDP, prs, estimatePriors = TRUE, subset = NULL, cl) getLikelihoods.Pois(cDP, prs, estimatePriors = TRUE, subset = NULL, distpriors = FALSE, cl) getLikelihoods.NBboot(cDP, prs, estimatePriors = TRUE, subset = NULL, bootStraps = 2, conv = 1e-4, cl) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{cDP}{An object of type \code{\link{countData}}.} \item{prs}{(Initial) prior probabilities for each of the groups in the 'countDP' object.} \item{estimatePriors}{Should the prior probabilities on each of the groups be estimated by bootstrap from the data? Defaults to TRUE.} \item{subset}{Numeric vector giving the subset of counts for which posterior likelihoods should be estimated.} \item{distpriors}{Should the Poisson method use an empirically derived distribution on the prior parameters of the Poisson distribution, or use the mean of the maximum likelihood estimates (default).} \item{bootStraps}{How many iterations of bootstrapping should be used in the (re)estimation of priors in the negative binomial method.} \item{conv}{If not null, bootstrapping iterations will cease if the mean squared difference between posterior likelihoods of consecutive bootstraps drops below this value.} \item{cl}{A SNOW cluster object.} } \details{ These functions estimate, under the assumption of various distributions, the (log) posterior likelihoods that each count belongs to a group defined by the \code{@group} slot of the \code{countData} object. The posterior likelihoods are stored on the natural log scale in the \code{@posteriors} slot of the \code{\link{countDataPosterior}} object generated by this function. This is because the posterior likelihoods are calculated in this form, and ordering of the counts is better done on these log-likelihoods than on the likelihoods. The Dirichlet and Poisson methods produce almost identical results in simulation. The Negative Binomial method produces results with much lower false discovery rates, but takes considerably longer to run. The quality of the results of the Negative Binomial is further improved by increasing the amount of bootstrapping. However, this further increases the run time. Filtering the data may be extremely advantageous in reducing run time. This can be done by passing a numeric vector to 'subset' defining a subset of the data for which posterior likelihoods are required. See Hardcastle & Kelly (2009) for a full comparison of the methods. A 'cluster' object is strongly recommended in order to parallelise the estimation of posterior likelihoods, particularly for the negative binomial method. However, passing NULL to the \code{cl} variable will allow the functions to run in non-parallel mode. } \value{ A \code{\link{countDataPosterior}} object. } \references{Hardcastle T.J., and Kelly, K (2009). Empirical Bayesian methods for differential expression in count data. In submission.} \author{Thomas J. Hardcastle} \seealso{\code{\link{countData}}, \code{\link{getPriors}}, \code{\link{topCounts}}, \code{\link{getTPs}}} \examples{ library(baySeq) # See vignette for more examples. # Create a {countData} object and estimate priors for the # Poisson methods. data(simCount) data(libsizes) groups <- list(c(1,1,1,1,1,1,1,1,1,1), c(1,1,1,1,1,2,2,2,2,2)) CD <- new("countData", data = simCount, libsizes = libsizes, groups = groups) CDP.Poi <- getPriors.Pois(CD, samplesize = 20, iterations = 1000, takemean = TRUE) # Get likelihoods for data with Poisson method CDPost.Poi <- getLikelihoods.Pois(CDP.Poi, prs = c(0.5, 0.5), estimatePriors = TRUE, cl = NULL) \dontrun{ # Alternatively, get priors for negative binomial method CDP.NBML <- getPriors.NB(CD, samplesize = 10^5, estimation = "ML", cl = NULL) # Get likelihoods for data with negative binomial method with bootstrapping CDPost.NBML <- getLikelihoods.NBboot(CDP.NBML, pres = c(0.5, 0.5), estimatePriors = TRUE, bootStraps = 2, cl = NULL) # Alternatively, if we have the 'snow' package installed we # can parallelise the functions. This will usually (not always) offer # significant performance gain. library(snow) cl <- makeCluster(4, 'SOCK') CDP.NBML <- getPriors.NB(CD, samplesize = 10^5, estimation = "ML", cl = cl) CDPost.NBML <- getLikelihoods.NBboot(CDP.NBML, pres = c(0.5, 0.5), estimatePriors = TRUE, bootStraps = 2, cl = cl) } } \keyword{distribution} \keyword{models}