% % NOTE -- ONLY EDIT .Rnw!!! % .tex file will get overwritten. % %\VignetteIndexEntry{@@PKNAME@@ Primer} %\VignetteDepends{} %\VignetteKeywords{Genomics} %\VignettePackage{@@PKNAME@@} % % NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. % \documentclass[12pt]{article} \usepackage{amsmath,pstricks} \usepackage[authoryear,round]{natbib} \usepackage{hyperref} \textwidth=6.2in \textheight=8.5in %\parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in \newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rfunarg}[1]{{\texttt{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \textwidth=6.2in \bibliographystyle{plainnat} \begin{document} %\setkeys{Gin}{width=0.55\textwidth} \section{LogitBoost} Boosting is a well-established approach to machine learning. The basic idea is that predictions formed by a series of `weak learners' are combined as cases are reweighted to increase the weight assigned to `difficult' cases. As implemented by Dettling and Buhlmann, the procedure requires identification of a training set with predictors and responses, and a test set for which responses should be available. The interface to logitboost created by Dettling is mildly inconvenient for use with exprSets. Records must be sorted by response value. The following function simplifies the tasks of boosting with exprSets. The second argument is a character string naming the response variable of interest (this will be converted to an integer in 0:nclasses). Arguments to logitboost must be named and are passed through. <<>>= library(LogitBoost) library(golubEsets) lboostTwoEsets <- function(esetTrain, pdNameTrain, esetTest, ...) { xlearn <- t(exprs(esetTrain)) ylearn <- as.numeric(factor(esetTrain[[pdNameTrain]])) - 1 ytest <- as.numeric(factor(esetTest[[pdNameTrain]])) - 1 orec <- order(ylearn) xlearn <- xlearn[orec,] ylearn <- ylearn[orec] xtest <- t(exprs(esetTest)) logitboost(xlearn, ylearn, xtest, ...) } @ We now illustrate the procedure: <<>>= run1 <- lboostTwoEsets( golubTrain[20:2600,], "ALL.AML", golubTest[20:2600,], mfinal=100 ) <>= print(summarize(run1, YTEST <<- as.numeric(factor(golubTest$ALL.AML))-1, mout=100)) @ <<>>= table( apply(run1$probs,1,function(x)mean(x)>.5), YTEST ) @ %Modifications of interest include substitution of %test for training set @ \section{GBM} <<>>= Y <- as.numeric(factor(golubTrain$ALL.AML))-1 X <- t(exprs(golubTrain)[1:100,]) dd <- data.frame(Y=Y, X) library(gbm) \end{document}