Making Complex Heatmaps ======================================== **Author**: Zuguang Gu ( z.gu@dkfz.de ) **Date**: `r Sys.Date()` ------------------------------------------------------------- ```{r global_settings, echo = FALSE, message = FALSE} library(markdown) options(markdown.HTML.options = c(options('markdown.HTML.options')[[1]], "toc")) library(knitr) knitr::opts_chunk$set( error = FALSE, tidy = FALSE, message = FALSE, fig.align = "center", fig.width = 5, fig.height = 5) options(markdown.HTML.stylesheet = "custom.css") options(width = 100) ``` Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential features. Here the **ComplexHeatmap** package provides a highly flexible way to arrange multiple heatmaps and supports self-defined annotation graphics. ## General design Generally, a heatmap list contains several heatmaps and row annotations. ```{r design, echo = FALSE, fig.width = 10, fig.height = 5} source("design.R") ``` There are several classes that are abstracted from the heatmap. Surrounding the heatmap list, there are legends for heatmaps themselves and annotations, also there are titles which are placed on the four sides of the heatmap list. For each heatmap, there are also different components standing surrounding the heatmap body. In **ComplexHeatmap** package, components of heatmap lists are abstracted into several classes. - `Heatmap` class: a single heatmap containing heatmap body, row/column names, titles, dendrograms and column annotations. - `HeatmapList` class: a list of heatmaps and row annotations. - `HeatmapAnnotation` class: defines a list of row annotations and column annotations. There are also several internal classes: - `SingleAnnotation` class: defines a single row annotation or column annotation. - `ColorMapping` class: mapping from values to colors. **ComplexHeatmap** is implemented under **grid** system, so users should know basic **grid** functionality to get full use of **ComplexHeatmap** package. ## A single heatmap A single heatmap is mostly used for a quick view of the data. It is a special case of a heatmap list which only contains one heatmap. In following examples, we will demonstrate how to set parameters for visualization a single heatmap. First let's load packages and generate a random matrix: ```{r data} library(ComplexHeatmap) library(circlize) library(colorspace) library(GetoptLong) set.seed(123) mat = cbind(rbind(matrix(rnorm(16, -1), 4), matrix(rnorm(32, 1), 8)), rbind(matrix(rnorm(24, 1), 4), matrix(rnorm(48, -1), 8))) rownames(mat) = paste0("R", 1:12) colnames(mat) = paste0("C", 1:10) ``` Plot the heatmap with default settings. The default style of the heatmap is quite the same as that generated by other similar heatmap functions. ```{r default} Heatmap(mat) ``` ### Colors In most cases, the heatmap visualizes a matrix with continuous values. In this case, user should provide a color mapping function. A color mapping function should accept a vector of values and return a vector of corresponding colors. The `colorRamp2()` from the **circlize** package is helpful for generating such functions. The two arguments for `colorRamp2()` is a vector of breaks values and corresponding colors. ```{r color_fun} Heatmap(mat, col = colorRamp2(c(-3, 0, 3), c("green", "white", "red"))) ``` If the matrix contains discrete values (either numeric or character), colors should be specified as a named vector to make it possible for the mapping from discrete values to colors. ```{r discrete_matrix} discrete_mat = matrix(sample(1:4, 100, replace = TRUE), 10, 10) colors = structure(rainbow_hcl(4), names = c("1", "2", "3", "4")) Heatmap(discrete_mat, col = colors) ``` Or a character matrix: ```{r discrete_character_matrix} discrete_mat = matrix(sample(letters[1:4], 100, replace = TRUE), 10, 10) colors = structure(rainbow_hcl(4), names = letters[1:4]) Heatmap(discrete_mat, col = colors) ``` As you see, for the numeric matrix, by default clustering is applied on both dimensions while for character matrix, clustering is suppressed. ### Titles The name of the heatmap is used as the title of the heatmap legend. The name also plays as a unique id if you plot more than one heatmaps together (introduced in later sections). ```{r with_matrix_name} Heatmap(mat, name = "foo") ``` You can set heatmap titles either by the rows or by the columns. Note at a same time you can only put e.g. column title either on the top or at the bottom of the heatmap. The graphic parameters can be set by `row_title_gp` and `column_title_gp` respectively. Please remember you should use `gpar()` to specify graphic parameters. ```{r row_column_title} Heatmap(mat, name = "foo", column_title = "I am a column title", row_title = "I am a row title") Heatmap(mat, name = "foo", column_title = "I am a column title at the bottom", column_title_side = "bottom") Heatmap(mat, name = "foo", column_title = "I am a big column title", column_title_gp = gpar(fontsize = 20, fontface = "bold")) ``` ### Clustering Clustering may be the key feature of the heatmap visualization. In **ComplexHeatmap** package, clustering is supported with high flexibility. For different levels, you can specify the clustering either by a pre-defined method, or by a distance function, or by a object that already contains clustering, or directly by a clustering function. It is also possible to render your dendrograms with different colors and styles for different branches for better revealing structures of your data. First there are general settings for the clustering, e.g. whether show dendrograms, side of the dendrograms and size of the dendrograms. ```{r cluster_basic} Heatmap(mat, name = "foo", cluster_rows = FALSE) Heatmap(mat, name = "foo", show_column_hclust = FALSE) Heatmap(mat, name = "foo", row_hclust_side = "right") Heatmap(mat, name = "foo", column_hclust_height = unit(2, "cm")) ``` There are three ways to specify distance metric for clustering: - specify distance as a pre-defined option. The valid values are the supported methods in `dist()` function and within `pearson`, `spearman` and `kendall`. - a self-defined function which calculates distance from a matrix. The function should only contain one argument. Please note for clustering on columns, the matrix will be transposed automatically. - a self-defined function which calculates distance from two vectors. The function should only contain two arguments. ```{r cluster_distance} Heatmap(mat, name = "foo", clustering_distance_rows = "pearson") Heatmap(mat, name = "foo", clustering_distance_rows = function(m) dist(m)) Heatmap(mat, name = "foo", clustering_distance_rows = function(x, y) 1 - cor(x, y)) ``` Maybe we can apply a robust clustering based on the pair-wise distance. Also with self-defining `col` by `colorRamp2()`, we can make a outlier-unsensitive heatmap. ```{r cluster_distance_advanced} mat_with_outliers = mat for(i in 1:10) mat_with_outliers[i, i] = 1000 robust_dist = function(x, y) { qx = quantile(x, c(0.1, 0.9)) qy = quantile(y, c(0.1, 0.9)) l = x > qx[1] & x < qx[2] & y > qy[1] & y < qy[2] x = x[l] y = y[l] sqrt(sum((x - y)^2)) } Heatmap(mat_with_outliers, name = "foo", col = colorRamp2(c(-3, 0, 3), c("green", "white", "red")), clustering_distance_rows = robust_dist, clustering_distance_columns = robust_dist) ``` Method to make hierarchical clustering can be specified by `clustering_method_rows` and `clustering_method_columns`. Possible methods are those supported in `hclust()` function. ```{r cluster_method} Heatmap(mat, name = "foo", clustering_method_rows = "single") ``` By default, clustering are performed by `hclust()`. But you can also utilize clustering results which are generated by other methods by specifying `cluster_rows` or `cluster_columns` to a `hclust` or `dendrogram` object. In following examples, we use `diana()` and `agnes()` methods which are from the **cluster** package to perform clusterings. ```{r cluster_object} library(cluster) Heatmap(mat, name = "foo", cluster_rows = as.dendrogram(diana(mat)), cluster_columns = as.dendrogram(agnes(t(mat)))) ``` You can render your `dendrogram` object by the **dendextend** package and make a more customized visualization of the dendrogram. ```{r cluster_dendextend} library(dendextend) dend = hclust(dist(mat)) dend = color_branches(dend, k = 2) Heatmap(mat, name = "foo", cluster_rows = dend) ``` More generally, `cluster_rows` and `cluster_columns` can be functions which calculate the clusterings. The input argument should be a matrix and returned value of the function should be a `hclust` or `dendrogram` object. Please note, when `cluster_rows` is executed internally, the argument `m` is the input `mat` itself while `m` is the transpose of `mat` when executing `cluster_columns`. ```{r cluster_function} Heatmap(mat, name = "foo", cluster_rows = function(m) hclust(dist(m)), cluster_columns = function(m) hclust(dist(m))) ``` ### Dimension names Side, visibility and graphics parameter for dimension names can be set. ```{r dimension_name} Heatmap(mat, name = "foo", row_names_side = "left", row_hclust_side = "right", column_names_side = "top", column_hclust_side = "bottom") Heatmap(mat, name = "foo", show_row_names = FALSE) Heatmap(mat, name = "foo", row_names_gp = gpar(fontsize = 20)) Heatmap(mat, name = "foo", row_names_gp = gpar(col = c(rep("red", 4), rep("blue", 8)))) ``` ### Split heatmap by rows A heatmap can be split by rows. The `km` argument with a value larger than 1 means applying a k-means clustering on rows. ```{r k_means} Heatmap(mat, name = "foo", km = 2) ``` More generally, `split` can be set to a vector or a data frame in which different combination of levels split the rows of the heatmap. Actually, k-means clustering just generates a vector of row classes and appends `split` with one additional column. The combined row titles can be controlled by `combined_name_fun` argument. ```{r split} Heatmap(mat, name = "foo", split = rep(c("A", "B"), 6)) Heatmap(mat, name = "foo", split = data.frame(rep(c("A", "B"), 6), rep(c("C", "D"), each = 6))) Heatmap(mat, name = "foo", split = data.frame(rep(c("A", "B"), 6), rep(c("C", "D"), each = 6)), combined_name_fun = function(x) paste(x, collapse = "\n")) Heatmap(mat, name = "foo", km = 2, split = rep(c("A", "B"), 6), combined_name_fun = function(x) paste(x, collapse = "\n")) Heatmap(mat, name = "foo", km = 2, split = rep(c("A", "B"), 6), combined_name_fun = NULL) ``` If you are not happy with the default k-means partitioning method, it is easy to use other partitioning methods by just assigning the partitioning vector to `split`. ```{r pam} pa = pam(mat, k = 3) Heatmap(mat, name = "foo", split = paste0("pam", pa$clustering)) ``` Height of gaps between every row slices can be controlled by `gap`. ```{r split_gap} Heatmap(mat, name = "foo", split = paste0("pam", pa$clustering), gap = unit(5, "mm")) ``` Character matrix can only be split by `split` argument. ```{r split_discrete_matrix} Heatmap(discrete_mat, name = "foo", split = rep(letters[1:2], each = 5)) ``` ### Self define the heatmap body The heatmap body can also be self-defined. By default the heatmap body is composed by an array of rectangles with different filled colors. If `type` in `rect_gp` is set to `none`, the array for rectangles is initialized but no graphics are put in. Then, users can define their own graphic function by `cell_fun`. `cell_fun` is applied on every cell in the heatmap and provides following information on the 'current' cell: - `i`: row index in the matrix. - `j`: column index in the matrix. - `x`: x coordinate of middle point of the cell which is measured in the viewport of the heatmap body. - `y`: y coordinate of middle point of the cell which is measured in the viewport of the heatmap body. - `width`: width of the cell. - `height`: height of the cell. - `fill`: color of the cell. In following example, we draw round rectangles and add text. We assign the heatmap to a variable and make the plot explicitly by calling `draw()`, just in order to suppress the heatmap legend. This functionality will be introduced in later sections. ```{r cell_fun} mat2 = matrix(sample(1:10, 12, replace = TRUE), 4, 3) ht = Heatmap(mat2, rect_gp = gpar(col = "white", lwd = 2, type = "none"), cell_fun = function(j, i, x, y, width, height, fill) { grid.roundrect(x, y, width*0.8, height*0.8, gp = gpar(fill = fill)) if(mat2[i, j] <= 4) { grid.text("cold", x = x, y = y) } else if(mat2[i, j] >= 7) { grid.text("hot", x = x, y = y) } else { grid.text("normal", x = x, y = y) } }, cluster_rows = FALSE, cluster_columns = FALSE) draw(ht, show_heatmap_legend = FALSE) ``` ## The HeatmapAnnotation class In this section, we only describe how to generate a column annotations. Row annotations will be described in later sections. The annotation graphics actually are quite general. The only common characteristic for column annotation is that they are aligned to the columns of the heatmap. Here there is a `HeatmapAnnotation` class which is used to define annotations on columns. ### Simple annotation A simple annotation is defined as a vector which contains discrete classes or continuous values corresponding to the columns. Since the simple annotation is represented as a vector, multiple simple annotations can be specified as a data frame. Colors for the simple annotations can be specified by `col` with discrete values or color mapping functions, depending on whether the simple annotations are discrete or continuous. When plotting the heatmap, simple annotations will be represented as rows of grids. There is a `draw()` method for the `HeatmapAnnotation` class. `draw()` is used internally and here we just use it for illustration. ```{r heatmap_annotation, fig.width = 7, fig.height = 0.5} df = data.frame(type = c(rep("a", 5), rep("b", 5))) ha = HeatmapAnnotation(df = df) ha draw(ha, 1:10) ``` ```{r heatmap_annotation_col, fig.width = 7, fig.height = 0.5} ha = HeatmapAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue"))) ha draw(ha, 1:10) ``` ```{r heatmap_annotation_colfun, fig.width = 7, fig.height = 0.5} ha = HeatmapAnnotation(df = data.frame(age = sample(1:20, 10)), col = list(age = colorRamp2(c(0, 20), c("white", "red")))) ha draw(ha, 1:10) ``` Put more than one annotations together. ```{r heatmap_annotation_mixed, fig.width = 7, fig.height = 1} ha = HeatmapAnnotation(df = data.frame(type = c(rep("a", 5), rep("b", 5)), age = sample(1:20, 10)), col = list(type = c("a" = "red", "b" = "blue"), age = colorRamp2(c(0, 20), c("white", "red"))) ) ha draw(ha, 1:10) ``` ### Complex annotations Besides simple annotations, these are complex annotations. The complex annotations are always represented as self-defined functions. Actually, for each column annotation, there will be a viewport created waiting for graphics. The annotation function here defines how to put the graphics to this viewport. The only argument of the function is an index of column which may be adjusted by column clustering. In following example, an annotation of points is created. Please note how we define `xscale` so that positions of points correspond to middle points of the columns. ```{r heatmap_annotation_complex, fig.width = 7, fig.height = 1} value = rnorm(10) column_anno = function(index) { n = length(index) pushViewport(viewport(xscale = c(0.5, n + 0.5), yscale = range(value))) grid.points(index, value, pch = 16) upViewport() # this is very important in order not to mess up the layout } ha = HeatmapAnnotation(points = column_anno) ha draw(ha, 1:10) ``` For simple annotation graphics, you can create such annotation function by `anno_points()` or `anno_barplot()`. `anno_points()` will return an annotation function which just satisfy our needs. ```{r heatmap_annotation_points, fig.width = 7, fig.height = 1} ha = HeatmapAnnotation(points = anno_points(value)) draw(ha, 1:10) ``` ```{r heatmap_annotation_barplot, fig.width = 7, fig.height = 1} ha = HeatmapAnnotation(points = anno_barplot(value)) draw(ha, 1:10) ``` There is also an `anno_boxplot()` which generates boxplot for each column in the matrix. ```{r heatmap_annotation_boxplot, fig.width = 7, fig.height = 1} ha = HeatmapAnnotation(boxplot = anno_boxplot(mat)) draw(ha, 1:10) ``` You can combine more than one annotations into the object. ```{r heatmap_annotation_mixed_with_complex, fig.width = 7, fig.height = 2} ha = HeatmapAnnotation(df = df, points = anno_points(value)) ha draw(ha, 1:10) ``` If there are more than one annotations, you can control height of each annotation by `annotation_height`. The value of `annotation_height` can either be numeric values or `unit` objects. But when you specify the height to the `unit` objects, you should make sure the sum of heights does not exceed the height of the annotations shown in the heatmap. ```{r, fig.width = 7, fig.height = 3} ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat), annotation_height = c(1, 2, 3)) draw(ha, 1:10) ``` ```{r, fig.width = 7, fig.height = 3} ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat), annotation_height = unit.c(unit(1, "null"), unit(3, "cm"), unit(3, "cm"))) draw(ha, 1:10) ``` With the annotation, you can assign in to the heatmap either by `top_annotation` or `bottom_annotation`. Also you can control the size of total column annotations by `top_annotation_height` and `bottom_annotation_height`. ```{r add_annotation} ha = HeatmapAnnotation(df = df, points = anno_points(value)) ha_boxplot = HeatmapAnnotation(boxplot = anno_boxplot(mat)) Heatmap(mat, name = "foo", top_annotation = ha, bottom_annotation = ha_boxplot, bottom_annotation_height = unit(2, "cm")) ``` You can suppress some of the annotation legend by specifying `show_legend` to `FALSE` when creating the `HeatmapAnnotation` object. ```{r annotation_show} ha = HeatmapAnnotation(df = df, show_legend = FALSE) Heatmap(mat, name = "foo", top_annotation = ha) ``` More types of annotations which show data distribution in corresponding rows and columns can be supported by `anno_histogram()` and `anno_density()`. ```{r annotation_more, fig.height = 10, fig.width = 10} ha_mix_top = HeatmapAnnotation(histogram = anno_histogram(mat, gp = gpar(fill = rep(2:3, each = 5))), density_line = anno_density(mat, type = "line", gp = gpar(col = rep(2:3, each = 5))), violin = anno_density(mat, type = "violin", gp = gpar(fill = rep(2:3, each = 5))), heatmap = anno_density(mat, type = "heatmap")) ha_mix_right = HeatmapAnnotation(histogram = anno_histogram(mat, gp = gpar(fill = rep(2:3, each = 5)), which = "row"), density_line = anno_density(mat, type = "line", gp = gpar(col = rep(2:3, each = 5)), which = "row"), violin = anno_density(mat, type = "violin", gp = gpar(fill = rep(2:3, each = 5)), which = "row"), heatmap = anno_density(mat, type = "heatmap", which = "row"), which = "row", width = unit(8, "cm")) Heatmap(mat, name = "foo", top_annotation = ha_mix_top, top_annotation_height = unit(8, "cm")) + ha_mix_right ``` The `HeatmapAnnotation` is also used for construction of row annotations which we will introduce in later sections. ## A list of heatmaps You can arrange more than one heatmaps which are placed columns by columns. Actually, one single heatmap is just a special case of the heatmap list of length one. `Heatmap()` is actually a class constructor function for a single heatmap. If more than one heatmaps are to be combined, users can append one heatmap to the other by `+` operator. ```{r heatmap_list_default, fig.width = 10} mat = matrix(rnorm(80, 2), 8, 10) mat = rbind(mat, matrix(rnorm(40, -2), 4, 10)) rownames(mat) = paste0("R", 1:12) colnames(mat) = paste0("C", 1:10) ht1 = Heatmap(mat, name = "ht1") ht2 = Heatmap(mat, name = "ht2") class(ht1) class(ht2) ht1 + ht2 ``` Under default mode, dendrograms from the second heatmap will be removed and row orders will be same as the first one. The returned value of addition of two heatmaps is a `HeatmapList` object. Directly call `ht_list` object will call `draw()` method with default settings. With explicitly calling `draw()` method, you can have more control on the legend and titles. ```{r} ht_list = ht1 + ht2 class(ht_list) ``` You can also append any number of heatmaps to the heatmap list. Also you can append a heatmap list to a heatmap list. ```{r, eval = FALSE} ht1 + ht1 + ht1 ht1 + ht_list ht_list + ht1 ht_list + ht_list ``` ### Titles A heatmap list also has titles which are independent to the heatmap titles. ```{r heatmap_list_title, fig.width = 10} ht1 = Heatmap(mat, name = "ht1", row_title = "Heatmap 1", column_title = "Heatmap 1") ht2 = Heatmap(mat, name = "ht2", row_title = "Heatmap 2", column_title = "Heatmap 2") ht_list = ht1 + ht2 draw(ht_list, row_title = "Two heatmaps, row title", row_title_gp = gpar(col = "red"), column_title = "Two heatmaps, column title", column_title_side = "bottom") ``` ### Legends Legends for all heatmaps and all annotations will be drawn together. The legends for heatmaps and legends for annotations are put in independent viewports. ```{r legend, fig.width = 10, fig.keep = "all"} df = data.frame(type = c(rep("a", 5), rep("b", 5))) ha = HeatmapAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue"))) ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", top_annotation = ha) ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2") ht_list = ht1 + ht2 draw(ht_list) draw(ht_list, heatmap_legend_side = "left", annotation_legend_side = "bottom") draw(ht_list, show_heatmap_legend = FALSE, show_annotation_legend = FALSE) ``` You can choose to only add some of the heatmap legends by setting `show_heatmap_legend` to a logical value. ```{r legend_show, fig.width = 10} ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", top_annotation = ha) ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", show_heatmap_legend = FALSE) ht1 + ht2 ``` **ComplexHeatmap** only generate legends for heatmaps and simple annotations. Self-defined legends can be passed by `annotation_legend_list` as a list of `grob` objects. ```{r self_defined_legend, fig.width = 10} ha = HeatmapAnnotation(points = anno_points(rnorm(10))) ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", top_annotation = ha, show_heatmap_legend = FALSE) lgd = legendGrob(c("dots"), pch = 16) draw(ht1 + ht2, annotation_legend_list = list(lgd)) ``` Graphic parameters for legends for simple annotations can be set. Actually these arguments are passed to `color_mapping_legend()` on `ColorMapping` class. ```{r heatmap_list_advanced, fig.width = 10} draw(ht1 + ht2, legend_grid_width = unit(6, "mm"), legend_title_gp = gpar(fontsize = 14, fontface = "bold")) ``` ### Gaps between heatmaps The gaps between heatmaps can be set by `gap` argument by a `unit` object. ```{r heatmap_list_gap, fig.width = 10, fig.keep = "all"} draw(ht_list, gap = unit(1, "cm")) draw(ht_list + ht_list, gap = unit(c(3, 6, 9, 0), "mm")) ``` ### Size of heatmaps The width for some (not all) heatmaps can be set to a fixed width. ```{r heatmap_list_size, fig.width = 10, fig.keep = "all"} ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1") ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", width = unit(5, "cm")) ht1 + ht2 ``` ### Auto adjustment There are some automatic adjustment if more than one heatmaps are plotted. There should be a main heatmap which by default is the first one. Some settings for the remaining heatmaps will be modified to the settings in the main heatmap. The adjustment are: - row clusters are removed. - row titles are removed. - if the main heatmap is split by rows, all remaining heatmaps will also be split by same levels as the main one. The main heatmap can be specified by `main_heatmap` argument. The value can be a numeric index or the name of the heatmap (of course, you need to set the heatmap name when you create the `Heatmap` object). ```{r heatmap_list_auto_adjust, fig.width = 10, fig.keep = "all"} ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", km = 2) ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2") ht1 + ht2 draw(ht2 + ht1) draw(ht2 + ht1, main_heatmap = "ht1") ``` If there is no row clustering in the main heatmap, all other heatmaps have no row clustering neither. ```{r heatmap_list_auto_adjust_no_row_cluster, fig.width = 10} ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", cluster_rows = FALSE) ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2") ht1 + ht2 ``` ## Heatmap list with row annotations ### Row annotations Row annotation is also defined by the `HeatmapAnnotation` class, but with specifying `which` to `row`. ```{r row_annotation, fig.width = 1, fig.height = 7} df = data.frame(type = c(rep("a", 6), rep("b", 6))) ha = HeatmapAnnotation(df = df, which = "row", width = unit(1, "cm")) draw(ha, 1:12) ``` ### Mix heatmaps and row annotations Essentially, row annotations and column annotations are identical graphics, but in applications, there is some difference. In **ComplexHeatmap** package, row annotations have the same place as the heatmap while column annotations are just like accessory components of heatmaps. For row annotations, similar as heatmaps, you can append the row annotations to heatmap or heatmap list or even row annotation object itself. ```{r heatmap_list_with_row_annotation, fig.width = 9} ht1 = Heatmap(mat, name = "ht1") ht1 + ha + ht1 ``` Also complex row annotations can be added. Pleast note in following code, the dendrogram and row title are put on the two sides of the heatmap list by specifying `row_hclust_side` and `row_sub_title_side`. ```{r heatmap_list_with_row_annotation_complex} ht1 = Heatmap(mat, name = "ht1", km = 2) ha_boxplot = HeatmapAnnotation(boxplot = anno_boxplot(mat, which = "row"), which = "row", width = unit(2, "cm")) draw(ha_boxplot + ht1, row_hclust_side = "left", row_sub_title_side = "right") ``` For row annotations, more complex annotation graphics can help to visualize the data: ```{r complex_heatmap, fig.width = 7} ht1 = Heatmap(mat, name = "ht1", km = 2) random_data = lapply(1:12, function(x) runif(20)) random_data[1:6] = lapply(random_data[1:6], function(x) x*0.5) ha = HeatmapAnnotation(distribution = function(index) { random_data = random_data[index] n = length(index) for(i in seq_len(n)) { y = random_data[[i]] pushViewport(viewport(x= 0, y = (i-0.5)/n, width = unit(1, "npc") - unit(1, "cm"), height = 1/n*0.8, just = "left", xscale = c(0, 20), yscale = c(0, 1))) if(index[i] %in% 1:6) fill = "blue" else fill = "red" grid.polygon(c(1:20, 20, 1), c(y, 0, 0), default.units = "native", gp = gpar(fill = fill, col = NA)) grid.yaxis(main = FALSE, gp = gpar(fontsize = 8)) upViewport() } }, which = "row", width = unit(10, "cm")) draw(ht1 + ha, row_hclust_side = "left", row_sub_title_side = "right") ``` ## Access components Each components of the heatmap/heatmap list has a name. You can go to any viewport by `seekViewport()`. Following figure almost contains all types of components. ```{r access_components, fig.width = 10, fig.height = 7} ha_column1 = HeatmapAnnotation(points = anno_points(rnorm(10))) ht1 = Heatmap(mat, name = "ht1", km = 2, row_title = "Heatmap 1", column_title = "Heatmap 1", top_annotation = ha_column1) ha_column2 = HeatmapAnnotation(df = data.frame(type = c(rep("a", 5), rep("b", 5)))) ht2 = Heatmap(mat, name = "ht2", row_title = "Heatmap 2", column_title = "Heatmap 2", bottom_annotation = ha_column2) ht_list = ht1 + ht2 draw(ht_list, row_title = "Heatmap list", column_title = "Heatmap list", heatmap_legend_side = "right", annotation_legend_side = "left") ``` The components that have names are: - `global`: the viewport which contains the whole figure. - `global_column_title`: the viewport which contains column title for the heatmap list. - `global_row_title`: the viewport which contains row title for the heatmap list. - `main_heatmap_list`: the viewport which contains a list of heatmaps and row annotations. - `heatmap_@{heatmap_name}`: the viewport which contains a single heatmap - `annotation_@{annotation_name}`: the viewport which contains an annotation either on columns or rows. - `@{heatmap_name}_heatmap_body_@{i}`: the heatmap body. - `@{heatmap_name}_column_title`: column title for a single heatmap. - `@{heatmap_name}_row_title_@{i}`: since a heatmap body may be splitted into several parts. `@{i}` is the index of the row slice. - `@{heatmap_name}_hclust_row_@{i}`: dendrogram for ith row slice. - `@{heatmap_name}_hclust_column`: dendrogram on columns - `@{heatmap_name}_row_names_@{i}`: the viewport which contains row names. - `@{heatmap_name}_column_names`: the viewport which contains column names. - `heatmap_legend`: the viewport which contains all heatmap legends. - `legend_@{heatmap_name}`: the viewport which contains a single heatmap legend. - `annotation_legend`: the viewport which contains all annotation legends. - `legend_@{annotation_name}`: the viewport which contains a single annotation legend. ```{r, fig.width = 10, fig.height = 7} ht_list seekViewport("annotation_points") grid.text("points", unit(0, "npc") - unit(2, "mm"), 0.5, default.units = "npc", just = "right") seekViewport("ht1_heatmap_body_2") grid.text("outlier", 1.5/10, 2.5/4, default.units = "npc") seekViewport("annotation_type") grid.text("type", unit(1, "npc") + unit(2, "mm"), 0.5, default.units = "npc", just = "left") ``` ## Real applications Data frame is an ideal data structure to store different subjects of information in which each row in the data frame focuses on the same element. Then, each subject can be visualized as a single heatmap and putting all subjects together can give you a full view of your data. In gene expression matrix, rows correspond to genes. More information about genes can be attached after the expression heatmap such as gene length and type of genes. ```{r expression_example, fig.width = 10, fig.height = 8} expr = readRDS(paste0(system.file(package = "ComplexHeatmap"), "/extdata/gene_expression.rds")) mat = as.matrix(expr[, grep("cell", colnames(expr))]) type = gsub("s\\d+_", "", colnames(mat)) ha = HeatmapAnnotation(df = data.frame(type = type)) Heatmap(mat, name = "expression", km = 5, top_annotation = ha, top_annotation_height = unit(4, "mm"), show_row_names = FALSE, show_column_names = FALSE) + Heatmap(expr$length, name = "length", col = colorRamp2(c(0, 100000), c("white", "orange")), width = unit(5, "mm")) + Heatmap(expr$type, name = "type", width = unit(5, "mm")) + Heatmap(expr$chr, name = "chr", col = rand_color(length(unique(expr$chr))), width = unit(5, "mm")) ``` Following examples is OncoPrint. The basic idea is to self define the heatmap body. Besides the default style which is provided by cBioPortal, there are additional barplots at both sides of the heatmap which show numbers of different alterations for each sample and for each gene. Source code is available here. ```{r, echo = FALSE, fig.width = 10, fig.height = 8} source("oncoprint.R") ``` Following examples visualizes correlation between methylation and expression, as well as other annotation information (data are randomly generated). In the heatmap, each row corresponds to a differentially methylated regions (DMRs). From left to right, heatmaps are: 1. methylation for each DMR (by rows) in samples. 2. direction of the methylation (one column heatmap), i.e. is methylation hyper in tumor or hypo? 3. expression for the genes that are associated with corresponding DMRs (e.g. closest gene). 4. significance for the correlation between methylation and expression (-log10(p-value)). 5. type of genes, i.e. is the gene a protein coding gene or a lincRNA? 6. annotation to gene models, i.e. is the DMR located in the intragenic region of the corresponding gene or the DMR is intergenic? 7. distance from the DMR to the TSS of the corresponding gene. 8. overlapping between DMRs and enhancers (Color shows how much the DMR is covered by the enhancers). Source code is available here. ```{r, echo = FALSE, fig.width = 10, fig.height = 8} source("genomic_regions.R") ``` ## Session info ```{r} sessionInfo() ```