Making Complex Heatmaps
========================================
**Author**: Zuguang Gu ( z.gu@dkfz.de )
**Date**: `r Sys.Date()`
-------------------------------------------------------------
```{r global_settings, echo = FALSE, message = FALSE}
library(markdown)
options(markdown.HTML.options = c(options('markdown.HTML.options')[[1]], "toc"))
library(knitr)
knitr::opts_chunk$set(
error = FALSE,
tidy = FALSE,
message = FALSE,
fig.align = "center",
fig.width = 5,
fig.height = 5)
options(markdown.HTML.stylesheet = "custom.css")
options(width = 100)
```
Complex heatmaps are efficient to visualize associations
between different sources of data sets and reveal potential features.
Here the **ComplexHeatmap** package provides a highly flexible way to arrange
multiple heatmaps and supports self-defined annotation graphics.
## General design
Generally, a heatmap list contains several heatmaps and row annotations.
```{r design, echo = FALSE, fig.width = 10, fig.height = 5}
source("design.R")
```
There are several classes that are abstracted from the heatmap. Surrounding the heatmap list,
there are legends for heatmaps themselves and annotations, also there are titles which are placed
on the four sides of the heatmap list. For each heatmap, there are also different components standing
surrounding the heatmap body.
In **ComplexHeatmap** package, components of heatmap lists are abstracted into several classes.
- `Heatmap` class: a single heatmap containing heatmap body, row/column names, titles, dendrograms and column annotations.
- `HeatmapList` class: a list of heatmaps and row annotations.
- `HeatmapAnnotation` class: defines a list of row annotations and column annotations.
There are also several internal classes:
- `SingleAnnotation` class: defines a single row annotation or column annotation.
- `ColorMapping` class: mapping from values to colors.
**ComplexHeatmap** is implemented under **grid** system, so users should know basic **grid** functionality
to get full use of **ComplexHeatmap** package.
## A single heatmap
A single heatmap is mostly used for a quick view of the data. It is a special case of
a heatmap list which only contains one heatmap. In following examples, we will demonstrate
how to set parameters for visualization a single heatmap.
First let's load packages and generate a random matrix:
```{r data}
library(ComplexHeatmap)
library(circlize)
library(colorspace)
library(GetoptLong)
set.seed(123)
mat = cbind(rbind(matrix(rnorm(16, -1), 4), matrix(rnorm(32, 1), 8)),
rbind(matrix(rnorm(24, 1), 4), matrix(rnorm(48, -1), 8)))
rownames(mat) = paste0("R", 1:12)
colnames(mat) = paste0("C", 1:10)
```
Plot the heatmap with default settings. The default style of the heatmap is quite the same
as that generated by other similar heatmap functions.
```{r default}
Heatmap(mat)
```
### Colors
In most cases, the heatmap visualizes a matrix with continuous values.
In this case, user should provide a color mapping function. A color mapping function
should accept a vector of values and return a vector of corresponding colors. The `colorRamp2()` from
the **circlize** package is helpful for generating such functions. The two arguments for `colorRamp2()`
is a vector of breaks values and corresponding colors.
```{r color_fun}
Heatmap(mat, col = colorRamp2(c(-3, 0, 3), c("green", "white", "red")))
```
If the matrix contains discrete values (either numeric or character), colors should be specified as
a named vector to make it possible for the mapping from discrete values to colors.
```{r discrete_matrix}
discrete_mat = matrix(sample(1:4, 100, replace = TRUE), 10, 10)
colors = structure(rainbow_hcl(4), names = c("1", "2", "3", "4"))
Heatmap(discrete_mat, col = colors)
```
Or a character matrix:
```{r discrete_character_matrix}
discrete_mat = matrix(sample(letters[1:4], 100, replace = TRUE), 10, 10)
colors = structure(rainbow_hcl(4), names = letters[1:4])
Heatmap(discrete_mat, col = colors)
```
As you see, for the numeric matrix, by default clustering is applied on both dimensions while
for character matrix, clustering is suppressed.
### Titles
The name of the heatmap is used as the title of the heatmap legend.
The name also plays as a unique id if you plot more than one heatmaps together (introduced in later sections).
```{r with_matrix_name}
Heatmap(mat, name = "foo")
```
You can set heatmap titles either by the rows or by the columns. Note at a same time
you can only put e.g. column title either on the top or at the bottom of the heatmap.
The graphic parameters can be set by `row_title_gp` and `column_title_gp` respectively.
Please remember you should use `gpar()` to specify graphic parameters.
```{r row_column_title}
Heatmap(mat, name = "foo", column_title = "I am a column title",
row_title = "I am a row title")
Heatmap(mat, name = "foo", column_title = "I am a column title at the bottom",
column_title_side = "bottom")
Heatmap(mat, name = "foo", column_title = "I am a big column title",
column_title_gp = gpar(fontsize = 20, fontface = "bold"))
```
### Clustering
Clustering may be the key feature of the heatmap visualization. In **ComplexHeatmap** package,
clustering is supported with high flexibility. For different levels, you can specify the
clustering either by a pre-defined method, or by a distance function, or by a object that already
contains clustering, or directly by a clustering function. It is also possible to render
your dendrograms with different colors and styles for different branches for better revealing
structures of your data.
First there are general settings for the clustering, e.g. whether show dendrograms, side
of the dendrograms and size of the dendrograms.
```{r cluster_basic}
Heatmap(mat, name = "foo", cluster_rows = FALSE)
Heatmap(mat, name = "foo", show_column_hclust = FALSE)
Heatmap(mat, name = "foo", row_hclust_side = "right")
Heatmap(mat, name = "foo", column_hclust_height = unit(2, "cm"))
```
There are three ways to specify distance metric for clustering:
- specify distance as a pre-defined option. The valid values are the supported methods
in `dist()` function and within `pearson`, `spearman` and `kendall`.
- a self-defined function which calculates distance from a matrix. The function should
only contain one argument. Please note for clustering on columns, the matrix will be transposed
automatically.
- a self-defined function which calculates distance from two vectors. The function should
only contain two arguments.
```{r cluster_distance}
Heatmap(mat, name = "foo", clustering_distance_rows = "pearson")
Heatmap(mat, name = "foo", clustering_distance_rows = function(m) dist(m))
Heatmap(mat, name = "foo", clustering_distance_rows = function(x, y) 1 - cor(x, y))
```
Maybe we can apply a robust clustering based on the pair-wise distance. Also with self-defining
`col` by `colorRamp2()`, we can make a outlier-unsensitive heatmap.
```{r cluster_distance_advanced}
mat_with_outliers = mat
for(i in 1:10) mat_with_outliers[i, i] = 1000
robust_dist = function(x, y) {
qx = quantile(x, c(0.1, 0.9))
qy = quantile(y, c(0.1, 0.9))
l = x > qx[1] & x < qx[2] & y > qy[1] & y < qy[2]
x = x[l]
y = y[l]
sqrt(sum((x - y)^2))
}
Heatmap(mat_with_outliers, name = "foo",
col = colorRamp2(c(-3, 0, 3), c("green", "white", "red")),
clustering_distance_rows = robust_dist,
clustering_distance_columns = robust_dist)
```
Method to make hierarchical clustering can be specified by `clustering_method_rows` and
`clustering_method_columns`. Possible methods are those supported in `hclust()` function.
```{r cluster_method}
Heatmap(mat, name = "foo", clustering_method_rows = "single")
```
By default, clustering are performed by `hclust()`. But you can also utilize clustering results
which are generated by other methods by specifying `cluster_rows` or `cluster_columns` to a
`hclust` or `dendrogram` object. In following examples, we use `diana()` and `agnes()` methods
which are from the **cluster** package to perform clusterings.
```{r cluster_object}
library(cluster)
Heatmap(mat, name = "foo", cluster_rows = as.dendrogram(diana(mat)),
cluster_columns = as.dendrogram(agnes(t(mat))))
```
You can render your `dendrogram` object by the **dendextend** package and make a more customized
visualization of the dendrogram.
```{r cluster_dendextend}
library(dendextend)
dend = hclust(dist(mat))
dend = color_branches(dend, k = 2)
Heatmap(mat, name = "foo", cluster_rows = dend)
```
More generally, `cluster_rows` and `cluster_columns` can be functions which calculate the clusterings.
The input argument should be a matrix and returned value of the function should be a `hclust` or `dendrogram`
object. Please note, when `cluster_rows` is executed internally, the argument `m` is the input `mat` itself
while `m` is the transpose of `mat` when executing `cluster_columns`.
```{r cluster_function}
Heatmap(mat, name = "foo", cluster_rows = function(m) hclust(dist(m)),
cluster_columns = function(m) hclust(dist(m)))
```
### Dimension names
Side, visibility and graphics parameter for dimension names can be set.
```{r dimension_name}
Heatmap(mat, name = "foo", row_names_side = "left", row_hclust_side = "right",
column_names_side = "top", column_hclust_side = "bottom")
Heatmap(mat, name = "foo", show_row_names = FALSE)
Heatmap(mat, name = "foo", row_names_gp = gpar(fontsize = 20))
Heatmap(mat, name = "foo", row_names_gp = gpar(col = c(rep("red", 4), rep("blue", 8))))
```
### Split heatmap by rows
A heatmap can be split by rows. The `km` argument with a value larger than 1 means applying a k-means clustering
on rows.
```{r k_means}
Heatmap(mat, name = "foo", km = 2)
```
More generally, `split` can be set to a vector or a data frame in which different combination of levels
split the rows of the heatmap. Actually, k-means clustering just generates a vector of row classes and appends
`split` with one additional column. The combined row titles can be controlled by `combined_name_fun` argument.
```{r split}
Heatmap(mat, name = "foo", split = rep(c("A", "B"), 6))
Heatmap(mat, name = "foo", split = data.frame(rep(c("A", "B"), 6), rep(c("C", "D"), each = 6)))
Heatmap(mat, name = "foo", split = data.frame(rep(c("A", "B"), 6), rep(c("C", "D"), each = 6)),
combined_name_fun = function(x) paste(x, collapse = "\n"))
Heatmap(mat, name = "foo", km = 2, split = rep(c("A", "B"), 6),
combined_name_fun = function(x) paste(x, collapse = "\n"))
Heatmap(mat, name = "foo", km = 2, split = rep(c("A", "B"), 6), combined_name_fun = NULL)
```
If you are not happy with the default k-means partitioning method, it is easy to use other partitioning methods
by just assigning the partitioning vector to `split`.
```{r pam}
pa = pam(mat, k = 3)
Heatmap(mat, name = "foo", split = paste0("pam", pa$clustering))
```
Height of gaps between every row slices can be controlled by `gap`.
```{r split_gap}
Heatmap(mat, name = "foo", split = paste0("pam", pa$clustering), gap = unit(5, "mm"))
```
Character matrix can only be split by `split` argument.
```{r split_discrete_matrix}
Heatmap(discrete_mat, name = "foo", split = rep(letters[1:2], each = 5))
```
### Self define the heatmap body
The heatmap body can also be self-defined. By default the heatmap body is composed by an array of rectangles
with different filled colors. If `type` in `rect_gp` is set to `none`, the array for rectangles is initialized but
no graphics are put in. Then, users can define their own graphic function by `cell_fun`. `cell_fun` is applied
on every cell in the heatmap and provides following information on the 'current' cell:
- `i`: row index in the matrix.
- `j`: column index in the matrix.
- `x`: x coordinate of middle point of the cell which is measured in the viewport of the heatmap body.
- `y`: y coordinate of middle point of the cell which is measured in the viewport of the heatmap body.
- `width`: width of the cell.
- `height`: height of the cell.
- `fill`: color of the cell.
In following example, we draw round rectangles and add text. We assign the heatmap to a variable
and make the plot explicitly by calling `draw()`, just in order to suppress the heatmap legend.
This functionality will be introduced in later sections.
```{r cell_fun}
mat2 = matrix(sample(1:10, 12, replace = TRUE), 4, 3)
ht = Heatmap(mat2, rect_gp = gpar(col = "white", lwd = 2, type = "none"),
cell_fun = function(j, i, x, y, width, height, fill) {
grid.roundrect(x, y, width*0.8, height*0.8, gp = gpar(fill = fill))
if(mat2[i, j] <= 4) {
grid.text("cold", x = x, y = y)
} else if(mat2[i, j] >= 7) {
grid.text("hot", x = x, y = y)
} else {
grid.text("normal", x = x, y = y)
}
},
cluster_rows = FALSE, cluster_columns = FALSE)
draw(ht, show_heatmap_legend = FALSE)
```
## The HeatmapAnnotation class
In this section, we only describe how to generate a column annotations. Row annotations
will be described in later sections.
The annotation graphics actually are quite general. The only common characteristic for column annotation
is that they are aligned to the columns of the heatmap. Here there is a `HeatmapAnnotation` class which is used to
define annotations on columns.
### Simple annotation
A simple annotation is defined as a vector which contains discrete classes or continuous values corresponding to the columns.
Since the simple annotation is represented as a vector, multiple simple annotations can be specified
as a data frame. Colors for the simple annotations can be specified by `col` with discrete values or
color mapping functions, depending on whether the simple annotations are discrete or continuous.
When plotting the heatmap, simple annotations will be represented as rows of grids.
There is a `draw()` method for the `HeatmapAnnotation` class. `draw()` is used internally and here
we just use it for illustration.
```{r heatmap_annotation, fig.width = 7, fig.height = 0.5}
df = data.frame(type = c(rep("a", 5), rep("b", 5)))
ha = HeatmapAnnotation(df = df)
ha
draw(ha, 1:10)
```
```{r heatmap_annotation_col, fig.width = 7, fig.height = 0.5}
ha = HeatmapAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")))
ha
draw(ha, 1:10)
```
```{r heatmap_annotation_colfun, fig.width = 7, fig.height = 0.5}
ha = HeatmapAnnotation(df = data.frame(age = sample(1:20, 10)),
col = list(age = colorRamp2(c(0, 20), c("white", "red"))))
ha
draw(ha, 1:10)
```
Put more than one annotations together.
```{r heatmap_annotation_mixed, fig.width = 7, fig.height = 1}
ha = HeatmapAnnotation(df = data.frame(type = c(rep("a", 5), rep("b", 5)),
age = sample(1:20, 10)),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red")))
)
ha
draw(ha, 1:10)
```
### Complex annotations
Besides simple annotations, these are complex annotations. The complex annotations are always
represented as self-defined functions. Actually, for each column annotation, there will be a viewport
created waiting for graphics. The annotation function here defines how to put the graphics to
this viewport. The only argument of the function is an index of column which may be adjusted by column clustering.
In following example, an annotation of points is created. Please note how we define `xscale` so that positions
of points correspond to middle points of the columns.
```{r heatmap_annotation_complex, fig.width = 7, fig.height = 1}
value = rnorm(10)
column_anno = function(index) {
n = length(index)
pushViewport(viewport(xscale = c(0.5, n + 0.5), yscale = range(value)))
grid.points(index, value, pch = 16)
upViewport() # this is very important in order not to mess up the layout
}
ha = HeatmapAnnotation(points = column_anno)
ha
draw(ha, 1:10)
```
For simple annotation graphics, you can create such annotation function by `anno_points()` or `anno_barplot()`.
`anno_points()` will return an annotation function which just satisfy our needs.
```{r heatmap_annotation_points, fig.width = 7, fig.height = 1}
ha = HeatmapAnnotation(points = anno_points(value))
draw(ha, 1:10)
```
```{r heatmap_annotation_barplot, fig.width = 7, fig.height = 1}
ha = HeatmapAnnotation(points = anno_barplot(value))
draw(ha, 1:10)
```
There is also an `anno_boxplot()` which generates boxplot for each column in the matrix.
```{r heatmap_annotation_boxplot, fig.width = 7, fig.height = 1}
ha = HeatmapAnnotation(boxplot = anno_boxplot(mat))
draw(ha, 1:10)
```
You can combine more than one annotations into the object.
```{r heatmap_annotation_mixed_with_complex, fig.width = 7, fig.height = 2}
ha = HeatmapAnnotation(df = df, points = anno_points(value))
ha
draw(ha, 1:10)
```
If there are more than one annotations, you can control height of each annotation by `annotation_height`.
The value of `annotation_height` can either be numeric values or `unit` objects. But when you specify the
height to the `unit` objects, you should make sure the sum of heights does not exceed the height of the
annotations shown in the heatmap.
```{r, fig.width = 7, fig.height = 3}
ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat),
annotation_height = c(1, 2, 3))
draw(ha, 1:10)
```
```{r, fig.width = 7, fig.height = 3}
ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat),
annotation_height = unit.c(unit(1, "null"), unit(3, "cm"), unit(3, "cm")))
draw(ha, 1:10)
```
With the annotation, you can assign in to the heatmap either by `top_annotation` or `bottom_annotation`.
Also you can control the size of total column annotations by `top_annotation_height` and `bottom_annotation_height`.
```{r add_annotation}
ha = HeatmapAnnotation(df = df, points = anno_points(value))
ha_boxplot = HeatmapAnnotation(boxplot = anno_boxplot(mat))
Heatmap(mat, name = "foo", top_annotation = ha, bottom_annotation = ha_boxplot,
bottom_annotation_height = unit(2, "cm"))
```
You can suppress some of the annotation legend by specifying `show_legend` to `FALSE` when creating the `HeatmapAnnotation` object.
```{r annotation_show}
ha = HeatmapAnnotation(df = df, show_legend = FALSE)
Heatmap(mat, name = "foo", top_annotation = ha)
```
More types of annotations which show data distribution in corresponding rows and columns can be supported
by `anno_histogram()` and `anno_density()`.
```{r annotation_more, fig.height = 10, fig.width = 10}
ha_mix_top = HeatmapAnnotation(histogram = anno_histogram(mat, gp = gpar(fill = rep(2:3, each = 5))),
density_line = anno_density(mat, type = "line", gp = gpar(col = rep(2:3, each = 5))),
violin = anno_density(mat, type = "violin", gp = gpar(fill = rep(2:3, each = 5))),
heatmap = anno_density(mat, type = "heatmap"))
ha_mix_right = HeatmapAnnotation(histogram = anno_histogram(mat, gp = gpar(fill = rep(2:3, each = 5)), which = "row"),
density_line = anno_density(mat, type = "line", gp = gpar(col = rep(2:3, each = 5)), which = "row"),
violin = anno_density(mat, type = "violin", gp = gpar(fill = rep(2:3, each = 5)), which = "row"),
heatmap = anno_density(mat, type = "heatmap", which = "row"),
which = "row", width = unit(8, "cm"))
Heatmap(mat, name = "foo", top_annotation = ha_mix_top, top_annotation_height = unit(8, "cm")) + ha_mix_right
```
The `HeatmapAnnotation` is also used for construction of row annotations which we will introduce in later sections.
## A list of heatmaps
You can arrange more than one heatmaps which are placed columns by columns. Actually, one single
heatmap is just a special case of the heatmap list of length one.
`Heatmap()` is actually a class constructor function for a single heatmap. If more than one heatmaps
are to be combined, users can append one heatmap to the other by `+` operator.
```{r heatmap_list_default, fig.width = 10}
mat = matrix(rnorm(80, 2), 8, 10)
mat = rbind(mat, matrix(rnorm(40, -2), 4, 10))
rownames(mat) = paste0("R", 1:12)
colnames(mat) = paste0("C", 1:10)
ht1 = Heatmap(mat, name = "ht1")
ht2 = Heatmap(mat, name = "ht2")
class(ht1)
class(ht2)
ht1 + ht2
```
Under default mode, dendrograms from the second heatmap will be removed and row orders will be same as the first one.
The returned value of addition of two heatmaps is a `HeatmapList` object. Directly call `ht_list` object
will call `draw()` method with default settings. With explicitly calling `draw()` method, you can have more control
on the legend and titles.
```{r}
ht_list = ht1 + ht2
class(ht_list)
```
You can also append any number of heatmaps to the heatmap list. Also you can append a heatmap list to a heatmap list.
```{r, eval = FALSE}
ht1 + ht1 + ht1
ht1 + ht_list
ht_list + ht1
ht_list + ht_list
```
### Titles
A heatmap list also has titles which are independent to the heatmap titles.
```{r heatmap_list_title, fig.width = 10}
ht1 = Heatmap(mat, name = "ht1", row_title = "Heatmap 1", column_title = "Heatmap 1")
ht2 = Heatmap(mat, name = "ht2", row_title = "Heatmap 2", column_title = "Heatmap 2")
ht_list = ht1 + ht2
draw(ht_list, row_title = "Two heatmaps, row title", row_title_gp = gpar(col = "red"),
column_title = "Two heatmaps, column title", column_title_side = "bottom")
```
### Legends
Legends for all heatmaps and all annotations will be drawn together. The legends for heatmaps
and legends for annotations are put in independent viewports.
```{r legend, fig.width = 10, fig.keep = "all"}
df = data.frame(type = c(rep("a", 5), rep("b", 5)))
ha = HeatmapAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")))
ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", top_annotation = ha)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2")
ht_list = ht1 + ht2
draw(ht_list)
draw(ht_list, heatmap_legend_side = "left", annotation_legend_side = "bottom")
draw(ht_list, show_heatmap_legend = FALSE, show_annotation_legend = FALSE)
```
You can choose to only add some of the heatmap legends by setting `show_heatmap_legend` to a logical value.
```{r legend_show, fig.width = 10}
ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", top_annotation = ha)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", show_heatmap_legend = FALSE)
ht1 + ht2
```
**ComplexHeatmap** only generate legends for heatmaps and simple annotations. Self-defined legends
can be passed by `annotation_legend_list` as a list of `grob` objects.
```{r self_defined_legend, fig.width = 10}
ha = HeatmapAnnotation(points = anno_points(rnorm(10)))
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", top_annotation = ha, show_heatmap_legend = FALSE)
lgd = legendGrob(c("dots"), pch = 16)
draw(ht1 + ht2, annotation_legend_list = list(lgd))
```
Graphic parameters for legends for simple annotations can be set. Actually these arguments are passed
to `color_mapping_legend()` on `ColorMapping` class.
```{r heatmap_list_advanced, fig.width = 10}
draw(ht1 + ht2, legend_grid_width = unit(6, "mm"),
legend_title_gp = gpar(fontsize = 14, fontface = "bold"))
```
### Gaps between heatmaps
The gaps between heatmaps can be set by `gap` argument by a `unit` object.
```{r heatmap_list_gap, fig.width = 10, fig.keep = "all"}
draw(ht_list, gap = unit(1, "cm"))
draw(ht_list + ht_list, gap = unit(c(3, 6, 9, 0), "mm"))
```
### Size of heatmaps
The width for some (not all) heatmaps can be set to a fixed width.
```{r heatmap_list_size, fig.width = 10, fig.keep = "all"}
ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1")
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", width = unit(5, "cm"))
ht1 + ht2
```
### Auto adjustment
There are some automatic adjustment if more than one heatmaps are plotted. There should be a main heatmap
which by default is the first one. Some settings for the remaining heatmaps will be modified to the settings
in the main heatmap. The adjustment are:
- row clusters are removed.
- row titles are removed.
- if the main heatmap is split by rows, all remaining heatmaps will also be split by same levels as the main one.
The main heatmap can be specified by `main_heatmap` argument. The value can be a numeric index or the name of the heatmap
(of course, you need to set the heatmap name when you create the `Heatmap` object).
```{r heatmap_list_auto_adjust, fig.width = 10, fig.keep = "all"}
ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", km = 2)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2")
ht1 + ht2
draw(ht2 + ht1)
draw(ht2 + ht1, main_heatmap = "ht1")
```
If there is no row clustering in the main heatmap, all other heatmaps have no row clustering neither.
```{r heatmap_list_auto_adjust_no_row_cluster, fig.width = 10}
ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", cluster_rows = FALSE)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2")
ht1 + ht2
```
## Heatmap list with row annotations
### Row annotations
Row annotation is also defined by the `HeatmapAnnotation` class, but with specifying
`which` to `row`.
```{r row_annotation, fig.width = 1, fig.height = 7}
df = data.frame(type = c(rep("a", 6), rep("b", 6)))
ha = HeatmapAnnotation(df = df, which = "row", width = unit(1, "cm"))
draw(ha, 1:12)
```
### Mix heatmaps and row annotations
Essentially, row annotations and column annotations are identical graphics, but in applications,
there is some difference. In **ComplexHeatmap** package, row annotations have the same place as the heatmap
while column annotations are just like accessory components of heatmaps. For row annotations, similar
as heatmaps, you can append the row annotations to heatmap or heatmap list or even row annotation object itself.
```{r heatmap_list_with_row_annotation, fig.width = 9}
ht1 = Heatmap(mat, name = "ht1")
ht1 + ha + ht1
```
Also complex row annotations can be added. Pleast note in following code, the dendrogram and row title are put
on the two sides of the heatmap list by specifying `row_hclust_side` and `row_sub_title_side`.
```{r heatmap_list_with_row_annotation_complex}
ht1 = Heatmap(mat, name = "ht1", km = 2)
ha_boxplot = HeatmapAnnotation(boxplot = anno_boxplot(mat, which = "row"),
which = "row", width = unit(2, "cm"))
draw(ha_boxplot + ht1, row_hclust_side = "left", row_sub_title_side = "right")
```
For row annotations, more complex annotation graphics can help to visualize the data:
```{r complex_heatmap, fig.width = 7}
ht1 = Heatmap(mat, name = "ht1", km = 2)
random_data = lapply(1:12, function(x) runif(20))
random_data[1:6] = lapply(random_data[1:6], function(x) x*0.5)
ha = HeatmapAnnotation(distribution = function(index) {
random_data = random_data[index]
n = length(index)
for(i in seq_len(n)) {
y = random_data[[i]]
pushViewport(viewport(x= 0, y = (i-0.5)/n, width = unit(1, "npc") - unit(1, "cm"),
height = 1/n*0.8, just = "left", xscale = c(0, 20), yscale = c(0, 1)))
if(index[i] %in% 1:6) fill = "blue" else fill = "red"
grid.polygon(c(1:20, 20, 1), c(y, 0, 0), default.units = "native",
gp = gpar(fill = fill, col = NA))
grid.yaxis(main = FALSE, gp = gpar(fontsize = 8))
upViewport()
}
}, which = "row", width = unit(10, "cm"))
draw(ht1 + ha, row_hclust_side = "left", row_sub_title_side = "right")
```
## Access components
Each components of the heatmap/heatmap list has a name. You can go to any viewport by `seekViewport()`.
Following figure almost contains all types of components.
```{r access_components, fig.width = 10, fig.height = 7}
ha_column1 = HeatmapAnnotation(points = anno_points(rnorm(10)))
ht1 = Heatmap(mat, name = "ht1", km = 2, row_title = "Heatmap 1", column_title = "Heatmap 1",
top_annotation = ha_column1)
ha_column2 = HeatmapAnnotation(df = data.frame(type = c(rep("a", 5), rep("b", 5))))
ht2 = Heatmap(mat, name = "ht2", row_title = "Heatmap 2", column_title = "Heatmap 2",
bottom_annotation = ha_column2)
ht_list = ht1 + ht2
draw(ht_list, row_title = "Heatmap list", column_title = "Heatmap list",
heatmap_legend_side = "right", annotation_legend_side = "left")
```
The components that have names are:
- `global`: the viewport which contains the whole figure.
- `global_column_title`: the viewport which contains column title for the heatmap list.
- `global_row_title`: the viewport which contains row title for the heatmap list.
- `main_heatmap_list`: the viewport which contains a list of heatmaps and row annotations.
- `heatmap_@{heatmap_name}`: the viewport which contains a single heatmap
- `annotation_@{annotation_name}`: the viewport which contains an annotation either on columns or rows.
- `@{heatmap_name}_heatmap_body_@{i}`: the heatmap body.
- `@{heatmap_name}_column_title`: column title for a single heatmap.
- `@{heatmap_name}_row_title_@{i}`: since a heatmap body may be splitted into several parts. `@{i}` is the index of the row slice.
- `@{heatmap_name}_hclust_row_@{i}`: dendrogram for ith row slice.
- `@{heatmap_name}_hclust_column`: dendrogram on columns
- `@{heatmap_name}_row_names_@{i}`: the viewport which contains row names.
- `@{heatmap_name}_column_names`: the viewport which contains column names.
- `heatmap_legend`: the viewport which contains all heatmap legends.
- `legend_@{heatmap_name}`: the viewport which contains a single heatmap legend.
- `annotation_legend`: the viewport which contains all annotation legends.
- `legend_@{annotation_name}`: the viewport which contains a single annotation legend.
```{r, fig.width = 10, fig.height = 7}
ht_list
seekViewport("annotation_points")
grid.text("points", unit(0, "npc") - unit(2, "mm"), 0.5, default.units = "npc", just = "right")
seekViewport("ht1_heatmap_body_2")
grid.text("outlier", 1.5/10, 2.5/4, default.units = "npc")
seekViewport("annotation_type")
grid.text("type", unit(1, "npc") + unit(2, "mm"), 0.5, default.units = "npc", just = "left")
```
## Real applications
Data frame is an ideal data structure to store different subjects of information
in which each row in the data frame focuses on the same element.
Then, each subject can be visualized as a single heatmap and putting all subjects
together can give you a full view of your data.
In gene expression matrix, rows correspond to genes. More information about genes can be attached after the expression
heatmap such as gene length and type of genes.
```{r expression_example, fig.width = 10, fig.height = 8}
expr = readRDS(paste0(system.file(package = "ComplexHeatmap"), "/extdata/gene_expression.rds"))
mat = as.matrix(expr[, grep("cell", colnames(expr))])
type = gsub("s\\d+_", "", colnames(mat))
ha = HeatmapAnnotation(df = data.frame(type = type))
Heatmap(mat, name = "expression", km = 5, top_annotation = ha,
top_annotation_height = unit(4, "mm"), show_row_names = FALSE,
show_column_names = FALSE) +
Heatmap(expr$length, name = "length", col = colorRamp2(c(0, 100000), c("white", "orange")),
width = unit(5, "mm")) +
Heatmap(expr$type, name = "type", width = unit(5, "mm")) +
Heatmap(expr$chr, name = "chr", col = rand_color(length(unique(expr$chr))),
width = unit(5, "mm"))
```
Following examples is OncoPrint.
The basic idea is to self define the heatmap body. Besides the default style which is
provided by cBioPortal, there are
additional barplots at both sides of the heatmap which show numbers of different alterations for
each sample and for each gene. Source code is available here.
```{r, echo = FALSE, fig.width = 10, fig.height = 8}
source("oncoprint.R")
```
Following examples visualizes correlation between methylation and expression, as well as other annotation information (data are randomly generated). In the heatmap, each row corresponds to a differentially methylated regions (DMRs).
From left to right, heatmaps are:
1. methylation for each DMR (by rows) in samples.
2. direction of the methylation (one column heatmap), i.e. is methylation hyper in tumor or hypo?
3. expression for the genes that are associated with corresponding DMRs (e.g. closest gene).
4. significance for the correlation between methylation and expression (-log10(p-value)).
5. type of genes, i.e. is the gene a protein coding gene or a lincRNA?
6. annotation to gene models, i.e. is the DMR located in the intragenic region of the corresponding gene or the DMR is intergenic?
7. distance from the DMR to the TSS of the corresponding gene.
8. overlapping between DMRs and enhancers (Color shows how much the DMR is covered by the enhancers).
Source code is available here.
```{r, echo = FALSE, fig.width = 10, fig.height = 8}
source("genomic_regions.R")
```
## Session info
```{r}
sessionInfo()
```