--- title: "Rendering tables with pandoc.table" author: "Roman Tsegelskyi, Gergely Daróczi" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Rendering tables with pandoc.table} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, message = FALSE} knitr::opts_chunk$set(collapse = T, comment = "#>") library(pander) ``` Core functionality of `pander` is centered around `pandoc.table`, which is aimed at rendering tables in `markdown`. In case of 2D tables, `pander` calls `pandoc.table` internally, thus in such cases `pander` and `pandoc.table` support the same argument and in this vignette will be used iterchangingly. `pandoc.table` has a wide variety of options ([highlighting](#highlighting-cells), [styles](#table-styles), etc.) and this vignette aims to give a more detailed overview of the most common options. `pander` comes with a variety of globally adjustable options, which have an effect on the result of your reports. You can query and update these options with the `panderOptions` function. ## Table styles Since `pander` aims at rendering R objects into [Pandoc](https://pandoc.org/)'s markdown all four (`multiline`, `simple`, `grid`, `rmarkdown`) of [Pandoc](https://johnmacfarlane.net/pandoc/README.html#tables)'s formats are supported. Users are advised to stick with the default `multiline` style, but if there is a need to change it either specify `style` argument when calling `pander/pandoc.table` or change the default `style` using `panderOptions`. ### multiline tables `multiline` tables allow headers and table rows to span multiple lines of text (but cells that span multiple columns or rows of the table are not supported). Also note that, for simplicity, line breaks are removed from cells by default, so multiline cells are typically the result of [splitting large cells](#table-and-cell-width) or setting `keep.line.breaks` to `TRUE`: ```{r} m <- data.frame('Value\n1', 'Value\n2') colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2') pandoc.table(m, keep.line.breaks = TRUE) m <- mtcars[1:3, 1:4] pandoc.table(m) ``` ### simple tables `simple` tables are have more compact syntax that all other styles, but don't they support multiline cells: ```{r, error=TRUE} m <- mtcars[1:3, 1:4] pandoc.table(m, style = 'simple') m <- data.frame('Value\n1', 'Value\n2') colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2') pandoc.table(m, keep.line.breaks = TRUE, style='simple') ``` ### grid tables `grid` format is really handy for `emacs` users ([Emacs table mod](http://table.sourceforge.net/)) and it does support block elements (multiple paragraphs, code blocks, lists, etc.) inside cells, but cells can't span multiple columns or rows. Alignments are not supported for grid tables by most parsers, meaning that even though `pander` will produce a table with alignment, it will be lost during conversion from `markdown` to `HTML/PDF/DOCX`. ```{r, error=TRUE} m <- mtcars[1:3, 1:4] pandoc.table(m, style = 'grid') m <- data.frame('Value\n1', 'Value\n2') colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2') pandoc.table(m, keep.line.breaks = TRUE, style='grid') ``` ### rmarkdown tables `rmarkdown` or pipe table format, is often used directly with `knitr`, since it was supported by the first versions of the `markdown` package. It is similar to `simple` table in that multiline cells are also not supported. The beginning and ending pipe characters are optional, but pipes are required between all columns: ```{r, error=TRUE} m <- mtcars[1:3, 1:4] pandoc.table(m, style = 'rmarkdown') m <- data.frame('Value\n1', 'Value\n2') colnames(m) <- c('Multiline\nCol1', 'Multiline\nCol2') pandoc.table(m, keep.line.breaks = TRUE, style='rmarkdown') ``` ## Cell alignment `pander` allows users to control cell alignment (`left`, `right` or `center/centre`) in a table directly by setting the `justify` parameter when calling `pander/pandoc.table`. Note that it is possible to specify alignment for each column separately by supplying a vector to justify: ```{r} pandoc.table(head(iris[,1:3], 2), justify = 'right') pandoc.table(head(iris[,1:3], 2), justify = c('right', 'center', 'left')) ``` Another way to define alignment is by using a permanent option `table.alignment.default/table.alignment.rownames` in `panderOptions` (preferred way) or by using `set.alignment` function (legacy way of setting alignment for next table or permanently) which support setting alignment separately for cells and rownames: ```{r} set.alignment('left', row.names = 'right') # set only for next table since permanent parameter is falce pandoc.table(mtcars[1:2, 1:5]) ``` Interesting application for this functionality is specifying a function that takes the R object as its argument to compute some unique alignment for your table based on e.g. column values or variable types: ```{r} panderOptions('table.alignment.default', function(df) ifelse(sapply(df, mean) > 2, 'left', 'right')) pandoc.table(head(iris[,1:3], 2)) panderOptions('table.alignment.default', 'center') ``` ## Highlighting cells One of the great features of `pander` is the ease of highlighting rows, columns and cells in a table. This is a native `markdown` feature without custom `HTML` or `LaTeX`-only tweaks, so all `HTML/PDF/MS Word/OpenOffice` etc. formats are supported. This can be achieved by specifying one of the arguments below when calling `pander`/`pandoc.table` or change default style using `panderOptions`: * emphasize.italics.rows * emphasize.italics.cols * emphasize.italics.cells * emphasize.strong.rows * emphasize.strong.cols * emphasize.strong.cells * emphasize.verbatim.rows * emphasize.verbatim.cols * emphasize.verbatim.cells The `emphasize.italics` helpers would turn the affected cells to *italic*, `emphasize.strong` would apply a **bold** style to the cell and `emphasize.verbatim` would apply a `verbatim` style to the cell. A cell can be also *italic*, **bold** and `verbatim` at the same time. Those functions and arguments ending in rows or cols take a vector (like which columns or rows to emphasize in a table), while the cells argument take either a vector (for one dimensional "tables") or an array-like data structure with two columns holding row and column indexes of cells to be emphasized -- just like what `which(..., arr.ind = TRUE)` returns: ```{r} t <- mtcars[1:3, 1:5] emphasize.italics.cols(1) emphasize.italics.rows(1) emphasize.strong.cells(which(t > 20, arr.ind = TRUE)) pandoc.table(t) pandoc.table(t, emphasize.verbatim.rows = 1, emphasize.strong.cells = which(t > 20, arr.ind = TRUE)) ``` For more elaborative examples, please see our blog post - [Highlight cells in markdown tables](http://blog.rapporter.net/2013/04/hihglight-cells-in-markdown-tables.html). ## Table and Cell width `pander/pandoc.table` is able to deal with wide tables. Ever had an issue in `LaTeX` or `MS Word` when trying to print a correlation matrix of 40 variables? This problem is carefully addressed with `split.table` parameter: ```{r} pandoc.table(mtcars[1:2, ], style = "grid", caption = "Wide table to be split!") ``` `split.table` defaults to 80 characters and to turn it off, set `split.table` to `Inf`: ```{r} pandoc.table(mtcars[1:2, ], style = "grid", caption = "Wide table to be split!", split.table = Inf) ``` Also, `pander` tries to split too wide cells into multiline cells. The maximum number of characters in a cell is specified by the `split.cells` parameter (defaults to 30), which can be a single value, vector (values for each column separately) and relative vector (percentages of `split.tables` parameter). Please not that this only works for `multiline` and `grid` tables: ```{r, error=TRUE} df <- data.frame(a = 'Lorem ipsum', b = 'dolor sit', c = 'amet') pandoc.table(df, split.cells = 5) pandoc.table(df, split.cells = c(5, 20, 5)) pandoc.table(df, split.cells = c("80%", "10%", "10%")) pandoc.table(df, split.cells = 5, style = 'simple') ``` In some cases it is also useful to split too long words with hyphens, and `pander` uses `sylly` functionality for that. Just specify `use.hyphening` argument and have `sylly` installed: ```{r} pandoc.table(data.frame(baz = 'foobar', foo='accoutrements'), use.hyphening = TRUE, split.cells = 3) ``` ## Rounding and number formatting `pander/pandoc.table` deals with formatting numbers by having 4 parameters: * `round` to the number of decimal places. * `digits` to specify how many significant digits are to be used for numeric * `decimal.mark/big.mark` to specify character for decimal point/orders of magnitude `round` and `digits` parameter can be a vector specifying values for each column (has to be the same length as number of columns). Values for non-numeric columns will be disregarded. Now let's get to some examples: ```{r} r <- matrix(c(283764.97430, 29.12345678901, -7.1234, -100.1), ncol = 2) pandoc.table(r, round = 2) pandoc.table(r, round = c(4,2)) # vector for each column pandoc.table(r, digits = 2) pandoc.table(r, digits = c(1, 5)) # vector for each column pandoc.table(r, big.mark = ',') pandoc.table(r, decimal.mark = ',') ``` ## Other options Functionality described in other sections is most notable, but `pander/pandoc.table` also has smaller nifty features that are worth mentioning: * `plain.ascii` - allows to have the output without `markdown` markup: ```{r} pandoc.table(mtcars[1:3, 1:4]) pandoc.table(mtcars[1:3, 1:4], plain.ascii = TRUE) ``` * `caption` - set caption (string) to be shown under the table: ```{r} pandoc.table(mtcars[1:3, 1:5], style = "grid", caption = "My caption!") ``` * `missing` - set a string to replace missing values: ```{r} m <- mtcars[1:3, 1:5] m$mpg <- NA pandoc.table(m, missing = '?') ```