Overview

This is a tutorial on R markdown and how to use Sheffield’s Florey Institute’s markdown template.

Although it is aimed at Florey Institute members I hope that it will be of broader interest, and anyone is welcome to use and adapt it. The penultimate section shows how to change the branding to another institution.

This page has three aims:

  1. To show what the Florey template looks like knitted
  2. To explain how to install, use and customise the template
  3. To provide a tutorial on how to use R Markdown for people who are familiar R but new to markdown

If you are new to markdown, I also aim to convert you to this powerful communication tool. This page will demonstrate how it could be used to easily communicate pre-publication research findings with colleague and collaborators.

This page was written with R Markdown using the Florey Template. The code used to make it can be found on my GitHub here. It might be useful to look at this in conjunction with this webpage to see how it has been put together, although this shouldn’t be necessary to understand it.

If you are already experienced with markdown then the section ‘Installing the template’ will be all you need to get going. However, the custom CSS section later on might be of interest, as this explains how to customise the logo and section breaks.

If you are new to markdown then I suggest following this page through in order.

If you need more detail or want to understand how to use markdown to create PDFs, Word documents or Shiny webpages I would highly recommend the R Markdown Cookbook.

In homage to the original markdown tutorial I have used the cars dataset1.

If you have any questions, suggestions for improvements or would like to edit the project please either email j.goodall@sheffield.ac.uk or open an issue on the GitHub page.

Installing the template

  1. In the R console install the devlopers’ tools packages by running: if (!require("devtools", quietly = TRUE)) { install.packages("devtools") library(devtools) }

  2. Then install the template package with:
    devtools::install_github("jackwgoodall/floreytemplate")

  3. Open a new file and select R Markdown Document as the document type

  4. On the left of the window select ‘From Template’

  5. Select ‘Florey Template’ and give it an appropriate name

  6. This should give you a minimal markdown document

  7. Press the ‘Knit’ button at the top of the screen

  8. If all has gone to plan an html file with the Florey logo and some intro text should appear

Anatomy of a Markdown

A markdown document is made up of:

  • The YAML heading
  • Formatted text
  • Code chunks

The YAML heading (‘YAML Ain’t Markup Language’ - recursive acroynms seem to be big in coding circles) tells R studio how to knit the document (knitting is the process of turning you markdown document into a human-readable form such as html).

The only essential thing for the YAML is the output: option. I would strongly suggest html for almost all tasks (although both PDF and Word documents have their uses).

However, there is much more that can be done with the YAML heading to make your markdown documents more accessible, as you will see.

Formatted text is really what makes markdown so great for sharing your pre-publication work with colleagues as this lets you add a narrative around the data you want to present.

The options open to you are shown in the headings and formatting sections.

Code chunks are sections of R code that are either just setting things up or are producing an output that we want to show. Code within these chunks operates pretty much the same as it does within ‘normal’ R - but plots or any text that would normally go to the console appears after the chunk that created it.

Chunks

Insert a chunk

All executable code in markdown needs to be contained within code chunks.

These chunks are framed by ```{r} at the start and ``` at the end.

This can either be done manually or with Code -> Insert Chunk.

After the ‘r’ you can add a name for the chunk (eg ```{r chunk_one}). This makes it easier to find specific chunks later as these names will appear at the bottom left of the editor.

Set the global options

You could choose how you want each chunk to behave one-by-one in the heading.

e.g. ```{r chunk_one, echo = FALSE} would tell R that you don’t want the code for this particular chunk to be shown.

Much easier though is to set the global options with knitr::opts_chunk$set like this:

knitr::opts_chunk$set(echo = TRUE, warning = FALSE, messages = FALSE)
options(scipen = 999, digits = 2)  # Turn off scientific notation, rounds numbers to 2 decimal places 

Click the ‘Show’ button here to see the settings for this document —->

I have selected:

  • echo = TRUE, meaning that all code will be shown. However as I have enabled code_folding: hide in the YAML these will be hidden by default unless you click the show button. I think this is a nice compromise as some collaborators will likely want to see you code, whilst others will just want the content.

  • warnings = FALSE to stop the warnings being printed. You will need to pay attention to these when you’re building the document as they won’t come out in the knit now.

  • message = FALSE prevents messages that are generated by code from being shown.

  • options(scipen = 999, digits = 2) this is an additional (non-standard) line I have added to force the output not to show real numbers (i.e. not scientific notation), and to round to two decimal places. These can obviously be modified to suit your needs

Other options are:

  • include = FALSE would prevent both the code AND it’s output from being shown. The code will still be run though.

  • eval = FALSE stops the code being run at all (useful if you want to show the code but don’t want it to do anything).

The setting in a code chunk will take precident over the global options. So, if you want all of the code chunks to be show apart from one you could set the global options to echo = TRUE and that chunk’s options to echo = FALSE.

Another useful thing to get your head around if you will be making documents with large data processing is caching. That is beyond the scope of this page but a good introduction can be found here

Packages

You will need some packages. If these are used for the knit they need to be explicitly loaded in the markdown document (not just loaded into your general environment).

I have gone for the default approach of hard coding pacman to be installed and then using this to install everything else.

I think ggplot2, tidyverse and kable_extra (see why kableExtra later) are all pretty much essentials but you can change these to suit.

I have also set a global ggplot theme - which is discussed in the plot themes section

if (!require("pacman", quietly = TRUE))
    install.packages("pacman")
pacman::p_load(ggplot2, tidyverse, kableExtra)
# Set global ggplot theme
theme_set(
  theme_minimal() +
    theme(
      panel.grid.major = element_blank(), # Remove the large grid lines
      panel.grid.minor = element_blank(), # Remove the small grid lines
      panel.background = element_blank(), # Remove the back ground
      axis.line = element_line(colour = "black") # Add a more solid axis line
    ))

Formatting

You can easily create headings using increasing numbers of hashtags (#). The more hashtags the smaller the heading.

I have set up the css file (this is explained later to give a bit of extra white space above and below the headings; these can be easily changed.

The setup of this markdown is such that anything up to heading level 4 (####) will feature in the folding table of content.

Click on “Formatting” then “Headings” in the table of content to see what this looks like
<———-

The heading of this section is heading 1 (one #)

Headings

This is heading 2 (two ##)

Heading 3

This is heading 3 (three ###)

Heading 4

This is heading 4 (four ####)

Heading 5

Heading 5 (five #####) will still create a heading - but this won’t be referenced in the table of content.

The depth of heading you want to be referenced can be easily changed in the YAML heading with toc_depth:

Text options

Input Output
**Two asterix** makes text bold
*One asterix* italicises text
***Three asterix*** does both
`Single backticks` formats as code
~~Two tilde~~ crosses out

Links can be made by putting the text in square brackets, followed by the website in round brackets, with no space in between.

[Link](https://github.com/jackwgoodall/floreytemplate/)

Makes:

Link

Equations can be made by wrapping the equation in $ for inline and $$ for displayed equations:

\[ \sum_{i = 1}^{n}{(this..is..an..equation)^2} \] A list of mathemetical notation can be found here

Dividing lines

A dividing line can be placed with three or more astrixes (***) like this:


Although a more convincing divide can be placed using some html code (done here to match the custom dividing lines I’ve put between sections in the css file).

<hr style="border: 1px solid black; width: 100%; margin-top: 1em; margin-bottom: 1em;"> makes:


(the ‘black’ here can be replaced by other colour names or hex codes to match you style)

Better eh?

Spaces

Markdown doesn’t really pay attention to spaces.

Adding two spaces to the end of a line generally starts a new line, but it you want to be more explict (or add multiple new lines) then inserting <br> will do this.

Adding &nbsp; will add whitespace.

Tables

Tables can be ‘freehanded’ (like I’ve done in the ‘Text Options’ section).

However, more elegant tables can be quickly made by piping the table into kableExtra. Like this:

table(pressure$temperature[1:5], pressure$pressure[1:5]) %>%
  kbl(caption = "Hover Table") %>%
  kable_paper("hover", full_width = F)
Hover Table
0.0002 0.0012 0.006 0.03 0.09
0 1 0 0 0 0
20 0 1 0 0 0
40 0 0 1 0 0
60 0 0 0 1 0
80 0 0 0 0 1

There are lots of options that are well documented on their CRAN page

R code in formatted text

Sometimes it can be useful to add data from R objects into the formatted text. This allows statements to update if you change the data source.

For instance I could find out that the median pressure from the cars dataset is 8.8 by checking this in the console. But this number won’t change if the cars dataset is updated.

Code can be run inline starting a phrase with “r” inside back ticks eg:

The median pressure of the cars dataset is `r median(pressure$pressure)` gives:

“The median pressure of the cars dataset is 8.8”

If the values in the cars dataset were to change, this would update accordingly.

This also works in the YAML header; I’ve used this to add the rendered date to the title.

Tabsets

One of the post powerful ways to make a data rich document more accessible is by using tabs.

These are easily inserted with the {.tabset} feature. This is added after a heading and then headings below that in the document that have a lower status will become tabs.

This ends when you add another heading with the higher level.

Standard

plot <- pressure %>%
  ggplot(aes(x = temperature, y = pressure)) + 
  geom_point()

plot

Red dots

plot +  
  geom_point(color = "red", # Adjusting colour names
             size = 4)  # Adjusting size

Blue trianges

plot +  
  geom_point(color = "#093daf", # Adjusting colour using hex colours 
             size = 4,   # Adjusting size
             shape = 17)  # Adjusting shape

Plots

Absolute plot size

The plotsize can be specified with fig.dim = c(x, y) where x is the width and y is the height of the image.

e.g. 2x3

plot +  
  geom_point(color = "#093daf", # Adjusting colour using hex colours 
             size = 4,   # Adjusting size
             shape = 17)  # Adjusting shape

Relative plot size

How much of the markdown space the plot occupies can be set with out.width = "X%" and out.height="X%".

e.g. 50%

plot +  
  geom_point(color = "#093daf", # Adjusting colour using hex colours 
             size = 4,   # Adjusting size
             shape = 17)  # Adjusting shape

Side-by-side

Two plots can be put side by side by changing the out.width to 50% and then adding the fig.show="hold" option.

eg. {r side-by-side plots, out.with="50%", fig.show="hold"} would give:

plot +  
  geom_point(color = "#093daf", # Adjusting colour using hex colours 
             size = 4,   # Adjusting size
             shape = 17)  + # Adjusting shape
  labs(title = "Figure 1")



plot +  
  geom_point(color = "#c21033", # Adjusting colour using hex colours 
             size = 4,   # Adjusting size
             shape = 17) + # Adjusting shape
labs(title = "Figure 2")

Plot themes

As with ggplot in R you can change the ggplot themes (theme_minimal() is very popular).

I have set a global ggplot theme which is applied to all the plots at the top of the markdown using ggplots’ theme_set:

theme_set(
  theme_minimal() +
    theme(
    panel.grid.major = element_blank(), # Remove the large grid lines
    panel.grid.minor = element_blank(), # Remove the small grid lines
    panel.background = element_blank(), # Remove the back ground
   axis.line = element_line(colour = "black") # Add a more solid axis line
  ))

This then applies this theme unless you specify another one. It means that plots look like:

plot +  
  geom_point(color = "#c21033", # Adjusting colour using hex colours 
             size = 4,   # Adjusting size
             shape = 17) + # Adjusting shape
  ggtitle("This...") + 
  theme(
       plot.title = element_text(size=24))

plot +  
  geom_point(color = "#c21033", # Adjusting colour using hex colours 
             size = 4,   # Adjusting size
             shape = 17) + # Adjusting shape
  ggtitle("...rather than this") + 
    theme_grey() + 
  theme(
      plot.title = element_text(size=24))

Add references

Style

References can be added to a markdown document by making a bibtex file and then referencing this in the markdown document. The style can be changed using a custom .csl file (they can be downloaded from Zotero’s website here). If you add this to the working directory and change the csl: section of the YAML this will update the referencing style of the inline citations and bibliography.

If you are using my template I have set this up as Nature’s style.

Adding reference files

References that are saved to the BibTex file can be referenced in the text.
The easiest way to make a BibTex file is to use a reference manager such as Zotero - this can be done for single files (right click), multiple files (select the ones you want and right click) or the whole library (File -> Export Library). There are also online tools that will make bibtex out of DOIs.

Adding references in the text

In line citations can be added using at @ sign followed by the name of the reference (this is mcneil1977interactive in our example).

This will create a reference like this1.

As I have added link-citations: true into the YAML this will hyperlink to the relevant part of the bibliography.

There are some more advanced ways to link Zotero and R studio, but unless you plan on writing whole dissertations this way I am not sure they are worth the faff.

Bibliography

Adding “# References” at the end of the document will automatically render the references in your chosen format.

Add crossreferences

Different parts of document can be cross-referenced by adding some text in square brackets, followed by the heading name, all in lower case with no special characters and with hyphens instead of spaces

Like this link to the formatting section

Custom CSS and HTML

This template has two extra twiddles. The first is a custom css file which sorts out some of the styling. The second is an html footer.

CSS

CSS is how HTML formatting is controlled. By referencing a custom CSS in the YAML we can add global options for the document styling.

If you open this file you can see that I have:

  • Changed the font sizes
  • Added more white space around the headers
  • Changed the colour of the text and background of the table of contents
  • Changed the font size of the table of contents
  • Added a logo

The YAML

Many of the YAML options have been explained already but this is a summary of all of the options I have added for this template with an explanation of what they do.
Unlike markdown, YAML is space sensitive and the indentations need to be maintained, with child lines indented relative to the parent lines - you can see this below:


title: "Florey R Markdown Template Tutorial" ## adds a title
author: "Jack Goodall" ## adds your name at the start
date: "Started 01-09-2024, last rendered 23-11-2024" ## adds the date rendered in UK format
output:
  html_document: ## tells the knit engine to turn this into html
    css: setup/style.css ## links to the custom css file
    includes:
      after_body: setup/footer.html ## adds the html footer (this is for the line break)
    code_folding: hide ## ensures the code chunks are printed but hides them by default
    toc: true ## adds a table of content
    toc_float: ## makes it float
      collapsed: true ## collapses the subheadings
      smooth_scroll: true ## smoothes the transitions
    toc_depth: 4 ## takes the heading down to level 4 (i.e. headings # top ####)
bibliography: references.bibtex ## adds a bibiography and points to the references filev csl: skeleton/nature.csl ## changes the referencing style to nature
link-citations: true ## hyperlinks citations to the references
editor_options:
  chunk_output_type: console ## sends code chunks to the ‘viewer’ rather than inline when editing

Conclusions

I hope this markdown tutorial has been useful. I would love to hear your feedback (good and bad!) either by email () or by opening an issue on the package page.

References

1.
McNeil, D. R. Interactive Data Analysis: A Practical Primer. (Wiley, 1977).