This is a tutorial on R markdown and how to use Sheffield’s Florey Institute’s markdown template.
Although it is aimed at Florey Institute members I hope that it will be of broader interest, and anyone is welcome to use and adapt it. The penultimate section shows how to change the branding to another institution.
This page has three aims:
If you are new to markdown, I also aim to convert you to this powerful communication tool. This page will demonstrate how it could be used to easily communicate pre-publication research findings with colleague and collaborators.
This page was written with R Markdown using the Florey Template. The code used to make it can be found on my GitHub here. It might be useful to look at this in conjunction with this webpage to see how it has been put together, although this shouldn’t be necessary to understand it.
If you are already experienced with markdown then the section ‘Installing the template’ will be all you need to get going. However, the custom CSS section later on might be of interest, as this explains how to customise the logo and section breaks.
If you are new to markdown then I suggest following this page through in order.
If you need more detail or want to understand how to use markdown to create PDFs, Word documents or Shiny webpages I would highly recommend the R Markdown Cookbook.
In homage to the original markdown tutorial I have used the cars dataset1.
If you have any questions, suggestions for improvements or would like to edit the project please either email j.goodall@sheffield.ac.uk or open an issue on the GitHub page.
In the R console install the devlopers’ tools packages by
running: if (!require("devtools", quietly = TRUE)) {
install.packages("devtools") library(devtools)
}
Then install the template package with:
devtools::install_github("jackwgoodall/floreytemplate")
Open a new file and select R Markdown Document as the document type
On the left of the window select ‘From Template’
Select ‘Florey Template’ and give it an appropriate name
This should give you a minimal markdown document
Press the ‘Knit’ button at the top of the screen
If all has gone to plan an html file with the Florey logo and some intro text should appear
A markdown document is made up of:
The YAML heading (‘YAML Ain’t Markup Language’ - recursive acroynms seem to be big in coding circles) tells R studio how to knit the document (knitting is the process of turning you markdown document into a human-readable form such as html).
The only essential thing for the YAML is the output:
option. I would strongly suggest html for almost all tasks (although
both PDF and Word documents have their uses).
However, there is much more that can be done with the YAML heading to
make your markdown documents more accessible, as you will see.
Formatted text is really what makes markdown so great for sharing your pre-publication work with colleagues as this lets you add a narrative around the data you want to present.
The options open to you are shown in the headings and formatting
sections.
Code chunks are sections of R code that are either just setting things up or are producing an output that we want to show. Code within these chunks operates pretty much the same as it does within ‘normal’ R - but plots or any text that would normally go to the console appears after the chunk that created it.
All executable code in markdown needs to be contained within code chunks.
These chunks are framed by ```{r} at the start and
``` at the end.
This can either be done manually or with Code -> Insert Chunk.
After the ‘r’ you can add a name for the chunk (eg
```{r chunk_one}). This makes it easier to find specific
chunks later as these names will appear at the bottom left of the
editor.
You could choose how you want each chunk to behave one-by-one in the heading.
e.g. ```{r chunk_one, echo = FALSE} would tell R that
you don’t want the code for this particular chunk to be shown.
Much easier though is to set the global options with knitr::opts_chunk$set like this:
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, messages = FALSE)
options(scipen = 999, digits = 2) # Turn off scientific notation, rounds numbers to 2 decimal places
Click the ‘Show’ button here to see the settings for this document —->
I have selected:
echo = TRUE, meaning that all code will be shown.
However as I have enabled code_folding: hide in the YAML
these will be hidden by default unless you click the show button. I
think this is a nice compromise as some collaborators will likely want
to see you code, whilst others will just want the content.
warnings = FALSE to stop the warnings being printed.
You will need to pay attention to these when you’re building the
document as they won’t come out in the knit now.
message = FALSE prevents messages that are generated
by code from being shown.
options(scipen = 999, digits = 2) this is an
additional (non-standard) line I have added to force the output not to
show real numbers (i.e. not scientific notation), and to round to two
decimal places. These can obviously be modified to suit your
needs
Other options are:
include = FALSE would prevent both the code AND it’s
output from being shown. The code will still be run though.
eval = FALSE stops the code being run at all (useful
if you want to show the code but don’t want it to do anything).
The setting in a code chunk will take precident over the global
options. So, if you want all of the code chunks to be show apart from
one you could set the global options to echo = TRUE and
that chunk’s options to echo = FALSE.
Another useful thing to get your head around if you will be making documents with large data processing is caching. That is beyond the scope of this page but a good introduction can be found here
You will need some packages. If these are used for the knit they need to be explicitly loaded in the markdown document (not just loaded into your general environment).
I have gone for the default approach of hard coding pacman to be installed and then using this to install everything else.
I think ggplot2, tidyverse and kable_extra (see why kableExtra later) are all pretty much essentials but you can change these to suit.
I have also set a global ggplot theme - which is discussed in the plot themes section
if (!require("pacman", quietly = TRUE))
install.packages("pacman")
pacman::p_load(ggplot2, tidyverse, kableExtra)
# Set global ggplot theme
theme_set(
theme_minimal() +
theme(
panel.grid.major = element_blank(), # Remove the large grid lines
panel.grid.minor = element_blank(), # Remove the small grid lines
panel.background = element_blank(), # Remove the back ground
axis.line = element_line(colour = "black") # Add a more solid axis line
))
You can easily create headings using increasing numbers of hashtags (#). The more hashtags the smaller the heading.
I have set up the css file (this is explained later to give a bit of extra white space above and below the headings; these can be easily changed.
The setup of this markdown is such that anything up to heading level 4 (####) will feature in the folding table of content.
Click on “Formatting” then “Headings” in the table of content to see
what this looks like
<———-
The heading of this section is heading 1 (one #)
This is heading 2 (two ##)
This is heading 3 (three ###)
This is heading 4 (four ####)
Heading 5 (five #####) will still create a heading - but this won’t be referenced in the table of content.
The depth of heading you want to be referenced can be easily changed
in the YAML heading with toc_depth:
| Input | Output |
|---|---|
| **Two asterix** | makes text bold |
| *One asterix* | italicises text |
| ***Three asterix*** | does both |
| `Single backticks` | formats as code |
| ~~Two tilde~~ |
Links can be made by putting the text in square brackets, followed by the website in round brackets, with no space in between.
[Link](https://github.com/jackwgoodall/floreytemplate/)
Makes:
Equations can be made by wrapping the equation in $ for
inline and $$ for displayed equations:
\[ \sum_{i = 1}^{n}{(this..is..an..equation)^2} \] A list of mathemetical notation can be found here
A dividing line can be placed with three or more astrixes (***) like this:
Although a more convincing divide can be placed using some html code
(done here to match the custom dividing lines I’ve put between sections
in the css file).
<hr style="border: 1px solid black; width: 100%; margin-top: 1em; margin-bottom: 1em;">
makes:
(the ‘black’ here can be replaced by other colour names or hex codes to match you style)
Better eh?
Markdown doesn’t really pay attention to spaces.
Adding two spaces to the end of a line generally starts a new line,
but it you want to be more explict (or add multiple new lines) then
inserting <br> will do this.
Adding will add whitespace.
Tables can be ‘freehanded’ (like I’ve done in the ‘Text Options’ section).
However, more elegant tables can be quickly made by piping the table into kableExtra. Like this:
table(pressure$temperature[1:5], pressure$pressure[1:5]) %>%
kbl(caption = "Hover Table") %>%
kable_paper("hover", full_width = F)
| 0.0002 | 0.0012 | 0.006 | 0.03 | 0.09 | |
|---|---|---|---|---|---|
| 0 | 1 | 0 | 0 | 0 | 0 |
| 20 | 0 | 1 | 0 | 0 | 0 |
| 40 | 0 | 0 | 1 | 0 | 0 |
| 60 | 0 | 0 | 0 | 1 | 0 |
| 80 | 0 | 0 | 0 | 0 | 1 |
There are lots of options that are well documented on their CRAN page
Sometimes it can be useful to add data from R objects into the formatted text. This allows statements to update if you change the data source.
For instance I could find out that the median pressure from the cars dataset is 8.8 by checking this in the console. But this number won’t change if the cars dataset is updated.
Code can be run inline starting a phrase with “r” inside back ticks eg:
The median pressure of the cars dataset is `r median(pressure$pressure)` gives:
“The median pressure of the cars dataset is 8.8”
If the values in the cars dataset were to change, this would update accordingly.
This also works in the YAML header; I’ve used this to add the rendered date to the title.
One of the post powerful ways to make a data rich document more accessible is by using tabs.
These are easily inserted with the {.tabset} feature.
This is added after a heading and then headings below that in the
document that have a lower status will become tabs.
This ends when you add another heading with the higher level.
plot <- pressure %>%
ggplot(aes(x = temperature, y = pressure)) +
geom_point()
plot
plot +
geom_point(color = "red", # Adjusting colour names
size = 4) # Adjusting size
plot +
geom_point(color = "#093daf", # Adjusting colour using hex colours
size = 4, # Adjusting size
shape = 17) # Adjusting shape
The plotsize can be specified with fig.dim = c(x, y)
where x is the width and y is the height of the image.
e.g. 2x3
plot +
geom_point(color = "#093daf", # Adjusting colour using hex colours
size = 4, # Adjusting size
shape = 17) # Adjusting shape
How much of the markdown space the plot occupies can be set with
out.width = "X%" and out.height="X%".
e.g. 50%
plot +
geom_point(color = "#093daf", # Adjusting colour using hex colours
size = 4, # Adjusting size
shape = 17) # Adjusting shape
Two plots can be put side by side by changing the
out.width to 50% and then adding the
fig.show="hold" option.
eg.
{r side-by-side plots, out.with="50%", fig.show="hold"}
would give:
plot +
geom_point(color = "#093daf", # Adjusting colour using hex colours
size = 4, # Adjusting size
shape = 17) + # Adjusting shape
labs(title = "Figure 1")
plot +
geom_point(color = "#c21033", # Adjusting colour using hex colours
size = 4, # Adjusting size
shape = 17) + # Adjusting shape
labs(title = "Figure 2")
As with ggplot in R you can change the ggplot themes (theme_minimal() is very popular).
I have set a global ggplot theme which is applied to all the plots at the top of the markdown using ggplots’ theme_set:
theme_set(
theme_minimal() +
theme(
panel.grid.major = element_blank(), # Remove the large grid
lines
panel.grid.minor = element_blank(), # Remove
the small grid lines
panel.background = element_blank(), # Remove the back
ground
axis.line = element_line(colour = "black") #
Add a more solid axis line
))
This then applies this theme unless you specify another one. It means that plots look like:
plot +
geom_point(color = "#c21033", # Adjusting colour using hex colours
size = 4, # Adjusting size
shape = 17) + # Adjusting shape
ggtitle("This...") +
theme(
plot.title = element_text(size=24))
plot +
geom_point(color = "#c21033", # Adjusting colour using hex colours
size = 4, # Adjusting size
shape = 17) + # Adjusting shape
ggtitle("...rather than this") +
theme_grey() +
theme(
plot.title = element_text(size=24))
References can be added to a markdown document by making a bibtex
file and then referencing this in the markdown document. The style can
be changed using a custom .csl file (they can be downloaded from
Zotero’s website here). If
you add this to the working directory and change the csl:
section of the YAML this will update the referencing style of the inline
citations and bibliography.
If you are using my template I have set this up as Nature’s style.
References that are saved to the BibTex file can be referenced in the
text.
The easiest way to make a BibTex file is to use a reference manager such
as Zotero - this can be done for single files (right click), multiple
files (select the ones you want and right click) or the whole library
(File -> Export Library). There are also online tools that will make
bibtex out of DOIs.
In line citations can be added using at @ sign followed by the name of the reference (this is mcneil1977interactive in our example).
This will create a reference like this1.
As I have added link-citations: true into the YAML this
will hyperlink to the relevant part of the bibliography.
There are some more advanced ways to link Zotero and R studio, but unless you plan on writing whole dissertations this way I am not sure they are worth the faff.
Adding “# References” at the end of the document will automatically render the references in your chosen format.
Different parts of document can be cross-referenced by adding some text in square brackets, followed by the heading name, all in lower case with no special characters and with hyphens instead of spaces
This template has two extra twiddles. The first is a custom css file which sorts out some of the styling. The second is an html footer.
CSS is how HTML formatting is controlled. By referencing a custom CSS in the YAML we can add global options for the document styling.
If you open this file you can see that I have:
To add the logo I’ve added this chunk into the CSS file:
#TOC::before {
content: "";
display: block;
height: 110px;
background-image: url(link/to/image.png);
background-size: contain;
background-position: center center;
background-repeat: no-repeat;
background-color: #450099;
}
This creates an empty block before the TOC, adds an image into it and then colours in the background the same as the logo.
If you want to change the logo you can simply change the link in the
background-image: url(link/to/image.png); to a file path of
your choice.
The only downside of this is that you will always need to have this image in the same directory as your html file for it to show.
A workaround for this is to convert you file to base64 (like I have done). This creates a text string that describes the picture which will be rendered into the html file.
Load required library
library(base64enc)
Encode the image as a Base64 string
img_path <- "setup/main-logo.png"
img_base64 <- base64enc::dataURI(file = img_path, mime = "image/png")
Print or save the Base64 string
cat(img_base64)
Copy the string that this creates into the relevant part of the css file and you’re away.
You may want to change the highlights I have chosen for the table of content and dividing lines to match your logo. This website lets you find the hex colour codes from an image you upload and might be helpful.
Many of the YAML options have been explained already but this is a
summary of all of the options I have added for this template with an
explanation of what they do.
Unlike markdown, YAML is space sensitive and the indentations need to be
maintained, with child lines indented relative to the parent lines - you
can see this below:
title: "Florey R Markdown Template Tutorial" ## adds a
title
author: "Jack Goodall" ## adds your name at the
start
date: "Started 01-09-2024, last rendered 23-11-2024" ##
adds the date rendered in UK format
output:
html_document: ## tells the knit engine to turn this into
html
css: setup/style.css ## links to the custom
css file
includes:
after_body: setup/footer.html ## adds the html footer (this
is for the line break)
code_folding: hide ## ensures the code chunks are printed
but hides them by default
toc: true ## adds a
table of content
toc_float: ## makes it float
collapsed: true ## collapses the subheadings
smooth_scroll: true ## smoothes the transitions
toc_depth: 4 ## takes the heading down to level 4
(i.e. headings # top ####)
bibliography: references.bibtex ## adds a bibiography and
points to the references filev csl: skeleton/nature.csl ##
changes the referencing style to nature
link-citations: true ## hyperlinks citations to the
references
editor_options:
chunk_output_type: console ## sends code chunks to the
‘viewer’ rather than inline when editing
I hope this markdown tutorial has been useful. I would love to hear your feedback (good and bad!) either by email (j.goodall@sheffield.ac.uk) or by opening an issue on the package page.