If you go through this document with the source and the output side-by-side, you’ll know enough to start making totally customizable R Markdown documents.
Whenever you’re using R Markdown, there’s two documents involved:
.rmd
/.Rmd
,1 andA concrete example: I’m currently typing into the document week4_document.rmd
from the script editor in R Studio. If I use the keyboard shortcut cmd+shift+k
, the document will get “knitted,” and the output document will get created in the same directory as the source document is in.
What happens when you knit a document is approximately the following:
output
field says (e.g. html or docx),The output file in this case is week4_document.html
, because that’s what I specified in the header – speaking of which, …
.rmd
source fileIf you’re looking at the source file week4_document.rmd
, you’ll see that the first six lines use a special syntax to specify basic attributes of the document: the title, the author, the date, and the output type. All of these are optional except for the output type. There are other, less relevant options that are not shown here, too. The entire block enclosed by ---
…---
is called the YAML header, and tells the document-generating system how the file is meant to be compiled, who the author is, what the title is, what other files it should load (if any), etc.
After the header is the main body of the document, which mostly just contains text like this. We just type right into the file to add text to the output document. Here is where you use markdown formatting like asterisks for italics, like brackets for links, and ###’s for sections and subsections. See the R markdown cheatsheet for pretty much all you’ll need to know.
You can use backticks (the key above “tab”) to make text look like code
. However, if you use this backtick syntax but also include the letter “r” and then a space immediately after the “r”, then knitr will treat what’s between the backticks as actual R code. It will get executed and the resulting value will be displayed in the paragraph, as if nothing was going on behind the scenes (see the source document for illustration).
Interleaved throughout the main text of the document are code chunks. This first chunk is called the “setup” chunk, and it’s the only “special” one. The setup chunk gives instructions for how all the other chunks should be displayed in the output. Don’t worry about the setup chunk for now – it will start to make sense as you use R Markdown more.
knitr::opts_chunk$set(echo=TRUE, message=FALSE, warning=FALSE)
It’s a good idea to load packages you know you’ll need in the setup chunk, too.
exercise: it’s conventional to use knitr::opts_chunk...
instead of just opts_chunk...
in the setup chunk. Why? (hint: think about packages you might need…)
You can insert a new code chunk with the shortcut ctrl+alt+i
. The syntax for writing code chunks is:
```{r chunk_name, chunkoption1=VALUE, chunkoption2=VALUE, ...}
# any R code you want to type
line <- of("R", "code")
and <- another(line)
finally(the_last, line)
# a comment about the plot
plot(the_xvar, the_yvar)
```
Here’s an example with actual code (but note that it’s verbatim, meaning that it’s for display purposes only – not a “real” chunk).
```{r chunk_name, fig.width=7, fig.height=5}
carz <- head(mtcars, n=3)
carz$gear <- as.factor(carz$gear)
print(carz[, 6:10])
```
Note that anything you can put as a chunk option in the setup chunk can also be used to specify an option for a particular chunk. But to specify a chunk option for an individual chunk, you put it after the chunk name, like in this example. So for example we could have put fig.width=7
in the opts_chunk$set()
command in the setup chunk. That would have the effect of making the default figure width 7 for all chunks in the document (unless that’s overriden by an option in an individual chunk)
Without the special verbatim formatting used above, the default way that code chunks appear in output is like this (referring to output document): first the code itself in a nice little block, and then the console output in a separate block.
carz <- mtcars
carz$cyl <- as.factor(carz$cyl)
carz$am <- as.factor(carz$am)
print(carz[1:5, c("cyl","am","gear","mpg","hp","wt")])
## cyl am gear mpg hp wt
## Mazda RX4 6 1 4 21.0 110 2.620
## Mazda RX4 Wag 6 1 4 21.0 110 2.875
## Datsun 710 4 1 4 22.8 93 2.320
## Hornet 4 Drive 6 0 3 21.4 110 3.215
## Hornet Sportabout 8 0 3 18.7 175 3.440
Look through the R Markdown cheatsheet for more kinds of useful chunk options.
You can write chunks that will display plots of any kind. The package knitr::
also has nice functions like kable()
, which take many kinds of rectangular console output (like a printed data frame or summary table), and turn them into nice, pretty formatted tables in the output file.
Here’s an example of a chunk with a nice table as output (cf. the same material print()
ed in above chunk):
knitr::kable(carz[1:5, c("cyl","am","gear","mpg","hp","wt")])
cyl | am | gear | mpg | hp | wt | |
---|---|---|---|---|---|---|
Mazda RX4 | 6 | 1 | 4 | 21.0 | 110 | 2.620 |
Mazda RX4 Wag | 6 | 1 | 4 | 21.0 | 110 | 2.875 |
Datsun 710 | 4 | 1 | 4 | 22.8 | 93 | 2.320 |
Hornet 4 Drive | 6 | 0 | 3 | 21.4 | 110 | 3.215 |
Hornet Sportabout | 8 | 0 | 3 | 18.7 | 175 | 3.440 |
Here’s an example of a chunk with plot output:
# a quick plot of two variables from `mtcars`
ggplot2::qplot(carz$gear, carz$mpg)
You can even get fancy and wrap certain plot objects with ggplotly()
to make them interactive(ish):3
# make it interactive w plotly! :p
library("ggplot2"); library("plotly")
# here's a plot:
boosh <- ggplot(carz, aes(x=wt, y=hp, shape=am, color=cyl)) +
geom_point(size=3)
boosh
# and here it is "interactivized"/"plotly-ized"
ggplotly(boosh, width=900, height=400)
hover over an individual data point to see its values. menu on upper-right hand corner for, e.g. “download plot as .png”
Getting your figures to look right is a trial-and-error process. Try different values for the width, height, etc., to get the size/scale the way you want it. Also check out some examples on the knitr::
page, and you’ll get an idea of how things work.
The main chunk options that control figure size are:
fig.width
/fig.height
– width of the figure (units can be a tad confusing – trial and error)out.width
/out.height
– interacts with fig.width
/fig.height
, has to do with scaling and space alotted to output displayfig.show
– controls whether figures should (asis
) or shouldn’t (hide
) be shown; and if they are, should they all be shown together at the end of the chunk (hold
), or shown immediately after the line that makes them (asis
). There’s also animate
as an option.Note that fig.show
should be set to "hold"
if you want to display e.g. two figures from the same chunk side-by-side in the output.
You can use html wherever you want in R Markdown documents, as long as the output type is hmtl. That means you can also use css to specify how the document is styled. At the bottom of the source file week4_document.rmd
you’ll find some starter html/css that changes a couple of things and has some miscellaneous comments to get you started.
You can also type html directly in the source (when output is html), e.g. <hr>
for a horizontal line or <br>
for a line-break.
or in rare cases something else like .rnw
.↩
Though note that if you want pdf output, you’ll need to have \(\LaTeX\) installed; here’s a link to get you started. If you don’t want to mess with that or if you’re in a hurry, just use html output and then “print” as a pdf from Google Chrome (which has very nice pdf printing options for webpages). R Markdown html output can actually be made to look quite good when printed as pdf. That said, \(\LaTeX\) is awesome and well worth learning.↩
Note how the legends were handled differently. Both packages are under active development (or were recently), so chances are in a couple years this will be dealt with more uniformly.↩