berd_rmarkdown_project.Rproj
00-install.R
and click "Run" to run all lines of code."An article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result."
-- (Claerbout and Karrenbach 1992)
Your closest collaborator is you six months ago, but you don't reply to emails.
Computational reproducibility: detailed information is provided about
Empirical reproducibility: detailed information is provided about
Statistical reproducibility: detailed information is provided about
R Opensci Reproducibility Guide
"These tools enable writing and publishing self-contained documents that include narrative and code used to generate both text and graphical results.
In the R ecosystem, knitr [R markdown] and its ancestor Sweave used with RStudio are the main tools for literate computing. Markdown or LaTeX are used for writing the narrative, with chunks of R code sprinkled throughout the narrative. IPython is a popular related system for the Python language, providing an interactive notebook for browser-based literate computing."
R Opensci Reproducibility Guide
.Rmd
file = Code + textknitr
is a package that converts .Rmd
files containing code + markdown syntax to a plain text .md
markdown file, and then to other formats (html, pdf, Word, etc)
⇒
.Rmd
-> .md
(behind the scenes).Rmd
-> .md
-> .html
.Rmd
-> .md
-> .pdf
.Rmd
-> .md
-> .doc
.Rmd
-> .md
-> slidesUse projects (read this)
read_csv("data/mydata.csv")
read_csv("/home/yourname/Documents/stuff/mydata.csv")
Advantages of using projects
.Rmd
)Two options:
You should see the following text in your editor window:
Before knitting the .Rmd file, you must first save it.
To knit the .Rmd file, either
render()
command in Console - See Extensions section for detailsA new window will open with the html output.
Remark:
.Rmd file
html output
Text in editor:
Output:
Text in editor:
Output:
You can easily navigate through your .Rmd file if you use headers to outline your text
Text in editor:
What is important is the spacing!
Text in editor:
Output:
Text in editor:
Output:
$
for inline equations: y=β0+β1x+ε$$
for centered formulas:^y=37+5age+32⋅height
Text in editor:
Output:
Gauss and the normal distribution were
featured on the 10 Deutsch Mark (DM) bill.
You can also source an image on the internet instead:
Later we will use R code to create tables from data.
We can create tables using Markdown as well:
Text in editor:
Output:
Alas, there are no autmatik sepll chekc to katch you're tipos and grammR.
gramr
package is an available RStudio Addin.Create an .Rmd file with file name example1.Rmd
that creates the html output to the right.
Can the flower species be determined by these variables?
Chunks of R code start with ```{r}
and end with ```
.
For example, the chunk produces the output
summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 Median :5.800 Median :3.000 Median :4.350 Median :1.300 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 Species setosa :50 versicolor:50 virginica :50
Code chunks can be created by either
Clicking on →
at top right of editor window, or
Keyboard shortcut
Text in editor:
No options specified: see both code and output
mean(iris$Sepal.Length)
[1] 5.843333
echo
determines whether the R code is displayed or not. The default is TRUE
. When set to FALSE
, the code is not displayed in the output:
[1] 5.843333
eval
determines whether the R code is run or not. The default is TRUE
. When set to FALSE
, the code is not run but is displayed in the output:
mean(iris$Sepal.Length)
Text in editor:
Output:
include
determines whether to include the R chunk in the output or not. The default is TRUE
. When set to FALSE
, the chunk is run but we do not see the code or its output (note that nothing is displayed below):
Setting include=FALSE
is useful when you have R code that you want to run, but do not want to display either the code or its output.
See the R Markdown cheatsheet for more chunk options.
Text in editor:
Output:
The mean sepal length for all 3 species combined is 5.8 (SD = 0.8) cm.
include=FALSE
is used a chunk option to evaluate the code but not show the code or its output. mean_SepalLength
, which can then be used later on.Text in editor:
fig.width
and fig.height
Sepal_WidthVsHeight-1.png
echo=FALSE
was used to hide the code and only display the figureOutput:
table_sepal_length <- iris %>% group_by(Species) %>% summarize(mean = mean(Sepal.Length), SD = sd(Sepal.Length))table_sepal_length
# A tibble: 3 x 3 Species mean SD <fct> <dbl> <dbl>1 setosa 5.01 0.3522 versicolor 5.94 0.5163 virginica 6.59 0.636
kable
kable
command from the knitr
package has some basic formatting optionsText in editor:
Output:
kableExtra
for more formatting optionsText in editor:
Output:
See Hao Zhu's webpage for many, many more kableExtra
options.
setup
knitr::opts_chunk$set(...)
commandsetup
chunkfig.path
sets the folder name where figures generated by the .Rmd file will be savedEdit the file example2/example2.Rmd
to create html output that matches example2/example2_output.html
shown below.
Create the table output shown below and at the end of example2/example2_output.html
(code link)
Many output options can be set in the YAML metadata, which is the first set of code in the file starting and ending with ---
.
Set the title, author, and date that appear at the top of the output file
Text in editor:
Output:
Text in editor: (example3a.Rmd
)
Try out
collapsed: yes
and smooth_scroll: no
Output: (example3a.html
)
Text in editor: (example3b.Rmd
)
Output: (example3b.html
)
echo = TRUE
code_folding: hide
all R code hidden by default; user must click Code button to see Rcode_folding: show
all R code shown by default; user must click Code button to hide RText in editor: (Word_example3.Rmd
)
Output: (Word_example3.docx
)
kableExtra
package optionskable
can be usedYAML with code to include style file:
Sample style file: (word-styles-reference.docx
)
The Word doc created by RStudio will have the same formatting as the specified style file.
Producing pdf documents requires that LaTeX be installed on your computer
See pdf_example3.Rmd
for code and pdf_example3.pdf
for output.
Change the YAML of example2/example2.Rmd
to
xaringan::inf_mr()
Instead of clicking "Knit" every time to see your updated document output, try this:
After installing the xaringan
package,
.Rmd
files can be run and rendered "live" as you type/save when you either run
xaringan::inf_mr()
in the console when your .Rmd
file is open. Or, click on on Adddins (top of screen), scroll down to "Xaringan" and click on "Infinite Moon Reader"
This is a new feature, so you need the most recent version of xaringan
and RStudio. It works well for html_document
output.
Your files must make sense to yourself 6 months from now, and/or other collaborators.
setwd()
)Absolute paths ≠ reproducible
Relative paths = reproducible (if done correctly)
Jenny Bryan's oft quoted opinion; see post on Project-oriented workflow
# Use a relative path, "relative to" the project folderread_csv("mydata.csv") # looks in .Rproj folder
```{r data, eval=TRUE}read_csv("mydata.csv") # looks in .Rmd's folder```
These three facts together can cause a headache.
here::here()
!After knitting, this gives you (file 🥗)
After knitting, this gives you:
..
or ../
cd ..
moves up one directory, cp ../myfile.txt newfile.txt
copies a file one level up into the current folder (working directory).Rmd
when you want to source the data in the data/
folder, you could use ..
to move up a folder into the main directory, and then back down into the data/
folder:# From the .Rmd folder, move up one folder then down to the data foldermydata <- read_csv("../data/report3_nhanes_data.csv")
here::here()
→ relative paths to the project directoryhere
package's here()
function solves this issue of inconsistent working directories..Rproj
file is.here::here()
returns the project directory as a string/
(Mac) or \
(PC) or spaces etchere::here()
[1] "/Users/minnier/Google Drive/BERD R Classes/berd_r_courses_github"
here::here()
with folders and filenameshere::here("folder","filename")
returns the entire file path as a string.Rmd
file interactively like a notebook, when knitting it, when copying it to the console, wherever, whenever!!here::here("data","mydatafile.csv")
[1] "/Users/minnier/Google Drive/BERD R Classes/berd_r_courses_github/data/mydatafile.csv"
here::here("data","raw-data","mydatafile.csv")
[1] "/Users/minnier/Google Drive/BERD R Classes/berd_r_courses_github/data/raw-data/mydatafile.csv"
We will explore how and when to use this in the exercises.
Within your project folder, open this file and follow the instructions:
example4/example4.Rmd
If you want to have separate .Rmd
files that are sourced in one large document, you can have "child document chunks":
A file called report_prelim.Rmd
in the analysis/
folder
(No YAML):
# Details about experimentHere are some details.I can make a plot, too.```{r plotstuff}plot(x,y)```
In the main doc main_doc.Rmd
---title: "Main Report:output: html_document---# Preliminary Analysis```{r child = here("analysis","report_prelim.Rmd")}```# Conclusion```{r}kable(summarytable)```
.Rmd
file with the xaringan
package!File -> new File -> R Markdown -> Presentation
# Slide Header
, or ---
Open example4/example4_pres.Rmd
and follow instructions.
Bonus: Try using xaringan::inf_mr()
to update the output in real time.
A nice feature for showing multiple images or sections is with tabbed sections:
## Results {.tabset}### By Species```{r}ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species))+ geom_point()```### Panel Species```{r}ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species))+geom_point()+ facet_wrap(~Species)```
.Rmd
(if they are installed on the computer), including SAS, STATA, and python.names(knitr::knit_engines$get())
[1] "awk" "bash" "coffee" "gawk" "groovy" "haskell" [7] "lein" "mysql" "node" "octave" "perl" "psql" [13] "Rscript" "ruby" "sas" "scala" "sed" "sh" [19] "stata" "zsh" "highlight" "Rcpp" "tikz" "dot" [25] "c" "fortran" "fortran95" "asy" "cat" "asis" [31] "stan" "block" "block2" "js" "css" "sql" [37] "go" "python" "julia" "sass" "scss"
collectcode=TRUE
to save code output.```{r setup} library(SASmarkdown)``````{sas clean_data, collectcode=TRUE}/* clean data with SAS code *//* export to file */``````{sas analyze_data}/* analyze data from above code */``````{r analyze_data}# source clean data file and run code```
rmarkdown::render()
It can sometimes be easier to set options and change output files/locations when using the render()
function in the rmarkdown
package. This is also useful for rendering multiple documents in a batch, or using parameterized reports.
In a .R
file, or in the console, run commands to knit the documents:
library(rmarkdown)render("report1.Rmd")# Render in a directoryrender(here::here("report3","report3.Rmd"))# Render a single formatrender("report1.Rmd", output_format = "html_document")# Render multiple formatsrender("report1.Rmd", output_format = c("html_document", "pdf_document"))# Render to a different file name or folderrender("report1.Rmd", output_format = "html_document", output_file = "output/report1_2019_07_18.html")
knitr::purl()
→ .R
fileRun in the console or keep in a separate R file to extract all the R code into a .R
file.
# makes an R file report1.R in same directorknitr::purl("report1.Rmd")# Can be more specific with outputknitr::purl(here::here("report3","report3.Rmd"), # Rmd location out = here::here("report3","report3_code_only.R")) # R output location
knitr::knit_exit()
: End document early.Rmd
to end document there and ignore the rest.```{r}knitr::knit_exit()```
---title: My Reportoutput: html_documentparams: data: file.csv printcode: TRUE year: 2018---```{r setup, include=FALSE}knitr::opts_chunk$set( echo = params$printcode)``````{r}mydata <- read_csv(params$data)mydata <- mydata %>% filter(year==params$year)```
rmarkdown::render
(default values are set in YAML)rmarkdown::render( "myreport.Rmd", params = list(data = "newfile.csv", year = "2019", printcode = FALSE), output_file = "report2019_newfile.html")
.R
files to .html
with the notebook/compile button or knitr::spin()
.bib
files.Rmd
are RStudio "notebooks" -- like an .Rmd
but all the output is saved as it is run in the notebook..Rmd
file that generated the slides is on github and can be downloaded here, though you need to download the whole R project to knit the file.solns/
folder.berd_rmarkdown_project.Rproj
00-install.R
and click "Run" to run all lines of code.Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |