My first R markdown experience was a painstaking journey in DSI. I was completely unfamiliar about the entire R package and spent hours to learn the basic functionality before AT2 resubmission using code chunks. From those memories I am now thinking about the current DSI student who never used r markdown will face the same scenario while preparing their assignment. This blog post may come little handy for them in order to format and visualize data using markdown file.
library(tidyverse)
## -- Attaching packages -------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.0 v purrr 0.3.2
## v tibble 2.1.3 v dplyr 0.8.3
## v tidyr 0.8.3 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## -- Conflicts ----------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(namer)
The very first thing we will notice in markdown file is YAML header.at the top of the document. It is a widely used file configuration standard. It’s set a key value pairs delimited by three dashes — with colons separating option names and values. The YMAL Header specify how to knit a file in Rstudio and output parameter specifies rendering option.
---
title: "Example""
output:
pdf_document:
toc: true
---
Code chunks are probably the most complex area for the beginner where I spent most of my time. Usually code chunks start and end with three back ticks ```.
Code chunk has four components.
Since makrdown document may consist on number of code chunk; user might find it difficult to remember every chunk name and what each chunk serves. Therefore, it is important to give code chunk a sensible name. Common characters such as letters, digits and dashes (-) mostly use as code names. Also, avoiding special character such as (_) , (/) code make things even easier. There is a package called namer that could automatically name the code chunk. It is recommended for the user to run the namer add-in before knitting.
library(namer)
Some code chunk may not need a sensible name and only require a title so that user could refer to them in error. Ofter there are multiple code chunk using same csv file.In Addins function there is an option called Labeled Rmd Chunks. By selecting this option namer function could sensibly appeneded each chunk name.
There are few options available in terms of text formatting. We can make the text bold or italic, code formatted and include the hyperlink in between the text.
**My name in bold**
*My name in Itelic*
When there is a code within the document setwd()
, we can also format it by using backtext ` ` so user could easily mark the code between the text.
`setwd()`
Often we may require to insert hyperlink or convert specific word to a hyperlink in our text. In Rmarkdown file we can do this as well. For example, in order to create ABS website link inside the text we can use the code below,
[ABS Website] # This will text the link.
(http:\\abs.gov.au) # This will point ABS Website
[ABS Website](http:\\abs.gov.au)
Sometimes we might work on script file that contains complex and large code. Later it needs to be presented through markdown report. It is often easy to source scripts instead embedding all code directly to .Rmd file. When sourcing the code in .Rmd file; R script that contains code needs be saved in the same directory or project folder. There are two main reason why sourcing scripts are useful:
The code chunk below source R script file in R markdown.
source("myfile")
We may already know the process of insert graphics into markdown document using knitr. First, we must save the image into r project folder and use knitr code below to insert image into markdown file.
knitr::include_graphics("Group.JPG")
What if, the size of the image is too large or small and we need to customize image. The below code comes handy resizing the image.
{```r,Group.png, out.width="30%"}
knitr::include_graphics("Group.JPG")
Note: This code works for html output only. When out method is PDF; the code chunk is slightly different.
{```r,Group.png, out.width="0.5\\textwidth"} #Here 0.5 means image size 50%
knitr::include_graphics("Group.JPG")
Often the tibble or data frame we load in the rmarkdown do not visualize like word or excel format. Letâs have a look into below starwars dataset.
knitr::include_graphics("Tibble.PNG")
Above, a default tibble format and hard to read. Both data frame and tibble could be visualize better by using the df_print
output in YMAL header section.
First, Lets insert a r code chunk and run a data frame call head(wrapbreaks)
.
Next, go to the YMAL Header section and insert df_print option to format below data frames and tibble.
title: "My R journey"
output:
html_document:
df_print: kable
There is another df_print option called paged. This could add page navigation option which allow user to look through the table page by number.
title: "My R journey"
output:
html_document:
df_print: paged
head(warpbreaks)
starwars %>%
filter(height > 100) %>%
select(height,mass, birth_year,gender,homeworld,species) %>%
arrange(desc(height)) %>%
slice(1:50)