This is a tutorial for writing reports with the R package
papaja.
But why write in R in the first place? Why use
papaja?
Advantages:
Formatting looks beautiful without much effort needed.
Conform to APA style (6th version). papaja is short
for Preparing APA Journal Articles.
No need to go back and forth between R and Word to update figures and results.
Results are more reproducible with all data and analysis codes available in the document.
Disadvantages:
Learning curve is steeper at the start.
Need latex knowledge if you want to achieve something more complicated.
No grammar checker (Grammarly I’m looking at you), spelling corrector is not great.
With Github, collaboration and change tracking are possible. But, I haven’t found a nice way to share and receive feedback.
Hope this gives you some motivation to learn papaja!
Below are the learning objectives for this tutorial:
1. Install the package, and apply the template.
2. Know what YAML is, and apply basic changes.
3. Create in-text citations and bibliography with the help of Zotero/.bib file.
4. Understand how to print variable values in text.
To follow this tutorial, you’ll need:
R and R Studio, for sure. I recommend updating R Studio to a version greater than v1.4, as the visual markdown editor is very useful.
R Markdown
Zotero (optional)
Some basic knowledge of the R Studio interface
Let’s start by creating a new project and installing the required packages.
Create a new project: File –> New Project –> New Directory –> New Project.
Let’s name this project papaja_tutorial, and put it in a
place you know of.
To knit markdown files into PDFs, you’ll need the
tinytex package. Install tinytex by running
the following line in the R console:
if(!requireNamespace("tinytex", quietly = TRUE)) install.packages("tinytex")
Troubleshooting: if you see the error message “/usr/local/bin not writable” like I did, run these lines in the terminal, not in R console (reference):
[ -e /usr/local/bin ] || sudo mkdir -p /usr/local/bin
sudo chown -R `whoami`:admin /usr/local/bin
~/Library/TinyTeX/bin/*/tlmgr path add
Alternatively, you may also consider MikTeX for Windows, MacTeX for Mac, or TeX Live for Linux. See the papaja manual for details.
Run these lines in the console to install papaja:
if(!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes")
remotes::install_github("crsh/papaja")
Now, let’s explore the APA-style template that papaja
has offered!
Create a new markdown using the template: File –> New File –> R Markdown –> From Template –> scroll down to find “APA-style manuscript” –> click “OK”
Let’s take a look at the markdown we’ve just created. If you are using the new version of R Studio with a visual markdown editor, switch to the source editor at this point.
At the top of the markdown, from Line 1 to 65, we can see the general information about this paper and some formatting parameters. This section is called YAML header and it controls the style of the paper. YAML header is always enclosed in triple dashes “—” on either side.
We’ll come back to YAML later! But now let’s move on to the next chunk.
library("papaja")
r_refs("r-references.bib")
The above chunk loads the package papaja, and generates
a reference file called r-references.bib for R and all R packages used
in this paper. Let’s add another package here too to see how it looks.
Here, I’m adding tidyverse:
library("papaja")
library("tidyverse")
r_refs("r-references.bib")
Then, the document is separated into 3 sections: Methods, Results and Discussion. The # symbol represents that the text in that line is a heading. One # is first-level heading, two #s is second-level heading, etc. The lowest, therefore the smallest heading, has five #s.
So, in this template, participants, material, procedure, and data analysis are all subheadings under Methods.
It seems like the template is missing Introduction and Background. Let’s add them to the document.
Let’s also add two subheadings, Data A and Data B under Material. Now, it will look like this:
Notice that there is one sentence under the data analysis section. It will automatically generate in-text citations for R and all R packages we’ve used in the document.
The final section is references, starting on a new page. There are
some weird-looking codes below, and don’t worry about them. You won’t
write anything here, because as we’ll later see, papaja
helps us auto-generate references.
Let’s knit the file and see what it looks like!
Go to Knit -> Knit to apa6_pdf. Then it will prompt you to save
this markdown before knitting. Name this file
papaja_testing. It should automatically place the document
inside the directory for this project. If not, you can manually find the
directory and save this document there.
After a while, a PDF reader window should pop up and show the output.
It does look like a paper! The first page is the title page, showing title, authors, affiliations, note, and running head. The next page dedicates to the abstract.
The body starts from the third page. Notice how the first-level
headings (e.g. Methods) look different from the second-level headings
(e.g. Participants) and third-level headings (e.g. Data A). The data
analysis section lists a lot of R packages, because
tidyverse is a collection of libraries.
Lastly, the References section auto-generates all the references used in the paper.
Also, your working directory should now have a new file: r-references.bib. Click on it to open it in R Studio. It consists of citation information for R and all packages used.
Let’s change a few things to make it look even better:
All of the above tasks happen in the YAML header, so let’s go back to Line 1.
Change the titles by replacing the text within the quotation marks.
Let’s put
Creating APA-style papers with the R package papaja as the
title, and Writing with papaja as the short title.
Similarly, replace “First author” with your name. You can remove
address, email and role. Change corresponding to no. Remove
the information of the second author. Change the institution to
Stanford University, and remove the second id and
institution.
Delete the whole author note section.
Scroll down until you see a list of parameters with yeses and nos.
Now, only the linenumbers parameter is turned on. Turn it
off by changing yes to no.
At the bottom of the YAML header, you should see
output: papaja::apa6_pdf. To create a word document output,
change it to output: papaja::apa6_docx.
Let’s create some more content in the body! We’ll learn how to create references and print variable values.
At this point, we only have references for the R packages. Now, let’s add citations and references for the literature.
If your R Studio is updated and you have a zotero, references couldn’t be easier! Here are the steps.
bibliography: references.bib. However, if you look
a few lines above, there’s already another bibliography:
bibliography: "r-references.bib" . Markdown will get
confused if there are two same parameters specifying different
things.bibliography: ["references.bib", "r-references.bib"] .Just now, we only covered one type of in-text citation: author(year). If you need other formats, here’s a reference from papaja manual:
| Citation type | Syntax | Rendered citation |
|---|---|---|
| Citation within parentheses | [@james_1890] |
(James, 1890) |
| Multiple citations | [@james_1890; @bem_2011] |
(Bem, 2011; James, 1890) |
| In-text citations | @james_1890 |
James (1890) |
| Year only | [-@bem_2011] |
(2011) |
If you don’t have Zotero, you can search the paper by using its DOI.
Try this:10.1093/jn/33.5.491.
Without visual markdown editor, you can manually create a .bib file to store your bibliography. Let’s create a .bib file in R Studio.
@article{isnandar2022effect,
title={The effect of an 8\% cocoa bean extract gel on the healing of alveolar osteitis following tooth extraction in Wistar rats},
author={Isnandar, I and Hanafiah, Olivia Avriyanti and Lubis, Muhammad Fauzan and Lubis, Lokot Donna and Pratiwi, Adzimatinur and Erlangga, Yeheskiel Satria Yoga},
journal={Dental Journal (Majalah Kedokteran Gigi)},
volume={55},
number={1},
pages={7--12},
year={2022}
}
@article{duailibi2004bioengineered,
title={Bioengineered teeth from cultured rat tooth bud cells},
author={Duailibi, MT and Duailibi, SE and Young, CS and Bartlett, JD and Vacanti, JP and Yelick, PC},
journal={Journal of dental research},
volume={83},
number={7},
pages={523--528},
year={2004},
publisher={SAGE Publications}
}
Save the file as manual-reference.bib.
Update YAML: in the bibliography parameter, add
`manual-reference.bib` to the list.
To cite a paper from the .bib file, type @ and then the short
name of the paper, which is on the first line of BibTex. For the first
paper, it’s isnandar2022effect; the name for the second
paper is duailibi2004bioengineered.
When you do calculations in R and want to put the results into the report, don’t manually enter the values! Because first, you can make typos. Also, if your data is modified and the calculated values change, you have to go back to the report and update the numbers there. Don’t do that!
Instead, you can print the values by referencing the variables in R.
Let’s start from the easiest. Say you have a single value x:
x <- 5
You can use `r x` to print the value of x without
actually writing down the number.
You can replace x in the expression with any
calculations. For example, if x is a vector and you want to report its
mean. Copy-paste the following line to create x:
x <- c(1, 4, 6, 7, 10)
The expression `r mean(x)` will automatically calculate
the mean of x and display the value.
Type this in your markdown: The mean of x is
`r mean(x)`.
Another example that could be useful is reporting the regression coefficients. Let’s first create an outcome variable. First, run this line of code:
y <- c(5, 8, 6, 10, 15)
Now let’s run a regression and obtain the coefficients.
mod <- lm(y ~ x)
coef_mod <- coefficients(mod) # extract coefficients
coef_mod
## (Intercept) x
## 3.026549 1.030973
Now, copy and paste this sentence into the markdown:
The predicted model is $y =
`r coef_mod[1]`+`r coef_mod[2]` x$.
You should get something like this:
\(y = 3.0265487 + 1.0309735x\)
And if you make changes to the data now, such as:
x <- x + 1 # add 1 to x
mod <- lm(y ~ x)
coef_mod <- coefficients(mod)
coef_mod
## (Intercept) x
## 1.995575 1.030973
The same expression will give you an updated result: \(y = 1.9955752 + 1.0309735x\).
This reporting feature is not limited to the papaja
template! You can use it in any R Markdown files.
The tutorial in .rmd can be found here: https://github.com/Shengqi-Iris-Zhong/papaja_tutorial.git
If you want to learn more about writing and formatting papers in R, here are some resources:
papaja manual: http://frederikaust.com/papaja_man/
visual markdown editor: https://rstudio.github.io/visual-markdown-editing/