If you use R markdown files in RStudio and have a GitHub account, you are already ready to publish your papers/projects online. You can share the link to your target audience or use the website to showcase what you know.
When you have your content ready, putting them on the internet takes less than half an hour if you have only one paper, or about one hour if you use the same structure as this website (with minor revision). I assume that you know how to knit your R Markdown file to html and have the files on a repository on your GitHub account. Here is how.
There are three ways to post an article online (so far that I know) using GitHub.
The first method is to use the html viewer.
For example, I wrote a R Markdown file and knitted into a html file, which were then stored in my repository on GitHub (see graph - html file highlighted in yellow).
When I click on it, it has the html format and is not readable.
To render it to the html file from a browser that you see on your local computer, you go to the address box on the top of your GitHub page to copy the link
and then paste it to the the html viewer .
Click on the Preview button next to the link. It is now live online and you can share the link with others.
If you find the link too long to remember, you can use the free web service Bitly to shorten it. Mine is changed to the concise bitly link: http://bit.ly/33lhsF1 from the lengthy URL: http://htmlpreview.github.io/?https://github.com/cathydatascience/showprojectsonline/blob/master/howtopublish.html
Alternatively, you can use the RPubs which is integrated in your knitted html file from RStudio and is free. After knitting the R Markdown file, you will see the html file on your local computer browser. On the top right corner, click on the triangle next to the Publish button.
It is a two-step process. First step, you need to register a RPubs account or sign in if you already have one. You write the title of your document and give a short description (so that other may find your paper from search). You can see the example below (the highlight is my RPubs account name).
Once you are done and click the Continue button at the bottom, you will be directly to the published paper. For example, this paper is available at http://rpubs.com/cathydatascience/518692
You can choose to update your content when you make changes to the paper on your computer by clicking the Publish button from your knitted html file.
The third method uses GitHub to post one article from the master branch of a repository. This guide is available at … Alternatively, you can publish the article from the master/doc branch as well. My project paper is online at … I will show you how to do it both ways. The former puts all the files under the master branch while the latter puts the R markdown files under the master branch and the html files and additional folders under the master/doc branch.
RStudio integrates Git commands and links to GitHub (assuming you have Git installed in your computer). You can click through buttons without using the command window (I use GitBash and the list of commands is enclose in the appendix). The advantage of typing in commands is that you can create a copy of your work in a separate branch making sure things work fine before merging the branch with the master. However, you can always start from the beginning since there are not many files in the project.
You log on to your GitHub account and click the green button New to create a new repository. Give a name to your repository and a brief description. Keep the repo to be Public. I usually check the box to initialize a README.md file (the content will be same as what you put down on the description box) but it does not matter. In the left pulldown box, type in R and leave the license box on the right alone (or, you can change to your liking). Then, click the green button Create repository at the bottom.
You will be directed to the repo. On the right side of the top, click open the green button Clone or download and copy the URL.
We turn to RStudio now. Under the File tab, select New Project ….
Then, choose Version Control and then Git. You shall see the following window. Paste the copied URL from GitHub (which is pointed to your newly created repo). And give a name to your project and you are ready to click Create Project to get started. If you want to pick a directory that is different from the working directory, you can change it by clicking Browse….
When you are done, you shall see the Git tab from the top right pane of RStudio. A connection is established between your R project and GitHub.
You can go ahead and write your R markdown file, or open your previously created R markdown file from other folders (make sure that you save a copy in the project folder) and knitted to html file. You will have two new files on the top right pane under the Git tab. For me, there are three since I have an additional folder for the images. We will have Git remember the changes we have made, that is to commit the changes. You check the boxes of the two new files (for me, three) and then click Commit on the top panel just below Git.
You will see the changes we have made. Type in some comments to describe the change (for example, add a graph, or first commit etc.). Press the Commit button.
It returns with a summary message. Just close the window. You are back to the previous window. Press the Push button on the top right corner of the commit message box. Your files will be pushed to the GitHub repository.
Close the message window. You can check your GitHub account. The local files that you just commited in the local computer are now pushed to the repo.
You are now ready to see your html live online. Scroll up to your GitHub repo and click on Settings.
Scroll down to the section GitHub Pages. Under Source, choose master branch from source.
It will take a while for GitHub to publish your html file online. In the meanwhile, it looks like below. When the link in blue turns green after refreshing the page, you can click on it.
Oops! It shows you the README.md file content. It is an easy fix. You can append the name of your html file in the repository. For example, my file’s name is “howtopublish”. I will add howtopublish.html to “http://cathydatascience.github.io/showprojectsonline” and use the link: http://cathydatascience.github.io/showprojectsonline/howtopublish.html
Alternatively, I can change my file’s name to “index”. I will just change my R markdown file and knit it again before pushing them to my GitHub account. Now, click the link and it will work like a charm. By default, GitHub looks for index.html first and if it is absent, it will use README.md file for publishing sites.
When we’d like to separate the R markdown file and the html file (maybe some associated graphs etc.), we choose to publish from the master/doc folder. It is also the way to publish a collection of articles with some structures (the next section). The steps are almost identical with the two places that are different.
I will use my class project to demonstrate the steps. The URL of the repo is: https://github.com/cathydatascience/PracticalMachineLearning_Project
Before you knit your R markdown file, add the following under the title section.
knit: (function(input_file, encoding) {
out_dir <- 'docs';
rmarkdown::render(input_file,
encoding=encoding,
output_file=file.path(dirname(input_file), out_dir, 'index.html'))})
image3_12
Your index.html file will be knitted to the doc sub-folder in your local computer. Repeat the process of commit and push the files to your GitHub account from RStudio.
When you change the publishing from source, instead of choose master branch, you need to pick master/doc branch. When you click on the link after a while, you will see your html file. It is live online!
The website uses rmarkdown
package that provides a simple site generator. It is for putting together a few R markdown files into a website. All the R mardown files are under one directory with no subdirectories. The key advantage is that you can publish the website online with a least amount of time invested.
Before I show you how to do it, I’d like to suggest that you read about the examples from the section of rmarkdown’s site generator in the book R Markdown: the definitive guide. They help you get an idea of the components of the website structure and how you can change them. Also, build a simple website to see how it works. Here is the guide that I used before building the website.
When you have the R markdown files (content ready to generate the html files) ready, you need to put togeher a list of instructions to tell rmarkdown
how you’d like to arrange the files. It is a **_site.yml** file, which can be written in a text editor. Feel free to make changes to content and see how those affect the layout.
If you want to publish from the master branch on GitHub, use . All the html files will be put togehter with the rmd files under the master branch. My website produced this way is available at: https://cathydatascience.github.io/DataScienceJHU_PracticalMachineLearning/
If you publish from the master/doc branch, use . The html files will be put separate in the /doc folder. The duplicate website published from master/doc branch is at: https://cathydatascience.github.io/duplicate/
It is important to have an index page in the **_site.yml** file. Otherwise, the rmarkdown
command will not work. Equally important is that you should not capitalize the Index.html file - use index.html. Otherwise, GitHub will not find the file for publication. The bottom line is if you are able to run the site in your local computer, you shall be able to publish online.
Last but not least, the GitBash command lines work faster when you have many files to be committed, and pushed to GitHub. The commands are enclosed in Appendix.
When you start a new project on GitHub, you will navigate to your folder which is to link to the GitHub repository and use the following commands on GitBash.
Start a new project, you will type: > git init
After you have your files ready (written in RStudio), you will commit them on your Git (local computer): > git add . #there are many options like -A
> git commit -m "msg"
Then, you will want to upload the files to your GitHub account. If you have not linked your GitHub account to the local folder, you will type: > git remote add origin URL-GitHub-repo
If you have created a new R Project from RStudio through Version Control and Git (as shown in the article), you can skip the step. You also see (master)
after you navigate to your project folder.
To store your files to GitHub, > git push -u origin master #there are options that follow and you can google
The above are the basics. If you want to create a copy of what you have so far, experiment with it and only put back the stuff you are happy with to the original, you can create a branch different from the master branch (you are typically at for a new repo).
>git branch yourpick #create yourpick branch
>git checkout yourpick #switch from master to yourpick branch
You can continue to create files in your local computer, commit to Git and then push to GitHub. Note that you are operating under the yourpick
branch. When you feel happy with what you have and are ready to bring the files to master
branch and then delete yourpick
branch. You type the following commands.
> git push -u origin yourpick #store files under yourpick branch
On GitHub, you can create a pull request. Since you own the repo (no collaborator yet), you will be able to just merge pull request by clicking the button and then confirm the merge. After see the files are now under master
branch, you are ready to delete the yourpick
branch.
> git checkout master #change to master branch in your local computer
> git pull #copy the files to your local folder
> git branch -D yourpick #delete yourpick branch
To check the status of your branch and your commits, here is the list of commands:
> git status
> git log --online #see the history of commits
> git log --online --graph --all # see commit history in a graph
> git diff --stat -summary master..yourpick #see changes between the two branches