Get the tools

Github First

Download for Mac: https://git-scm.com/download/mac Download for Windows: https://git-scm.com/download/win Set up a free account here: https://github.com/ For a private account you need $7 per month Create a repository: go to the repositories menu tab at the top and choose “New” Find the homework STA758 repository here: https://github.com/kvond/STATS/tree/master/studenthomework

R Markdown

This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

What is R Markdown?

R Markdown allows you to create documents that serve as a neat record of your analysis. In the world of reproducible research, we want other researchers to easily understand what we did in our analysis, otherwise nobody can be certain that you analysed your data properly. You might choose to create an RMarkdown document as an appendix to a paper or project assignment that you are doing, upload it to an online repository such as Github, or simply to keep as a personal record so you can quickly look back at your code and see what you did. RMarkdown presents your code alongside its output (graphs, tables, etc.) with conventional text to explain it, a bit like a notebook. Notes in this RMD are from https://ourcodingclub.github.io/2016/11/24/rmarkdown-1.html

R Output

Put your cursor in the code chunk and select Run from the menu tab or use the green arrow within the code chunk on the upper right.

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Plot

plot(pressure)

## Hiding code chunks If you don’t want the code of a particular code chunk to appear in the final document, but still want to show the output (e.g. a plot), then you can include echo = FALSE in the code chunk instructions.

A <- c("a", "a", "b", "b")
B <- c(5, 10, 15, 20)
dataframe <- data.frame(A, B)

``` Here echo is set to True, which is the default, so you should see the R code chunk. I did not add the command to plot the dataframe, so you won’t see that until the code runs again, this time with echo set to FALSE.

-Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot - you should only see the plot and not the code chunk.

Create your own Markdown file

To create a new RMarkdown file (.Rmd), select File -> New File -> R Markdown…_ in RStudio, then choose the file type you want to create. For now we will focus on a .html Document, which can be easily converted to other file types later. -For now save the .Rmd file on your desktop. Later you will create a Github account and load it there.

Conventions

-Title: At the top of any RMarkdown script is a YAML header section enclosed by —. By default this includes a title, author, date and the file type you want to output to. Many other options are available for different functions and formatting.See the references for more info.

Knitting your file

To compile your .Rmd file into a .html document, you should press the Knit button in the taskbar:

By default, RStudio opens a separate preview window to display the output of your .Rmd file. If you want the output to be displayed in the Viewer window in RStudio (the same window where you would see plotted figures / packages / file paths), select “View in Pane” from the drop down menu that appears when you click on the Knit button in the taskbar, or in the Settings gear icon drop down menu next to the Knit button.

A preview appears, and a .html file is also saved to the same folder where you saved your .Rmd file.

Code Chunks

Below the YAML header is the space where you will write your code, accompanying explanation and any outputs. Code that is included in your .Rmd document should be enclosed by three backwards apostrophes ``` (grave accents!). These are known as code chunks and look like this:

norm <- rnorm(100, mean = 0, sd = 1)

Inside the curly brackets is a space where you can assign rules for that code chunk. The code chunk above says that the code is R code. We’ll get onto some other curly brace rules later.

Data Manipulation Homework

–all data manipulation assignments should be saved as R Markdown files uploaded to the class homework GitHub Repository. These links will be shared on the discussion board for peer review.

Tweaks

-In some cases, when you load packages into RStudio, various warning messages such as “Warning: package ‘dplyr’ was built under R version 3.4.4” might appear. If you do not want these warning messages to appear, you can use warning = FALSE

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

-R Markdown doesn’t pay attention to anything you have loaded in other R scripts, you MUST load all objects and packages in the R Markdown script. -More formating options are in the reference text.

inserting graphs

By default, RMarkdown will place graphs by maximising their height, while keeping them within the margins of the page and maintaining aspect ratio. If you have a particularly tall figure, this can mean a really huge graph. In the following example we modify the dimensions of the figure we created above. To manually set the figure dimensions, you can insert an instruction into the curly braces:

A <- c("a", "a", "b", "b")
B <- c(5, 10, 15, 20)
dataframe <- data.frame(A, B)
plot(dataframe)

formating dataframes

kable() function from knitr package The most aesthetically pleasing and simple table formatting function I have found is kable() in the knitr package. The first argument tells kable to make a table out of the object dataframe and that numbers should have two significant figures. Remember to load the knitr package in your .Rmd file as well. see library(knitr) argument kable(dataframe, digits = 2)

References for slides: Markdown

-most of the R markdown information comes from: https://ourcodingclub.github.io/2016/11/24/rmarkdown-1.html -codes for this tutorial are found here: https://github.com/ourcodingclub/CC-2-RMarkdown -github sites are important for getting data and storing your R Markdown files. -cheat sheet is here: https://github.com/ourcodingclub/CC-2-RMarkdown/blob/master/rmarkdown-cheatsheet.pdf

References for GitHub

Basic use instructions: https://git-scm.com/docs More invoved tutorial here (and more clear) https://ourcodingclub.github.io/2017/02/27/git.html Note the Mac issues seem to have been resolved. Similar Tutorial: https://product.hubspot.com/blog/git-and-github-tutorial-for-beginners Pick the one that resonates with you. A deeper tutorial group that you join on github: https://lab.github.com/githubtraining/communicating-using-markdown