R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When I click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. I can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

I can also embed plots, for example:

par(mfrow = c(1,2))
hist(cars$speed, xlab = 'Speed', main = 'Histogram of Speed')
hist(cars$dist, xlab = 'Distance', main = 'Histogram of Distance')

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Datacamp Practice

This is my DataCamp practice. I am practicing this in hope of becoming great.
Notice that a * before and after the word or sentence makes them in italics.
Also, notice that two stars (i.e ** before and after the word or sentence makes them in bold.
Observe that ` before and after the word or sentence makes themhighlighted`.

I can turn a word into a link by surrounding it in hard brackets and then placing the link behind it in paranthesis, like this:
[Jad’s website] (http://chessthegameofkings.blogspot.com/)

To create titles and headers, use leadings hashtags. The number of hashtags determines the header’s level:

First level header

(Make sure there is line or more between the headers and other thingsbecause headers take a lot vertical space)

Second level header

Third level header

List in R Markdown

I can make bullet points by adding a * before the sentence or word like this:

I can also make an ordered list by placing each new item on a new line after a number followed by a period and a space like that:

  1. ordered item 1
  2. ordered item 2
  3. item 3

If I wanted to make my first two points bold and my third italicized then

Leaving blank lines is usually a good idea between bullets. “Whatever I like!”

LaTex Equations

Using a two $ signs before and after an equation will make cool equations

\[E = M * C^2\] To embed an equation inline (at beginning of line), I can surround it with a single pair of dollar signs:

\(E = M * C^2\)

Also, I can use all the COOL latex math symbols for my equations. Check out this website [LaTex Mathematics] (https://en.wikibooks.org/wiki/LaTeX/Mathematics)

Cleaning my data

I will clean some data using the mtcars data and will the package dplyr

# Load mtcars data
data(mtcars)
# Attach dplyr package
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Find the structure of the data, names, heads, tails, summary of variables. (Any useful function I can think of)

# data structure
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
# names of variables
names(mtcars)
##  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
## [11] "carb"
# head up to 6 rows and tails up to 3 rows
# heads
head(mtcars, 6)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
# tails
tail(mtcars,3)
##                mpg cyl disp  hp drat   wt qsec vs am gear carb
## Ferrari Dino  19.7   6  145 175 3.62 2.77 15.5  0  1    5    6
## Maserati Bora 15.0   8  301 335 3.54 3.57 14.6  0  1    5    8
## Volvo 142E    21.4   4  121 109 4.11 2.78 18.6  1  1    4    2
# sapply: Find summary for all variables in data set
options(digits = 4)         ## will display numbers in summaries in terms of 4 digits
sapply(mtcars[,1:11],summary)
##           mpg   cyl  disp    hp  drat    wt  qsec     vs     am  gear
## Min.    10.40 4.000  71.1  52.0 2.760 1.513 14.50 0.0000 0.0000 3.000
## 1st Qu. 15.43 4.000 120.8  96.5 3.080 2.581 16.89 0.0000 0.0000 3.000
## Median  19.20 6.000 196.3 123.0 3.695 3.325 17.71 0.0000 0.0000 4.000
## Mean    20.09 6.188 230.7 146.7 3.597 3.217 17.85 0.4375 0.4062 3.688
## 3rd Qu. 22.80 8.000 326.0 180.0 3.920 3.610 18.90 1.0000 1.0000 4.000
## Max.    33.90 8.000 472.0 335.0 4.930 5.424 22.90 1.0000 1.0000 5.000
##          carb
## Min.    1.000
## 1st Qu. 2.000
## Median  2.000
## Mean    2.812
## 3rd Qu. 4.000
## Max.    8.000
# Attach data set and draw visualizations
attach(mtcars)
# MPG vs Weight
mpg_wt.scatter <-plot(wt, mpg , xlab = "Weight"
             , ylab = "Miles per Gallon"
             , main = "MPG v Weight" )
# Table of frequencies
# Convert gear variables to factors
gear <- factor(gear, levels = c(3,4,5)
                   , labels = c('3','4','5'))
# Create a frequency, relative frequency and cumulative relative frequency table for gear
gear.freq <- data.frame(table(gear))
gear_rel.freq <- (gear.freq$Freq / sum(gear.freq$Freq) * 100)
gear_cum.freq <- cumsum(gear_rel.freq)

table_gear <- cbind(gear.freq, gear_rel.freq, gear_cum.freq)
table_gear
##   gear Freq gear_rel.freq gear_cum.freq
## 1    3   15         46.88         46.88
## 2    4   12         37.50         84.38
## 3    5    5         15.62        100.00
# User ggplot2 to draw a histogram
# Get ggplot2 library
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following object is masked from 'mtcars':
## 
##     mpg

# MPG Histogram
ggplot(mtcars, aes(x = mpg)) + geom_histogram( breaks = seq(10,40,by =2) 
                  ,col = "black"
                  ,fill = "red") + labs( title = "Histogram of MPG", x ="MPG", y ="Count")

# Group and filter using gear in ggplot 
gear_table <- mtcars %>%
              filter(wt > 2.581) %>%
              group_by(gear) %>%
              summarize(mpg = mean(mpg, na.rm = TRUE),
                       hp = mean(hp, na.rm = TRUE),
                        drat = mean(drat, na.rm = TRUE))
gear_table   
## # A tibble: 3 x 4
##    gear   mpg    hp  drat
##   <dbl> <dbl> <dbl> <dbl>
## 1     3  15.7  182.  3.09
## 2     4  21.1  105.  3.91
## 3     5  16.8  258   3.79

Sometimes R will generate errors, warnings, and messages . To tell R not to print those errors in the report, I can use warning = FALSE, error = FALSE in ```{r warning = FALSE, error = FALSE, message = FALSE}

library(ggvis)

The above sets warning = FALSE and error = FALSE. If we don’t set those to false, we get

I can use echo = FALSE if I want the code to not to show in report. (The code will run but only the results will show).

## [1] "factor"

I can use eval = FALSE if I want the code not to run and the results not to show, but the code will show.

gear_factor <- factor(gear, levels = c(3,4,5)
                          , labels = c('3rd','4th','5th'))
class(gear_factor)

I can use results = ‘hide’ if I want the code to run and show but the results not to show.

gear_factor <- factor(gear, levels = c(3,4,5)
                          , labels = c('3rd', '4th', '5th'))
class(gear_factor)

fig.height and fig.width arguments control the size of figures in graph

I can embed R code into the text of Ir document with the the below syntax. For example, The factorial of 4 is 24.

Another example:

My name is Jad and my age is 33.

Labeling

I can label code snippets like so:

2 + 2

See that the ouput will appear in the .Rmd document where I am typing this
but will not show in the final report.

Why is defining labels great? Because knitr provides the option of ref.label
to refer to previously defined and labeled code chunks. If used correctly, knitr
will copy the code of the chunk I referred to and repeat it in the current code chunk.
This feature enables me to separate R code and R output in the output docment, with code duplication. Ex:

## [1] 4

Pandoc

Pandoc

I can use tranform Rmarkdown file to a finished format using Pandoc program. I can render the rmarkdown files in html, pdf, word, or slideshow formats. I can control the render process by providing a Yamel headers, which is the top “thing” I see in my RMarkdown script. In it, I see title, author, date and output. — title author date output — I can change the output to the following: html_document renders it as an html document pdf_document renders it as a pdf document word_document renders it as a word document beamer_presentation renders it as a beamer slide show (pdf format for slides) slidy_presentation and ioslides_presentation render document as html slide show md_document will create a Markdown file

Despite the fact that a preview of the output depends on the type of document I set, output files of different kinds will be saved in the working directory. So, If output in my Yamel is set to output: pdf_document, when I knit, the preview that will pop up is a pdf document but files that will be saved in my working directory will be an html file and a pdf file and a other files.

Render function

I can use markdown::render Here’s how: markdown::render(“doc.rmd”, “html_document”)

I can also give render a vector of output formats render(“doc.rmd”, c(“html_document”, “pdf_document”))

Inserting Additional slides

When we are using presentation formats, R Markdown will start a new slide at each first or second level header in my document.I can insert additional slide breaks with Markdown’s horizontal rule syntax: (ie. *** will move what is after it to a new slide). If the ouput file is a pdf or a non-slide file, I will get a line __________________ between what is before the three astericks and what is after them


Everywhere I add these three asterisks in Ir text, pandoc will create a new slide.

Specify Knitr and Pandoc Options

The link below will give me great Syntax for a lot of things I want have in Ir report. In it, I can see how to create awesome slides

[R Markdown: Reference Guide] (http://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf)

Each R Markdown output template is a collection of knitr and pandoc options. I can customize Ir output by overwriting the default options that come with the template.

For example, the YAML header below overwrites the default code highlight style of the pdf_document template to create a document that uses the zenburn style:

---
title: "Demo"
output:
  pdf_document:
    highlight: zenburn
---

The YAML header below overwrites the default bootstrap CSS theme of the html_document template.

---
title: "Demo"
output:
  html_document:
    theme: spacelab
---

Pay close attention to the indentation of the options inside the YAML header; if I do not do this correctly, pandoc will not correctly understand Ir specifications. As an example, notice the difference between only specifying the output document to be HTML:

---
output: html_document
---

and specifying an HTML output document with a different theme:

---
output:
  html_document:
    theme: spacelab
---

To add a table of contents, I can do this for the YAML at the beginning of the document. For this to work, replace the original YAML with this code inputing the right information. — title: “” author: “Jad” date: “Data Camp R MarkDown Class Practice” output: html_document: toc : true number_sections: true —

OR I can use any of the following.

Brand Ir reports with style sheets In the last exercise, we showed a way to change the CSS style of Ir HTML output: I can set the theme option of html_document to one of default, cerulean, journal, flatly, readable, spacelab, united, or cosmo. (Try it out).

But what if I want to customize Ir CSS in more specific ways? I can do this by writing a .css file for Ir report and saving it in the same directory as the .Rmd file. To have Ir report use the CSS, set the css option of html_document to the file name, like this

---
title: "Demo"
output:
  html_document:
    css: styles.css
---

Custom CSS is an easy way to add branding to Ir reports.

For example, the YAML header below overwrites the default code highlight style of the pdf_document template to create a document that uses the zenburn style:

---
title: "Demo"
output:
  pdf_document:
    highlight: zenburn
---

The YAML header below overwrites the default bootstrap CSS theme of the html_document template.

---
title: "Demo"
output:
  html_document:
    theme: spacelab
---

Pay close attention to the indentation of the options inside the YAML header; if I do not do this correctly, pandoc will not correctly understand Ir specifications. As an example, notice the difference between only specifying the output document to be HTML:

---
output: html_document
---

and specifying an HTML output document with a different theme:

---
output:
  html_document:
    theme: spacelab
---

Brand my reports with style sheets In the last exercise, we showed a way to change the CSS style of Ir HTML output: I can set the theme option of html_document to one of default, cerulean, journal, flatly, readable, spacelab, united, or cosmo. (Try it out).

But what if I want to customize Ir CSS in more specific ways? I can do this by writing a .css file for my report and saving it in the same directory as the .Rmd file. To have my report use the CSS, set the css option of html_document to the file name, like this

---
title: "Demo"
output:
  html_document:
    css: styles.css
---

Custom CSS is an easy way to add branding to my reports.

The Shiny package

Shiny to make your reports interactive Shiny is an R package that uses R to build interactive web apps such as data explorers and dashboards. You can add shiny components to an R Markdown file to make an interactive document.

When you do this, you must ensure that

You use an HTML output format (like html_document, ioslides_presentation, or slidy_presentation). This doesn’t work with pdf and word or any non HTML output format. You add runtime: shiny to the top level of the file’s YAML header. To learn more about interactivity with Shiny and R, visit [shiny package] (shiny.rstudio.com)

Interactive ggvis graphics You can also use R Markdown to create reports that use interactive ggvis graphics. ggvis relies on the shiny framework to create interactivity, so you will need to prepare your interactive document in the same ways:

You need to add runtime: shiny to the YAML header You need to ensure that your output is a HTML format (like html_document, ioslides_presentation, or slidy_presentation) You do not need to wrap your interactive ggvis plots in a render function. They are ready to use as is in an R Markdown document.

The below code for the displacement variable will work only if runtime: shiny in the YAML and the output: html_document or ioslides_presentation, or slidy_presentation. Don’t forget to put the code in ``` {r } etc..

mtcars %>% ggvis( x = ~disp) %>% layer_densities( adjust = input_slider(.1, 2, value = 1, step = .1, label = “Bandwidth adjustment”), kernel = input_select( c(“Gaussian” = “gaussian”, “Epanechnikov” = “epanechnikov”, “Rectangular” = “rectangular”, “Triangular” = “triangular”, “Biweight” = “biweight”, “Cosine” = “cosine”, “Optcosine” = “optcosine”), label = “Kernel”) )

Look at [rpubs.com] http://rpubs.com to see great reports. Now you can use the code again and again.