This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When I click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. I can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
I can also embed plots, for example:
par(mfrow = c(1,2))
hist(cars$speed, xlab = 'Speed', main = 'Histogram of Speed')
hist(cars$dist, xlab = 'Distance', main = 'Histogram of Distance')
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.
This is my DataCamp practice. I am practicing this in hope of becoming great.
Notice that a * before and after the word or sentence makes them in italics.
Also, notice that two stars (i.e ** before and after the word or sentence makes them in bold.
Observe that ` before and after the word or sentence makes themhighlighted`.
I can turn a word into a link by surrounding it in hard brackets and then placing the link behind it in paranthesis, like this:
[Jad’s website] (http://chessthegameofkings.blogspot.com/)
To create titles and headers, use leadings hashtags. The number of hashtags determines the header’s level:
(Make sure there is line or more between the headers and other thingsbecause headers take a lot vertical space)
I can make bullet points by adding a * before the sentence or word like this:
I can also make an ordered list by placing each new item on a new line after a number followed by a period and a space like that:
If I wanted to make my first two points bold and my third italicized then
point 1
point 2
point 3
Leaving blank lines is usually a good idea between bullets. “Whatever I like!”
Using a two $ signs before and after an equation will make cool equations
\[E = M * C^2\] To embed an equation inline (at beginning of line), I can surround it with a single pair of dollar signs:
\(E = M * C^2\)
Also, I can use all the COOL latex math symbols for my equations. Check out this website [LaTex Mathematics] (https://en.wikibooks.org/wiki/LaTeX/Mathematics)
I will clean some data using the mtcars data and will the package dplyr
# Load mtcars data
data(mtcars)
# Attach dplyr package
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Find the structure of the data, names, heads, tails, summary of variables. (Any useful function I can think of)
# data structure
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
# names of variables
names(mtcars)
## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
## [11] "carb"
# head up to 6 rows and tails up to 3 rows
# heads
head(mtcars, 6)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
# tails
tail(mtcars,3)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Ferrari Dino 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6
## Maserati Bora 15.0 8 301 335 3.54 3.57 14.6 0 1 5 8
## Volvo 142E 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
# sapply: Find summary for all variables in data set
options(digits = 4) ## will display numbers in summaries in terms of 4 digits
sapply(mtcars[,1:11],summary)
## mpg cyl disp hp drat wt qsec vs am gear
## Min. 10.40 4.000 71.1 52.0 2.760 1.513 14.50 0.0000 0.0000 3.000
## 1st Qu. 15.43 4.000 120.8 96.5 3.080 2.581 16.89 0.0000 0.0000 3.000
## Median 19.20 6.000 196.3 123.0 3.695 3.325 17.71 0.0000 0.0000 4.000
## Mean 20.09 6.188 230.7 146.7 3.597 3.217 17.85 0.4375 0.4062 3.688
## 3rd Qu. 22.80 8.000 326.0 180.0 3.920 3.610 18.90 1.0000 1.0000 4.000
## Max. 33.90 8.000 472.0 335.0 4.930 5.424 22.90 1.0000 1.0000 5.000
## carb
## Min. 1.000
## 1st Qu. 2.000
## Median 2.000
## Mean 2.812
## 3rd Qu. 4.000
## Max. 8.000
# Attach data set and draw visualizations
attach(mtcars)
# MPG vs Weight
mpg_wt.scatter <-plot(wt, mpg , xlab = "Weight"
, ylab = "Miles per Gallon"
, main = "MPG v Weight" )
# Table of frequencies
# Convert gear variables to factors
gear <- factor(gear, levels = c(3,4,5)
, labels = c('3','4','5'))
# Create a frequency, relative frequency and cumulative relative frequency table for gear
gear.freq <- data.frame(table(gear))
gear_rel.freq <- (gear.freq$Freq / sum(gear.freq$Freq) * 100)
gear_cum.freq <- cumsum(gear_rel.freq)
table_gear <- cbind(gear.freq, gear_rel.freq, gear_cum.freq)
table_gear
## gear Freq gear_rel.freq gear_cum.freq
## 1 3 15 46.88 46.88
## 2 4 12 37.50 84.38
## 3 5 5 15.62 100.00
# User ggplot2 to draw a histogram
# Get ggplot2 library
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following object is masked from 'mtcars':
##
## mpg
# MPG Histogram
ggplot(mtcars, aes(x = mpg)) + geom_histogram( breaks = seq(10,40,by =2)
,col = "black"
,fill = "red") + labs( title = "Histogram of MPG", x ="MPG", y ="Count")
# Group and filter using gear in ggplot
gear_table <- mtcars %>%
filter(wt > 2.581) %>%
group_by(gear) %>%
summarize(mpg = mean(mpg, na.rm = TRUE),
hp = mean(hp, na.rm = TRUE),
drat = mean(drat, na.rm = TRUE))
gear_table
## # A tibble: 3 x 4
## gear mpg hp drat
## <dbl> <dbl> <dbl> <dbl>
## 1 3 15.7 182. 3.09
## 2 4 21.1 105. 3.91
## 3 5 16.8 258 3.79
Sometimes R will generate errors, warnings, and messages . To tell R not to print those errors in the report, I can use warning = FALSE, error = FALSE in ```{r warning = FALSE, error = FALSE, message = FALSE}
library(ggvis)
The above sets warning = FALSE and error = FALSE. If we don’t set those to false, we get
I can use echo = FALSE if I want the code to not to show in report. (The code will run but only the results will show).
## [1] "factor"
I can use eval = FALSE if I want the code not to run and the results not to show, but the code will show.
gear_factor <- factor(gear, levels = c(3,4,5)
, labels = c('3rd','4th','5th'))
class(gear_factor)
I can use results = ‘hide’ if I want the code to run and show but the results not to show.
gear_factor <- factor(gear, levels = c(3,4,5)
, labels = c('3rd', '4th', '5th'))
class(gear_factor)
fig.height and fig.width arguments control the size of figures in graph
I can embed R code into the text of Ir document with the the below syntax. For example, The factorial of 4 is 24.
Another example:
My name is Jad and my age is 33.
I can label code snippets like so:
2 + 2
See that the ouput will appear in the .Rmd document where I am typing this
but will not show in the final report.
Why is defining labels great? Because knitr provides the option of ref.label
to refer to previously defined and labeled code chunks. If used correctly, knitr
will copy the code of the chunk I referred to and repeat it in the current code chunk.
This feature enables me to separate R code and R output in the output docment, with code duplication. Ex:
## [1] 4
I can use tranform Rmarkdown file to a finished format using Pandoc program. I can render the rmarkdown files in html, pdf, word, or slideshow formats. I can control the render process by providing a Yamel headers, which is the top “thing” I see in my RMarkdown script. In it, I see title, author, date and output. — title author date output — I can change the output to the following: html_document renders it as an html document pdf_document renders it as a pdf document word_document renders it as a word document beamer_presentation renders it as a beamer slide show (pdf format for slides) slidy_presentation and ioslides_presentation render document as html slide show md_document will create a Markdown file
Despite the fact that a preview of the output depends on the type of document I set, output files of different kinds will be saved in the working directory. So, If output in my Yamel is set to output: pdf_document, when I knit, the preview that will pop up is a pdf document but files that will be saved in my working directory will be an html file and a pdf file and a other files.
I can use markdown::render Here’s how: markdown::render(“doc.rmd”, “html_document”)
I can also give render a vector of output formats render(“doc.rmd”, c(“html_document”, “pdf_document”))
When we are using presentation formats, R Markdown will start a new slide at each first or second level header in my document.I can insert additional slide breaks with Markdown’s horizontal rule syntax: (ie. *** will move what is after it to a new slide). If the ouput file is a pdf or a non-slide file, I will get a line __________________ between what is before the three astericks and what is after them
Everywhere I add these three asterisks in Ir text, pandoc will create a new slide.
The link below will give me great Syntax for a lot of things I want have in Ir report. In it, I can see how to create awesome slides
[R Markdown: Reference Guide] (http://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf)
Each R Markdown output template is a collection of knitr and pandoc options. I can customize Ir output by overwriting the default options that come with the template.
For example, the YAML header below overwrites the default code highlight style of the pdf_document template to create a document that uses the zenburn style:
---
title: "Demo"
output:
pdf_document:
highlight: zenburn
---
The YAML header below overwrites the default bootstrap CSS theme of the html_document template.
---
title: "Demo"
output:
html_document:
theme: spacelab
---
Pay close attention to the indentation of the options inside the YAML header; if I do not do this correctly, pandoc will not correctly understand Ir specifications. As an example, notice the difference between only specifying the output document to be HTML:
---
output: html_document
---
and specifying an HTML output document with a different theme:
---
output:
html_document:
theme: spacelab
---
To add a table of contents, I can do this for the YAML at the beginning of the document. For this to work, replace the original YAML with this code inputing the right information. — title: “” author: “Jad” date: “Data Camp R MarkDown Class Practice” output: html_document: toc : true number_sections: true —
OR I can use any of the following.
Brand Ir reports with style sheets In the last exercise, we showed a way to change the CSS style of Ir HTML output: I can set the theme option of html_document to one of default, cerulean, journal, flatly, readable, spacelab, united, or cosmo. (Try it out).
But what if I want to customize Ir CSS in more specific ways? I can do this by writing a .css file for Ir report and saving it in the same directory as the .Rmd file. To have Ir report use the CSS, set the css option of html_document to the file name, like this
---
title: "Demo"
output:
html_document:
css: styles.css
---
Custom CSS is an easy way to add branding to Ir reports.
For example, the YAML header below overwrites the default code highlight style of the pdf_document template to create a document that uses the zenburn style:
---
title: "Demo"
output:
pdf_document:
highlight: zenburn
---
The YAML header below overwrites the default bootstrap CSS theme of the html_document template.
---
title: "Demo"
output:
html_document:
theme: spacelab
---
Pay close attention to the indentation of the options inside the YAML header; if I do not do this correctly, pandoc will not correctly understand Ir specifications. As an example, notice the difference between only specifying the output document to be HTML:
---
output: html_document
---
and specifying an HTML output document with a different theme:
---
output:
html_document:
theme: spacelab
---
Brand my reports with style sheets In the last exercise, we showed a way to change the CSS style of Ir HTML output: I can set the theme option of html_document to one of default, cerulean, journal, flatly, readable, spacelab, united, or cosmo. (Try it out).
But what if I want to customize Ir CSS in more specific ways? I can do this by writing a .css file for my report and saving it in the same directory as the .Rmd file. To have my report use the CSS, set the css option of html_document to the file name, like this
---
title: "Demo"
output:
html_document:
css: styles.css
---
Custom CSS is an easy way to add branding to my reports.
Shiny to make your reports interactive Shiny is an R package that uses R to build interactive web apps such as data explorers and dashboards. You can add shiny components to an R Markdown file to make an interactive document.
When you do this, you must ensure that
You use an HTML output format (like html_document, ioslides_presentation, or slidy_presentation). This doesn’t work with pdf and word or any non HTML output format. You add runtime: shiny to the top level of the file’s YAML header. To learn more about interactivity with Shiny and R, visit [shiny package] (shiny.rstudio.com)
Interactive ggvis graphics You can also use R Markdown to create reports that use interactive ggvis graphics. ggvis relies on the shiny framework to create interactivity, so you will need to prepare your interactive document in the same ways:
You need to add runtime: shiny to the YAML header You need to ensure that your output is a HTML format (like html_document, ioslides_presentation, or slidy_presentation) You do not need to wrap your interactive ggvis plots in a render function. They are ready to use as is in an R Markdown document.
The below code for the displacement variable will work only if runtime: shiny in the YAML and the output: html_document or ioslides_presentation, or slidy_presentation. Don’t forget to put the code in ``` {r } etc..
mtcars %>% ggvis( x = ~disp) %>% layer_densities( adjust = input_slider(.1, 2, value = 1, step = .1, label = “Bandwidth adjustment”), kernel = input_select( c(“Gaussian” = “gaussian”, “Epanechnikov” = “epanechnikov”, “Rectangular” = “rectangular”, “Triangular” = “triangular”, “Biweight” = “biweight”, “Cosine” = “cosine”, “Optcosine” = “optcosine”), label = “Kernel”) )
Look at [rpubs.com] http://rpubs.com to see great reports. Now you can use the code again and again.