The programming language R is, fundamentally, a language designed for statisticians and data scientists for analyzing data. One of the main benefits of using a programming language for analyzing data is that, by saving all instructions provided to the computer in code-files, we can easily make our analysis replicable and easy to share with others. This is one of the best advantages of working in data analysis with R.
Despite the advantages of using code for analyzing data and saving the steps followed in the analysis, there may be a small drawback when analyzing data with code: the human interface that code provides is not precisely friendly. Code files are helpful since they provide a system for communicating complex operations to the computer. Despite this, code is not the nicest interface where to read explanations of each step, or to comment on the results and findings of an analysis.
Often, when analyzing data, what practitioners do is perform a couple of steps and deciding what is the next step as a function of the outputs that have been obtained so far. Statisticians often make choices on how to proceed based on how data looks like
The task of analyzing data is inherently worked in incremental steps. In each step, we perform some operations or visualizations on our data. Then, we make choices on how to move forward based on the newly gathered information.
To improve the interface that data scientists have for working on
their data analysis, as well as to help create a step-by-step interface
for data analysis, R allows you to create notebooks with code inserted
in them. These are known as R-Markdown documents.
Notebook documents of this type have the extension
document.Rmd
when saved.
To clarify further R allows programming on two type of files: * “file.R” (source code): can only contain code and comments. * “file.Rmd” (R-Markdown notebook): more text, runs “chunks of code.” Useful for step-by-step work.
In this document, we provide the basics of R-Markdown for writing, coding and knitting documents. We will teach the basics for the purposes of our course. For a more comprehensive treatment of R-Markdown, check-out other resources like R Markdown: The Definitive Guide.
All class in the semester will be taught through R-Markdown documents. From this, it is important for you to have familiarity with this type of documents.
As you can see, an R-Markdown document is like a text document, where you can write. While the format of this document might seem minimal, you can change style like font, size, titles and orientation of the text with simple commands. While you will not see all of this formating in the document in progress, you can see a formatted version of the R-Markdown document by knitting your R-Markdown. some examples of format are:
**like this**
._like this_
and *like this*
.~
sign: like
H~3~PO~4~
turns into H3PO4.[text](link)
syntax. For example, you can find an R-Markdown tutorial here.this
.In addition to writing text, you can create titles and subtitles for your text:
##
###
####
In addition to writing text with format, you can also write code. This can be done by creating a “chunk” of code.
## Printing Hello world!
char <- "Hello world!"
print(char)
## [1] "Hello world!"
## Operating on numerical variables.
x <- 5
print(x + 5)
## [1] 10
print(x+5)
## [1] 10
As you can see, you can run these lines of code by copy-pasting the lines in your R-console in the bottom-left pane. You can also run it by clicking the “play” green button in the upper-right corner of the chunk.
Note: To run multiple lines of code in RStudio,
select lines and press CTRL + ENTER
To run an entire R code file, or a chunk of code, press
CTRL + SHIFT + ENTER
Try using the CTRL + ENTER
and
CTRL + SHIFT + ENTER
in the following chunk.
## I ran this line with only my keyboard.
print("I ran this line with only my keyboard.")
## [1] "I ran this line with only my keyboard."
## I ran this chunk with my keyboard.
print("I ran this chunk with my keyboard.")
## [1] "I ran this chunk with my keyboard."
In addition to running code, you can create plots which are automatically included in your output documents! This helps a lot, since you can create plots in a chunk, and then subsequently discuss the findings of the plot below in text.
## Exponential graph is created here.
x <- seq(-5,5, length.out = 100)
y <- 2.71^x
plot(x,y, main = "Exponential Graph", type = "l", col= "red")
Without further instruction, chunks show both their code and the outputs of that code. When writing formal reports, this may not be desirable. To improve the aspect, you can:
echo=FALSE
;fig.height
and
fig.width
(measured in inches);fig.cap
;fig.align
.## Parabola is created here using sequence function.
## Also used the Plot function to create a graph.
## Color is red and main is parabola
x <- seq(-5,5, length.out = 100)
y <- x^2
plot(x = x, y = y, type = "l", col = "red", main = "Parabola")
A) Below this, create a chunk of code. In the chunk, create a plot of the parabola y=ex for x between -5 and 5. It must have the following features:
4in
width x
5in
height.type = "l"
), and
the line should be of color "red"
.B) Knit your file to both PDF and HTML. If you have
knitting issues, ask for help. To knit, you need to have the R package
tinytex
in your computer. If your knitting is not
working, and your code does not have any issue, do the
following:
packages
tab.tinytex
package is
on the list. If not, this means it is not yet installed.Console
on the bottom-left pane, and run the
line of code install.packages("tinytex")
.‘install.packages(“tinytex”)’