An R Notebook is a document that uses the R Markdown language to create an interactive document that can dsiplay output from R code. It includes formatted text and R code chunks that can be executed independently and interactively, with output visible immediately beneath the input.
When you create a new R Notebook
File -> New File -> R Notebook you open an R Notebook
template with simple instructions and examples.
Select the Source view to type in your text and code
- I always work in Source mode
Select the Visual view to see what your Notebook
will look like
Type text directly into the Notebook (we’ll discuss formatting later)
Insert R code into a Notebook in a chunk
Preview or Knit as you add elements to your
Notebook to see the output in the Viewer pane
Preview quickly renders your code into a notebook and
displays it in the ViewerKnit renders your code into a notebook in the
publication format and displays it in the Viewer
To test your R code, click the green arrow within a code chunk to Run it. The output will display below.
You can type directly into your R Notebook with regular text. All formatting uses the R Markdown syntax.
See the R Markdown cheat sheet for details and examples. These are the most common:
IMPORTANT You must put a blank line after any header or it won’t register
As you have noticed, you don’t always want to display the output from your code. You can define how your code displays using chunk options in 2 ways:
The most common chunk options are:
include = FALSE prevents code and results from
appearing in the finished file. R Markdown still runs the code in the
chunk, and the results can be used by other chunks.echo = FALSE prevents code, but not the results from
appearing in the finished file. This is a useful way to embed
figures.message = FALSE prevents messages that are generated by
code from appearing in the finished file.warning = FALSE prevents warnings that are generated by
code from appearing in the finished.fig.cap = "..." adds a caption to graphical
results.See the R Markdown Reference Guide for a complete list of knitr chunk options.
At the top of your R Notebook, under the title insert a code chunk
with global options using the
knitr::opts_chunk$setparameter. These will define the
default display options for all code chunks in the Notebook. For
example, the following :
names the chunk “setup”
uses include=FALSE : do not display this code chunk
in the output
echo = T : defines global default to show the code
chunk
quietly = T : defines global default to suppress
messages
message = F : defines global default to suppress
messages (different and more robust method)
This code chunk should be alone - it should not include any other code
To change those global display options for one code chunk (like
load_acs which is very noisy), add any chunk options within the
{r} of your code chunk. For example, to display the output,
but the not the code for a code chunk:
This chunk displays the following formatted table using the
knitr package, but doesn’t display the code.
You can also add define pre-designed themes, add table of contents, and much more in the title section called the yaml (Yet Another Markup Language = stupid coding joke). The formatting is very finicky, when you are following an example make it look EXACTLY the same. It is complicated but we’ll learn a few.
See this chapter for more details on adjusting the html document in the title section.
Theme
Table of Contents
toc: true: create a table of contents _
toc_depth: 3: create entries in the Table of Contents for
Header 3 and highertoc_float: true: toc sticks to the side so you can
always see it
NYC Open Data is free public data published by New York City agencies and other partners.
https://opendata.cityofnewyork.us/
There is a vast amount of data. You can download data from NYC Open
Data, or use the RSocrata to import the data directly into
R.
We’ll use an example from Boyan Kostadinov at City Tech.
The goal of this activity is to explore the 2021 DOE Middle School Directory data from the New York City Open Data Portal. This activity is an introduction to exploratory data analysis and visualizations using R and RStudio.
RSocrata: for loading the data from NYC Open Dataknitr: for printing tablesDT: for interactive tables in html formatnyc_middle_schools.Rmdmain_data/scriptslibrary(tidyverse)
library(RSocrata)
library(knitr)
library(DT)
# import the data directly into RStudio using url path
data <- read.socrata("https://data.cityofnewyork.us/resource/f6s7-vytj.csv")
## Warning in read.socrata("https://data.cityofnewyork.us/resource/f6s7-vytj.csv"):
## Dates and currency fields will be converted to character
Follow the instructions in Kostadinov’s R Notebook to select columns, and create summary statistics of the number of math professors, and correct some missing values.
Use this dataset to test out different ways to style and format an R Notebook
Explore NYC Open Data to see what data is available. Select a dataset to import via R Socrata and answer a question about Corona or your final project. Create a R Notebook to share your analysis. Include:
kable or datatable functions to
create at least one formatted tablesSome suggestions:
st_as_sf())Add the link to 2 Notebooks on CANVAS: for your in-class assignment and for this analysis.
Answer the “Beginning of the Project” framing questions for your final project:
Upload to the assignment on CANVAS.