R markdown

What does R Markdown do

R Markdown is like a magic notebook for data work. You can write text, add code, and see the results of that code all in one place. When you’re done, you can turn your work into a report, presentation, or even a website. It’s a way to share your code, results, and explanations in a single document.

Header and formatting

Text formatting

Bold

Meow Meow

Italic

Meow Meow

Strikethrough

~~Meow~~

Blockquote

Why did the cat sit on the computer? > Because it wanted to keep an eye on the mouse!

Endash

2023–08–20

Emdash

Cat___Felis catus

Superscript

x²

Subscript

H₂O

Horizontal lines

Use three or more dashes—, ***, ___

Lists

Domestic Cats: The ones you might have at home!

Domestic Cats
- Devon Rex
- Siamese
- Tuxedo

Domestic Cats
- Devon Rex
- Siamese
- Tuxedo

Unordered list

item 2
- sub-item 1
- sub-item 2

Ordered list

item 1
item 2
- sub-item 1
- sub-item 2

Table

Header 1	Header 2
Row1Col1	Row1Col2
Row2Col1	Row2Col2

Cat breed	Average weight (lbs)
Devon Rex	5-10
Ragdool	8-20
American Shorthair	10-15

Others

Insert link

Click here for R Markdown cheatsheet

Insert Image

Orca

Code Chunks

# This is a code chunk in R Markdown
data <- c(1, 2, 3, 4, 5)
mean(data)

## [1] 3

Output Formats

R Markdown can be rendered into various formats: > HTML PDF Word Slides And more…

#R project

this automatically sets the working directory to the project’s root folder.

How to create a RStudio Project:
- Open R Studio
- Start a new project
- Chose a director and project a name for your project
- And then create a markdown file or any file, this file will always be in the folder where you created the project

NA

NA stands for not available in a dataset > Missing or undefined data, like blanks in the data

#create a list  of cat names
cat_name<-c("Orca", "Charlie", "Luna", "Paws","Cookie")
#create a list of cat weight recorded
cat_weight<-c(4.5, 6.2, NA, 3.8, NA)
#create a list of cat potty activity detected
Cat_potty<-c(TRUE, FALSE, NA, FALSE, TRUE)

Data cleaning options for NA

In a list

#check if a value in the list is NA
na_check_cat_weight<- is.na(cat_weight)
na_check_cat_weight

## [1] FALSE FALSE  TRUE FALSE  TRUE

#Count the number of NA in the list
sum(is.na(cat_weight))

## [1] 2

#Remove NA from the list
na_removal_cat_weight <- na.omit(cat_weight)
na_removal_cat_weight

## [1] 4.5 6.2 3.8
## attr(,"na.action")
## [1] 3 5
## attr(,"class")
## [1] "omit"

#replace NA with another value, in this case, 0
na_replace_cat_weight <- cat_weight
na_replace_cat_weight[is.na(cat_weight)] <- 0
na_replace_cat_weight

## [1] 4.5 6.2 0.0 3.8 0.0

Combine the lists into dataframe

cat_health <- data.frame(name = cat_name, weight = cat_weight, potty_activity = Cat_potty)
cat_health

##      name weight potty_activity
## 1    Orca    4.5           TRUE
## 2 Charlie    6.2          FALSE
## 3    Luna     NA             NA
## 4    Paws    3.8          FALSE
## 5  Cookie     NA           TRUE

Operation with ‘NA’ in the data

#use na.rm=TRUE to ignore na is it exist
mean_weight<-mean(cat_health$weight,na.rm=TRUE)
mean_weight

## [1] 4.833333

#use complete.cases to identify rows in a data frame that contains NA
luna_row <- cat_health[cat_health$name == "Luna", ]  # Subsetting the dataframe for Luna's row
luna_completion <- complete.cases(luna_row)  # Checking completeness for Luna's row
luna_completion

## [1] FALSE

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

Assignment demo

Stats Team

For the function assigned
- create a data frame
- run your function on your data frame
- put down notes for the input and output of this function

Function: Head()

# Creating an data frame for anything you are interested in, you can also make it up (min, 7 rows)
disney_movies <- data.frame(
  Movie = c("The Lion King", "Frozen", "Aladdin", "Mulan", "Beauty and the Beast", "Tangled", "Pocahontas"),
  Release_Year = c(1994, 2013, 1992, 1998, 1991, 2010, 1995),
  IMDb_Rating = c(8.5, 7.4, 8.0, 7.6, 8.0, 7.7, 6.7),
  Protagonist = c("Simba", "Elsa & Anna", "Aladdin", "Mulan", "Belle", "Rapunzel", "Pocahontas"),
  Antagonist = c("Scar", "Duke of Weselton", "Jafar", "Shan Yu", "Gaston", "Mother Gothel", "Governor Ratcliffe"),
  Main_Song = c("Circle of Life", "Let It Go", "A Whole New World", "Reflection", "Beauty and the Beast", "I See the Light", "Colors of the Wind")
)

# This display your data frame
disney_movies

##                  Movie Release_Year IMDb_Rating Protagonist         Antagonist
## 1        The Lion King         1994         8.5       Simba               Scar
## 2               Frozen         2013         7.4 Elsa & Anna   Duke of Weselton
## 3              Aladdin         1992         8.0     Aladdin              Jafar
## 4                Mulan         1998         7.6       Mulan            Shan Yu
## 5 Beauty and the Beast         1991         8.0       Belle             Gaston
## 6              Tangled         2010         7.7    Rapunzel      Mother Gothel
## 7           Pocahontas         1995         6.7  Pocahontas Governor Ratcliffe
##              Main_Song
## 1       Circle of Life
## 2            Let It Go
## 3    A Whole New World
## 4           Reflection
## 5 Beauty and the Beast
## 6      I See the Light
## 7   Colors of the Wind

#run you function here
head(disney_movies)

##                  Movie Release_Year IMDb_Rating Protagonist       Antagonist
## 1        The Lion King         1994         8.5       Simba             Scar
## 2               Frozen         2013         7.4 Elsa & Anna Duke of Weselton
## 3              Aladdin         1992         8.0     Aladdin            Jafar
## 4                Mulan         1998         7.6       Mulan          Shan Yu
## 5 Beauty and the Beast         1991         8.0       Belle           Gaston
## 6              Tangled         2010         7.7    Rapunzel    Mother Gothel
##              Main_Song
## 1       Circle of Life
## 2            Let It Go
## 3    A Whole New World
## 4           Reflection
## 5 Beauty and the Beast
## 6      I See the Light

head(disney_movies,2)

##           Movie Release_Year IMDb_Rating Protagonist       Antagonist
## 1 The Lion King         1994         8.5       Simba             Scar
## 2        Frozen         2013         7.4 Elsa & Anna Duke of Weselton
##        Main_Song
## 1 Circle of Life
## 2      Let It Go

Notes:( here, put in notes of what is the input and what the function does to the dataframe and what is the output )

Input: head(dataframe) or head(dataframe, #)

Output:

by default, head() pulls out the first 6 rows in a data frame head(data_frame, #) returns the first # rows in a data frame

Name	Assigned Function
Christina Nguyen	`head()`
Marina Zhao	`tail()`
Gardenia Chang	`str()`
Kelly Zhen	`summary()`
Jack Oberdorfer	`length()`
Emily Zhu	`dim()`
Wesley Martinez	`class()`
Francesca Casimiro	`names()`

Theory team

Include everything in the list:
Title
Subtitle (name, date, etc)
Text formatting
- Bold,
- Italic
- Strikethrough
- Blockquote
- Endash
List
Table
In the code chunks:
- A string
- A variable
- An math equation of your choice
In a separate csv file:
- create a table of at least 3 columns and 5 rows
- import the csv into your R studio
Next in the code chunk, create:
- 5 vectors (that is the same as the info in your table)
- use your vector to combine into a dataframe (should look the same as your table in the csv)

What to upload in the end:

RMD file- original code
HTML file- what is outputted from your code
Where? (https://drive.google.com/drive/folders/14IbsxXR-ogkwbbycHi-JdrnNEwUjLCaw)

Demo 1

Bethany

2023-08-14