An RStudio Project is like your personal workspace. It sets the working directory to the project’s root folder, making it easier to manage files and run code.
this automatically sets the working directory to the project’s root folder.
R Markdown is like a magic notebook for data work. You can write text, add code, and see the results of that code all in one place. When you’re done, you can turn your work into a report, presentation, or even a website. It’s a way to share your code, results, and explanations in a single document.
Pay attention to how the different number of # returns different sizes of font:
Okay, let’s dive into some basic text formatting.
For Bold, type: “Double Asterisk, Meow, Double Asterisk” or “Double Underscore, Meow, Double Underscore” (Pause and demonstrate)
Meow Meow
To make your text italic, it’s almost the same, but you’ll use just one asterisk or one underscore on each side of your text.
Meow Meow
There’s more! You can even do a strikethrough by wrapping your text with double tildes.
Meow Want to create a quote? Use the “Greater Than” symbol
to make a blockquote. For nested quotes, just add another “Greater Than”
symbol.
Why did the cat sit on the computer? > Because it wanted to keep an eye on the mouse!
Basically, more dash lines create a longer line
2023–08–20
2023-08-20
Cat—Felis catus
Lists
Lists are super easy. For bullet points, you use an asterisk followed by a space. To nest items, simply indent them and use a dash followed by a space.
For Bullet List, type: “Asterisk, Space, Your Item” For Nested Items, type: “Indent, Dash, Space, Your Sub-item”
Domestic Cats: The ones you might have at home!
For numbered lists, you use the number, a dot, and a space.
| Header 1 | Header 2 |
|---|---|
| Row1Col1 | Row1Col2 |
| Row2Col1 | Row2Col2 |
| Cat breed | Average weight (lbs) |
|---|---|
| Devon Rex | 5-10 |
| Ragdool | 8-20 |
| American Shorthair | 10-15 |
Links and Images To add a hyperlink, use square brackets for the text and parentheses for the URL. Like this:
For Hyperlink, type: “Open Square Bracket, Text, Close Square Bracket, Open Parenthesis, URL, Close Parenthesis”
Click here for R Markdown cheatsheet
Adding an image is similar but you’ll start with an exclamation mark.
For Image, type: “Exclamation Mark, Open Square Bracket, Text, Close Square Bracket, Open Parenthesis, Image Path, Close Parenthesis”
For this case, my image is already loaded into the working folder of this file, so I only need to refer to the name of the picture.
# This is a code chunk in R Markdown
data <- c(1, 2, 3, 4, 5)
mean(data)
## [1] 3
R Markdown can be rendered into various formats: * HTML For other output, you will need to download other packages, but these are what you can create * PDF * Word * Slides * And more…
Alright, folks, let’s move on to a critical topic in data analysis—handling missing or undefined data. In R, this is often represented by the term ‘NA,’ which stands for “Not Available.”
Why does this matter? Well, missing data can significantly impact your analyses and conclusions. So it’s essential to know how to deal with it.
NA stands for not available in a dataset > Missing or undefined data, like blanks in the data
First, let’s create some sample data to work with. I’m going to generate three lists—cat names, their weights, and their potty activities. Note that I’ll intentionally include some ‘NA’ values.
#create a list of cat names
cat_name<-c("Orca", "Charlie", "Luna", "Paws","Cookie")
#create a list of cat weight recorded
cat_weight<-c(4.5, 6.2, NA, 3.8, NA)
#create a list of cat potty activity detected
Cat_potty<-c(TRUE, FALSE, NA, FALSE, TRUE)
Checking for ‘NA’ in Lists
Now, what if you want to know which values in our weight list are missing? R provides a straightforward function called is.na() for that.
#check if a value in the list is NA
na_check_cat_weight<- is.na(cat_weight)
na_check_cat_weight
## [1] FALSE FALSE TRUE FALSE TRUE
See the output? The function returns a logical vector, indicating ‘TRUE’ where the data is ‘NA’ and ‘FALSE’ otherwise.
Next, you might want to know how many ‘NA’ values are there. For that, we can use sum() along with is.na().
#Count the number of NA in the list
sum(is.na(cat_weight))
## [1] 2
As you see, it shows the total count of ‘NA’ values in our list.
Now, what if you want to get rid of these pesky ‘NA’ values? Simple, you can use the na.omit() function.
#Remove NA from the list
na_removal_cat_weight <- na.omit(cat_weight)
na_removal_cat_weight
## [1] 4.5 6.2 3.8
## attr(,"na.action")
## [1] 3 5
## attr(,"class")
## [1] "omit"
Notice how the ‘NA’ values are removed from the list.
Replacing ‘NA’ Values Alternatively, you might want to replace ‘NA’ with a specific value. Let’s say, zero.
#replace NA with another value, in this case, 0
na_replace_cat_weight <- cat_weight
na_replace_cat_weight[is.na(cat_weight)] <- 0
na_replace_cat_weight
## [1] 4.5 6.2 0.0 3.8 0.0
Now, let’s combine our lists into a data frame, which is a more complex data structure.
cat_health <- data.frame(name = cat_name, weight = cat_weight, potty_activity = Cat_potty)
cat_health
## name weight potty_activity
## 1 Orca 4.5 TRUE
## 2 Charlie 6.2 FALSE
## 3 Luna NA NA
## 4 Paws 3.8 FALSE
## 5 Cookie NA TRUE
Let’s say you want to calculate the mean weight of these cats, ignoring the ‘NA’ values. You can do this by adding a parameter na.rm=TRUE to the mean() function
#use na.rm=TRUE to ignore na is it exist
mean_weight<-mean(cat_health$weight,na.rm=TRUE)
mean_weight
## [1] 4.833333
Finally, you can use complete.cases() to identify rows that have ‘NA’ values.
#use complete.cases to identify rows in a data frame that contains NA
luna_row <- cat_health[cat_health$name == "Luna", ] # Get Luna's information from the table
luna_completion <- complete.cases(luna_row) # Check if Luna's information is complete
luna_completion
## [1] FALSE
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.