Terhemen Hulugh
2024-07-28
The purpose of the first exercise of the day is to introduce you to R, Rstudio and some basic concepts in R.
After completing the exercise you should be able to:
Install R from the R homepage: https://www.r-project.org/
Install Rstudio from the R studio homepage: https://www.rstudio.com/
R studio is divided into 4 quadrants:
## [1] "R version 4.3.0 (2023-04-21 ucrt)"
To update R, run the codes below:
Difference between a Package and Function in R (Give examples)
Install and load the following packages: tidyverse, dplyr and ggplot2 using the install.packages() and library() functions respectively.
Always ensure the package name is in double quotes (” “) when it is used in install.packages() function.
## [1] "C:/Users/TERHEMEN HULUGH/Documents/R Training"
It is recommend that you also specify folders on your computer by creating data, script, results and graphics folders. Note: - You have to replace the path with one that matches your setup - i.e. copy & paste from the folder
## [1] "C:/Users/TERHEMEN HULUGH/Desktop/R Training"
The hash (#) symbol shows that everything to the right of it is a comment hence R will ignore during evaluation
assignment operator (<-) assigns a value to a vector or object
c() function in R can be used to create vectors of objects.
auto print and explicit print using the print function
## [1] "John" "Peter" "Kingsley" "Andrew"
## [1] "John" "Peter" "Kingsley" "Andrew"
A vector is an object that contains one class. However the one exception to this rule is for vectors created as a list.
The 5 basic atomic classes of objects in R are:
numeric
integer
character
logical
complex
Matrices are an array of numbers and are constructed column wise using the matrix() function
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 3 5 7 9
## [2,] 2 4 6 8 10
Create matrix using the dim function
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 3 5 7 9
## [2,] 2 4 6 8 10
Create matirx using cbind and rbind functions
## x y
## [1,] 1 11
## [2,] 2 12
## [3,] 3 13
## [4,] 4 14
## [5,] 5 15
## [,1] [,2] [,3] [,4] [,5]
## x 1 2 3 4 5
## y 11 12 13 14 15
Factor variables are either unordered or ordered (used for modeling functions like lm, glm)
## [1] yes no yes yes no yes
## Levels: no yes
## [1] "no" "yes"
## x
## no yes
## 2 4
R normally uses alphabetical order to order the levels of factor variables but this can be changed to your preference using the levels argument in the factor function
x <- factor(c("male", "female", "male", "male", "female", "male"),
levels = c("male", "female"))
levels(x)## [1] "male" "female"
## x
## male female
## 4 2
Missing values are either NA or NaN can be identified using is.na() or is.nan() functions
## [1] FALSE FALSE TRUE FALSE FALSE
## [1] FALSE FALSE FALSE FALSE FALSE
## [1] FALSE FALSE TRUE TRUE FALSE
## [1] FALSE FALSE FALSE TRUE FALSE
## foo bar dro
## 1 1 TRUE male
## 2 2 FALSE male
## 3 3 FALSE female
## 4 4 TRUE male
## [1] 4
## [1] 3
## 'data.frame': 4 obs. of 3 variables:
## $ foo: int 1 2 3 4
## $ bar: logi TRUE FALSE FALSE TRUE
## $ dro: chr "male" "male" "female" "male"
## foo bar dro
## Min. :1.00 Mode :logical Length:4
## 1st Qu.:1.75 FALSE:2 Class :character
## Median :2.50 TRUE :2 Mode :character
## Mean :2.50
## 3rd Qu.:3.25
## Max. :4.00
The names() function assigns names to the columns of a dataset
## [1] 1 2 3
## NULL
## foo boo bar
## 1 2 3
## [1] "foo" "boo" "bar"
There are principal functions for reading data into R
The following are functions for saving data in R
There are a number of operators that can be used to extract subsets from an object
## [1] "a"
## [1] "d"
## [1] "a" "b" "c" "b"
## [1] "b" "c" "b" "d"
## $foo
## [1] 1 2 3 4
## [1] 1 2 3 4
## [1] 1 2 3 4
## $foo
## [1] 1 2 3 4
##
## $goo
## [1] "hello"
## [1] 1 2 3 4
## [1] 1 2 3 4
## [1] 0.6
## [1] 0.6
## $bar
## [1] 0.6
## [1] 5
## [1] 4
## [,1]
## [1,] 5
If you’ve installed swirl in the past make sure you have a more recent version like version 2.2.21 or later. You can check your current version by typing this in the console: packageVersion(“swirl”)
Load swirl Every time you want to use swirl, you need to first load the package. From the R console: library(swirl)
Install the R Progroamming course swirl offers a variety of interactive courses, but for our purposes, you want the one called R Programming. Type the following from the R prompt to install this course: install_from_swirl(“R Programming”)
Type the following in the R studio console to start swirl: swirl()
Then, follow the menus and select the R Programming course when given the option.
For the first part of this course you should complete the following lessons:
Basic Building Blocks
Workspace and Files
Sequences of Numbers
Vectors
Missing Values
Subsetting Vectors
Matrices and Data Frames