R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. R is a programming language developed by Ross Ihaka and Robert Gentleman at the University of Auckland in 1993
R can be downloaded for free from https://www.r-project.org/
R Studio allows the user to run R in a more user-friendly environment. It is open-source (i.e. free) and available at https://rstudio.com/products/rstudio/download/
| # 1.1 R as a calculator |
| ## 1.1.1 Performing variuos arithmetic operations: Addition(+),Subtraction(-), multiplication(*), exponent(^) etc |
r 25+3 |
## [1] 28 |
r 25-3 |
## [1] 22 |
r 25*3 |
## [1] 75 |
r 25/3 # division |
## [1] 8.333333 |
r 25%/%3 # given the integer value when 25 is divided by 3 |
## [1] 8 |
r 2**3 |
## [1] 8 |
r 2^3 |
## [1] 8 |
| ## 1.1.2 Builtin functions Built-in functions refer to a set of pre-defined functions |
r exp(6) |
## [1] 403.4288 |
r sqrt(36) #square root of 36 |
## [1] 6 |
r sum(2,3,4,5,6) |
## [1] 20 |
r log(64) |
## [1] 4.158883 |
r log(10,2) #log 10 to base 2 |
## [1] 3.321928 |
r log(42,10)#log 42 to the base 10 |
## [1] 1.623249 |
r log(5,3) |
## [1] 1.464974 |
r factorial(5) # 5!= 1*2*3*4*5 |
## [1] 120 |
r abs(-8.5)#absolute value |
## [1] 8.5 |
r floor(3.8) #greatest integer less than 3.8 |
## [1] 3 |
r ceiling(3.2) #next integer to 3.2 |
## [1] 4 |
r rep(35,times=10) #repeate 35 10 times |
## [1] 35 35 35 35 35 35 35 35 35 35 |
r rep("Happy", times=5) |
## [1] "Happy" "Happy" "Happy" "Happy" "Happy" |
r ## [1] "Happy" "Happy" "Happy" "Happy" "Happy" 5:9 # display numbers from 5 to 9 |
## [1] 5 6 7 8 9 |
r 1:100 |
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ## [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 ## [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 ## [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 ## [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 ## [91] 91 92 93 94 95 96 97 98 99 100 |
r seq(5,9) # generates a sequence of numbers from 5 to 9, |
## [1] 5 6 7 8 9 |
r seq(5,10,0.5) #generates a sequence of numbers starting from 5, incrementing by 0.5 till 10 |
## [1] 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 |
r seq(1,50,5) |
## [1] 1 6 11 16 21 26 31 36 41 46 ## 1.1.3
Relational Operators Relational operators in R are used to compare
values and determine the relationship between them. |
r 2<5 |
## [1] TRUE |
r 3>9 |
## [1] FALSE |
r 2+3==6 #to check whether 'is equal to' |
## [1] FALSE |
r 2!=3 #to check whether 'not equal to' |
## [1] TRUE # 1.2 Assigning value to a variable we can
assign a value to a variable using the assignment operator <- or the
equal sign = syntax: variable_name <- value |
r x <- 48 # x is assigned the value 48 print(x) |
## [1] 48 |
r x/5 |
## [1] 9.6 |
r x*2 |
## [1] 96 |
r y<-"Happy" rm(x) #to remove x from memory rm(y) #
1.3 Vectors In R, a vector is a fundamental data structure that can hold
a collection of values of the same data type. Vectors are essential in R
and are used extensively in data analysis, statistics, and programming.
c() is used to create vectors. |
| ```r a<-c(20,30,40,45,56) #easiest way to create a vector in R |
| str(a) # it will provide information about the data type of a and its contents ``` |
## num [1:5] 20 30 40 45 56 |
r print(a) |
## [1] 20 30 40 45 56 |
r View(a) # Subsetting (operator [ ]) |
| Subsetting using the [ ] operator in R allows you to extract specific elements or subsets of elements from an object |
r a[3] # to extract third value or subset of vector "a" with 3rd element |
## [1] 40 |
r a[5] |
## [1] 56 |
r a[c(1,3,5)] #to extract several values |
## [1] 20 40 56 |
r a[-2] #to drop second value |
## [1] 20 40 45 56 |
r a[-3] |
## [1] 20 30 45 56 |
r a[c(-2,-5)]# to drop 2nd and 5th value |
## [1] 20 40 45 |
r length(a) # To find number of elements |
## [1] 5 # Find the class of a vector |
r a1<-c(20,30,40,45,56) class(a) # To find class of a vector |
## [1] "numeric" |
r m <- c(5, 'a', -1, 2) class(m) |
## [1] "character" |
r m<-c(TRUE,F,T,FALSE) class(m) |
## [1] "logical" |
r sapply(m, class) # display class of all elements |
## [1] "logical" "logical" "logical" "logical"
#Operations on vectors |
r b<-c(1,2,3,4,5) a+b |
## [1] 21 32 43 49 61 |
r a*b |
## [1] 20 60 120 180 280 |
r a-b |
## [1] 19 28 37 41 51 # Question 1 |
| Construct a vector with elements -2,3,-6,10,7 and assign it as X and another vector with elements 11,23,14,52,16 and assign it as Y a) Find length of X and Y b) Remove 2 nd element from Y c) Find 4th element of X d) Find X+Y , X*Y e) Find Y/X and round to 1 decimal place |
r X<-c(-2,-3,-6,10,7) Y<-c(11,23,14,52,16) length(X) |
## [1] 5 |
r length(Y) |
## [1] 5 |
r Y[-2] |
## [1] 11 14 52 16 |
r X[4] |
## [1] 10 |
r X+Y |
## [1] 9 20 8 62 23 |
r X*Y |
## [1] -22 -69 -84 520 112 |
r round(Y/X,1) |
## [1] -5.5 -7.7 -2.3 5.2 2.3 |
r data <- c("apple", 3.14, 42, TRUE) class(data[2]) |
## [1] "character" |
| # Question 2 The weight before and after a diet plan for a group of # 5 people are given. Find the weight loss and also its mean # before: 78,72,78,79,105 # after : 67,65,79,70,93 |
r before<-c(78,72,78,79,105) after<-c(67,65,79,70,93) wtloss<-before-after wtloss |
## [1] 11 7 -1 9 12 |
r mean(wtloss) |
## [1] 7.6 |
r max(wtloss) |
## [1] 12 |
r min(wtloss) |
## [1] -1 |
| # Practice Questions |
| ## Question 1: Create a vector grades with elements “A”, “B”, “C”, “D”, “F”. Change the third element to “B+”. |
| ## Question 2: Create a vector temperatures with elements 72, 68, 75, 80, 77. Convert the temperatures from Fahrenheit to Celsius using the formula (Fahrenheit - 32) * 5/9. Round the Celsius temperatures to one decimal place. |
| # Question 3: Create two vectors, vector1 with elements 1 to 5 and vector2 with elements 6 to 10. Calculate the product of these vectors. |
| # Question 4: Create a vector names with elements “John”, “Jane”, “Bob”, “Alice”, “Eve”. Extract the first three elements of the vector and store them in a new vector called subset_names. |
| # Question 5: Create a vector data with the following values: “apple”, 3.14, 42, TRUE. Find the class of each element in the vector and store the results in a new vector called data_classes. |
In a dataframe, data is organized into rows and columns, where each column can contain data of a different data type.
# create a DATA FRAME
Applied <- data.frame(
Specialization = c("Bio","Chem","ES","Bio"),
Level = c("AD","Btech","Dip","Btech"),
GPA = c(3.21,3.63,2.15,2.01))
AS2 <- Applied[,c(2,3)]
# SUBSETTING: create a subset of dataframe 'Applied' containing the
# last and first columns only. Call it 'AS3'
AS3 <- Applied[,c(3,1)]
# SUBSETTING: create a subset of dataframe 'Applied' containing the
# 2nd column and last three rows only. Call it 'AS4'
AS4 <- Applied[c(2:4),2]
# Create a dataframe
data <- data.frame(
Name = c("Alice", "Bob", "Charlie", "David", "Eva"),
Age = c(25, 30, 22, 35, 28),
Score = c(92, 85, 78, 96, 88)
)
# Print the dataframe
print(data)
## Name Age Score
## 1 Alice 25 92
## 2 Bob 30 85
## 3 Charlie 22 78
## 4 David 35 96
## 5 Eva 28 88
# Access specific columns
names <- data$Name
ages <- data$Age
scores <- data$Score
# Print the values of specific columns
cat("Names: ", names, "\n")
## Names: Alice Bob Charlie David Eva
cat("Ages: ", ages, "\n")
## Ages: 25 30 22 35 28
cat("Scores: ", scores, "\n")
## Scores: 92 85 78 96 88
# Calculate summary statistics
mean_age <- mean(ages)
mean_score <- mean(scores)
cat("Mean Age: ", mean_age, "\n")
## Mean Age: 28
cat("Mean Score: ", mean_score, "\n")
## Mean Score: 87.8
# Filter data based on a condition (e.g., age greater than 25)
filtered_data <- data[data$Age > 25, ]
# Print the filtered dataframe
print(filtered_data)
## Name Age Score
## 2 Bob 30 85
## 4 David 35 96
## 5 Eva 28 88
# Sort the dataframe by Age in descending order
sorted_data <- data[order(data$Age, decreasing = TRUE), ]
# Print the sorted dataframe
print(sorted_data)
## Name Age Score
## 4 David 35 96
## 2 Bob 30 85
## 5 Eva 28 88
## 1 Alice 25 92
## 3 Charlie 22 78