Numbers (which includes integers and floats: e.g. 6.88 and 1)
Strings (which includes anything in quotations). Strings in R imply a character vector.
Not sure what a data type is? use the class() function
String Specific Functions:
Examples:
# vector of numbers
list1 <- c(1, 2, 3, 4, 5)
# vector of characters/strings
list2 <- c("Hello")
list3<- c("H", "e", "l", "l", "o")
# Checking the data type:
class(list3)
[1] "character"
# Concatenating strings/lists
x <- 3
print("This is the number:", x) # does not concatenate
[1] "This is the number:"
cat("This is the number:", x) # does concatenate
This is the number: 3
\(+\) or \(-\) addition or subtraction
\(*\) multiplication
/ division
%% remainder
** exponent
Booleans (true and false statements):
\(==\) Find where true –> not the same as just one equal sign, \(=\).
\(!=\) Find where false
# When given a list, l1, print the even numbers in the list.
l1 <- c(1,2,3,4,5,6)
for (value in l1) {
# print(value) # run this line of code to check if iterating over the values you want
if (value %% 2 == 0)
#print(value)
cat("Even numbers:", value, "\n")}
Even numbers: 2
Even numbers: 4
Even numbers: 6
# "\n" is equivalent to 'enter' on the key board so that you can print you print statements on new lines
For loops allows you to loop over lists, vectors, or columns in order to perform certain calculations, etc. for each variable in the loop. These loops are extremely helpful when you want to perform iterative processes - processes that you need to do over and over again but don’t want to type it out each time. However, the biggest challenge of writing for loops is understanding where variables are set and how ‘indented’ your lines of code should be.
If statements will allow you to make conditional statements so that you can, for example, filter your data for specific requirements.
Counts are great to set up when you want to be able to keep track of certain calculations or check your code.
Creating empty lists and knowing where to call/create them is crucial when trying to store information or calculations generated by a for loop/if statement.
# How many beak lengths, and which beak lengths, are greater than or equal to 8 mm for female soapberry bugs?
i <- 0
j <- 0
l <- vector(mode="list", length=0) # this is an empty list
for (beak_length in female_data$beak) {
i <- i + 1
if (beak_length >= 8){
j <- j + 1
l <- c(l, beak_length)}
}
cat("Total number of beak_lengths:", i)
Total number of beak_lengths: 94
cat("Total number of beak_lengths > 8 mm:", j)
Total number of beak_lengths > 8 mm: 38
Reading a CSV file is important but writing a CSV file is great because you can store your new data/calculations in a file.
matrix <- matrix(l) # Trick: Converted the list into a matrix so can get 1 column of values
matrix
[,1]
[1,] 8.44
[2,] 8.44
[3,] 8.55
[4,] 8.42
[5,] 8.82
[6,] 8.87
[7,] 9.29
[8,] 9.34
[9,] 8.63
[10,] 8.58
[11,] 8.99
[12,] 8.94
[13,] 8.16
[14,] 8.04
[15,] 8.48
[16,] 8.49
[17,] 9.09
[18,] 9.07
[19,] 8.59
[20,] 8.49
[21,] 8.7
[22,] 8.86
[23,] 8.8
[24,] 8.56
[25,] 8.45
[26,] 8.9
[27,] 8.34
[28,] 8.4
[29,] 8.91
[30,] 8.54
[31,] 8.58
[32,] 8.77
[33,] 9.27
[34,] 8.34
[35,] 8.6
[36,] 9.24
[37,] 8.29
[38,] 8.07
dim(matrix) # Can check the dimensions of the matrix by using the function dim()
[1] 38 1
new_data <- write.table(matrix, "new_data.csv", row.names=FALSE,
col.names="beak lengths > 8 mnm")
# If don't convert the list into a matrix, you will get 38 columns with one value each! Not what we want. See for yourself,
list_data <- write.table(l, "list_data.csv")
i<-0
l2 <- vector(mode = "list", length = 0)
for (b in female_data$beak) {
if (b < 6.00) {
i <- i + 1
l2 <- c(l2, b)
cat("Beak length that's smaller than 6: ", b)
}}
Beak length that's smaller than 6: 5.77
print(i)
[1] 1
print(l2)
[[1]]
[1] 5.77
matrix2 <- matrix(l2)
matrix2
[,1]
[1,] 5.77
# Writing to a csv file using write.table
new_data2 <- write.table(matrix2, "new_data.csv", sep=",", row.names=FALSE,
col.names=c("filtered_beak_length"))
Let’s say you had this list:
l <- c(1, 2, 3, 4, 5)
How would you go about printing only the odd numbers? Why would doing this be helpful? (e.g. indexing, rows)
Now, let’s add those even numbers back in to recreate the same list.
What if you needed to see which values in a column were outlier values? How would you go about this? (Hint: google very basic outlier tests that can be done in R and see if you can read other people’s code and use that for your own projects and programming)