Software Tools for Earth and Environmental Science

–6th WEEK–

Emir Toker

27/11/2020

R Language - Part 1 & 2

Syllabus, Last Week and Book
This Week
- R Language Part-1 : Data Types (Repeat)
- R Language Part-2 : Data Structures
- Take Home
  - HMW-1 : Create a New R-Notebook
  - HMW-2 : Practice for Data Types and Data Structures
- QUIZ
Next Week
- R Language Part-3 : Read, Write and Plot Data
- Midterm Project

Syllabus, Last Week and Book

Syllabus

Extended Syllabus PDF

Last Week

LINK

Book

PDF - (Pg. 22-36 and 89-101)

R Language

R Language - Part 1

Getting Started
- Assignment
- Comment
- Basic and Scientific Math
Data Types - Classes
- Numeric
- Integer
- Logical
- Character
- Special

R Language - Part 2

Data Structures - Objects
- Vector
- Matrice
- Array
- Data Frame
- List

R Language - Part 3

Attributes
Read
Write
Plot
ggplot2 Package

R Language - Part 1

Summary

Assignment; <-
Comment; #
Help; ?func .or. help(func)
Install Packages; install.packages()
Call from Library; library()
Basic Math;
- addition; +
- subtraction; -
- multiplication; *
- division; /
- exponentiation; ^
- the square root; sqrt

Questions

What is Function ?
What is Package ?
What is R Library ?

Answer

R packages are a collection of R functions, complied code and sample data. They are stored under a directory called “library” in the R environment.

Codes –> Functions –> Packages –> Library

R Library
- Packages
  - Functions
    - Codes

Question

How can i download a package in my library ?

Install Package - 1st Solution

Install Package - 1st Solution - ggplot2

Question

How can i use or call my package in R ?

Call Package - 1st Solution - ggplot2

Install and Call Package - 2nd Solution - ggplot2

Console

install.packages(“ggplot2”)
library(ggplot2)

Basic Math

a <- 2.3
(6*a+42)/(3^(4.2-3.62))

## [1] 29.50556

Question

a <- 2.3
(6*a+42)/(3^(4.2-3.62))

## [1] 29.50556

isTRUE((6*a+42)/(3^(4.2-3.62))==29.50556)

## [1] FALSE

Question

a <- 2.3
(6*a+42)/(3^(4.2-3.62))

## [1] 29.50556

isTRUE((6*a+42)/(3^(4.2-3.62))==29.50556)

## [1] FALSE

hint : print(x, digits=16)

Data Types - Classes

Numeric

# Any number with (or without) a decimal point.
a <- 3

Integer

# Sub-class of the numeric class. The suffix L tells R to store.
a <- 3L

Logical

# TRUE or FALSE - Logical Operators. < , > , == , >= , <= , != ... 
a <- 3<2

Character

# Data type consists of letters or words. String. with quotes: " … "
a <- "3"

is.XXX() and class()

Data Types - Classes

name1 <- emir
name1 <- "emir"
name2 <- name1
name3 <- "name1"

number1 <- 32
number2 <- "32"
number3 <- 1:10
number4 <- seq(1,10)
number5 <- seq(1,10,2)

var1 <- TRUE
var2 <- "TRUE"

answer1 <- is.logical(var1)
answer2 <- var1 + answer1 / 3

surname1 <- "toker"
print(name1)
print(surname1)

print(name1,surname1)

is.XXX() and class()

BONUS - Data Structures - Factor

Factors are a special variable type for storing categorical variables.
They sometimes behave like strings, and sometimes like integers.

gender = c("male", "female", "male", "male", "female")
gender
class(gender)
str(gender)
gender[2]


gender_factor <- factor(c("male", "female", "male", "male", "female"))
gender_factor
class(gender_factor)
str(gender_factor)
gender_factor[2]

R Language - Part 2

Data Structures - Objects
- Vector
- Matrice
- Array
- Data Frame
- List

Data Structures (R-Objects)

(Atomic) Vector

The simplest data structure in R

Vectors are a list-like structure that contain items of the same data type.

spring_month <- "April"
spring_month
spring_months <- c("March", "April","May","June")
spring_months
class(spring_months)

c means “combine”

Data Structures - (Atomic) Vector

Vector : The simplest data structure in R

name <- "emir"
surname <- "toker"
name_surname <- c(name,surname)
class(name_surname)
length(name_surname)

vector1 <- c("21","21")  
vector2 <- c("21",21)  
vector3 <- c(21,21)

print(), class(), length()

Data Structures - (Atomic) Vector

spring_months <- c("March", "April","May","June")

spring_months

length(spring_months)

dim(spring_months)     # dimension

[] for indexing

spring_months[1]

spring_months[3:4]

str(spring_months)     # structure

print(), class(), length(), dim(), str()

Data Structures - (Atomic) Vector

Basic string manipulations

substr(spring_months, start = 1, stop = 2)     # substrings

strsplit(spring_months,"")          # string split

gsub("a", "A", spring_months)       # matching and eeplacement

?substr , ?strsplit , ?gsub

(Atomic) Vector

myvec <- c(1, 3, 1, 42)
a <- 35
myvec2 <- c(3L, 3.45, 1e+03, 64^0.5, 2+(3-1.1)/9.44, a)
myvec3 <- c(myvec, myvec2)
class(myvec3)

x <- c("all", "b", "olive")

Length of a vector, length(vector_name)

length(x)

Indexing element, vector_name[element_position]

x[2]

Manipulating element of vector, assigning arrow

x[2] <- "b_new"
x

Note: In R, counting elements start position 1, not 0.

(Atomic) Vector

y <- c( 1.2, 5, "Rt", "2000", 20, 4905)
y [0]
class(y)
y

Sequences

7:16.4
a <- 7:16
a
seq(from=7,to=16,by=3)
seq(50,150,25)
seq(50,149,25)
seq(from=3,to=27,length.out=40)

Round

3/2
round(3/2)
round(5.1)
round(pi)
round(pi, digits = 6)

(Atomic) Vector

Repetition

rep(x=1, times=4)
rep(x=c(3, 62, 8),times=3)
rep(x=c(3, 62, 8),each=2,times=3)

Sorting

sort(x=c(2.5, -1, -10, 3.44))             # decreasing=FALSE (default)
sort(x=c(2.5, -1,- 10, 3.44), decreasing=TRUE)

Random - Uniform Distribution

runif(15, min = 20, max = 45)
runif(15, 20, 45)
runif(25, 60, 50)

Random variable can be saved

set.seed(1)
runif(15, 20, 45)

Data Structures - Matrice

Vectors indexed using two indices instead of one.

[ row, col ]

a <- c(1:3)
# str(a), dim(a), length(a)
b <- matrix(1:3, nrow = 1, ncol = 3)
# str(b), dim(b), length(b)

Data Structures - Matrix

a <- c(1:3)
b <- matrix(1:3, nrow = 1, ncol = 3)

## [1] 1 2 3

##      [,1] [,2] [,3]
## [1,]    1    2    3

c <- matrix(1:9, nrow = 3, ncol = 3)

##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9

d <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE)

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9

Matrices

m_mycol <- matrix(c(1, 2, 3, 4, 5, 6),
                  nrow = 2,
                  ncol = 3,
                  byrow = FALSE)       # Default
m_mycol 

m_byrow <- matrix(c(1, 2, 3, 4, 5, 6),
                  nrow = 2,
                  ncol = 3,
                  byrow = TRUE)     
m_byrow      
t(m_byrow)

length(m_byrow)
dim(m_byrow)

Matrices

n <- runif(9,1,100)    
n
matrix(n, nrow = 3, ncol = 3)

n2 <- runif(10,1,100)    
matrix(n2, nrow = 3, ncol = 3)

Matrices

x <- as.numeric(seq(from=10,to=120,by=10))
length(x)
mx <- matrix(x,3,4)                  # n, nrow, ncol
mx
class(mx)
class(mx[1])
typeof(mx)

mx[1,]
mx[,2]
mx[,2:4]
mx_new <- mx[,2:4]
mx
mx[2,3] <- "rose"
mx
class(mx)
typeof(mx)

mx_new <- as.numeric(mx)
mx_new
class(mx_new)
typeof(mx_new)

Data Structures - Matrix

my_mat <- matrix(runif(n=20, min=0, max=100), nrow = 4,  ncol = 5)

##          [,1]     [,2]     [,3]     [,4]     [,5]
## [1,] 62.97103 63.32477 66.15347 36.81336 94.89551
## [2,] 80.45485 75.46393 11.02355 18.83055 95.70202
## [3,] 89.69659  3.65689 75.38308 62.67887 15.56819
## [4,] 16.83077 27.61296 97.97371 92.28101 78.88344

add <-  matrix(seq(from=10, to=60, by=10), nrow = 2, ncol = 3)

##      [,1] [,2] [,3]
## [1,]   10   30   50
## [2,]   20   40   60

my_mat[2:3,2:4] <- add

##          [,1]     [,2]     [,3]     [,4]     [,5]
## [1,] 62.97103 63.32477 66.15347 36.81336 94.89551
## [2,] 80.45485 10.00000 30.00000 50.00000 95.70202
## [3,] 89.69659 20.00000 40.00000 60.00000 15.56819
## [4,] 16.83077 27.61296 97.97371 92.28101 78.88344

Data Structures - Array

lots of matrices…

array(x, dim = c(raw,col,level))

Data Structures - Array

[ row, col, level ]

Data Structures - Array

str(arr)

##  int [1:3, 1:4, 1:2] 1 2 3 4 5 6 7 8 9 10 ...

# dim(), class()
# length(), typeof()
# QUESTION : where is 17
# x[ ? , ? , ? ]

x <- 1:24

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

arr <- array(x, dim = c(3,4,2)) #raw,col,level

## , , 1
## 
##      [,1] [,2] [,3] [,4]
## [1,]    1    4    7   10
## [2,]    2    5    8   11
## [3,]    3    6    9   12
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4]
## [1,]   13   16   19   22
## [2,]   14   17   20   23
## [3,]   15   18   21   24

Data Structures - Array

[ row, col, level ]

arr <- array(data=10:30,dim=c(2,5,2))

## , , 1
## 
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   10   12   14   16   18
## [2,]   11   13   15   17   19
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   20   22   24   26   28
## [2,]   21   23   25   27   29

arr[2,2:4,]

##      [,1] [,2]
## [1,]   13   23
## [2,]   15   25
## [3,]   17   27

arr[1,1:5,2]

## [1] 20 22 24 26 28

Data Structures - Array

array <- array(data=seq(2,144,2),dim=c(3,6,4))

## , , 1
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    2    8   14   20   26   32
## [2,]    4   10   16   22   28   34
## [3,]    6   12   18   24   30   36
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]   38   44   50   56   62   68
## [2,]   40   46   52   58   64   70
## [3,]   42   48   54   60   66   72
## 
## , , 3
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]   74   80   86   92   98  104
## [2,]   76   82   88   94  100  106
## [3,]   78   84   90   96  102  108
## 
## , , 4
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]  110  116  122  128  134  140
## [2,]  112  118  124  130  136  142
## [3,]  114  120  126  132  138  144

array[1,1,1:4]
array[1,1,]
array[1,1,4:1]
array[1,1,c(1:4!=2)]

array[1,1,2]
array[1,1,which(x==2)]
array[1,1,which(x<=2)]

array[1,1,2:4]
array[1,1,-1]
array[1,1,c(-1,-2)]

array[1,2:5,2]
array[1,c(2,3,4,5),2]
array[1,c(2,5),2]

array[1,c(2,5),2:3]
array[,c(2,5),2:3]

[ row, col, level ]

Data Structures (R-Objects)

Data Structures - Data Frame

Each element is of the same length, like a matrix.
A column can have different types.
BUT, all the elements within a column are the same type.

Data Structures - Data Frame

person  <- c("Peter","Lois","Meg","Chris","Stewie")
age     <- c(42,40,17,14,1)
gender  <- factor(c("M","F","F","M","M"))
married <-c(TRUE,TRUE,FALSE,FALSE,FALSE)

m1 <- matrix(c(person,married),5,2)
m2 <- matrix(c(age,married),5,2)
m3 <- matrix(c(person,age),5,2)
m4 <- matrix(c(person,gender),5,2)

no need to Combine

df <- data.frame(person,married)
df

class(df)
dim(df)
length(df)
str(df)
df$
df$person
as.character(df$person)

df <- data.frame(person,age,gender,married)
df
str(df)

Data Structures - Data Frame

new_person  <- "Brian"
new_age     <- 7
new_gender  <- factor("M")
new_married <- FALSE

new_record_row <- data.frame(new_person,new_age,new_gender,new_married)
new_df1 <- rbind(df, new_record_row)      # Combine R Objects by Rows
new_df1

surname <- c("Yilmaz","Zeki","Fena","Caliskan","Uslu","Guzel")
new_record_col <- data.frame(surname)
new_df2 <- cbind(new_df1, new_record_col) # Combine R Objects by Columns

new_df2
new_df2[c(5,6),]
new_df2[c(5,6),] <- new_df2[c(6,5),]
new_df2

new_df2[5]

rbind(), cbind()

Data Structures - Data Frame

new_df2
new_df2[1]   # 1st column
new_df2[1,]    # 1st row, all columns
length(new_df2[1])
dim(new_df2[1])


new_df2[[1]]
length(new_df2[[1]])
dim(new_df2[[1]])

new_df2[[1]][2]
new_df2[[1]][2:5]

new_df2$person[2]
new_df2$person[2:5]

new_df2
new_df2[2:3,1:5]
new_df2[2:3,]

new_df2[2,1]
new_df2[2:2,1:1]

attributes(new_df2)

Data Structures - List

Lists are like atomic vectors because they group data into a one-dimensional set.
Lists are like data frame because they can group different types of data.
BUT, the length of elements is NOT important.

Data Structures - List

my_matrix <- matrix(data=1:4,nrow=2,ncol=2)
my_vector <- c(T,F,T,T)
my_var <- "hello"
my_data_frame <- new_df2     # I've created already

my_list  <- list(my_matrix,my_vector,my_var,my_data_frame)
my_list

Data Structures - List

my_list
class(my_list)
str(my_list)
dim(my_list)
length(my_list)

Data Structures - List

How can i choose first member/component of my_list ?

Data Structures - List

my_list

my_list[1]

my_list[1][1]

my_list[1][1][1]

my_list[[1]]

Data Structures - List

How can i choose the first row and the first column of the first member/component of my_list ?

Data Structures - List

my_list

my_list[[1]][1,1]

QUIZ

R Language - Part - I & II

kahoot.it

Next Week

R Language - Part 3

Read
Write
Plot
Practice with the data
Create a function
Midterm Project

Read

library(help="datasets")

list.files("/Users/emirtoker/Desktop/Dersler/Memurluk/Software_Tools_for_Earth_&_Environmental_Science/Software_Tools_R_Github/Presentation")

file.choose()

read.table(file = "18397_Cekmekoy_Omerli_15dk.txt")

read.table(file = "18397_Cekmekoy_Omerli_15dk.txt", 
          header=TRUE, sep=";")
          
read.table(file = "18397_Cekmekoy_Omerli_15dk.txt", 
          header=TRUE, sep=";", na.strings="-9999")

mydata_txt <- read.table(file = "18397_Cekmekoy_Omerli_15dk.txt",
                        header=TRUE, 
                        sep=";",
                        na.strings="-9999")
              
str(mydata_txt)

mydata_csv <- read.csv(file="18397_Cekmekoy_Omerli.csv",
                      header=TRUE,
                      na.strings="-9999")
                      
str(mydata_csv)

Read and Write

url <- "https://web.itu.edu.tr/tokerem/18397_Cekmekoy_Omerli_15dk.txt"
urldata_txt <- read.table(url,
                          header=TRUE, 
                          sep=";",
                          na.strings="-9999")

Write .TXT and .CSV

write.table(x=urldata_txt,file="somenewfile.txt")

write.table(x=urldata_txt,file="somenewfile.txt",
           sep=";",na="-9999",quote=FALSE,row.names=FALSE)

new_df2
write.table(x=new_df2,file="dffile.txt",
            sep=";",na="-9999",quote=FALSE,row.names=FALSE)
            
write.table(x=new_df2,file="dffile.csv",
            sep=";",na="-9999",quote=FALSE,row.names=FALSE)

Plot

foo <- c(1.1,2,3.5,3.9,4.2)
bar <- c(2,2.2,-1.3,0,0.2)
plot(foo,bar)

type the supplied coordinates (for example, as stand-alone points or joined by lines or both dots and lines).
main, xlab, ylab Options to include plot title, the horizontal axis label, and the vertical axis label, respectively.
col Color (or colors) to use for plotting points and lines.
lty Stands for line type. (for example, solid, dotted, or dashed).
lwd This controls the thickness of plotted lines.
xlim, ylim limits for the horizontal range and vertical range (respectively)

Plot

plot(foo,bar)
plot(foo,bar,type="l")
plot(foo,bar,type="b",main="My lovely plot",xlab="x axis label", ylab="location y")
plot(foo,bar,type="b",main="My lovely plot",xlab="",ylab="",col="red")

x <- 1:20
y <- c(-1.49,3.37,2.59,-2.78,-3.94,-0.92,6.43,8.51,3.41,-8.23,
-12.01,-6.58,2.87,14.12,9.63,-4.58,-14.78,-11.67,1.17,15.62)
plot(x,y,type="n",main="")
abline(h=c(-5,5),col="red",lty=2,lwd=2)
segments(x0=c(5,15),y0=c(-5,-5),x1=c(5,15),y1=c(5,5),col="red",lty=3,
lwd=2)
points(x[y>=5],y[y>=5],pch=4,col="darkmagenta",cex=2)
points(x[y<=-5],y[y<=-5],pch=3,col="darkgreen",cex=2)
points(x[(x>=5&x<=15)&(y>-5&y<5)],y[(x>=5&x<=15)&(y>-5&y<5)],pch=19,
col="blue")
points(x[(x<5|x>15)&(y>-5&y<5)],y[(x<5|x>15)&(y>-5&y<5)])
lines(x,y,lty=4)
arrows(x0=8,y0=14,x1=11,y1=2.5)
text(x=8,y=15,labels="sweet spot")
legend("bottomleft",
legend=c("overall process","sweet","standard",
"too big","too small","sweet y range","sweet x range"),
pch=c(NA,19,1,4,3,NA,NA),lty=c(4,NA,NA,NA,NA,2,3),
col=c("black","blue","black","darkmagenta","darkgreen","red","red"),
lwd=c(1,NA,NA,NA,NA,2,2),pt.cex=c(NA,1,1,2,2,NA,NA))

Plot

mydata_txt <- read.table(file = "18397_Cekmekoy_Omerli_15dk.txt",
                        header=TRUE, 
                        sep=";",
                        na.strings="-9999")

mydata_txt

plot(mydata_txt$temp, type="l" )

Create a Function

Take Home - HMW

HMW-1 : Create a New R-Notebook
HMW-2 : Practice for Data Types and Data Structures

I will upload Homework-1 and Homework-2 to Ninova

Additional Course (is Turkish)

We can do it again in the next week

After 7th Course ?

Data Types
Data Structure
Read, Write and Plot Data