DAY 1

Basics of R Programming

Hey Debishree As of now you must have downloaded R Base and R Studio from the link i had provided you on your whats app, so now we’ll get familiar with the interface and basics of R functions, like importing data sets, creating data sets and exploring them, executing some basic mathematical functions, and ah just I’ll also show you how to plot a basic diagram, so stay tuned ;)


Using R console like a calculator

Basic Calculater

Interesting isn’t it? But it’s really simple as using your dumb ass calculator, though you can use R for much advanced things but there’s no harm in using it for your basic calculations. So let’s start with some very basic functions. Just type 8+8 and press “Ctrl+Shift+Enter”, to exectute the line/programm.

8+8
## [1] 16

Intesting isn’t it? Lets perform some more complicated mathematical function, hmm let’s perform

987*10+130
## [1] 10000

It’s cool isn’t it? can your calculater do BODMAS :D

Creating Functions in R

As you must have studied in class 8th-10th, functions and linear equations are really important in solving some really complicated problem. So lets create some functions in R.

Suppose your age is ‘X’, so how will you feed this value in R? To assign any value in any Word/Character we will use this “<-” sign in R. for example, suppose your age is 20 (yeah you look so young ;), just kidding)

Debishreeage <- 20

So now i have assigned the word “Debishreeage” a value of “20”, so now if i type Debishreeage and run the code by “Ctrl+Shift+Enter”, it will the show the assigned value.

Debishreeage
## [1] 20

Now you must be thinking kya dumb samjh rakha hai tell e something advanced. So lets create some advanced function. The word “function” is used to create a function for any variables. As you must have seen fucntions like fucntion(x,y)=x+Y, so if you’ll perform the function it will add both the numbers. SO to create a function for example here we are creating a function

sumof2numbers <- function(x,y){x+y}

So now we have created a function “sumof2numbers” where if we put values in this function it will automatically add the provided numbers, lets check

sumof2numbers(5,8)
## [1] 13

Removing Objects

As above we have created some objects and function like “debishreeage” and “sumof2numbers”, so if we execute these things the assigned values will show, but what if we want to remove these objects from the memory. We use rm() function to remove saved values

rm(Debishreeage)

so now if we execute this function it will show error as it has been removed from the memory.

Debishreeage
## Error in eval(expr, envir, enclos): object 'Debishreeage' not found

Data classes

There are three types of data classes, Numeric, character, and logical

Obviously any numbers comes under numeric class

12.6 #numeric
## [1] 12.6

To assign something as Character feed them in double quotes. Otherwise the R will think it is somekind of a function and not just a simple word

"Male" #Character
## [1] "Male"

Logical values just denotes weather some function is TRUE or FALSE (must be in capitals), here you don’t have to use double quote

TRUE
## [1] TRUE

Data Structures

As above we discussed what are the types of data classes, as you have observed all kind of data comes in these types only. For example if you take RNA expression data set, the gene name column will have “Characters” in it and gene count column will have “Numeric” values.

Data structure as the name suggests is nothing but data in a strutured manner, be it in table form where you’ll have multiple column or rows or in vector form where the data values are stored in a linear form.Let’s start with vector.

#Vector the c(), fuction saves all the data in a vector form inserted in it, for eg lets save the numbers 1 to 10,

c(1,2,3,4,5,6,7,8,9,10)
##  [1]  1  2  3  4  5  6  7  8  9 10

Hmmm, lets now save the numbers upto 1000; hahaha just kidding obviously it’ll be very cumbersome process. let me tell you a shortcut

c(1:100)
##   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
##  [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
##  [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
##  [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
##  [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
##  [91]  91  92  93  94  95  96  97  98  99 100

much easy isn’t it?

great now lets name this vector ILBS rooms.

ILBSrooms <- c(1:100)

Now create a vector having characters

ILBSfaculties <- c("Sukirti", "Vijay","Rajeev", "Vikrant", "Gayatri")

Now print ILBS Faculties

ILBSfaculties
## [1] "Sukirti" "Vijay"   "Rajeev"  "Vikrant" "Gayatri"

Can we add two vecors having numeric values? ofcourse dude,

a<- c(2:12)
b<- c(3:13)
a+b
##  [1]  5  7  9 11 13 15 17 19 21 23 25

or you can just perform independent actions on each vector

a*2
##  [1]  4  6  8 10 12 14 16 18 20 22 24

or

b+3
##  [1]  6  7  8  9 10 11 12 13 14 15 16

List

List is a data structure which comprises vectors and list, for example, as we have created vectors a, and B above so any list can accommodate a, or b and any other data values. let’s see by an exapmle.

list(1,2,3,"Hello", FALSE)
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] 3
## 
## [[4]]
## [1] "Hello"
## 
## [[5]]
## [1] FALSE

lets add our created vector “a” in our list

list(a, 1,2,3,4,"Debishree", TRUE)
## [[1]]
##  [1]  2  3  4  5  6  7  8  9 10 11 12
## 
## [[2]]
## [1] 1
## 
## [[3]]
## [1] 2
## 
## [[4]]
## [1] 3
## 
## [[5]]
## [1] 4
## 
## [[6]]
## [1] "Debishree"
## 
## [[7]]
## [1] TRUE

Subsetting vectors

As our PIs are expert in manipulation, let’s learn how to manipulate data.

Suppose you need to extract a specific data from a list “Days”

Days <- list("monday", "tuesday", "wednesday", "thursday", "friday")

So now if we want to see what day is on 3rd positon, we’ll simply do

Days[3]
## [[1]]
## [1] "wednesday"

or for the first three days

Days[c(1,2,3)] #you must insert values under c() for more than one numbers
## [[1]]
## [1] "monday"
## 
## [[2]]
## [1] "tuesday"
## 
## [[3]]
## [1] "wednesday"

or

Days[1:3]
## [[1]]
## [1] "monday"
## 
## [[2]]
## [1] "tuesday"
## 
## [[3]]
## [1] "wednesday"

or you can just exclude some values

Days[-4]
## [[1]]
## [1] "monday"
## 
## [[2]]
## [1] "tuesday"
## 
## [[3]]
## [1] "wednesday"
## 
## [[4]]
## [1] "friday"

R Base functions

Till now you have understood the basic nitty-gritty of R functionalities, now we are gonna learn some tools you’ll use in you data analysis work.

For example we had created a data set ILBSrooms, so we’ll be using that dataset here

ILBSrooms
##   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
##  [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
##  [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
##  [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
##  [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
##  [91]  91  92  93  94  95  96  97  98  99 100
mean(ILBSrooms)
## [1] 50.5
median(ILBSrooms)
## [1] 50.5
min(ILBSrooms)
## [1] 1
max(ILBSrooms)
## [1] 100
sum(ILBSrooms)
## [1] 5050
class(ILBSrooms)
## [1] "integer"
length(ILBSrooms)
## [1] 100
log(ILBSrooms)
##   [1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
##   [8] 2.0794415 2.1972246 2.3025851 2.3978953 2.4849066 2.5649494 2.6390573
##  [15] 2.7080502 2.7725887 2.8332133 2.8903718 2.9444390 2.9957323 3.0445224
##  [22] 3.0910425 3.1354942 3.1780538 3.2188758 3.2580965 3.2958369 3.3322045
##  [29] 3.3672958 3.4011974 3.4339872 3.4657359 3.4965076 3.5263605 3.5553481
##  [36] 3.5835189 3.6109179 3.6375862 3.6635616 3.6888795 3.7135721 3.7376696
##  [43] 3.7612001 3.7841896 3.8066625 3.8286414 3.8501476 3.8712010 3.8918203
##  [50] 3.9120230 3.9318256 3.9512437 3.9702919 3.9889840 4.0073332 4.0253517
##  [57] 4.0430513 4.0604430 4.0775374 4.0943446 4.1108739 4.1271344 4.1431347
##  [64] 4.1588831 4.1743873 4.1896547 4.2046926 4.2195077 4.2341065 4.2484952
##  [71] 4.2626799 4.2766661 4.2904594 4.3040651 4.3174881 4.3307333 4.3438054
##  [78] 4.3567088 4.3694479 4.3820266 4.3944492 4.4067192 4.4188406 4.4308168
##  [85] 4.4426513 4.4543473 4.4659081 4.4773368 4.4886364 4.4998097 4.5108595
##  [92] 4.5217886 4.5325995 4.5432948 4.5538769 4.5643482 4.5747110 4.5849675
##  [99] 4.5951199 4.6051702

Generating Normal values in r

rnorm(100, mean =5)
##   [1] 3.865785 5.256733 5.950543 5.075388 7.108440 4.535784 5.844372 5.051526
##   [9] 5.026750 5.362679 3.436547 3.137238 5.880930 5.063561 3.809988 5.438181
##  [17] 6.194283 4.149028 5.665885 6.484580 2.837622 5.059366 4.398256 3.147268
##  [25] 4.443238 5.259950 4.716774 4.693452 3.730403 6.142237 3.828713 5.953652
##  [33] 4.848831 6.182376 4.984736 4.549125 3.540682 4.901339 4.954108 5.882223
##  [41] 5.426617 5.206501 5.155253 3.843573 6.055464 5.026163 5.188766 4.716474
##  [49] 5.109551 4.683862 5.006118 4.571495 5.388725 5.046085 3.251669 6.093741
##  [57] 4.956033 3.878943 5.617791 4.602069 3.837023 6.682865 4.111236 5.096640
##  [65] 2.515224 6.289244 7.889476 3.984342 4.376988 4.015754 6.329794 5.081992
##  [73] 5.502331 6.384455 4.222846 3.346803 4.626667 2.784766 4.015780 3.919799
##  [81] 4.768666 6.443937 5.019382 3.407760 4.130200 4.164925 4.949764 6.275985
##  [89] 4.503774 4.300727 4.408166 4.863092 4.667608 4.736362 4.535371 4.005805
##  [97] 5.370105 5.848978 2.620931 3.209260

Now we can plot histograms for our generated value

hist(rnorm(100, mean = 5))

Cliffhanger

##   speed dist
## 1     4    2
## 2     4   10
## 3     7    4
## 4     7   22
## 5     8   16
## 6     9   10
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00