Coding basics and Importing Data From Excel

Presentation Two - Delivered August 8th, 2019 - covers the basics of coding with mathematical principles. We will cover:

Arrays, Lists, Functions, Variables & Objects

Importing Data From Excel, the best form of data and utilizing file paths.

In the below code, we will create an array and test functions with it. Read the comments to learn along

#A scalar is a variable with one value

x = 3 
x
## [1] 3
# A vector takes (that is, represents) multiple scalars. To create a vector, you need to put 'c' before the list of inputs. 
MyVector = c(1,2,3,4,5,6,7,8,9)
MyVector
## [1] 1 2 3 4 5 6 7 8 9
length(MyVector)
## [1] 9
#use seq to create vectors with specifications you choose
seq(1,10,by = 2) #Create a sequence between 1 to 10 in steps of 2
## [1] 1 3 5 7 9
seq(1,20,length.out = 3) #A vector between 1 - 20 with 3 values equally spaced
## [1]  1.0 10.5 20.0

Quick sidenote - the anatomy of R-Functions

#Name_of_function(Input1,input2, variable1=‘example’)
# plot(x,y,main = "Title", xlab = "X Axis Label", ylab = "Y Axis Label")

Back to Vectors

# You can pull individual values from your vectors 
y = MyVector[2];y # I am pullng the 2nd value in MyVector
## [1] 2
z = MyVector[c(1,3,7)];z
## [1] 1 3 7
q = MyVector[c(1.3, 4.6)];q
## [1] 1 4

User Functions

#Creating a function that does what you want
MyFunction = function(x){return (x + 10)} #The syntax is important here
MyFunction2 = function(x,y){return (x + y + 25)} 
MyFunction(3) # Should return 13
## [1] 13
MyFunction2(3, 5)# Should return 33. order matters with the inputs
## [1] 33
Different data types
There ate 3 types we will focus on today
# Numeric Data
NumericVector = c(1,2,3,4,5,6)
#Logical Data
LogicalVector = c(TRUE, FALSE, FALSE, TRUE, FALSE, TRUE)
#Chracter Data
CharacterVector = c("Shepens", "Eye", "Research", "Institute")
# You can do operations with these vectors against each other, but it is not advised as it will change the data type to a single type. 
CombinedVector = c(NumericVector, LogicalVector) #This data has been turned numeric
CombinedVector = c(NumericVector, CharacterVector)# This data has been turned charater
Lists are able to take care of differnet typed of one data in one object.
MyList = list(name = "Nick", age = 23, married = FALSE)
#All three ~attributes~ (important word) are different types of data and yet can be contained within the one object. 
MyBiggerList = list(name = c("Nick", "Jane", "Tommy"), age = c(23, 45, 21), married = c(FALSE, TRUE, FALSE))
#Implanting vectors into the list attributes is how to create each larger list of separate data types. 
Tables in R. This is a very borad term in R, here we simply focus on teh R_Function term tables. There are data-tables that we will get into in a more advanced class.
#Table() allows us to count up the inputs of a certain vectors. 
TestTableVector= c("Nick", "Nick", "Tony", "Rachel", "tony", "Bernie", "Ben")
table(TestTableVector)
## TestTableVector
##    Ben Bernie   Nick Rachel   tony   Tony 
##      1      1      2      1      1      1
#Notice its case sensitive

Importing Data from R

# Lets focus on the mnost basic form of data importation
#An Excel sheet in long form. Long form is when the data is formed in columns with a heading in these columns. 
# This is the easiest form of data importation, in the future we will get more involved

#This portion of the markdown cannot include code, as we focused on the "Import Dataset" button in the upper right of RStudio
After clicking the Import Dataset button there is a window and pre-code preview. We will focus on the coding review when implemented. The coding review is seen at the bottom right of the window. Here we will focus on what the coding preview is and means and how we can use it in our code.

Heres the code that is produced when I imported some data. The code was implemented in the CONSOLe and not the script window. This means the code is implemented once, and never again. If you want this code run everytime the script is run, you need to paste it into the script window.

### CODE IMPLEMENTED IN THE CONSOLE WHEN DATA IS IMPORTED 
### ---------------------------------------------------------
# library(readxl)
# AntibodyStudyData_Lesson1 = read_excel("Documents/R/ExpDesign/2018 Class/AntibodyStudyData_Lesson1.xlsx", sheet = "for_stata")
# View(AntibodyStudyData_Lesson1) I commented this out but you dont have to view it everytime you import it

### ---------------------------------------------------------

# Lets go through this line by line 
# ----------------------
# LINE 1
# library(readxl)
  # The above line activates the package needed to read excel sheets. R is aware of this when you choose    "From Excel""
# ----------------------
# LINE 2

# AntibodyStudyData_Lesson1 <- read_excel("Documents/R/ExpDesign/2018 Class/AntibodyStudyData_Lesson1.xlsx", sheet = "for_Stata")
  # The above line has alot in it. Lets go through it peice by peice. 
  
  # The first is the name of the variable that will be assignes to ALL the data we import. The name is      customizable, but teh default is the file name
  
  # Next is the read_excel() function that is part of teh readxl package we activated earlier in LINE 1 

  # Inside the read_excel function is the inputs necessary to import the data, and the  most important is the PATH!! the path the path the path. This term is important to know in any coding language. 
# The path is a map for the computer to find my file. And here, in a MAC, I told R that the file i am interested in is in Documents, then in R, then in EXP Design, then in 2018 class, then etc etc etc. The computer knows which folders to open and open to get to the file. Your path and tour friends path will look different because we all organize our files differnetly with differently titled folders. If you send someone an R file to run on theri computer, it will nto work. 

# Next is the sheet= part where I specify what sheet I want to import. 

Finally, once we import the data, the view(data) line is implemented (we can choose not to include this). Do we want to re-import the file everytime we run the script? Do we intend to change the excel file in the future? How will R know about these updates if we do not re-import it everytime?

We will get into the specifics of different types of data import later