Lists

Creating a list

alumno1=list(nombre="Patrick", apellido="Salazar", edad=27, casado=F)

Watching the list:

alumno1
## $nombre
## [1] "Patrick"
## 
## $apellido
## [1] "Salazar"
## 
## $edad
## [1] 27
## 
## $casado
## [1] FALSE

Watching the data types on my list:

str(alumno1)
## List of 4
##  $ nombre  : chr "Patrick"
##  $ apellido: chr "Salazar"
##  $ edad    : num 27
##  $ casado  : logi FALSE

chr stands for “character” [It doesn’t matter that there is actually more than one character in both written texts. R interprets them as a single character].

num stands for “numeric”.

logi stands for “logical value”.

Accesing individual elements of my list [three different ways of doing so]:

alumno1[2] #accessing by number of column
## $apellido
## [1] "Salazar"
alumno1["apellido"] #accessing by name of column
## $apellido
## [1] "Salazar"
alumno1$apellido #accesing using the $ sign
## [1] "Salazar"

Accesing to more than one element of my list:

alumno1[c("nombre","apellido")] #I use function c
## $nombre
## [1] "Patrick"
## 
## $apellido
## [1] "Salazar"

Replacing elements on my list:

alumno1$nombre="Atilio" #I'm rewriting on my old code, thus changing my name

Watching the change:

alumno1
## $nombre
## [1] "Atilio"
## 
## $apellido
## [1] "Salazar"
## 
## $edad
## [1] 27
## 
## $casado
## [1] FALSE

Inserting a new element to the list:

alumno1$distritoNace="Miraflores" #I'm inserting the district I was born in

Watching the change:

alumno1
## $nombre
## [1] "Atilio"
## 
## $apellido
## [1] "Salazar"
## 
## $edad
## [1] 27
## 
## $casado
## [1] FALSE
## 
## $distritoNace
## [1] "Miraflores"

Eliminating an element of the list:

alumno1 = alumno1[-c(2)] #I'm deleting the second element, where I put my last name

Watching the changes:

alumno1
## $nombre
## [1] "Atilio"
## 
## $edad
## [1] 27
## 
## $casado
## [1] FALSE
## 
## $distritoNace
## [1] "Miraflores"

Vectors

Creating some vectors:

nombre=c("Alejandro","Alan","Ollanta","Pedro Pablo")
edad=c(72,68,55,79)
terminoMandato=c(T,T,T,F)

Watching vectors:

nombre
## [1] "Alejandro"   "Alan"        "Ollanta"     "Pedro Pablo"
edad
## [1] 72 68 55 79
terminoMandato
## [1]  TRUE  TRUE  TRUE FALSE

Remember that a vector is always composed by elements of the same type. To access to individual elements of a vector:

nombre[4]
## [1] "Pedro Pablo"
edad[4]
## [1] 79
terminoMandato[4]
## [1] FALSE

Altering an element of a vector:

nombre[4]="PPK"

Watching the change:

nombre
## [1] "Alejandro" "Alan"      "Ollanta"   "PPK"

Inserting a new element to a vector:

nombre=c(nombre,"Martin")

Watching the changes:

nombre
## [1] "Alejandro" "Alan"      "Ollanta"   "PPK"       "Martin"

Eliminating the element I just added, and watching the result:

nombre=nombre[-5]
nombre
## [1] "Alejandro" "Alan"      "Ollanta"   "PPK"

Sorting the ages and creating a new vector with the sorted ages:

edad_ordenada=sort(edad, decreasing = T)
edad_ordenada
## [1] 79 72 68 55

Watching the size [how long] is a vector:

length(edad_ordenada)
## [1] 4

Getting the maximun and minimun values of a vector:

max(edad)
## [1] 79
min(edad)
## [1] 55

Data Frames

I first create some vectors:

nombre=c("Alejandro", "Alan", "Ollanta", "Pedro Pablo")
apellido=c("Toledo", "Garcia", "Humala", "Kuczynski")
edad=c(72,68,55,79)
partido=c("Peru Posible", "APRA", "Partido Nacionalista", "Peruanos por el Kambio")

And now I create a data frame as a list of vectors:

presidentes=data.frame(nombre, apellido, edad, partido, stringsAsFactors = F) #I use stringsAsFactors = F to tell R not to convert the character vectors into Factors

Watching the data:

presidentes

Checking data types with str function:

str(presidentes)
## 'data.frame':    4 obs. of  4 variables:
##  $ nombre  : chr  "Alejandro" "Alan" "Ollanta" "Pedro Pablo"
##  $ apellido: chr  "Toledo" "Garcia" "Humala" "Kuczynski"
##  $ edad    : num  72 68 55 79
##  $ partido : chr  "Peru Posible" "APRA" "Partido Nacionalista" "Peruanos por el Kambio"

Exploring dimensions of my data frame:

dim(presidentes) #dimensions
## [1] 4 4
nrow(presidentes) #number of rows [how many cases do I have?]
## [1] 4
ncol(presidentes) #number of columns [how many variables do I have?]
## [1] 4

How long is my data frame?

length(presidentes)
## [1] 4

See only first 2 rows:

head(presidentes,2)

See only last two rows:

tail(presidentes,2)