In this class, you will learn how to:
- Extract individual values from a data set
- Change individual values within a data set
Lets search the data frame among the computer
deck <- read.csv("C:/Users/raulr/OneDrive/Escritorio/STU/Prgram For Data Analytics/Archivos/deck.csv",stringsAsFactors=FALSE)
head(deck)
#Positive Numbers R treats positive integers just like ij notation in
linear algebra: deck[i,j] will return the value of deck that is in the
ith row and the jth column, So in this example, the value that is in the
first row, first column is “King”, then a list with the first row has
been created and named new.
deck[1, 1]
[1] "king"
deck[1, c(1, 2, 3)]
new <- deck[1, c(1, 2, 3)]
Repetition. Returns a data frame of 2 rows and 3 columns of the the
first row two times.
deck[c(1, 1), c(1, 2, 3)]
Returns a Data Frame of 2 rows and 2 columns. Of the first and second
rows and columns of the initial dataframe
deck[1:2, 1:2]
Returns a vector of the first row of first and second columns.
deck[1:2, 1]
[1] "king" "queen"
Returns a data frame of 2 rows and 1 column. So it returns the first
2 rows of the first column.
deck[1:2, 1, drop = FALSE]
#Negative Numbers With the negative sign it returns the part of the
data frame that is not inside of the interval.
deck[-(2:52), 1:3]
Illegal instruction.
#deck[c(-1, 1), 1]
#Zero The following instruction creates and empty object.
deck[0, 0]
#Blank Spaces
deck[1, ]
#Logical Values It returns just the columns that have a TRUE value
asociated with it.
deck[1, c(TRUE, TRUE, FALSE)]
Lets check the attributes function again
rows <- c(TRUE, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
F, F, F, F, F, F, F, F, F, F, F, F, F, F)
Extract elements based on the name
deck[1, c("face", "suit", "value")]
#Names You can use a blank space to tell R to extract every value in
a dimension.
# the entire value column
deck[ , "value"]
[1] 13 12 11 10 9 8 7 6 5 4 3 2 1 13 12 11 10 9 8 7 6 5 4 3 2 1 13 12 11 10 9 8 7 6 5 4 3 2
[39] 1 13 12 11 10 9 8 7 6 5 4 3 2 1
Complete the following code to make a function that returns the first
row of a data frame: Writting the name of the data frame instead of deck
will returns any first row of the data frame.
deal <- function(cards) {cards[1, ]}
deal(deck)
NA
Shuffle deck
deck2 <- deck[1:52, ]
head(deck2)
deck3 <- deck[c(2, 1, 3:52), ]
head(deck3)
#How Modify Order Change the order of the cards
#First create a vector of 52 numbers in random order and store it in an object named random.
random <- sample(1:52, size = 52)
random
[1] 11 15 44 32 13 49 39 3 38 36 46 1 47 22 37 52 43 8 35 28 33 12 9 26 4 31 48 20 19 10 30 45 7 23 34 25 42 41
[39] 27 5 6 40 2 51 18 17 14 16 50 21 24 29
deck4 <- deck[random, ]
head(deck4)
Exercise
Use the preceding ideas to write a shuffle function. shuffle should
take a data frame and return a shuffled copy of the data frame.
shuffle <- function(cards) {
random <- sample(1:52, size = 52)
cards[random, ]
}
deal(deck)
## face suit value
## king spades 13
deck2 <- shuffle(deck)
deal(deck2)
## face suit value
## jack clubs 11
#Dollar Signs and Double Brackets
Two types of object in R obey an optional second system of notation.
You can extract values from data frames and lists with the $ syntax. You
will encounter the $ syntax again and again as an R programmer, so let’s
examine how it works. To select a column from a data frame, write the
data frame’s name and the column name separated by a $. Notice that no
quotes should go around the column name:
deck$value
[1] 13 12 11 10 9 8 7 6 5 4 3 2 1 13 12 11 10 9 8 7 6 5 4 3 2 1 13 12 11 10 9 8 7 6 5 4 3 2
[39] 1 13 12 11 10 9 8 7 6 5 4 3 2 1
mean(deck$value)
[1] 7
median(deck$value)
[1] 7
To see this, make a list:
lst <- list(numbers = c(1, 2), logical = TRUE, strings = c("a", "b", "c"))
lst
$numbers
[1] 1 2
$logical
[1] TRUE
$strings
[1] "a" "b" "c"
And then subset it:
lst[1]
$numbers
[1] 1 2
The result is a smaller list with one element. That element is the
vector c(1, 2). This can be annoying because many R functions do not
work with lists. For example, sum(lst[1]) will return an error. It would
be horrible if once you stored a vector in a list, you could only ever
get it back as a list:
#sum(lst[1])
When you use the $ notation, R will return the selected values as
they are, with no list structure around them:
lst$numbers
[1] 1 2
## 1 2
You can then immediately feed the results to a function:
sum(lst$numbers)
[1] 3
## 3
If the elements in your list do not have names (or you do not wish to
use the names), you can use two brackets, instead of one, to subset the
list. This notation will do the same thing as the $ notation:
lst[[1]]
[1] 1 2
## 1 2
In other words, if you subset a list with single-bracket notation, R
will return a smaller list. If you subset a list with double-bracket
notation, R will return just the values that 74 | Chapter 4: R Notation
were inside an element of the list. You can combine this feature with
any of R’s indexing methods:
lst["numbers"]
$numbers
[1] 1 2
lst[["numbers"]]
[1] 1 2
#Changing Values in Place
vec <- c(0, 0, 0, 0, 0, 0)
vec
[1] 0 0 0 0 0 0
vec[1] <- 1000
vec
[1] 1000 0 0 0 0 0
Changing more than one value at the same time.
vec[c(1, 3, 5)] <- c(7, 9, 8)
vec
[1] 7 0 9 0 8 0
vec[4:6] <- vec[4:6] + 1
vec
[1] 7 0 9 1 9 1
Force the vector to grow.
vec[7] <- 0
vec
[1] 7 0 9 1 9 1 0
I do not know why but an error appears because r can not open the
conection. So I am going to upload the dataframe in the same way as
before.
#deck2 <- read.csv("~/Documents/Clases/sainthomas/deck.csv",stringsAsFactors=FALSE)
deck2 <- read.csv("C:/Users/raulr/OneDrive/Escritorio/STU/Prgram For Data Analytics/Archivos/deck.csv",stringsAsFactors=FALSE)
deck2$new <- 1:52
head(deck2)
Remove a column from a dataframe. Column named new has been
eliminated
deck2$new <- NULL
head(deck2)
The rows specified have been the output
deck2[c(13, 26, 39, 52), ]
Rows specified for and column number 3
deck2[c(13, 26, 39, 52), 3]
[1] 1 1 1 1
deck2$value[c(13, 26, 39, 52)]
[1] 14 14 14 14
The value has changed from 1 to 14 in each row.
---
title: "Week 2 Part2 (Extraction)"
author: "Raul Roces"
output: html_notebook
---
In this class, you will learn how to:

* Extract individual values from a data set
* Change individual values within a data set

Lets search the data frame among the computer
```{r}
deck <- read.csv("C:/Users/raulr/OneDrive/Escritorio/STU/Prgram For Data Analytics/Archivos/deck.csv",stringsAsFactors=FALSE)
head(deck)
```
#Positive Numbers
R treats positive integers just like ij notation in linear algebra: deck[i,j] will return the
value of deck that is in the ith row and the jth column,
So in this example, the value that is in the first row, first column is "King", then a list with the first row has been created and named new.
```{r}
deck[1, 1]
deck[1, c(1, 2, 3)]
new <- deck[1, c(1, 2, 3)]
```
Repetition.
Returns a data frame of 2 rows and 3 columns of the the first row two times.
```{r}
deck[c(1, 1), c(1, 2, 3)]
```
Returns a Data Frame of 2 rows and 2 columns. Of the first and second rows and columns of the initial dataframe 
```{r}
deck[1:2, 1:2]
```
Returns a vector of the first row of first and second columns.
```{r}
deck[1:2, 1]
```

Returns a data frame of 2 rows and 1 column. So it returns the first 2 rows of the first column.
```{r}
deck[1:2, 1, drop = FALSE]
```
#Negative Numbers
With the negative sign it returns the part of the data frame that is not inside of the interval.
```{r}
deck[-(2:52), 1:3]
```
Illegal instruction.
```{r}
#deck[c(-1, 1), 1]
```

#Zero 
The following instruction creates and empty object.
```{r}
deck[0, 0]
```

#Blank Spaces 
```{r}
deck[1, ]
```
#Logical Values
It returns just the columns that have a TRUE value asociated with it.
```{r}
deck[1, c(TRUE, TRUE, FALSE)]
```
Lets check the attributes function again
```{r}
rows <- c(TRUE, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
F, F, F, F, F, F, F, F, F, F, F, F, F, F)
```
Extract elements based on the name
```{r}
deck[1, c("face", "suit", "value")]
```
#Names 
You can use a blank space to tell R to extract every value in a dimension.
```{r}
# the entire value column
deck[ , "value"]
```
Complete the following code to make a function that returns the first row of a data
frame:
Writting the name of the data frame instead of deck will returns any first row of the data frame.
```{r}
deal <- function(cards) {cards[1, ]}
deal(deck)

```
Shuffle deck
```{r}
deck2 <- deck[1:52, ]
head(deck2)
```

```{r}
deck3 <- deck[c(2, 1, 3:52), ]
head(deck3)
```

#How Modify Order
Change the order of the cards
```{r}
#First create a vector of 52 numbers in random order and store it in an object named random.
random <- sample(1:52, size = 52)
random
```



```{r}
deck4 <- deck[random, ]
head(deck4)
```

# Exercise
Use the preceding ideas to write a shuffle function. shuffle should take a data frame
and return a shuffled copy of the data frame.

```{r}
shuffle <- function(cards) {
random <- sample(1:52, size = 52)
cards[random, ]
}
deal(deck)
## face suit value
## king spades 13
deck2 <- shuffle(deck)
deal(deck2)
## face suit value
## jack clubs 11
```
#Dollar Signs and Double Brackets
```{r}

```
Two types of object in R obey an optional second system of notation. You can extract
values from data frames and lists with the $ syntax. You will encounter the $ syntax again
and again as an R programmer, so let’s examine how it works.
To select a column from a data frame, write the data frame’s name and the column name
separated by a $. Notice that no quotes should go around the column name:
```{r}
deck$value
```

```{r}
mean(deck$value)
```

```{r}
median(deck$value)
```
To see this, make a list:

```{r}
lst <- list(numbers = c(1, 2), logical = TRUE, strings = c("a", "b", "c"))
lst
```

And then subset it:
```{r}
lst[1]
```

The result is a smaller list with one element. That element is the vector c(1, 2). This
can be annoying because many R functions do not work with lists. For example,
sum(lst[1]) will return an error. It would be horrible if once you stored a vector in a
list, you could only ever get it back as a list:




```{r}
#sum(lst[1])

```
When you use the $ notation, R will return the selected values as they are, with no list
structure around them:


```{r}
lst$numbers
## 1 2

```
You can then immediately feed the results to a function:
```{r}
sum(lst$numbers)
## 3

```
If the elements in your list do not have names (or you do not wish to use the names),
you can use two brackets, instead of one, to subset the list. This notation will do the
same thing as the $ notation:
```{r}
lst[[1]]
## 1 2
```
In other words, if you subset a list with single-bracket notation, R will return a smaller
list. If you subset a list with double-bracket notation, R will return just the values that
74 | Chapter 4: R Notation
were inside an element of the list. You can combine this feature with any of R’s indexing
methods:
```{r}
lst["numbers"]
lst[["numbers"]]
```



#Changing Values in Place
```{r}
vec <- c(0, 0, 0, 0, 0, 0)
vec
vec[1] <- 1000
vec
```


Changing more than one value at the same time.
```{r}
vec[c(1, 3, 5)] <- c(7, 9, 8)
vec
vec[4:6] <- vec[4:6] + 1
vec
```

Force the vector to grow.
```{r}
vec[7] <- 0
vec
```
I do not know why but an error appears because r can not open the conection. So I am going to upload the dataframe in the same way as before.
```{r}
#deck2 <- read.csv("~/Documents/Clases/sainthomas/deck.csv",stringsAsFactors=FALSE)
deck2 <- read.csv("C:/Users/raulr/OneDrive/Escritorio/STU/Prgram For Data Analytics/Archivos/deck.csv",stringsAsFactors=FALSE)
deck2$new <- 1:52
head(deck2)
```
Remove a column from a dataframe. Column named new has been eliminated
```{r}
deck2$new <- NULL
head(deck2)
```

The rows specified have been the output
```{r}
deck2[c(13, 26, 39, 52), ]
```
Rows specified for and column number 3
```{r}
deck2[c(13, 26, 39, 52), 3]
```

```{r}
deck2$value[c(13, 26, 39, 52)]
```
The value has changed from 1 to 14 in each row.
```{r}
deck2$value[c(13, 26, 39, 52)] <- c(14, 14, 14, 14)
```
