Activity 12

Michael Marx

In this class, you will learn how to:

  • Extract individual values from a data set
  • Change individual values within a data set

importing the deck csv file -> stored as dataframe displaying the first 5 rows of the dataframe

deck <- read.csv("deck.csv",stringsAsFactors=FALSE)
head(deck)

Positive Numbers

R treats positive integers just like ij notation in linear algebra: deck[i,j] will return the value of deck that is in the ith row and the jth column,

Accessing the first element in the first row -> king accessing first, second, and third elements of first row -> king spades 13 assigning previous three values to new variable

deck[1, 1]
## [1] "king"
deck[1, c(1, 2, 3)]
new <- deck[1, c(1, 2, 3)]

Repetition accessing same thing twice

deck[c(1, 1), c(1, 2, 3)]

Returns a Data Frame first and second element of first two rows as dataframe:

deck[1:2, 1:2]

Returns a vector first element of first two rows as vector:

deck[1:2, 1]
## [1] "king"  "queen"

Returns a data frame first element of first two rows as a dataframe with header

deck[1:2, 1, drop = FALSE]

Negative Numbers

declaring a negative -() number results in the first row being accessed

deck[-(2:52), 1:3]

Illegal instruction. having the - operating within the () results in an illigal operation

#deck[c(-1, 1), 1]

#Zero The following instruction creates and empty object.

deck[0, 0]

-> returned dataframe is empty

#Blank Spaces returns the full first row:

deck[1, ]

#Logical Values returns the first two elements of the first row

deck[1, c(TRUE, TRUE, FALSE)]

Lets check the attributes function again decleare boolean vector:

rows <- c(TRUE, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
F, F, F, F, F, F, F, F, F, F, F, F, F, F)

Extract elements based on the name access specific column names by row:

deck[1, c("face", "suit", "value")]

Names

You can use a blank space to tell R to extract every value in a dimension. returns the whole value column:

# the entire value column
deck[ , "value"]
##  [1] 13 12 11 10  9  8  7  6  5  4  3  2  1 13 12 11 10  9  8  7  6  5  4  3  2
## [26]  1 13 12 11 10  9  8  7  6  5  4  3  2  1 13 12 11 10  9  8  7  6  5  4  3
## [51]  2  1

Complete the following code to make a function that returns the first row of a data frame:

deal <- function(cards) {
 cards[1,]
}
deal(deck)

Shuffle deck return the first five rows of deck2 deck2 is composed of first 52 rows of deck:

deck2 <- deck[1:52, ]
head(deck2)

deck3 is composed of second, first, 3rd to 52nd row of deck:

deck3 <- deck[c(2, 1, 3:52), ]
deck3

#How Modify Order creating random sample size of 52:

#First create a vector of 52 numbers in random order and store it in an object named random.
random <- sample(1:52, size = 52)
random
##  [1] 15 30 22 14 43 50 41 27 25 31 51 13 46 34 29 45 20 18 42 32  2 19 40 12 39
## [26] 24 49 48  7 37 38 36 44 10 21  6  9  1 11 33  8 35  5 16 28  3 52 26 47 17
## [51] 23  4

accessing this random sample size within deck: first 5 rows of random deck4 is returned

deck4 <- deck[random, ]
head(deck4)

Exercise

Use the preceding ideas to write a shuffle function. shuffle should take a data frame and return a shuffled copy of the data frame.

shuffle <- function(cards) {
random <- sample(1:52, size = 52)
cards[random, ]
}
deal(deck)
## face suit value
## king spades 13
deck2 <- shuffle(deck)
deal(deck2)
## face suit value
## jack clubs 11

Explanation: the preceding creates a function that picks a random sample size of 52 rows from the deck dataframe then the deal functions ‘deals’ a deck deck2 is assigned by the shuffle function then the deck2 is dealt

#Dollar Signs and Double Brackets

Two types of object in R obey an optional second system of notation. You can extract values from data frames and lists with the $ syntax. You will encounter the $ syntax again and again as an R programmer, so let’s examine how it works. To select a column from a data frame, write the data frame’s name and the column name separated by a $. Notice that no quotes should go around the column name:

deck$value
##  [1] 13 12 11 10  9  8  7  6  5  4  3  2  1 13 12 11 10  9  8  7  6  5  4  3  2
## [26]  1 13 12 11 10  9  8  7  6  5  4  3  2  1 13 12 11 10  9  8  7  6  5  4  3
## [51]  2  1

-> deck’s value column is directly accessed by ‘$value’

mean(deck$value)
## [1] 7

-> mean of value column is calculated

median(deck$value)
## [1] 7

-> same with median

To see this, make a list:

lst <- list(numbers = c(1, 2), logical = TRUE, strings = c("a", "b", "c"))
lst
## $numbers
## [1] 1 2
## 
## $logical
## [1] TRUE
## 
## $strings
## [1] "a" "b" "c"

-> the list contains three separate ‘elements’: numbers, logical, and strings values are assigned to each elements

And then subsetit:

lst[1]
## $numbers
## [1] 1 2

-> the first element numbers is accessed by the brackets [1]

The result is a smaller list with one element. That element is the vector c(1, 2). This can be annoying because many R functions do not work with lists. For example, sum(lst[1]) will return an error. It would be horrible if once you stored a vector in a list, you could only ever get it back as a list:

-> error is returned:

#sum(lst[1])

When you use the $ notation, R will return the selected values as they are, with no list structure around them:

lst$numbers
## [1] 1 2
## 1 2

-> the numbers element is directly accessed and the values are returned

You can then immediately feed the results to a function:

sum(lst$numbers)
## [1] 3
## 3

-> as the values are directly returned, they can be passed to a function

If the elements in your list do not have names (or you do not wish to use the names), you can use two brackets, instead of one, to subset the list. This notation will do the same thing as the $ notation:

lst[[1]]
## [1] 1 2
## 1 2

-> the double brackets also return the values of the first element (numbers)

In other words, if you subset a list with single-bracket notation, R will return a smaller list. If you subset a list with double-bracket notation, R will return just the values that 74 | Chapter 4: R Notation were inside an element of the list. You can combine this feature with any of R’s indexing methods:

lst["numbers"]
## $numbers
## [1] 1 2
lst[["numbers"]]
## [1] 1 2

-> single brackets returns the $numbers list -> double brackets the raw values

#Changing Values in Place

vec <- c(0, 0, 0, 0, 0, 0)
vec
## [1] 0 0 0 0 0 0
vec[1] <- 1000
vec
## [1] 1000    0    0    0    0    0

-> directly assigning the first element in the vector 1000

Changing more than one value at the same time.

vec[c(1, 3, 5)] <- c(7, 9, 8)
vec
## [1] 7 0 9 0 8 0
vec[4:6] <- vec[4:6] + 1
vec
## [1] 7 0 9 1 9 1

-> adding 1 to the 4th to 6th value in the vector

Force the vector to grow.

vec[7] <- 0
vec
## [1] 7 0 9 1 9 1 0

-> concatenating 0 to the vector (at 7th position)

deck2 <- read.csv("deck.csv",stringsAsFactors=FALSE)
deck2$new <- 1:52
head(deck2)

-> adding a new column (new) to deck2 -> as range from 1 to 52

Remove a column from a dataframe.

deck2$new <- NULL
head(deck2)

-> deleting the new column by setting it to NULL see that column is gone

deck2[c(13, 26, 39, 52), ]

-> accessing the 13th, 26th, 39th, and 52nd row of deck2

deck2[c(13, 26, 39, 52), 3]
## [1] 1 1 1 1

-> same rows but only third column

deck2$value[c(13, 26, 39, 52)]
## [1] 1 1 1 1

-> different method same result: value column of 13th, 26th, 39th, and 52nd row

deck2$value[c(13, 26, 39, 52)] <- c(14, 14, 14, 14)
deck2

-> assigning new values to the specified rows of the value column