HW1: Basic Data Objects

1) Vectors [8 pts]

Create vectors and factors for the columns in the data table displayed above, according to the following data types. If there are missing values, codify them as NA.

number: integer vector

number <- c(30L, 35L, 23L, 9L, 11L, 27L, 34L, 6L, 3L, 0L)

player: character vector

player <- c('Stephen Curry', 'Kevin Durant', 'Draymond Green', 'Andre Iguodala', 'Klay Thompson', 'Zaza Pachulia', 'Shaun Livingston', 'Nick Young', 'David West', 'Patrcik McCaw')

position: factor

position <- c("PG", "PF", "PF", "SF", "SG", "C", "PG", "SG", "C", "SG")
position <- as.factor(position)

height: character vector

height <- c('6-3', '6-9', '6-7', '6-6', '6-7', '6-11', '6-7', '6-7', '6-9', '6-7')

weight: double (i.e. real) vector

weight <- c(190, 240, 230, 215, 215, 270, 192, 210, 250, 185)

birthdate: character vector

birthdate <- c('March 14, 1988', 'September 29, 1988', 'March 4, 1990', 'January 28, 1984', 'February 8, 1990', 'February 10, 1984', 'September 11, 1985', 'June 1, 1985', 'August 29, 1980', 'October 25, 1995')

experience: integer vector

experience <- c(8L,10L,5L,13L,6L,14L,12L,10L,14L,1L)

college: character vector

college <- c('Davidson College', 'University of Texas at Austin', 'Michigan State University', 'University of Arizona', 'Washingtion State University', 'NA', 'NA', 'University of Southern California', 'Xavier University', 'University of Nevada, Las Vegas')

2) Vector Subsetting (i.e. indexing) [10 pts]

Write R commands (just one single-line commands)—displaying the output—that answer the following questions:

What is the weight of the tallest player?
What is the college of the player that has a height of 6-6?

college[height == '6-6']

## [1] "University of Arizona"

What is the position of the player with more years of experience? Hint: the which.max() function is your friend.

position[which.max(experience)]

## [1] C
## Levels: C PF PG SF SG

What is the number of the lightest player? Hint: the which.min() function is your friend.

number[which.min(weight)]

## [1] 0

What is the average height for those players with more than 5 years of experience?
How many players have a weight larger than the average (i.e. mean) weight?

sum(weight > mean(weight))

## [1] 4

How many players have between 9 and 12 years of experience (inclusive)?

sum(experience >= 9 & experience <= 12)

## [1] 3

What is the mean years of experience of Shooting Guard (SG) players?

mean(experience[position == 'SG'])

## [1] 5.666667

What is the median weight of those players with a position different of Center (C)?

median(weight[position != 'C'])

## [1] 212.5

What is the first quartile (i.e. bottom 25th percentile) of years of experience among Power Forwards (PF) and Shooting Guards (SG). Hint: the quantile() function is your friend.

quantile(experience[position == 'PF' | position == 'SG'], .25)

## 25% 
##   5

3) List for GSW [5 pts]

Use the vectors created in the previous section to create the following list gsw:

gsw <- list(
  player = player,
  number = number,
  position = position,
  weight = weight,
  experience = experience
)

Use the list gsw to write R commands—displaying the output—that answer the following questions (use only the list gsw, NOT the individual vectors):

What is the number of the heaviest player?

gsw[["number"]][which.max(weight)]

## [1] 27

What is the position of the player with the least amount of experience?

gsw[["position"]][which.min(experience)]

## [1] SG
## Levels: C PF PG SF SG

How many players have less than 8 or more than 11 years of experience?

sum(gsw$experience < 8 | gsw$experience > 11)

## [1] 7

What is the third quartile (i.e. bottom 75th percentile) of years of experience among Power Forwards (PF) and Shooting Guards (SG). Tip: the function quantile() is your friend.

quantile(gsw$experience[which(position == 'PF' | position == 'SG')], .75)

## 75% 
##  10

What is the name of the player whose weight is furthest from the average weight (of all players)? Tip: the function which.max() is your friend.

gsw$player[which.max((abs(mean(weight)-weight)) + mean(weight))]

## [1] "Zaza Pachulia"

4) More lists [6 pts]

Consider the following list:

hp <- list(
  first = 'Harry',
  last = 'Potter',
  courses = c('Potions', 'Enchantments', 'Spells'),
  sport = 'quidditch',
  age = 18L,
  gpa = 3.9
)

Write R commands—displaying the output—to answer the following questions:

What is the class of hp?

class(hp)

## [1] "list"

How many elements are in hp?

length(hp)

## [1] 6

What is the length of courses?

length(hp$courses)

## [1] 3

What is the data type of the element age?

typeof(hp$age)

## [1] "integer"

What is the data type of the element gpa?

typeof(hp$gpa)

## [1] "double"

If you combine age and gpa in a new vector, what is the data type of this vector?

typeof(sum(hp$age, hp$gpa))

## [1] "double"

5) Technical Questions [4 pts]

Explain why the following command returns 2?

1 + TRUE

TRUE has a value of 1, so 1 + 1 = 2

Explain why the following command returns TRUE?

"1" == 1

== is the operator which checks to see if the value on the left matches the value on the right, in this case ‘1’ is the same as 1.

Explain why the following command returns FALSE?

"-2" > 0

‘>’ will return TRUE or FALSE, depending on if the critera is met, in this case, -2 is not greater than 0, so FALSE.

Explain why the following command returns TRUE?

(10 <= 5) >= 0

(10 <= 5) will return FALSE, and FALSE has a value of 0, so then we get 0 >= 0 which is TRUE.

6) Explanations [10 pts]

Consider the following vector lord:

lord <- c('v', 'o', 'l', 'd', 'e', 'm', 'o', 'r', 't')

Run the following commands and explain what’s happening in each of them (in terms of subsetting, coercion, recycling, vectorization, etc):

lord[TRUE]

'This command will return all TRUE values inside of the vector lord, which is every value since we every value is TRUE'

## [1] "This command will return all TRUE values inside of the vector lord, which is every value since we every value is TRUE"

lord[length(lord) + 1]

'So this command will check what value is vector lord that is at the end of the vector + 1 position. Since there is no vector that is in the end + 1 position, we get the result NA.'

## [1] "So this command will check what value is vector lord that is at the end of the vector + 1 position. Since there is no vector that is in the end + 1 position, we get the result NA."

lord[seq(from = length(lord), to = 1, by = -2)]

'This is giving us the value of the sequence that starts from the end of lord (from = length(lord)), returning every value that 2 positions behind the previous value (by = -2), all the way until the 1 position (to = 1)'

## [1] "This is giving us the value of the sequence that starts from the end of lord (from = length(lord)), returning every value that 2 positions behind the previous value (by = -2), all the way until the 1 position (to = 1)"

lord[lord == "o"]

'This is just returning every value inside of the vector lord that is equal to "o", in this case, is two "o""o".'

## [1] "This is just returning every value inside of the vector lord that is equal to \"o\", in this case, is two \"o\"\"o\"."

lord[lord != "e" & lord != "o"]

"This command returns that values that are not equal to 'e' and not equal to 'o'. In this case those are the values 'v''l''d''m''r''t'."

## [1] "This command returns that values that are not equal to 'e' and not equal to 'o'. In this case those are the values 'v''l''d''m''r''t'."

lord[lord %in% c('a', 'e', 'i', 'o', 'u')]

'This checks for the values inside the vector lord, which are also in the new vector created, a e i o u, so we are returned with values o e o.'

## [1] "This checks for the values inside the vector lord, which are also in the new vector created, a e i o u, so we are returned with values o e o."

toupper(lord[!(lord %in% c('a', 'e', 'i', 'o', 'u'))])

'The opposite of above, checking which values that are not equal to vowel vector, and then returning those values back to us as upper case (toupper).'

## [1] "The opposite of above, checking which values that are not equal to vowel vector, and then returning those values back to us as upper case (toupper)."

paste(lord, collapse = '')

"Paste concatenates the values given, in this case the values of vector lord, and collapse gets rid of all ''. Giving us the result of voldemort."

## [1] "Paste concatenates the values given, in this case the values of vector lord, and collapse gets rid of all ''. Giving us the result of voldemort."

lord[is.na(lord)]

"This is checking for the values inside of lord that are NA values. Since there is no NA values, is.na(lord)returns 9 FALSE values. Since no values are TRUE, we get the result character(0),"

## [1] "This is checking for the values inside of lord that are NA values. Since there is no NA values, is.na(lord)returns 9 FALSE values. Since no values are TRUE, we get the result character(0),"

sum(!is.na(lord))

"The opposite as above, except we are not given 9 TRUE values for !is.na(lord) which checks for all values which are NOT NA, which gives us 9 TRUES. With 9 TRUES sum(9 TRUES) = 9, since TRUE has a value of 1."

## [1] "The opposite as above, except we are not given 9 TRUE values for !is.na(lord) which checks for all values which are NOT NA, which gives us 9 TRUES. With 9 TRUES sum(9 TRUES) = 9, since TRUE has a value of 1."

7) Matrices [7 pts]

You can take the vector lord and pass it to matrix() to create various kinds of matrices. For example:

vol <- matrix(lord, nrow = 3, ncol = 3)
vol

##      [,1] [,2] [,3]
## [1,] "v"  "d"  "o" 
## [2,] "o"  "e"  "r" 
## [3,] "l"  "m"  "t"

and

     [,1] [,2]
[1,] "d"  "o" 
[2,] "e"  "r" 
[3,] "m"  "t"

Manipulate the matrix vol with bracket notation (i.e. subscripting) to write R commands—displaying the output—in order to obtain the following matrices.

obtain the following output

vol[1,]

## [1] "v" "d" "o"

obtain the following output

cbind(vol[1:2,2],vol[1:2,1])

##      [,1] [,2]
## [1,] "d"  "v" 
## [2,] "e"  "o"

obtain the following output

rbind(vol[3,1:3],vol[2,1:3],vol[1,1:3])

##      [,1] [,2] [,3]
## [1,] "l"  "m"  "t" 
## [2,] "o"  "e"  "r" 
## [3,] "v"  "d"  "o"

obtain the following output

cbind(vol[,1:2],vol[,2])

##      [,1] [,2] [,3]
## [1,] "v"  "d"  "d" 
## [2,] "o"  "e"  "e" 
## [3,] "l"  "m"  "m"

obtain the following output

cbind(dem[3:1,2:1], vol[3:1,1])

##      [,1] [,2] [,3]
## [1,] "t"  "m"  "l" 
## [2,] "r"  "e"  "o" 
## [3,] "o"  "d"  "v"

obtain the following output

rbind(cbind(dem[3:1,2:1], dem[3:1,1:2]), cbind(dem[1:3,2:1], dem[1:3, 1:2]))

##      [,1] [,2] [,3] [,4]
## [1,] "t"  "m"  "m"  "t" 
## [2,] "r"  "e"  "e"  "r" 
## [3,] "o"  "d"  "d"  "o" 
## [4,] "o"  "d"  "d"  "o" 
## [5,] "r"  "e"  "e"  "r" 
## [6,] "t"  "m"  "m"  "t"

obtain the following output

cbind(rbind(vol[3:1,1:3],vol[1:3,1:3]), rbind(vol[3:1,3:1],vol[1:3,3:1]))

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] "l"  "m"  "t"  "t"  "m"  "l" 
## [2,] "o"  "e"  "r"  "r"  "e"  "o" 
## [3,] "v"  "d"  "o"  "o"  "d"  "v" 
## [4,] "v"  "d"  "o"  "o"  "d"  "v" 
## [5,] "o"  "e"  "r"  "r"  "e"  "o" 
## [6,] "l"  "m"  "t"  "t"  "m"  "l"