To complete this assignment, follow these steps:
  1. Download the homework1.Rmd file from Canvas.

  2. Open homework1.Rmd in RStudio.

  3. Replace the “Your Name Here” text in the author: field with your own name.

  4. Supply your solutions to the homework by editing homework1.Rmd.

  5. When you have completed the homework and have checked that your code both runs in the Console and knits correctly when you click Knit HTML, rename the R Markdown file to homework1_YourNameHere.Rmd, and submit on Canvas.
    (YourNameHere should be changed to your own name.)

Tips:
Important criteria!

You must follow the R Style Guide. This means, amongst other things:

You will be graded on this. Unreadable code will earn a 0.


Problem 1: (3pts)

Consider the object COUNT printed below. What type of object is COUNT?

(a) Matrix, data frame, or not enough information? (1pts)

matrix

(b) Explain your choice. (2pts)

It is a combination of 2 vectors with all the variables in the same mode.


Problem 2: (6pts)

Create the matrix below using the following approaches. Everything should match the below matrix, including the column and row names.

(a) Using the c() and matrix() functions (3pts)
M0 <- matrix(data=c(-2,-2,-2,-1,-1,-1,0,0,0,1,1,1,2,2,2), nrow=3, ncol=5, byrow=FALSE,dimnames=list(c("[1,]","[2,]","[3,]"), c("[,1]","[,2]","[,3]","[,4]","[,5]"))) 
(b) Using the cbind() or rbind() function (3pts)
v1 <- rep(-2,3)
v2 <- rep(-1,3)
v3 <- rep(0,3)
v4 <- rep(1,3)
v5 <- rep(2,3)
M1 <- cbind(v1,v2,v3,v4,v5)
M1.2 <- matrix(data=M1, nrow=3, ncol=5, byrow=FALSE,dimnames=list(c("[1,]","[2,]","[3,]"), c("[,1]","[,2]","[,3]","[,4]","[,5]")))

Problem 3: Vectors (12 pts)

(a) Using seq() (3pts)

Read the help page for seq. Using what you have learned so far, write three different R expressions to generate the vector (2, 4, 6, 8). Although you will be writing three different expressions/ways to generate the vector, each of the three ways should utilize the seq function. The examples on the help page will be useful.

?seq
## starting httpd help server ... done
seq(2,8, by = 2)
seq(2,8, length.out = 4)
seq(2,9, by =2)
(b) Using rep() (3pts)

Consider the following set of measurements:

x.2b <- c(1, 1, 2, 3, 4, 4, 4, 5, 9, 11, 11, 12,
          17, 20, 25, 42, 49, 209, 390,  420)
x.2b
##  [1]   1   1   2   3   4   4   4   5   9  11  11  12  17  20  25  42  49 209 390
## [20] 420
?rep()

Read the help page for the rep function. Using rep() and c(), write an R expression to generate those values in any order and assign them to y, WITHOUT directly using x.2b. Show y after the assignment.

y <- c(rep(1:4, c(2,1,1,3)),5,9,rep(11:12, c(2,1)),17,20,25,42,49,209,390,420)
y
(c) Read the help pages for any and all, and briefly describe what they do. (3pts)
?any
?all

The any function checks if from a given set of vectors, any of them is true. The all function also checks this, but every single one of the vectors must follow the given condition.

(d) What do you expect to get from all(y==x.2b), and why? Check your intuition in R; was it correct? Why/why not? (3pts)

I expect it to be true because all values of y are the same as x.2b. This was correct because since all the values of x.2b and y were the same and in the same order, they must be equal to each other.

x.2b <- c(1, 1, 2, 3, 4, 4, 4, 5, 9, 11, 11, 12,
          17, 20, 25, 42, 49, 209, 390,  420)
y <- c(rep(1:4, c(2,1,1,3)),5,9,rep(11:12, c(2,1)),17,20,25,42,49,209,390,420)
all(y==x.2b)
## [1] TRUE

Problem 4: Missing values (14pts)

Recall learning about finding documentation and information on functions in R. Let’s explore the is.na() function and apply it to age. Let age be the vector defined below. Start by running ?is.na()

age <- c(18, NA, 25, 71, NA, 45, NA, NA, 18)
?is.na()
(a) Finding NA’s. (4pts)

Write a Boolean expression that checks whether each entry of age is missing (recall missing values are denoted by NA). Your expression should return a Boolean vector having the same length as age.

is.na(age)
## [1] FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE
(practice for part b) all() and any() when NA’s are present.

Given your experience thus far with R, what do you expect all(age>20) to give? Try it. Were your expectations correct?

I expect it to give FALSE because there are some NA for age and 18 is less than 20. The answer was FALSE, which matches my expectations.

all(age>20)
## [1] FALSE

Given your experience thus far with R, what do you expect all(age>0) to give? Try it. Were your expectations correct?

I believe it would still be FALSE because of the NA, which is not a numerical mode. Thus, this cannot be greater than 0 and would disprove all of the values being greater than 0. The answer was NA, which is similar to what I expected as it wouldn’t be able to compute it.

all(age>0)
## [1] NA

How would you reconcile the different behavior of R in these two cases? [Hint: think about what the missing value NA does to your ability to answer “all elements of age are …”]

NA makes it unable to render as NA is not a numeric value. When age>20, 18 existed which allowed the function to say FALSE because it is not true. However, when age>0, all the numbers were greater, but because only NA were left, this prevented the function from saying TRUE, but instead NA.

Given your experience thus far with R, what do you expect any(age<20) to give? Try it. Were your expectations correct?

I expect it to say yes because at least one of the data, 18, is less than 20. It is TRUE, like my expectations.

any(age<20)
## [1] TRUE

Given your experience thus far with R, what do you expect any(age<0) to give? Try it. Were your expectations correct?

I believe it would say NA since there are no numbers less than 0 and the NAs would make it impossible for the function to compare NA and 0. My expectations were correct.

any(age<0)
## [1] NA

How would you reconcile the different behavior of R in these two cases? [Hint: think about what the missing value NA does to your ability to answer “any elements of age are …”]

Similar to all, when there are no numbers left to compare with, NA makes the function unable to compare with numeric values. Unlike all though, as long as one of the values fits with the statement, it outputs TRUE.

(b) Summarizing the behavior of all() and any(): (10pts)

Fill out the table below, giving the conditions for x under which all(x) and any(x) would output TRUE, FALSE, or NA. One cell is filled in as an example.

command output = TRUE output = FALSE output = NA
all(x) all TRUE, no NAs all FALSE, no NAs only NAs left for statement
any(x) at least one TRUE, can have NAs at least one FALSE, can have NAs only NAs left for statement