⇒ /Users/sambamamba/Desktop/Stat_133/Stat_133 Stuff/Lab_8.Rmd
1. The following loop can be done as a vectorized calculation. What is the equivalent vectorized calculation?
vec = 1:50
i = 1
while(i < 50) {
vec[i] <- i^2
i <- i + 1
}
Equivalent vectorized calculation:
vec <- c(1:50)
vec[1:49] <- vec[1:49]^2
vec
## [1] 1 4 9 16 25 36 49 64 81 100 121 144 169 196
## [15] 225 256 289 324 361 400 441 484 529 576 625 676 729 784
## [29] 841 900 961 1024 1089 1156 1225 1296 1369 1444 1521 1600 1681 1764
## [43] 1849 1936 2025 2116 2209 2304 2401 50
2. Write down what the vector x will contain after each line of R code, if they are executed sequentially.
x <- seq(from = 0 , to = 8, by = 2)
x will contain
## [1] 0 2 4 6 8
x[x < 4] <- NA
if x is the same as in part (a), then x will contain
y <- c("NA", "NA", 4, 6, 8)
y
## [1] "NA" "NA" "4" "6" "8"
x[] <- 0
x will return
z <- c(0,0,0,0,0)
z
## [1] 0 0 0 0 0
x <- 0
x will return
0
## [1] 0
**3. Fill in the appropriate types of R objects in the sentence: The lapply function in R operates on ____ and returns ___. There may be more than one correct answer.**
lapply in R operates on vectors, lists, matrices, data frames and returns a list.
4. Suppose we have a 5000 by 3 matrix m
set.seed(1337)
m <- matrix(runif(15000, -3, 3), ncol = 3)
head(m)
## [,1] [,2] [,3]
## [1,] 0.4579293 -1.5301484 2.7691707
## [2,] 0.3884528 1.1452789 -1.8301459
## [3,] -2.5560586 1.3678571 -1.1991257
## [4,] -0.2768063 -2.5663416 -0.7060263
## [5,] -0.7603244 -0.5228916 -2.2818951
## [6,] -1.0120953 -2.1568870 2.9346426
m. Write R code to do this in two different ways:m.ssq.loop:m.ssq.loop <- c()
for (i in 1:nrow(m)) {
b <- sum(m[i,]^2)
m.ssq.loop <- c(m.ssq.loop, b)
}
head(m.ssq.loop)
## [1] 10.219360 4.811994 9.842371 7.161204 6.058554 14.288626
apply function and saving the result in an object called m.ssq.apply (See ?apply):m.ssq.apply <- apply(m, 1, function(x) {sum(x^2)})
head(m.ssq.apply)
## [1] 10.219360 4.811994 9.842371 7.161204 6.058554 14.288626
TRUE if the two vectors are exactly the same and FALSE otherwise. This is called a sanity check, and you should do them often when coding.identical(m.ssq.loop, m.ssq.apply)
## [1] TRUE
5. table1 is given below:
Year <- c(2000, 2001)
Algeria <- c(7, 9)
Brazil <- c(12, "NA")
Columbia <- c(16, 18)
table1 <- data.frame(Year, Algeria, Brazil, Columbia)
table1
## Year Algeria Brazil Columbia
## 1 2000 7 12 16
## 2 2001 9 NA 18
table1 into the object table2.table2 <- table1 %>%
gather(key = Country, value = Value, Algeria, Brazil, Columbia)
## Warning: attributes are not identical across measure variables; they will
## be dropped
table2
## Year Country Value
## 1 2000 Algeria 7
## 2 2001 Algeria 9
## 3 2000 Brazil 12
## 4 2001 Brazil NA
## 5 2000 Columbia 16
## 6 2001 Columbia 18
table2 to produce table3.table3 <- table2 %>%
group_by(Year) %>%
mutate(Average = mean(as.numeric(Value), na.rm = TRUE)) %>%
spread(key = Country, value = Value)
## Warning: NAs introduced by coercion
## Warning: NAs introduced by coercion
table3
## Source: local data frame [2 x 5]
## Groups: Year [2]
##
## Year Average Algeria Brazil Columbia
## (dbl) (dbl) (chr) (chr) (chr)
## 1 2000 11.66667 7 12 16
## 2 2001 13.50000 9 NA 18
6. Consider the graphic in the lab prompt.
Answer: The glyphs in this graph are the points on the graph and the smooth lines approximating the points.
Answer:
Answer:
X-axis: educ variable
Y-axis: wage variable
color: sex variable, with layers “F” and “M”
Answer:
wage: quantitative
educ: quantitative
sex: qualitative
Answer:
Guide is the display of the scale, and the scale is the relationship between a varibale and the aesthetic to which it is mapped to. As such, a guide is the display of the relationship between variable and aesthetic.
The guides for the variable educ are the years of education mapped as tick marks on the x-axis. We are shown only the tick marks 5,10, and 15, corresponding to 5, 10, and 15 years of education.
The guides for wage are the tick marks on the y-axis, with each tick mark denoting dollars per hour, or hourly wage, of each individual case.
The sex variable guides are displayed by the coloring of the glyphs. For both the lines and the points, the green color represents “M” level of the sex variable, or represents the males, and the red colors on the lines and points represent the “F” layer, or the female sex.
ggplot and the CPS85 data table. (Hint: the font size of the plot title is 20)Bernie <- CPS85 %>%
ggplot(aes(x = educ, y = wage, col = sex)) + geom_point() + geom_smooth(method = "lm", se = TRUE) + theme(plot.title = element_text(size = 20), axis.title = element_text(size = 13)) + labs(title = "Wage vs. Education in CPS85") + ylim(0,15)
Bernie
## Warning: Removed 55 rows containing non-finite values (stat_smooth).
## Warning: Removed 55 rows containing missing values (geom_point).
7. Consider the string my.string given below:
my.string <- "ggplot2 is a data visualization package f or the statistical programming language R"
SplitChars taht takes a character string as input and splits it intoa vector of single character elements. Hint: the function strsplit() may be useful. The result of calling the function on my.string is shown in the prompt.SplitChars <- function(x) {
if (!(is.character(x))) {
stop("x must be a character vector")
}
strsplit(x, "")
}
SplitChars(my.string)
## [[1]]
## [1] "g" "g" "p" "l" "o" "t" "2" " " "i" "s" " " "a" " " "d" "a" "t" "a"
## [18] " " "v" "i" "s" "u" "a" "l" "i" "z" "a" "t" "i" "o" "n" " " "p" "a"
## [35] "c" "k" "a" "g" "e" " " "f" " " "o" "r" " " "t" "h" "e" " " "s" "t"
## [52] "a" "t" "i" "s" "t" "i" "c" "a" "l" " " "p" "r" "o" "g" "r" "a" "m"
## [69] "m" "i" "n" "g" " " "l" "a" "n" "g" "u" "a" "g" "e" " " "R"
my.string and saves it in an object called count.count <- 0
for (i in SplitChars(my.string)[[1]]) {
if (i == "a" | i == "s" | i == "R" | i == "r") {
count <- count + 1
}
}
print(count)
## [1] 20
Reverse that reverses the order of single characters. The result of using it on SplitChars(mystring) is shown in the prompt.sep <- SplitChars(my.string)
unlist(lapply(sep, rev))
## [1] "R" " " "e" "g" "a" "u" "g" "n" "a" "l" " " "g" "n" "i" "m" "m" "a"
## [18] "r" "g" "o" "r" "p" " " "l" "a" "c" "i" "t" "s" "i" "t" "a" "t" "s"
## [35] " " "e" "h" "t" " " "r" "o" " " "f" " " "e" "g" "a" "k" "c" "a" "p"
## [52] " " "n" "o" "i" "t" "a" "z" "i" "l" "a" "u" "s" "i" "v" " " "a" "t"
## [69] "a" "d" " " "a" " " "s" "i" " " "2" "t" "o" "l" "p" "g" "g"