Write a function that calculates the mean of any numeric vector you give it, without using the built-in mean() or sum() functions.
# test sum f(n) for debugging
v <- seq(2, 45, 3)
t <- 0
sm <- function(x) {
for (i in seq(1, length(x), 2)) {
xn <- x[i + 1]
if (is.na(xn) == T) {
xn <- 0
}
t <- x[i] + t + xn
print(paste("x[i]:", x[i]))
print(paste("t:", t))
}
}
sum(v)
## [1] 345
# mean f(n)
mn <- function(x) {
for (i in seq(1, length(x), 2)) {
if (i <= length(x)) {
xn <- x[i + 1]
if (is.na(xn) == T) {
xn <- 0
}
t <- x[i] + t + xn
if (i == length(x) - 1 | i == length(x)) {
print(t/length(x))
}
}
}
}
(outcome <- c(mn(v), mean(v)))
## [1] 23
## [1] 23
Write a function that takes as its input a vector with four elements. If the sum of the first two elements is greater than the sum of the second two, the function returns the vector; otherwise it returns 0.
v <- c(4:1)
b <- function(x) {
if (v[1] + v[2] > v[3] + v[4]) {
print(v)
} else {
print(0)
}
}
b(v)
## [1] 4 3 2 1
Write a function that calculates the Fibonacci sequence up to the nth element, where n is any number input into your function (its argument). The Fibonacci sequence is: 1, 1, 2, 3, 5, 8, 13, 21. . . , ie, each element is the sum of the previous two elements. One way to do this is to start off with the first two elements, c(1,1) and set an internal variable to this sequence. Then write a loop that counts up to n, where for each new element, you first calculate it by adding the last two elements of the growing sequence, and then stick that new number onto the growing sequence using c(). When the loop is finished, the function should return the final vector of Fibonacci numbers.
fib <- c(1, 1)
fibseq <- function(x) {
fend <- c(1:(x - 2))
i <- 1
for (i in fend) {
sf <- sum(fib)
fib <- c(fib[2], sf)
i <- i + 1
if (i == (x - 1)) {
print(fib[2])
}
}
}
fibseq(6)
## [1] 8
Create a 4x4 matrix of the numbers 1 through 16. Use apply to apply you function from (a) to each of the rows in your matrix.
(fibm <- matrix(data = c(1:16), nrow = 4, ncol = 4))
## [,1] [,2] [,3] [,4]
## [1,] 1 5 9 13
## [2,] 2 6 10 14
## [3,] 3 7 11 15
## [4,] 4 8 12 16
(fio <- apply(fibm, c(1), "mn"))
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## NULL
Using the airquality dataset, constuct an aggregated dataset which shows the maximum wind and ozone by month.
(wioz_by_mo <- aggregate(cbind(Wind, Ozone) ~ Month, data = airquality, "max"))
## Month Wind Ozone
## 1 5 20.1 115
## 2 6 20.7 71
## 3 7 14.9 135
## 4 8 15.5 168
## 5 9 16.6 96
Create the authors and books datasets following the example and data in the lecture, and then create a new data set by merging these two datasets by author, preserving all rows.
(authors <- data.frame(surname = c("Tukey", "Venables", "Tierney", "Ripley", "McNeil"),
nationality = c("US", "Australia", "US", "UK", "Australia"), stringsAsFactors = FALSE))
## surname nationality
## 1 Tukey US
## 2 Venables Australia
## 3 Tierney US
## 4 Ripley UK
## 5 McNeil Australia
(books <- data.frame(name = c("Tukey", "Venables", "Tierney", "Ripley", "Ripley",
"McNeil", "R Core"), title = c("Exploratory Data Analysis", "Modern Applied Statistics ...",
"LISP-STAT", "Spatial Statistics", "Stochastic Simulation", "Interactive Data Analysis",
"An Introduction to R"), stringsAsFactors = FALSE))
## name title
## 1 Tukey Exploratory Data Analysis
## 2 Venables Modern Applied Statistics ...
## 3 Tierney LISP-STAT
## 4 Ripley Spatial Statistics
## 5 Ripley Stochastic Simulation
## 6 McNeil Interactive Data Analysis
## 7 R Core An Introduction to R
merge.data.frame(authors, books, by.x = "surname", by.y = "name")
## surname nationality title
## 1 McNeil Australia Interactive Data Analysis
## 2 Ripley UK Spatial Statistics
## 3 Ripley UK Stochastic Simulation
## 4 Tierney US LISP-STAT
## 5 Tukey US Exploratory Data Analysis
## 6 Venables Australia Modern Applied Statistics ...
Take the following string and replace every instance of “to” or “To” with “2” :
to_2 <- "To be, or not to be -- that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles, And by opposing end them. To die -- to sleep -- No more..."
gsub("[T|t]o", 2, to_2)
## [1] "2 be, or not 2 be -- that is the question: Whether 'tis nobler in the mind 2 suffer The slings and arrows of outrageous fortune, Or 2 take arms against a sea of troubles, And by opposing end them. 2 die -- 2 sleep -- No more..."
Create a histogram using the base R graphics using some dataset or variable other than the one in the lessons. Always make sure your graph has well-labeled x and y axes and an explanatory title.
hist(mpg$hwy, main = "Frequencies of hwy mpg for 38 popular models of car", xlab = "Highway MPG")
Create a scatter plot using the base R graphics, again with some variable other than the one in the lessons.
plot(mpg$cyl, y = mpg$hwy, main = "# of Cylinders v Highway MPG", xlab = "# of Cylinders",
ylab = "HWY MPG")
Create a histogram using ggplot, using some new data. In this and the later plots, please tinker with the settings using the examples in http://www.cookbook-r.com/Graphs/ to make it prettier.
head(diamonds)
## # A tibble: 6 x 10
## carat cut color clarity depth table price x y z
## <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
## 4 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63
## 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75
## 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
palette <- c("#4545FF", "#B6FF9C", "#84FFC9", "#7EEA90", "#BADB69")
(dia_1 <- diamonds %>% group_by(cut) %>% summarize(mn.price = mean(price), mn.crt = mean(carat),
n = n()))
## # A tibble: 5 x 4
## cut mn.price mn.crt n
## <ord> <dbl> <dbl> <int>
## 1 Fair 4358.758 1.0461366 1610
## 2 Good 3928.864 0.8491847 4906
## 3 Very Good 3981.760 0.8063814 12082
## 4 Premium 4584.258 0.8919549 13791
## 5 Ideal 3457.542 0.7028370 21551
ggplot(data = dia_1, mapping = aes(x = cut)) + geom_histogram(mapping = aes(y = mn.price,
fill = mn.crt, color = n, size = 1), stat = "identity") + scale_fill_gradient("mn.carat",
low = "#FDE4FF", high = "#A065A3") + xlab("Cut") + ylab("Mean Price") + ggtitle("Cut v Avg Price, Fill: Avg Carat, Border: No. in Cut")
Create a box plot (with multiple categories) using ggplot, using some new data.
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot(color = "#4545FF",
fill = "#B6FF9C") + xlab("Class of Car") + ylab("Hwy MPG") + ggtitle("Class of Car v Hwy MPG") +
theme(plot.title = element_text(hjust = 0.5))
Create a scatter plot using ggplot, using some new data.
ggplot(data = mpg, mapping = aes(x = cyl, y = cty, color = class)) + geom_point() +
xlab("No. of Cylinders") + ylab("City MPG") + ggtitle("Cylinders v City MPG, Class of Vehicle by Color")