Question 2.1
#Suppose you keep track of your mileage each time you fill up. At your last 6 fill-ups the mileage was:
# 65311 65624 65908 66219 66499 66821 67145 67447
# Enter these numbers into R.
miles <- c(65311, 65624, 65908, 66219, 66499, 66821, 67145, 67447)
miles
## [1] 65311 65624 65908 66219 66499 66821 67145 67447
# Use the function diff on the data. What does it give?
# Ans: The diff() function outputs the difference between the n+1 and n index of the vector, where n < length of the vector
fillup_delta <- diff(miles)
fillup_delta
## [1] 313 284 311 280 322 324 302
# Use the max to find the maximum number of miles between fill-ups, the mean function to find the average number of miles and the min to get the minimum number of miles.
max(fillup_delta)
## [1] 324
min(fillup_delta)
## [1] 280
mean(fillup_delta)
## [1] 305.1429
Question 2.2
# Suppose you track your commute times for two weeks (10 days) and you find the following times in minutes: 17 16 20 24 22 15 21 15 17 22
# Enter this into R.
commute <- c(17, 16, 20, 24, 22, 15, 21, 15, 17, 22)
commute
## [1] 17 16 20 24 22 15 21 15 17 22
# Use the function max to find the longest commute time, the function mean to find the average and the function min to find the minimum.
max(commute)
## [1] 24
min(commute)
## [1] 15
mean(commute)
## [1] 18.9
# Oops, the 24 was a mistake. It should have been 18. How can you fix this? Do so, and then find the new average.
commute[4] <- 18
mean(commute)
## [1] 18.3
# How many times was your commute 20 minutes or more?
sum(commute >= 20)
## [1] 4
# What do you get? What percent of your commutes are less than 17 minutes? How can you answer this with R?
sum(commute < 17) / length(commute)
## [1] 0.3
Question 2.3
# Your cell phone bill varies from month to month. Suppose your year has the following monthly amounts: 46 33 39 37 46 30 48 32 49 35 30 48
# Enter this data into a variable called bill.
bill <- c(46, 33, 39, 37, 46, 30, 48, 32, 49, 35, 30, 48)
bill
## [1] 46 33 39 37 46 30 48 32 49 35 30 48
# Use the sum command to find the amount you spent this year on the cell phone.
sum(bill)
## [1] 473
# What is the smallest amount you spent in a month?
min(bill)
## [1] 30
# What is the largest?
max(bill)
## [1] 49
# How many months was the amount greater than $40?
sum(bill > 40)
## [1] 5
# What percentage was this?
sum(bill > 40) / length(bill)
## [1] 0.4166667
Question 2.4
# You want to buy a used car and find that over 3 months of watching the classifieds you see the following prices (suppose the cars are all similar): 9000 9500 9400 9400 10000 9500 10300 10200
price <- c(9000, 9500, 9400, 9400, 10000, 9500, 10300, 10200)
price
## [1] 9000 9500 9400 9400 10000 9500 10300 10200
# Use R to find the average value and compare it to Edmund's (http://www.edmunds.com) estimate of $9500.
avg <- mean(price)
edmunds <- 9500
delta <- abs(avg - edmunds)
delta
## [1] 162.5
# Use R to find the minimum value and the maximum value. Which price would you like to pay?
min(price)
## [1] 9000
max(price)
## [1] 10300
# I would pay $9000.
Question 2.5
# Try to guess the results of these R commands. Remember, the way to access entries in a vector is with []. Suppose we assume
x = c(1,3,5,7,9)
y = c(2,3,5,7,11,13)
# 1. x+1 -> all elements in x are incremented by 1 > 2 4 6 8 10
x + 1
## [1] 2 4 6 8 10
# 2. y*2 -> all elements in y are multipled by 2 > 4 6 10 14 22 26
y*2
## [1] 4 6 10 14 22 26
# 3a. length(x) -> the number of elements within x > 5
length(x)
## [1] 5
# 3b. length(y) -> the number of elements within y > 6
length(y)
## [1] 6
# 4. x + y -> The first 5 elements of x are added to the first 5 elements of y, and the first element of x is added to the last element of y > 3 6 10 14 20 14
x + y
## [1] 3 6 10 14 20 14
# 5a. sum(x>5) -> summation of number of elements within x which are greater than 5 > 2
sum(x>5)
## [1] 2
# 5b. sum(x[x>5]) -> summation of elements within x which are greater than 5 > 16
sum(x[x>5])
## [1] 16
# 6. sum(x>5 | x< 3) -> summation of the number of elements within x which are greater than 5 or less than 3 > 3
sum(x>5 | x< 3)
## [1] 3
# 7. y[3] -> the third element of y > 5
y[3]
## [1] 5
# 8. y[-3] -> all elements except the third element of y > 2 3 7 11 13
y[-3]
## [1] 2 3 7 11 13
# 9. y[x] (What is NA?) -> A vector with the same length as x, and where the elements of x are used as the index of y, i.e. the 1st, 3rd, 5th, 7th and 9th element of y > 2 5 11 NA NA
y[x]
## [1] 2 5 11 NA NA
# 10. y[y>=7] -> a vector of elements within y which are greater than or equal to 7 > 7 11 13
y[y>=7]
## [1] 7 11 13
Question 2.6
# Let the data x be given by
x = c(1, 8, 2, 6, 3, 8, 5, 5, 5, 5)
# Use R to compute the following functions. Note, we use X1 to denote the first element of x (which is 0) etc.
# 1. (X1 + X2 + · · · + X10)/10 (use sum)
sum(x) / 10
## [1] 4.8
# 2. Find log10(Xi ) for each i. (Use the log function which by default is base e)
log(x, base = 10)
## [1] 0.0000000 0.9030900 0.3010300 0.7781513 0.4771213 0.9030900 0.6989700
## [8] 0.6989700 0.6989700 0.6989700
# 3. Find (Xi - 4.4)/2.875 for each i. (Do it all at once)
(x - 4.4) / 2.875
## [1] -1.1826087 1.2521739 -0.8347826 0.5565217 -0.4869565 1.2521739
## [7] 0.2086957 0.2086957 0.2086957 0.2086957
# 4. Find the difference between the largest and smallest values of x. (This is the range. You can use max and min or guess a built in command.)
diff(range(x))
## [1] 7