R contains built-in functions for your favorite math operations and, of course, for statistical distributions.
#Visualize the cars dataset
plot(cars)
Extended Example: Calculating a Probability
As our first example, we’ll work through calculating a probability using the prod() function. Suppose we have n independent events, and the ith event has the probability pi of occurring. What is the probability of exactly one of these events occurring? Suppose first that n = 3 and our events are named A, B, and C. Then we break down the computation as follows:
P(exactly one event occurs) = P(A and not B and not C) + P(not A and B and not C) + P(not A and not B and C)
Here’s code to compute this, with our probabilities pi contained in the vector p:
p <- c(2,4,6)
exactlyone <- function(p) {
notp <- 1 - p
tot <- 0.0
for (i in 1:length(p))
tot <- tot + p[i] * prod(notp[-i])
return(tot)
}
exactlyone(p)
## [1] 68
How does it work? Well, the assignment notp <- 1 - p creates a vector of all the “not occur” probabilities 1 − pj , using recycling. The expression notp[-i] computes the product of all the elements of notp, except the ith—exactly what we need.
As mentioned, the functions cumsum() and cumprod() return cumulative sums and products.
#cumsum()-is used to calculate the cumulative sum of the vector passed as argument
#sums the elements together
#cumprod()- is used to calculate the cumulative product of the vector passed as argument.
#multiples the elements together
x <- c(12,5,13)
cumsum(x)
## [1] 12 17 30
cumprod(x)
## [1] 12 60 780
In x, the sum of the first element is 12, the sum of the first two elements is 17, and the sum of the first three elements is 30. The function cumprod() works the same way as cumsum(), but with the product instead of the sum.
Minima and Maxima
There is quite a difference between min() and pmin(). The former simply combines all its arguments into one long vector and returns the minimum value in that vector. In contrast, if pmin() is applied to two or more vectors, it returns a vector of the pair-wise minima, hence the name pmin.
Here’s an example:
z [,1] [,2] [1,] 1 2 [2,] 5 3 [3,] 6 2
min(z[,1],z[,2]) [1] 1
pmin(z[,1],z[,2]) [1] 1 3 2
In the first case, min() computed the smallest value in (1,5,6,2,3,2). But the call to pmin() computed the smaller of 1 and 2, yielding 1; then the smaller of 5 and 3, which is 3; then finally the minimum of 6 and 2, giving 2. Thus, the call returned the vector (1,3,2).
The max() and pmax() functions act analogously to min() and pmin(). Function minimization/maximization can be done via nlm() and optim(). For example, let’s find the smallest value of f(x) = x2 − sin(x).
#nlm() function carries out a non linear minimization of the function f using a Newton-type algorithm.
nlm(function(x) return(x^2-sin(x)),8)
## $minimum
## [1] -0.2324656
##
## $estimate
## [1] 0.4501831
##
## $gradient
## [1] 4.024558e-09
##
## $code
## [1] 1
##
## $iterations
## [1] 5
Here, the minimum value was found to be approximately −0.23, occurring at x = 0.45. A Newton-Raphson method (a technique from numerical analysis for approximating roots) is used, running five iterations in this case. The second argument specifies the initial guess, which we set to be 8. (This second argument was picked pretty arbitrarily here, but in some problems, you may need to experiment to find a value that will lead to convergence.)
Recreate a similar case scenario shown above but this time using the function f(x) = 3x2 − cos(x)
#Enter Answer Here
#needed to fix some parts of the functin to make it configure the solution.
nlm(function(x) return(3*x^2-cos(x)),12)
## $minimum
## [1] -1
##
## $estimate
## [1] -7.282059e-13
##
## $gradient
## [1] -4.996004e-12
##
## $code
## [1] 1
##
## $iterations
## [1] 4
Calculus
R also has some calculus capabilities, including symbolic differentiation and numerical integration, as you can see in the following example.
# derivative
D(expression(exp(x^2)),"x")
## exp(x^2) * (2 * x)
Find the derivative of exp(x^3)+x
#Enter Answer Here
D(expression(exp(x^3 + x)),"x")
## exp(x^3 + x) * (3 * x^2 + 1)
#integrate()- Adaptive quadrature of functions of one variable over a finite or infinite interval.
#the process of constructing a square with an area equal to that of a circle, or of another figure bounded by a curve.
integrate(function(x) x^2,0,1)
## 0.3333333 with absolute error < 3.7e-15
Integrate exp(x^2) between 0 and 1
#Enter Answer Here
integrate(function(x) exp(x^2),0,1)
## 1.462652 with absolute error < 1.6e-14
Set Operations
R includes some handy set operations, including these: • union(x,y): Union of the sets x and y • intersect(x,y): Intersection of the sets x and y • setdiff(x,y): Set difference between x and y, consisting of all elements of x that are not in y • setequal(x,y): Test for equality between x and y • c %in% y: Membership, testing whether c is an element of the set y • choose(n,k): Number of possible subsets of size k chosen from a set of size n
#Name the vectors 'x,y' and give theme elements
x <- c(1,2,5)
y <- c(5,1,8,9)
#brings both lements of each vector together in order with no repitions
union(x,y)
## [1] 1 2 5 8 9
#shows which elements of each vector are repeated in one another.
intersect(x,y)
## [1] 1 5
#show which elements of the x vector are nonexistent in the y vector
setdiff(x,y)
## [1] 2
#show which elements of the y vector are nonexistent in the x vector
setdiff(y,x)
## [1] 8 9
Create two vectors and x1 and y1 and simulate a similar case scenario.
#Enter Answer Here
peru <- c(10,9,5,8,6,16)
cuba <- c(2,4,6,8,10,14,7)
#brings both lements of each vector together in order with no repitions
union(peru,cuba)
## [1] 10 9 5 8 6 16 2 4 14 7
#shows which elements of each vector are repeated in one another.
intersect(peru,cuba)
## [1] 10 8 6
#show which elements of the x vector are nonexistent in the y vector
setdiff(peru,cuba)
## [1] 9 5 16
#show which elements of the y vector are nonexistent in the x vector
setdiff(cuba,peru)
## [1] 2 4 14 7