Ch1.1.3 Efficiency

Efficiency

Programmers work hard to design efficient methods.
Every gain in efficiency enables more interesting problems to be solved.
Programs are run so many times that increased efficiency decreases the overall runtime as well.
Accuracy and speed depend on fewer steps along the way.

Two Areas of Efficiency

Time complexity focuses on how long it takes for a program to execute.
Programs that require more steps for larger input have higher time complexity.
Space complexity measures how much storage, even intermediate, is necessary to execute a program.
Programs that require more memory for larger inputs are said to have more space complexity.
Counting floating point operations (flops) is a simple way to address some of these ideas.

Example 1: Count1

This program counts number of integers between 1 and m.
Is the output what we expect?
Write down the values of i and the values of count for each i in the for loop. How many flops are performed?

count1 <- function(m){
  count <- 0
  for (i in 1:m)
    count <- count + 1
  return(count)}

count1(10)

[1] 10

Example 1: Count2

This version of the previous program shows the values of i in the for loop along with the value of count for each i.
The cat command provides a format for listing the output.
The \n provides a new line for the listing.
Useful for debugging and determining flop count.

count2 <- function(m) {
  count <- 0
  for (i in 1:m) {
    count <- count + 1
    cat("i = ",count,",","count = ",count,"\n")
    }
  }

Example 1: Count2

Is end result correct?
Are intermediate steps correct?
What is flop count?

count2 <- function(m){
count <- 0
for (i in 1:m){
count <- count + 1
cat("i = ",count,",","count = ",count,"\n")}
}

count2(10)

i =  1 , count =  1 
i =  2 , count =  2 
i =  3 , count =  3 
i =  4 , count =  4 
i =  5 , count =  5 
i =  6 , count =  6 
i =  7 , count =  7 
i =  8 , count =  8 
i =  9 , count =  9 
i =  10 , count =  10

Example 2: Count3

What is computed here? Is the output correct?
Write down the intermediate values of the indices m, n, i, j.
How many flops are performed?

count3 <- function(m,n) {
  count <- 0
  for (i in 1:m) {
    for (j in 1:n) {
    count <- count + 1  }}
  return(count)}

count3(3,2)

[1] 6

Example 2: Count4

This version of the previous program shows the intermediate values of the indices m, n, i, j as they appear when running through the for loops.

count4 <- function(m,n) {
  count <- 0
  cat("i = ", 0,",","j = ", 0,",", "count = ", count, "\n")
  for (i in 1:m) {
    for (j in 1:n) {
      count <- count + 1
  cat("i = ", i,",","j = ", j,",", "count = ", count, "\n")}}
}

Example 2: Count4

Is end result correct?
Are intermediate steps correct?
What is flop count?

count4(3,2)

i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  1 , j =  2 , count =  2 
i =  2 , j =  1 , count =  3 
i =  2 , j =  2 , count =  4 
i =  3 , j =  1 , count =  5 
i =  3 , j =  2 , count =  6

Example 2: Count4

count4(4,3)

i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  1 , j =  2 , count =  2 
i =  1 , j =  3 , count =  3 
i =  2 , j =  1 , count =  4 
i =  2 , j =  2 , count =  5 
i =  2 , j =  3 , count =  6 
i =  3 , j =  1 , count =  7 
i =  3 , j =  2 , count =  8 
i =  3 , j =  3 , count =  9 
i =  4 , j =  1 , count =  10 
i =  4 , j =  2 , count =  11 
i =  4 , j =  3 , count =  12

Example 3: Count5

What is computed here? Is the output correct?
Write down the intermediate values of the indices m, n, i, j.
How many flops are performed?

count5 <- function(m) {
  count <- 0
  for (i in 1:m) {
    for (j in 1:i) {
    count <- count + 1  }}
  return(count)}

count5(3)

[1] 6

Example 3: Count6

This version of the previous program shows the intermediate values of the indices m, n, i, j as they appear when running through the for loops.

count6 <- function(m) {
  count <- 0
  cat("i = ", 0,",","j = ", 0,",", "count = ", count, "\n")
  for (i in 1:m) {
    for (j in 1:i) {
      count <- count + 1
  cat("i = ", i,",","j = ", j,",", "count = ", count, "\n")}}
}

Example 3: Count6

Is end result correct?
Are intermediate steps correct?
What is flop count?

count6(3)

i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  2 , j =  1 , count =  2 
i =  2 , j =  2 , count =  3 
i =  3 , j =  1 , count =  4 
i =  3 , j =  2 , count =  5 
i =  3 , j =  3 , count =  6

Example 3: Count6

count6(4)

i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  2 , j =  1 , count =  2 
i =  2 , j =  2 , count =  3 
i =  3 , j =  1 , count =  4 
i =  3 , j =  2 , count =  5 
i =  3 , j =  3 , count =  6 
i =  4 , j =  1 , count =  7 
i =  4 , j =  2 , count =  8 
i =  4 , j =  3 , count =  9 
i =  4 , j =  4 , count =  10

Discussion

Flops have been going up with \( n \).
Not always the case.
In next example, flops go up with \( \sqrt{n} \).
Recall a prime is a natural number \( n \) that is only divisible by 1 and \( n \).
Here are the first several primes:

\[ 2,3,5,7,11,13,17,... \]

Example 4: IsPrime

What is computed here? Is output correct?
Write down intermediate values for steps.

IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }

c(IsPrime(2), IsPrime(3), IsPrime(4))

[1]  TRUE  TRUE FALSE

Example 4: IsPrime

What is computed here?
Is output correct?
Write down intermediate values for steps.

IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }

sqrt(3)

[1] 1.732051

3 %% 2

[1] 1

IsPrime(3)

[1] TRUE

Example 4: IsPrime

What is computed here?
Is output correct?
Write down intermediate values for steps.

IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }

sqrt(11)

[1] 3.316625

c(11 %% 2, 11 %% 3)

[1] 1 2

IsPrime(11)

[1] TRUE

Example 4: IsPrime

What is computed here?
Is output correct?
Write down intermediate values for steps.

IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }

sqrt(12)

[1] 3.464102

c(12 %% 2, 12 %% 3)

[1] 0 0

IsPrime(12)

[1] FALSE

Example 4: IsPrime

What is computed here?
Is output correct?
Write down intermediate values for steps.

IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }

sqrt(15)

[1] 3.872983

c(15 %% 2, 15 %% 3)

[1] 1 0

IsPrime(15)

[1] FALSE

Example 5: isPrime

Use cmna program:

library(cmna)

isPrime(3)

[1] TRUE

isPrime(4)

[1] FALSE

Erroneous result:
Only natural numbers can be prime.

pi

[1] 3.141593

isPrime(pi)

[1] TRUE

Discussion

A prime is a natural number \( n \) that is only divisible by 1 and \( n \).
For IsPrime, flops go up with \( \sqrt{n} \).
Big \( O \) notation: flop count is \( O(\sqrt{n}) \)

IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }

Example 5: Vectorized Count

What is computed here? Is output correct? See next slide.
Write down intermediate values. What is flop count?

VecCount <- function(m) {
  count <- 0
  x <- 1:100
  y <- rep(1,100)
  for (i in 1:m) {
    y <- y*x
    count <- count + 1}
  return(count)}

VecCount(10)

[1] 10

Example 5: Vectorized Count

Intermediate commands:

(x <- 1:5)

[1] 1 2 3 4 5

(y <- rep(1,5))

[1] 1 1 1 1 1

(y*x)

[1] 1 2 3 4 5

Example 5: Vectorized Count

Both vectors x,y have length 100.
The y*x is performed by a built-in loop.
100 multiplications performed when computing each y*x.
y*x is computed 10 times.
The flop count is \( O(m^2) \), or 1000 for \( m=10 \).

VecCount <- function(m) {
  count <- 0
  x <- 1:100
  y <- rep(1,100)
  for (i in 1:m) {
    y <- y*x
    count <-count+1}
  return(count)}

VecCount(10)

[1] 10

Discussion

Understanding what is happening inside a loop is critical to estimating how long it will take to execute.
If a loop calls other external functions, like y*x, each of those will have their operations to perform.
It is often the case that we do not know the coding for these external functions, but for commericial packages these are often efficient and accurate.

VecCount <- function(m) {
  count <- 0
  c(x <- 1:100, y <- rep(1,100))
  for (i in 1:m) {
    y <- y*x
    count <- count + 1}
  return(count)}

Discussion

R is not a fast programming language.
Like MATLAB and Python, it is an interpreted programming language that does not run directly on hardware, and instead is translated and executed by the language's runtime enginge.
FORTRAN and C are faster languages, as they execute as machine code directly on the hardware.
The convenience of R, Python and MATLAB in terms of packages and libraries make them versatile and useful for most applications.
Estimating the number of operations needed is language neutral, so flop counts are independent of language used.