Ch1.1.3 Efficiency

Efficiency

  • Programmers work hard to design efficient methods.
  • Every gain in efficiency enables more interesting problems to be solved.
  • Programs are run so many times that increased efficiency decreases the overall runtime as well.
  • Accuracy and speed depend on fewer steps along the way.

Two Areas of Efficiency

  • Time complexity focuses on how long it takes for a program to execute.
  • Programs that require more steps for larger input have higher time complexity.
  • Space complexity measures how much storage, even intermediate, is necessary to execute a program.
  • Programs that require more memory for larger inputs are said to have more space complexity.
  • Counting floating point operations (flops) is a simple way to address some of these ideas.

Example 1: Count1

  • This program counts number of integers between 1 and m.
  • Is the output what we expect?
  • Write down the values of i and the values of count for each i in the for loop. How many flops are performed?
count1 <- function(m){
  count <- 0
  for (i in 1:m)
    count <- count + 1
  return(count)}
count1(10)
[1] 10

Example 1: Count2

  • This version of the previous program shows the values of i in the for loop along with the value of count for each i.
  • The cat command provides a format for listing the output.
  • The \n provides a new line for the listing.
  • Useful for debugging and determining flop count.
count2 <- function(m) {
  count <- 0
  for (i in 1:m) {
    count <- count + 1
    cat("i = ",count,",","count = ",count,"\n")
    }
  }

Example 1: Count2

  • Is end result correct?
  • Are intermediate steps correct?
  • What is flop count?
count2 <- function(m){
count <- 0
for (i in 1:m){
count <- count + 1
cat("i = ",count,",","count = ",count,"\n")}
}
count2(10)
i =  1 , count =  1 
i =  2 , count =  2 
i =  3 , count =  3 
i =  4 , count =  4 
i =  5 , count =  5 
i =  6 , count =  6 
i =  7 , count =  7 
i =  8 , count =  8 
i =  9 , count =  9 
i =  10 , count =  10 

Example 2: Count3

  • What is computed here? Is the output correct?
  • Write down the intermediate values of the indices m, n, i, j.
  • How many flops are performed?
count3 <- function(m,n) {
  count <- 0
  for (i in 1:m) {
    for (j in 1:n) {
    count <- count + 1  }}
  return(count)}
count3(3,2)
[1] 6

Example 2: Count4

  • This version of the previous program shows the intermediate values of the indices m, n, i, j as they appear when running through the for loops.
count4 <- function(m,n) {
  count <- 0
  cat("i = ", 0,",","j = ", 0,",", "count = ", count, "\n")
  for (i in 1:m) {
    for (j in 1:n) {
      count <- count + 1
  cat("i = ", i,",","j = ", j,",", "count = ", count, "\n")}}
}

Example 2: Count4

  • Is end result correct?
  • Are intermediate steps correct?
  • What is flop count?
count4(3,2)
i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  1 , j =  2 , count =  2 
i =  2 , j =  1 , count =  3 
i =  2 , j =  2 , count =  4 
i =  3 , j =  1 , count =  5 
i =  3 , j =  2 , count =  6 

Example 2: Count4

count4(4,3)
i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  1 , j =  2 , count =  2 
i =  1 , j =  3 , count =  3 
i =  2 , j =  1 , count =  4 
i =  2 , j =  2 , count =  5 
i =  2 , j =  3 , count =  6 
i =  3 , j =  1 , count =  7 
i =  3 , j =  2 , count =  8 
i =  3 , j =  3 , count =  9 
i =  4 , j =  1 , count =  10 
i =  4 , j =  2 , count =  11 
i =  4 , j =  3 , count =  12 

Example 3: Count5

  • What is computed here? Is the output correct?
  • Write down the intermediate values of the indices m, n, i, j.
  • How many flops are performed?
count5 <- function(m) {
  count <- 0
  for (i in 1:m) {
    for (j in 1:i) {
    count <- count + 1  }}
  return(count)}
count5(3)
[1] 6

Example 3: Count6

  • This version of the previous program shows the intermediate values of the indices m, n, i, j as they appear when running through the for loops.
count6 <- function(m) {
  count <- 0
  cat("i = ", 0,",","j = ", 0,",", "count = ", count, "\n")
  for (i in 1:m) {
    for (j in 1:i) {
      count <- count + 1
  cat("i = ", i,",","j = ", j,",", "count = ", count, "\n")}}
}

Example 3: Count6

  • Is end result correct?
  • Are intermediate steps correct?
  • What is flop count?
count6(3)
i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  2 , j =  1 , count =  2 
i =  2 , j =  2 , count =  3 
i =  3 , j =  1 , count =  4 
i =  3 , j =  2 , count =  5 
i =  3 , j =  3 , count =  6 

Example 3: Count6

count6(4)
i =  0 , j =  0 , count =  0 
i =  1 , j =  1 , count =  1 
i =  2 , j =  1 , count =  2 
i =  2 , j =  2 , count =  3 
i =  3 , j =  1 , count =  4 
i =  3 , j =  2 , count =  5 
i =  3 , j =  3 , count =  6 
i =  4 , j =  1 , count =  7 
i =  4 , j =  2 , count =  8 
i =  4 , j =  3 , count =  9 
i =  4 , j =  4 , count =  10 

Discussion

  • Flops have been going up with \( n \).
  • Not always the case.
  • In next example, flops go up with \( \sqrt{n} \).
  • Recall a prime is a natural number \( n \) that is only divisible by 1 and \( n \).
  • Here are the first several primes:

\[ 2,3,5,7,11,13,17,... \]

Example 4: IsPrime

  • What is computed here? Is output correct?
  • Write down intermediate values for steps.
IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }
c(IsPrime(2), IsPrime(3), IsPrime(4))
[1]  TRUE  TRUE FALSE

Example 4: IsPrime

  • What is computed here?
  • Is output correct?
  • Write down intermediate values for steps.
IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }
sqrt(3)
[1] 1.732051
3 %% 2
[1] 1
IsPrime(3)
[1] TRUE

Example 4: IsPrime

  • What is computed here?
  • Is output correct?
  • Write down intermediate values for steps.
IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }
sqrt(11)
[1] 3.316625
c(11 %% 2, 11 %% 3)
[1] 1 2
IsPrime(11)
[1] TRUE

Example 4: IsPrime

  • What is computed here?
  • Is output correct?
  • Write down intermediate values for steps.
IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }
sqrt(12)
[1] 3.464102
c(12 %% 2, 12 %% 3)
[1] 0 0
IsPrime(12)
[1] FALSE

Example 4: IsPrime

  • What is computed here?
  • Is output correct?
  • Write down intermediate values for steps.
IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }
sqrt(15)
[1] 3.872983
c(15 %% 2, 15 %% 3)
[1] 1 0
IsPrime(15)
[1] FALSE

Example 5: isPrime

  • Use cmna program:
library(cmna)
isPrime(3)
[1] TRUE
isPrime(4)
[1] FALSE
  • Erroneous result:
  • Only natural numbers can be prime.
pi
[1] 3.141593
isPrime(pi)
[1] TRUE

Discussion

  • A prime is a natural number \( n \) that is only divisible by 1 and \( n \).
  • For IsPrime, flops go up with \( \sqrt{n} \).
  • Big \( O \) notation: flop count is \( O(\sqrt{n}) \)
IsPrime <-function(n){
  if (n == 2)
    return(TRUE)
  for (i in 2:sqrt(n)) 
    if (n %% i == 0) 
      return(FALSE)
  return(TRUE) }

Example 5: Vectorized Count

  • What is computed here? Is output correct? See next slide.
  • Write down intermediate values. What is flop count?
VecCount <- function(m) {
  count <- 0
  x <- 1:100
  y <- rep(1,100)
  for (i in 1:m) {
    y <- y*x
    count <- count + 1}
  return(count)}
VecCount(10)
[1] 10

Example 5: Vectorized Count

  • Intermediate commands:
(x <- 1:5)
[1] 1 2 3 4 5
(y <- rep(1,5))
[1] 1 1 1 1 1
(y*x)
[1] 1 2 3 4 5

Example 5: Vectorized Count

  • Both vectors x,y have length 100.
  • The y*x is performed by a built-in loop.
  • 100 multiplications performed when computing each y*x.
  • y*x is computed 10 times.
  • The flop count is \( O(m^2) \), or 1000 for \( m=10 \).
VecCount <- function(m) {
  count <- 0
  x <- 1:100
  y <- rep(1,100)
  for (i in 1:m) {
    y <- y*x
    count <-count+1}
  return(count)}
VecCount(10)
[1] 10

Discussion

  • Understanding what is happening inside a loop is critical to estimating how long it will take to execute.
  • If a loop calls other external functions, like y*x, each of those will have their operations to perform.
  • It is often the case that we do not know the coding for these external functions, but for commericial packages these are often efficient and accurate.
VecCount <- function(m) {
  count <- 0
  c(x <- 1:100, y <- rep(1,100))
  for (i in 1:m) {
    y <- y*x
    count <- count + 1}
  return(count)}

Discussion

  • R is not a fast programming language.
  • Like MATLAB and Python, it is an interpreted programming language that does not run directly on hardware, and instead is translated and executed by the language's runtime enginge.
  • FORTRAN and C are faster languages, as they execute as machine code directly on the hardware.
  • The convenience of R, Python and MATLAB in terms of packages and libraries make them versatile and useful for most applications.
  • Estimating the number of operations needed is language neutral, so flop counts are independent of language used.