Week 2

Intro to Algo Lecture 2-1

Sorting
Insertion sort \(O\)(n²)
Merge sort \(O\)(n log n) by divide and conquer

Analyze complexity of recursions
By expansion: the recursion tree method
By induction: the substitution method

In-place sort - sorted item occupy the same storage as the original one
Out-of-space sort - sorting algo used extra space to do sorting

INSERTIONSORT(A)
START
  FOR i = 1 to n
    WHILE A[i] < A [i -1]
        swap A[i] with A[i-1]
        i -= 1
    ENDWHILE
  ENDFOR
END

SWAP(A, i)
START
  temp <-- A[i]
  A[i] <-- A[i-1]
  A[i-1] <-- temp
END

Worst-case running time \(T\)(n) on an input of size n

\(T\)(n) = \(\frac{n(n-1)}{2}\) = \(\Theta\)(n²)

Divide and conquer solution
- Divide input into parts (smaller problems)
- Conquer (solve) each part recursively
- Combine results to obtain solution of original

\(T\)(n) = divide time [\(\Theta\)(n)]
T(n₁) + T(n₂) + … + T(n_k) [2\(T\)(n/2)]
combine time [\(\Theta\)(n)]

\(T\)(n) = 2\(T\)(n/2) + \(\Theta\)(n)

MergeSort(A)
START
IF n = 1 THEN
  done -> return
ENDIF
RECURSIVE_SORT A[:n/2] --> L
RECURSIVE_SORT A[n/2:] --> R
merge L & R --> output A'
END

MERGESORT(a)
START
  IF n == 1 THEN
    RETURN a

   l1 <-- a[0] ... a[n/2]
   l2 <-- a[n/2+1] ... a[n]

   l1 <-- call: MERGESORT(l1)
   l2 <-- call: MERGESORT(l2)

   RETURN MERGE(l1, l2)
END

MERGE(a, b)
START
  c <-- []
  WHILE a and b have elements
    IF a[0] > b[0] THEN
      add b[0] to the end of c
      remove b[0] from b
    ELSE
      add a[0] to the end of c
      remove a[0] from a
    ENDIF
  ENDWHILE
     
  WHILE a has elements
    add a[0] to the end of c
    remove a[0] from a
  ENDWHILE
  
  WHILE b has elements
    add b[0] to the end of c
    remove b[0] from b
  ENDWHILE
  
  RETURN c
END

Intro to Algo Lecture 2-2

The recursion tree = sum of cost of nodes

\(T\)(n) = 2\(T\)(n/2) + cn
Sum of each level and then sum all levels \(T\)(n) = n\(T\)(1) + cn log n \(\Theta\)(nlogn)

\(T\)(n) = \(T\)(n/2) + cn
\(T\)(n) = cn + cn/2 + cn/4 + …. + 1 = cn (1 + ½ + ¼ + …. + 1/2logn) = cn (2) = Θ(n)

Master Theorem

Sum of each value in each level is the same, Total value = \(\Theta\)(no. of levels * sum of each lvl)
Each level’s value is decreasing, Total value = \(\Theta\)(root node’s value)
Each level’s value is increasing, Total value = \(\Theta\)(no. of leaves)

\(T\)(n) = a \(T\)(n / b) + f(n)

No. of lvl = log_b(n) = \(\Theta\)(log n)
No. of leaves = a^log_b(n) = n^log_b(a)

value of each lvl is geometrically increasing down the tree

if f(n) = \(O\)(L^{1-\(\epsilon\)}) = \(O\)(n^{log_b(a)-\(\epsilon\)})
\(T\)(n) = \(\Theta\)(L) = \(\Theta\)(n^log_b(a))
Note: dominant no. of leaves

log(n) –> log(n/2) –> log(n/4) –> … –> log(n / 2^k) –> c

No. of leaves increasing down the tree (dominant)
f(n) = \(O\)(L^{1-\(\epsilon\)}) = \(O\)(n^{log_b(a)-\(\epsilon\)})
T(n) = \(\Theta\)(n^log_b(a)) = \(\Theta\)(n)
value of each lvl are equal Height \(\times\) Leaves

f(n) = \(\Theta\)(L) = \(\Theta\)(n^log_b(a)) T(n) = \(\Theta\)(L log n) = \(\Theta\)(n^(log_b(a) log(n)) = \(\Theta\)(n log n), a = b = 1

n –> n/2 –> n/4 –> … –> n/2^k –> c
T(n) = \(\Theta\)(L log n) = \(\Theta\)(n^log_b(a) log n)
value of each is lvl is geometrically decreasing down the tree

f(n) = \(\Omega\)(L^{1+\(\epsilon\)}) = \(\Omega\)(n^{(log_b(a)+\(\epsilon\)})
\(T\)(n) = \(\Theta\)(f(n))
Note: dominant node

n² –> (n/2)² –> (n/4)² –> … –> (n/2^k)² –> c
n² is significantly bigger than n log n

Exercise 1 (Case 1)

\(T\)(n) = 9\(T\)(n/3) + n

a = 9, b = 3, f(n) = n
height = log₃(n); leaves = 9^log₃(n) = n^log₃(9) = \(\Theta\)(n²)
Note: height = log_b(n) (b^height = n), #leaves = a^log_b(n) = n^log_b(a)

Since f(n) = \(O\)(n^{log₃(9-\(\epsilon\))}) where \(\epsilon\) = 1 \(T\)(n) = \(\Theta\)(#leaves) = \(\Theta\)(n²)

Exercise 2 (Case 2)

\(T\)(n) = \(T\)(2n/3) + \(\Theta\)(1)

a = 1, b = 3/2, f(n) = 1
height = log_(3/2)(n); leaves = log_3/2(n) = n^log_3/2(1) = n⁰ = \(\Theta\)(1)

f(n) = \(\Theta\)(n^log_(3/2)(1))

\(T\)(n) = \(\Theta\)(n^log_(3/2)(1) log n = \(\Theta\)(log n)

Exercise 3 (Case 3)

\(T\)(n) = 3\(T\)(n/4) + n log n

a = 3, b = 4, f(n) = n log n
height = log₄(n); #leaves = 3^{log₄(nlogn)} = n^log₄(3) \(\approx\) n^0.793

f(n) = \(\Omega\)(n^{log₄(3+\(\epsilon\))}) where \(\epsilon\) 0.2
AND af(n/b) = 3f(n/4) = 3/4n log(n/4) <= cn log n, for c = 3/4 (regularity condition)
check if value is decreasing, if c < 1 and cn log n dominates 3/4n log(n/4)
\(T\)(n) = \(\Theta\)(f(n)) = \(\Theta\)(n log n)

Exercise 4 (Case 2)

\(T\)(n) = 4\(T\)(n/2) + n²
a = 4, b = 2, f(n) = n²
height = log₂(n), #leaves = 4^log₂(n) = n^log₂(4) = n²

f(n) = \(\Theta\)(n²)
\(T\)(n) = \(\Theta\)(n² log n)

Exercise 5 (Case 3)

\(T\)(n) = 7\(T\)(n/3) + \(\Theta\)(n²)
a = 7, b = 3, f(n) = n²
height = log₃(n), #leaves = 7^log₃(n) = n^log₃(7)

f(n) = \(\Omega\)(n^{log₃(7+\(\epsilon\))}), where \(\epsilon\) \(\approx\) 0.229

Since af(n/b) = 7f(n/3) = 7(n/3)² = 7/9n² <= cn², for c = 7/9 (regularity condition)

\(T\)(n) = \(\Theta\)(n²)

Exercise 6 (Unable to resolve using Master Theorem, use recursive tree)

\(T\)(n) = \(T\)(n - 1) + 2
2 –> 2 –> … –> 2
n –> n-1 –> 1 2(n); \(T\)(n) = \(\Theta\)(n)

Intro to Algo Cohort 2-3

Peak Finding Problem

Neighbors < Peak element

10, 13, 5, 8, 3, 2, 1

PEAKFINDER(A)
START
  IF A.length == 0 THEN //If list is 0, there is no peak
    return NIL
  ENDIF
    
  IF A.length == 1 THEN // If list has been reduced to 1 element, it's a peak
    return A
  ENDIF
    
  mid = A.length / 2
  
  left = mid - 1
  right = mid + 1
  
  IF A[left] <= A[mid] >= A[right] THEN
    RETURN A[mid]
  ELSE IF A[mid] < A[left]
    RETURN PEAKFINDER(A[:mid])
  ELSE IF A[mid] < A[right]
    RETURN PEAKFINDER(A[mid+1:])
  ENDIF
END

An element A[i] is a peak if it is not smaller than all its neighbor(s)

IF i != 1 THEN
  n <-- A[1] >= A[i - 1] and A[i] >= A[i + 1]
ELSE IF i = 1
  A[1] >= A[2]
ELSE IF i = n
  A[n] >= A[n-1]
ENDIF

13 & 8 are peaks

Algo 1
Scan the array fr left to right
Compare A[i] with its neighbors
Exit when found a peak

1, 2, 4, 8, 9, 12, 21

Complexity:
Might need to scan all elements, so \(T\)(n) = \(\Theta\)(n)

Algo 2
Consider the middle element of the array & compare with neighbors

IF A[n/2-1] > A[n/2] THEN
  search for a peak among A[1]... A[n/2  - 1]
ELSE IF A[n/2] < A[n/2 + 1]
  search for a peak among A[n/2 + 1]...A[n]
ELSE
  A[n/2] is a peak! //since A[n/2 -1] <= A[n/2] and A[n/2] >= A[n/2 + 1]
ENDIF

Complexity
\(T\)(n) = \(T\)(n/2) + \(O\)(1) //Time for comparing A[n/2] with 2 neighbors
\(T\)(n) <= \(T\)(n/2) + c <= (\(T\)(n/2²) + c) + c <= … <= \(T\)(n/2^log₂(n)) + c + c + … + c = 1 + log₂(n) = \(O\)(log n)

Time to find peak in array of length n, O(1) time to find peak, \(T\)(n/2) cut array in half

Divide and Conquer
Very powerful design tool:
Divide input into multiple disjoint parts
Conquer each of the parts separately (using recursive call)

Peak finding: 2D
Consider a square 2D array A[1…n, 1…n]:
An element A[i] is a 2D peak if it is not smaller than its (at most 4 neighbors.
Problem: find any 2D peak

Algo 1: brute-force method (i.e. search all squares)
Complexity = \(\Theta\)(n²)

Algo 2:
a) For each col j, find global maximum B[j]: complexity = \(\Theta\)(n)
b) Apply 1D peak finder to find a peak of B[1…n] (say B[j]): complexity = \(\Theta\)(n² + log n), log n to
find the peak

Complexity = \(\Theta\)(n² _ log n) = \(\Theta\)(n²)

Algo 3:

Pick middle col j = n/2<br>
Find global max  on col j, [i, j]<br>
Compare A[i, n/2] to A[i, n/2 - 1]  A[i, n/2 + 1]<br>

IF A[i, n/2] < A[i, n/2-1] THEN
  pick left col j=[1...n/2]
ELSE IF A[i, n/2] < A[i, n/2+1]
  pick right col j = [n/2...n]
ELSE IF A[i, n/2] >= A[i, n/2 -1] AND A[i, n/2] >= A[i, n/2+1]
    A[i, j] is 2D peak

PEAKFINDER(A, rows, cols, mid)
START
  max <-- 0
  max, max_index <-- call: FINDMAX(A, rows, mid, max)
  
  IF mid == 0 OR mid == cols - 1 THEN // If on the first or last col, max is a peak
    RETURN max
  ENDIF
  
  IF max < A[max_index][mid - 1] // If max is less than left
    RETURN FINDMAX(A, rows, cols, mid - (mid / 2))
  ELSE IF max < A[max_index][mid + 1] // If max is less than right
    RETURN FINDMAX(A, rows, cols, mid + (mid / 2))
  ELSE IF max >= A[max_index][mid - 1] AND max >= A[max_index][mid + 1] // If max is more than left and right
    RETURN max
  ENDIF
STOP

FINDMAX(A, rows, mid, max)
START
  max_index <-- 0
  FOR i = 0 to row
    IF max < A[i][j] THEN // Find global max  on col mid, [i, mid]
      max <-- A[i][mid]
      max_index <-- i
    ENDIF
  ENDFOR
END

Complexity = \(T\)(m/2) + \(\Theta\)(n) cost of finding the maximum in a col
\(T\)(n) = \(T\)(n/2) + \(\Theta\)(n) = \(T\)(n/4) + 2 * \(\Theta\)(n) = \(T\)(n/16) + 3 * \(\Theta\)(n)
= log(n) \(\Theta\)(n) = \(\Theta\)(n log n)

\(\Theta\)(n log m)

\(T\)(n) = 3\(T\)(n/2) + n²
a = 3, b = 2, f(n) = n²
height = log₂(n), #leaves = 3^log₂(n) = n^log3

Compare n^log3 and n²
Complexity = \(\Theta\)(n²)
\(T\)(n) = 9\(T\)(n/3) + n²
a = 9, b = 3, f(n) = n²
height = log₃(n), #leaves = 9^log₃(n) = n^log₃(9) = n²

Complexity = \(\Theta\)(n² log n)
\(T\)(n) = \(T\)(n/2) + 2ⁿ
a = 1, b = 2, f(n) = 2ⁿ
height = log₂(n), #leaves = n^log₂(1) = 1

Compare 2ⁿ and 1
Complexity = \(\Theta\)(2ⁿ) < 2ⁿ + 2ⁿ = 2 * 2ⁿ = \(\Theta\)(2ⁿ)

\(T\)(n) = 2ⁿ + 2^(n/2) + … + 2 + 1
2^k > 2^(k-1) + 2^(k-2) + … + 2 + 1
(2^k) - 1