Divide and Conquer Power Function

Recursion can become super inefficient if we nest a recursive call within a recursive call

Divide and Conquer Power Function::: Efficient

  X^N = (X^(N/2)^2)       N even, >0
  X^N = X*(X^(N/2)^2)     N odd, >0
  X^N = 1                 N = 0

Run time stack runs lgN 1 - 2 multiplications per level One for the square (always) One for the off N (sometimes) Spawn 1 recursive call which spawns 1 recursive call and so on

Divide and Conquer Power Function::: Inefficient

  X^N = (X^(N/2))*(X^(N/2))        N even, >0
  X^N = X*(X^(N/2))(X^(N/2))       N odd, >0
  X^N = 1                          N = 0

Mathematically similar, programming wise different 2 - 3 multiplications per level Need to recursively call X^(N/2) twice 2 recursive calls for each original call 2 calls spawn 2 calls spawn 2 calls and so on Each spawn calls two calls O(N) multiplications Same as non divide and conquer

Efficient Version::: Each recursive call leads to 1 recursive call… one recursive call at each level

Inefficient Version::: Spawns 2 rrecursive call per recursive call which increases the number of reccursive calls exponentially

Stack size is lgN, efficient version is lgN

Defined runtime based on multiplication … mult of integers depends on # of bits in integer Integerr with arbitrary # of bits, this is O(N) because it is a function of the number of bits and as we multiply, we may need to use more bits

Often calculate mult as mod. to get the remainder and use this to save space Known as power mod… also bounds the answer at the int you set too modulo it with

Recursion and Binary Search

Binary search - data in sorted order

How to implement recursion::: 1) Divide problem up to make a recursive call A) Simply divide the array in half (logical division) 2) How to reassemble the data B) Trivial… just return

Base Case(s)::: 1) Base case not found - logiaal array size is down to 0 2) Base case found - key matches current item

Recusive Case::: Consider middle item in each recursive call 1) M == S, item has been found A) Second Base Case, found the key 2) M < S, return BS of right side of array 3) M > S, return BS of left side array

Return Binary search L + R returns the position based on which side it was found on

Cuts problem in half with each recursive call

Recursive and Iterative Binary Search will always have the same number of comparisons

Base Case:::: Low < High, nothing left is array, it is empty, return -1

Not Base Case::: Find middle item, compare to key If middle < key, returns to rigt side If middle > key, reurns to left side If middle == key, base case found

Tail Recursion

A recursive algorithm that makes the recursive call the last thing in the method This indicates we can do this same thing iteratively

Overhead of Recursion

Conversion to iterative can be favorable because it saves on overhead Saves on space Saves on time

Conversion off recursion to iterative simply is in our best interest

When is recursion beneficial::: 1) Some problems, recursion is just better 2) Sometimes difficult to do iterative

Recursion and Backtracking

Go down a path to a solution until you realize that path wont go anywhere Backtrack to a paint where we can go forward again i.e. maze

Each move forward is a recursive call

Sometimes have to make a choice how to proceed

If stuck, back track to previous call, change choice and try to ove forward again

Either find a solution or run out of choices

Queens Problem - Backtracking

Problem: Place 8 Queens on a chess board such that no queen can take another in the next turn

Can be solved using recursion and backtracking Each queen must be in a different row and different column Each row and each column must have exactly 1 queen Step 1: Place 1 queen Step 2: recursively place 7 queens on the rest of the board Back tracking is used if the initial choice is not able to lead us to a satisfiable solution

One queen in each column One queen in each row One queen in each diagonal

Place the queen and if it doesnt lead to a solution, perform backtracking.

Each recursive call attempts to place one queen in a specific column… then itteratively try each row in a specific column

Need to keep variables to keep track of the board… which are free and which would threaten the placement of the new queen

Recursively … initialiy try column 0… if we plae a queen legally at column 7, then we have solved the issue because this indicates that weve placed one in each column

If all rows in column K have been tried and failed, the call terminates and backtracks to the previous call (K-1). Removes the queen from where it was placed in K-1 and a different row is placed in this row…. if this is still unsuccessful, you continue backtracking until you place in a different column

JRQueens.java for reference on how to do this – trycol method Note the variables needed for this function

Why difficult to do iteratively – need to know info about previous column … in recursion this can be done with local variables which the program maintains for itself within the activation record on the runtime stack Col parameter and row parameter… each activation record has a column and a row, dont have to mutate the variable every time like we would with recursion

Finding a Word in a Grid

Ex. find a word in a 2D grid of letters - 16 letters … 4x4 grid … given a word, can the word be found in the grid Rules: each rule must touch the previous letter up, down, left, right (not diagonally)– can only use one letters once… cannot use the same letter twice

Each recursive call deals with one location on the board, the sequential records make up the stack as the activation record. If all letters matched, K activation records and we are done

If we go to a position on the board that does not match the letter we aare looking for, we backtrack a letter back and try a different route

Use back tracking if none of the options (LRUD) give the next letter

public boolean findWord(int r, int c, String word, int loc, char [][] bo)

findWord(r, c+1, word, loc+1, bo) // right findWord(r-1, c, word, loc+1, bo) // up findWord(r, c-1, word, loc+1, bo) // left findWord(r+1, c, word, loc+1, bo) // down

return as a boolean

Towers off Hanoi

Problem : 3 towers – on the first tower we have N disk of increasing size from top to bottom Can only move 1 disk at a time and can never put a larger disk on a smaller disk Middle tower allows this to be solved, middle one is the temperary holder

Very difficult to do iteratively

3 recursive functions Move N-1 disks from start to end Last disk from Start to end Move N-1 disks from mid to end

We move N - 1 items from the start to the middle and then put the Nth item from the start to the end We then move the N-1 items from the mid back to the 1st and repeat until there is nothing left

Difficult to do itteratively because it doesnt allow for a linear stack whereas recursion allows for building on the activation stack As stack increases in size, we get closer to the back case, start popping items of the stack, and decreasing the stack as we go

Recursive call has 2 recursive calls– makes a binary tree – without recursion the programmer needs to maintain all of the variables for all of these variables – maintainging infor stored in the stack is mmuch more difficultt but makes our code very prone to errors and bugs

##Towers of Hanoi Run Time

Recursive algorithms have less lines of code but this doesnt neccessarily change the run time.. could even make the runtime longer than iterattively

Key instruction for algorithm – move of disk – Every time we go down the level, we decrease problem size by 1 To get to base case, we need N levels in execution trace

Each size X call spawns 2 X-1 calls 
    Need more move N- 1 disks from start to middle and then move them back so 1 call spawns two calls ofN-1

Asymptotic Runtime – geometric sum – 2^0 + 2^1 + 2^2 + … + 2^{(N-1) 2}(N-1) ==> 2^N - 1 = O(2^N)

Runtime is poor, exponential, takes a long time for large N

Execution not done level by level, goes down aand up and down and up towards the base case Down to base case and back up repeat

Sorting

Sorting generally is done in increasing order ex) if i < j then A[i] <= A[j] If we want decreasing order then we just reverse this

Simple Sorting

##Insertion Sort

This system works by iteraating through the initial set of items, removing themm one at a time and putting them into a new set, but in this set we sort as we add the initial item

This works but would use double the memory–> more effiient to sort in place –> constant amountt of memory

Insttead, we can think of the array in ttwo parts, the sorted part and the unsorted part With each iteration, we take an item from the unsorted section and sort it into the sorted section, essentially works the same as insertion sort but the second array is just the beginning of out original array

We insert each item into the correct spot and find the correct spot for the item going from back to front

Starts off with only the array and the length as parameters Each iteration takes an item from the unsorted array and brings it to the sorted array calls another method to move the value to the correct spot Values shifted from left to righht (this leaves a hole where the item used to be )

Run Time of Insertion Sort Key isntruction – comparisons Worst case scenario – reverse sorted data - eaach N has to go through N (max) iterations to get to its proper spot in the srted array (N-1)(N)/2 or O(N^2) – worst case On average it is bettter but still O(N^2)

Selection Sort

Selection sort – Iteration i, find the ith smallest item and swap it into location i Simple with nested loops – get the next smallest and swap it into tthe location Assume a[first] is smallest, itterate through the rest of the array, looking for the actual smallest and return that index

This returned index is used by the first loop to swap the actiual smallest with the ith

Run time - key instruction is commparison Diff runtime bettween insertion sort and selection sort Both have two for loops Insertion sort – inner for loop stops when the ittem is at the correct inserion point Selection Sort – Dont do a comparison in the loop header Iterations in both for loops based only on index values Outer loop - i goes from - to N - 2 Inner loop - j goes fro i + 1 to N - 1 Doesnt mattter how the datta is organized, there is no worst case or best case All cases iterate the same numberr of times and do the same number of comparrisons # comparison needed… N - start index i.e. starting at 3? n-3 comparisons

(N-1)(N)/2 –> O(N^2) –

Selection sort same as insertion sort

Bubblesort

j compared to j+1 items if sortted, j < j + 1 – if this is the case, do nothing if j > j + 1, theyre out of order in whicch case we swap again continue from the beginning to end until commpletely sorted

This may hav to be done more than once as only two items compared at the same time

Worst case run time – O(N^2)

Insertion Sort of a Linked List

Insertion sort can be used on a linked list Remove front node and insert in order into a second, new list Not creating any new nodes, just moving them around Insert from front to back of sorted list, compare data as you go

Worst case, every new node ends up in the last position, indicating that it was in the sorted order in the previos linked list

Run time is the same O(N^2)

Shellsort

Simple Sorts are not the greatest Insertion Sort – A relatively poor algorithm – This requires either no movement or data move of 1 location – really out of order data requires a lot more comparisons

Shellsort moves data more than once in one turn which makes it more efficient Shellsort doesnt compare items that are close together, but rather items that are farther apart Shellsort breaks up the array into subarrrays that we do insertion sort on Break into subarrays, and compare the Kth item in each subarray, reducing the size of K until it is sorted

Shellsort breaks up the array as such 1 2 3 4 1 2 3 4 k = 4 The ones are commpared first and sorted and so on .. 1 compared with 1, 2 compared with 2, 3 compared with 3, 4 compared with 4 Now that weve gone thhrough one round of that, the k value is reduced 1 2 1 2 1 2 1 2 k = 2 The ones are compared and sorted and then the 2’s are compared and sorted 1 1 1 1 1 1 1 1 k = 1 Once K = 1, we do a rregular insertion sort, but the data is relatively sorted so this greatly reduces runtime

Shellsort does insertion sorts of subarrays, and at the end a final insertion sort of the whole array Shellsort has better runtime despite taking several insertion sorts to complete Insertion sort has really good runtime the more sorted the initial data is so shellsort moves the data to a more sorted value

Best Case - O(N) Good Case - O(N^(3/2)) Insertion Sort - O(N^2)

input – array, int first, int last

K = N/2 is a simple way to determine subarrays

Improved Sorts

Lower Bound - No comparison based sorting algorithm can do better than NlogN in the best case Dont know if we can actually achieve this

Divide and Conquer can help us do this. Ex. Binary Search implemented with recursion

Divide and Conquer Sorts

How do we divide the problem into subproblems –

How do we use the solutions of the subproblems to determine the ooverall solution –

These questions answered differenttly in MergeSort and QuickSort

MergeSort

How do we divide the problem? – Just break in half and continue breaking in hald until there are subarrays of containing 1 item each (0 - 7) –> (0 - 3) & (4 - 7) –> (0 - 1) & (2 - 3) & (4 - 5) & (6 - 7) –> (0) & (1) & (2) & (3) & (4) & (5) & (6) & (7) Once we have 1 item each, weve reached out base case and the recursive breaking is complete Each subarray is sorted because they are being compared with nothing – sorting is done when puttting the pieces back together

How do we use subproblem solutions to solve the overall problem? – look at the first item in the subarrays and sort based on the lowest value into a merged array i.e. 20, 50, 60, 80 10, 30, 40, 70 Merged - 10, 20, 30, 40, 50, 60, 70, 80 Merge the two subarrays togethe in the overall merged array by looking at tthe first value

Data is copied from subarrays into a sorted temparray and then back to the original MergeSort DOES NOT sort in place Adds linear overhead to the run time – does not make it asymptotically worse We only hhave one temp aray PASSED INTO the rrecursive call rather than making a new temp array at each recursive call

There is a case where we dont need to merge, if the largest value on the sorted right is less than the smallest value on the sorted right – could potentially save us a merge

MergeSort Runtime

Comparisons are the key instruction More difficult because recursion is used instead of loops

Analyze level by level – at each level, each initial call gives rise to two calls that are each 1/2 the size of the original – same at each level because more arrays but smaller sizes – cancel out O(log2N) at each level – O(Nlog2N) total — log2N levels, O(N) at each level – multipled to Nlog2N

MergeSort Tree Execution Tree

We recursively go down the left side, breaking farther and farther until the base case is reached…once the base case is reached we begin merging and going back up but we dont begin merging until weve rersively gone all thhe way down Once weve hit the base case at the left, we go up a level and go back down, merge and continue

MergeSort vs. Towers of Hanoi Run Time

lgN levels - each call of X spawns 2 calls of X/2 Each call of size X, worst case for merging is X calls

Sum– N + 2(N/2) + 4(N/4) … this is the total number of calls

Formally express – N = 2^{k (2}0)(2^k) + (2¹⁾⁽²k - 1) + (2²⁾⁽²k - 2) + (2^k)(20) = (K+1)(2^k) red green

red is number oof problems / calls green is size of problems / calls

(K+1)(2^k) with k = lg2N —–> (lg2N + 1)(N) = O(Nlg2N) Binary search has lgN levels but only 1 operation per level, MergeSort has lgN levels with N opperations per level

Towers of Hanoi also makes an execution tree with N levels bc each call of size X makes two calls of X-1… One move per call with overall runtime determined by number of calls made O(2^N) – Geometric Sum

MergeSort Overhead

Asymptotically optimal by achieving the lower bound asymptotically Not an in placce sort, requires memory overhhead in terms of copying into temp array and then back to the original Slows down the execution in real time

QuickSort does this by copying in place with Divide and Conquer

QuickSort

How to divide the problem Instead of dividing by index, divides into sections based on pivot point Gives 3 groups of data Data <= pivot on left of pivot Data == pivot Data >= pivot on right of pivot Division will not be equal on either side of pivot Pivot can be any value .. for simplicity can do last

Ex… 50, 80, 60, 20, 30, 10, 70, 40 Pivot is 40

10, 30, 20 40 50, 80, 60, 70 <= pivot pivot >= pivot

Pivot is now in its correctt spot

QuickSort(A) if (size of A > 1) [not at base case] Choose a pivot value Partition A into left and right sides based on the pivot Recursively sort left side Recursively sort right side

Once recursive calls are completed, we are done… sorting done in place so no merging required, we are done We have gotten rid of overhead

Partitioning— Start with countter on left and right most index of the array If the data at the left counter is less than the pivot, we leave it where it is If the data at the right counte is greatter than the pivot, we leave it where it is If here is a value on either side of the pivot tha does not abide by the rules, we need to swap the values in question (pivot and value) – then continue with the original counter Special case when the counter are next to each other… this tells us that the partition algorithm is completed– we now have to move the pivot into tthe correct spot between the two counters Swap the counter closest to the pivot with the pivot to get the pivot in the correct spot

Now that we’ve partitioned, we need to recursively sort the items on the left side and then the right side

Partition breaks into left and right side and the return value is the index where the pivot is located Firrst call quicksort from the firs value to the pivot -1 Then call qiccksort from the pivot + 1 to tthe end Inputs are thte array, first index and last index

Partition algorithm – inputt is the array, int firstt and int left pivot index = last indexFromLef = first indexFromRight = last - 1 While loop – until done While loop –increment as long as value is less than the pivot While loop – right side of array, make sure greater than pivot Contains condition indexFromRight > first – this is used if all of the values are greater than pivot – would make the condition neve true and lead to an indexOutOfBounds exception if not true Condition not needed in first while loop bc even if all values on left less than pivot, we would come in contact with the pivot befoe going past the aray Now swap the index from left and index from right – makes sure we havent gone through the entire array If we havent gone through the entire array, we loop all over again After the loop is completed, swap he pivot wih the back counter

Runtime analysis of quicksort

Mergesort – O(Nlog2N) in best case and average case Quicksort– doesnt always divide evenly so allows for more variability in asymptotic analysis Optimal Analysis – pivot value is the middle value – equal number on both sides In this case, similar to mergesort and we will get O(Nlog2N) but without the overhhead of a temp array Worst case – Pivot is an extreme value – all greater or all less than Comparisons done in partition algorithm Compare N-1 items to the pivot – all N-1 items will be less than the pivot – single recursive call of N-1 … usually we should get 2 rewcursive calls but since all on one side, only one rercursive call Need to make N-2 ccomparison to pivot witha recursive call of N-2 Then a thirrs call of N-3 and this continues until we get to the base case This is the worst case because the most comparisons are being made Results in (N-1)(N)/2 –> O(N^2) – same as primitive sort Worst case is bad because it violates divide and conquer, we never divided it

Which runtime case is more likley?? Depends on original data distribution and how the pivot is chosen

Worst case for quicksort– data already sorted and pivot is the rightmost value The same worst case occurs if we have reverse sorted data Quicksort has average / expected of O(NlgN) but has the pottential for a worst case of O(N^2) Need to rethink how to choose the pivot

Median of Three Quicksort

Optimizes quicksort by choosing the pivot better Would be better to choose the pivot from multiple locations to reduce the probability of getting the worst case O(N^2)

Instead oof choosing thte pivot at a single index, look at 3 values from the array .. A[first], A[mid], A[last] – locations prior to partition Sort the 3 pivot values relative to eacch other… i.e. Original A[first] = 50 A[mid] = 20 A[last] = 40 Rearange the Pivots – sorted pivots A[first] = 20 A[mid] = 40 A[last] = 50

Now we choose A[mid] as the pivot, it is the median of the 3 values that we checked

Best Case: already sorted data with median of three yields the best case

Input – array, int first, int last First sorts the 3 index’s that we picked A[mid] is now the pivot Need to move the pivot onto the right side of the array – move to A[kast - 1] because we know that A[last] is greater than A[mid] We start the partitian algorithm at A[first -1] because we know that it is smaller than A[mid]

Simple partition has extra condition in while loop (case where all values are greater than the pivot to prevent an index ouot of bounds excception) but in tthe median of three, this condition does not have to be checked for because there is not way for this to occur as we already have a value smaller than the pivot into A[first]

Median of three removes the possibility of getting the worst case by having the data be sorted and the last value be the pivot but it doesnt remove the possibiliy of getting O(N^2) All median of three guarantees is that there will be one item less than the pivot and one item grreater than the pivot Median of Three guaranttees a liklihood of the worst case of O(N^2) butt makes it much less likley to occur

Expeccted run time O(Nlog2N) Worst Case - O(N^2) - less likely and fewer scenarios where this wouold come up using median of three

Random Pivot Quicksort

Pivot index is chosen randomly

Choose random index between 1st and last and swap it to the end of the array

Worst Case– just as bad as simple quicksort– very unlikley

Average Case – Nlog2N

Why is worst case so unlikely ???

Probability that the pivot we choose is the extreme — 2/N.. then the same thing on the next call – 2/(N-1) , and then the next one 2/(N-2) etc etc What is the proability that we always choose the case ine very call of parttition? — (2^N)/N! – tthhis is highly highly unlikley even in smalll arrays and incredibly unlikely tthe larger the array gets

Worst case is still O(N^2) but infintesimally small and will almost neverr occur Best Case - O(NlgN)

This is because Exponential grows much faster than N^2 … it is hyperexponential rather than exponential

Downside– it has some overhead – having to pick random numbers Need to rrun empirical analysis to know for sure

Dual Pivot Quicksort

Use two pivots rather than one. Gives us 3 regions in our partitioned array Values less than the first pivot Values betweent he first and second pivot Values after the second pivott

[ Less that first | P[first] | Mid Values | P[second] | Greater than second ]

This is super super efficient and it is what java uses

Optimizing Quicksort

Stop recursing when the logical size is 1 (the base case) However, this is not always efficient… at some point the cost of thte recursive call outweights the cost of Divide and Conquer

Instead we can stop at some base case > 1 and instead of recursive calls, use some other sorting algorithm with better runtime than Quicksort for smaller values Insertion sort is a good choice for this – for very small and partially sorted arrays, it has a best case of N

Recursive call with quicksort until a predefined base case > 1 and then from there call InsertionSort to finish the sort

ALTERNATTIVELY – do nothing in base case and keep in the partially sorted fashhion.. after all the recursive calls to break it down, call insertion sort to sort the mostly sorted array Requires empirical testing to find out which is better

Quicksort vs. Mergesort

MergeSort – more consistant runtime (NlogN) Quicksort – in normal case, Quicksort outperforms MergeSort (worst case is N^2 but quicksort doone in place and quicksort outperforms mergesort)

Quicksort is not very Stable Given X1 = X2, in a stable algorithm, X1 will still be before X2 in the sorted data (MergeSort) This is not guarenteed in quicksort, X2 may be before X1

For complex objects, may be better to use MergeSort to preserve other data held independent of the soort as Quicksort may lose some of this complexity

Java uses MergeSort for objects and Quuicksort for primitive types

Why is mergesort stable? – Merge sort compares pre sorted Left data with pre sorted Right data and if they are equal, uses the Left data before the right

Quicksort does not do this– comparisons are done during the partition method Looks for values on the wrong sides of the array and swaps them – this swapping allows values that are equal to be swapped and placed in the incorrect order

Mergesort of a Linked List

Can do mergesort on a linked list but there is more overhead On array, divides by index but this is not possible on linked list, requires traversal (this is the overhead) and this overhead is linear but is trivial Maintains asymptotic runtime of Nlog2N Rather than merging into a temp array, we merge into a new linked list

Quicksort can be done with linked list but loses a lot of its speed and would require a doubly linked list

Mergesort better for linked list

CS 0445 Exam 2 Material

Jacob Krasner

10/21/2020