Recursion can become super inefficient if we nest a recursive call within a recursive call
Divide and Conquer Power Function::: Efficient
X^N = (X^(N/2)^2) N even, >0
X^N = X*(X^(N/2)^2) N odd, >0
X^N = 1 N = 0
Run time stack runs lgN 1 - 2 multiplications per level One for the square (always) One for the off N (sometimes) Spawn 1 recursive call which spawns 1 recursive call and so on
Divide and Conquer Power Function::: Inefficient
X^N = (X^(N/2))*(X^(N/2)) N even, >0
X^N = X*(X^(N/2))(X^(N/2)) N odd, >0
X^N = 1 N = 0
Mathematically similar, programming wise different 2 - 3 multiplications per level Need to recursively call X^(N/2) twice 2 recursive calls for each original call 2 calls spawn 2 calls spawn 2 calls and so on Each spawn calls two calls O(N) multiplications Same as non divide and conquer
Efficient Version::: Each recursive call leads to 1 recursive call… one recursive call at each level
Inefficient Version::: Spawns 2 rrecursive call per recursive call which increases the number of reccursive calls exponentially
Stack size is lgN, efficient version is lgN
Defined runtime based on multiplication … mult of integers depends on # of bits in integer Integerr with arbitrary # of bits, this is O(N) because it is a function of the number of bits and as we multiply, we may need to use more bits
Often calculate mult as mod. to get the remainder and use this to save space Known as power mod… also bounds the answer at the int you set too modulo it with
Binary search - data in sorted order
How to implement recursion::: 1) Divide problem up to make a recursive call A) Simply divide the array in half (logical division) 2) How to reassemble the data B) Trivial… just return
Base Case(s)::: 1) Base case not found - logiaal array size is down to 0 2) Base case found - key matches current item
Recusive Case::: Consider middle item in each recursive call 1) M == S, item has been found A) Second Base Case, found the key 2) M < S, return BS of right side of array 3) M > S, return BS of left side array
Return Binary search L + R returns the position based on which side it was found on
Cuts problem in half with each recursive call
Recursive and Iterative Binary Search will always have the same number of comparisons
Base Case:::: Low < High, nothing left is array, it is empty, return -1
Not Base Case::: Find middle item, compare to key If middle < key, returns to rigt side If middle > key, reurns to left side If middle == key, base case found
A recursive algorithm that makes the recursive call the last thing in the method This indicates we can do this same thing iteratively
Conversion to iterative can be favorable because it saves on overhead Saves on space Saves on time
Conversion off recursion to iterative simply is in our best interest
When is recursion beneficial::: 1) Some problems, recursion is just better 2) Sometimes difficult to do iterative
Go down a path to a solution until you realize that path wont go anywhere Backtrack to a paint where we can go forward again i.e. maze
Each move forward is a recursive call
Sometimes have to make a choice how to proceed
If stuck, back track to previous call, change choice and try to ove forward again
Either find a solution or run out of choices
Problem: Place 8 Queens on a chess board such that no queen can take another in the next turn
Can be solved using recursion and backtracking Each queen must be in a different row and different column Each row and each column must have exactly 1 queen Step 1: Place 1 queen Step 2: recursively place 7 queens on the rest of the board Back tracking is used if the initial choice is not able to lead us to a satisfiable solution
One queen in each column One queen in each row One queen in each diagonal
Place the queen and if it doesnt lead to a solution, perform backtracking.
Each recursive call attempts to place one queen in a specific column… then itteratively try each row in a specific column
Need to keep variables to keep track of the board… which are free and which would threaten the placement of the new queen
Recursively … initialiy try column 0… if we plae a queen legally at column 7, then we have solved the issue because this indicates that weve placed one in each column
If all rows in column K have been tried and failed, the call terminates and backtracks to the previous call (K-1). Removes the queen from where it was placed in K-1 and a different row is placed in this row…. if this is still unsuccessful, you continue backtracking until you place in a different column
JRQueens.java for reference on how to do this – trycol method Note the variables needed for this function
Why difficult to do iteratively – need to know info about previous column … in recursion this can be done with local variables which the program maintains for itself within the activation record on the runtime stack Col parameter and row parameter… each activation record has a column and a row, dont have to mutate the variable every time like we would with recursion
Ex. find a word in a 2D grid of letters - 16 letters … 4x4 grid … given a word, can the word be found in the grid Rules: each rule must touch the previous letter up, down, left, right (not diagonally)– can only use one letters once… cannot use the same letter twice
Each recursive call deals with one location on the board, the sequential records make up the stack as the activation record. If all letters matched, K activation records and we are done
If we go to a position on the board that does not match the letter we aare looking for, we backtrack a letter back and try a different route
Use back tracking if none of the options (LRUD) give the next letter
public boolean findWord(int r, int c, String word, int loc, char [][] bo)
findWord(r, c+1, word, loc+1, bo) // right findWord(r-1, c, word, loc+1, bo) // up findWord(r, c-1, word, loc+1, bo) // left findWord(r+1, c, word, loc+1, bo) // down
return as a boolean
Problem : 3 towers – on the first tower we have N disk of increasing size from top to bottom Can only move 1 disk at a time and can never put a larger disk on a smaller disk Middle tower allows this to be solved, middle one is the temperary holder
Very difficult to do iteratively
3 recursive functions Move N-1 disks from start to end Last disk from Start to end Move N-1 disks from mid to end
We move N - 1 items from the start to the middle and then put the Nth item from the start to the end We then move the N-1 items from the mid back to the 1st and repeat until there is nothing left
Difficult to do itteratively because it doesnt allow for a linear stack whereas recursion allows for building on the activation stack As stack increases in size, we get closer to the back case, start popping items of the stack, and decreasing the stack as we go
Recursive call has 2 recursive calls– makes a binary tree – without recursion the programmer needs to maintain all of the variables for all of these variables – maintainging infor stored in the stack is mmuch more difficultt but makes our code very prone to errors and bugs
##Towers of Hanoi Run Time
Recursive algorithms have less lines of code but this doesnt neccessarily change the run time.. could even make the runtime longer than iterattively
Key instruction for algorithm – move of disk – Every time we go down the level, we decrease problem size by 1 To get to base case, we need N levels in execution trace
Each size X call spawns 2 X-1 calls
Need more move N- 1 disks from start to middle and then move them back so 1 call spawns two calls ofN-1
Asymptotic Runtime – geometric sum – 2^0 + 2^1 + 2^2 + … + 2(N-1) 2(N-1) ==> 2^N - 1 = O(2^N)
Runtime is poor, exponential, takes a long time for large N
Execution not done level by level, goes down aand up and down and up towards the base case Down to base case and back up repeat
Sorting generally is done in increasing order ex) if i < j then A[i] <= A[j] If we want decreasing order then we just reverse this
##Insertion Sort
This system works by iteraating through the initial set of items, removing themm one at a time and putting them into a new set, but in this set we sort as we add the initial item
This works but would use double the memory–> more effiient to sort in place –> constant amountt of memory
Insttead, we can think of the array in ttwo parts, the sorted part and the unsorted part With each iteration, we take an item from the unsorted section and sort it into the sorted section, essentially works the same as insertion sort but the second array is just the beginning of out original array
We insert each item into the correct spot and find the correct spot for the item going from back to front
Starts off with only the array and the length as parameters Each iteration takes an item from the unsorted array and brings it to the sorted array calls another method to move the value to the correct spot Values shifted from left to righht (this leaves a hole where the item used to be )
Run Time of Insertion Sort Key isntruction – comparisons Worst case scenario – reverse sorted data - eaach N has to go through N (max) iterations to get to its proper spot in the srted array (N-1)(N)/2 or O(N^2) – worst case On average it is bettter but still O(N^2)
Selection sort – Iteration i, find the ith smallest item and swap it into location i Simple with nested loops – get the next smallest and swap it into tthe location Assume a[first] is smallest, itterate through the rest of the array, looking for the actual smallest and return that index
This returned index is used by the first loop to swap the actiual smallest with the ith
Run time - key instruction is commparison Diff runtime bettween insertion sort and selection sort Both have two for loops Insertion sort – inner for loop stops when the ittem is at the correct inserion point Selection Sort – Dont do a comparison in the loop header Iterations in both for loops based only on index values Outer loop - i goes from - to N - 2 Inner loop - j goes fro i + 1 to N - 1 Doesnt mattter how the datta is organized, there is no worst case or best case All cases iterate the same numberr of times and do the same number of comparrisons # comparison needed… N - start index i.e. starting at 3? n-3 comparisons
(N-1)(N)/2 –> O(N^2) –
Selection sort same as insertion sort
j compared to j+1 items if sortted, j < j + 1 – if this is the case, do nothing if j > j + 1, theyre out of order in whicch case we swap again continue from the beginning to end until commpletely sorted
This may hav to be done more than once as only two items compared at the same time
Worst case run time – O(N^2)
Insertion sort can be used on a linked list Remove front node and insert in order into a second, new list Not creating any new nodes, just moving them around Insert from front to back of sorted list, compare data as you go
Worst case, every new node ends up in the last position, indicating that it was in the sorted order in the previos linked list
Run time is the same O(N^2)
Simple Sorts are not the greatest Insertion Sort – A relatively poor algorithm – This requires either no movement or data move of 1 location – really out of order data requires a lot more comparisons
Shellsort moves data more than once in one turn which makes it more efficient Shellsort doesnt compare items that are close together, but rather items that are farther apart Shellsort breaks up the array into subarrrays that we do insertion sort on Break into subarrays, and compare the Kth item in each subarray, reducing the size of K until it is sorted
Shellsort breaks up the array as such 1 2 3 4 1 2 3 4 k = 4 The ones are commpared first and sorted and so on .. 1 compared with 1, 2 compared with 2, 3 compared with 3, 4 compared with 4 Now that weve gone thhrough one round of that, the k value is reduced 1 2 1 2 1 2 1 2 k = 2 The ones are compared and sorted and then the 2’s are compared and sorted 1 1 1 1 1 1 1 1 k = 1 Once K = 1, we do a rregular insertion sort, but the data is relatively sorted so this greatly reduces runtime
Shellsort does insertion sorts of subarrays, and at the end a final insertion sort of the whole array Shellsort has better runtime despite taking several insertion sorts to complete Insertion sort has really good runtime the more sorted the initial data is so shellsort moves the data to a more sorted value
Best Case - O(N) Good Case - O(N^(3/2)) Insertion Sort - O(N^2)
input – array, int first, int last
K = N/2 is a simple way to determine subarrays
Lower Bound - No comparison based sorting algorithm can do better than NlogN in the best case Dont know if we can actually achieve this
Divide and Conquer can help us do this. Ex. Binary Search implemented with recursion
How do we divide the problem into subproblems –
How do we use the solutions of the subproblems to determine the ooverall solution –
These questions answered differenttly in MergeSort and QuickSort
How do we divide the problem? – Just break in half and continue breaking in hald until there are subarrays of containing 1 item each (0 - 7) –> (0 - 3) & (4 - 7) –> (0 - 1) & (2 - 3) & (4 - 5) & (6 - 7) –> (0) & (1) & (2) & (3) & (4) & (5) & (6) & (7) Once we have 1 item each, weve reached out base case and the recursive breaking is complete Each subarray is sorted because they are being compared with nothing – sorting is done when puttting the pieces back together
How do we use subproblem solutions to solve the overall problem? – look at the first item in the subarrays and sort based on the lowest value into a merged array i.e. 20, 50, 60, 80 10, 30, 40, 70 Merged - 10, 20, 30, 40, 50, 60, 70, 80 Merge the two subarrays togethe in the overall merged array by looking at tthe first value
Data is copied from subarrays into a sorted temparray and then back to the original MergeSort DOES NOT sort in place Adds linear overhead to the run time – does not make it asymptotically worse We only hhave one temp aray PASSED INTO the rrecursive call rather than making a new temp array at each recursive call
There is a case where we dont need to merge, if the largest value on the sorted right is less than the smallest value on the sorted right – could potentially save us a merge
Comparisons are the key instruction More difficult because recursion is used instead of loops
Analyze level by level – at each level, each initial call gives rise to two calls that are each 1/2 the size of the original – same at each level because more arrays but smaller sizes – cancel out O(log2N) at each level – O(Nlog2N) total — log2N levels, O(N) at each level – multipled to Nlog2N
We recursively go down the left side, breaking farther and farther until the base case is reached…once the base case is reached we begin merging and going back up but we dont begin merging until weve rersively gone all thhe way down Once weve hit the base case at the left, we go up a level and go back down, merge and continue
lgN levels - each call of X spawns 2 calls of X/2 Each call of size X, worst case for merging is X calls
Sum– N + 2(N/2) + 4(N/4) … this is the total number of calls
Formally express – N = 2k (20)(2^k) + (21)(2k - 1) + (22)(2k - 2) + (2k)(20) = (K+1)(2^k) red green
red is number oof problems / calls green is size of problems / calls
(K+1)(2^k) with k = lg2N —–> (lg2N + 1)(N) = O(Nlg2N) Binary search has lgN levels but only 1 operation per level, MergeSort has lgN levels with N opperations per level
Towers of Hanoi also makes an execution tree with N levels bc each call of size X makes two calls of X-1… One move per call with overall runtime determined by number of calls made O(2^N) – Geometric Sum
Asymptotically optimal by achieving the lower bound asymptotically Not an in placce sort, requires memory overhhead in terms of copying into temp array and then back to the original Slows down the execution in real time
QuickSort does this by copying in place with Divide and Conquer
How to divide the problem Instead of dividing by index, divides into sections based on pivot point Gives 3 groups of data Data <= pivot on left of pivot Data == pivot Data >= pivot on right of pivot Division will not be equal on either side of pivot Pivot can be any value .. for simplicity can do last
Ex… 50, 80, 60, 20, 30, 10, 70, 40 Pivot is 40
10, 30, 20 40 50, 80, 60, 70 <= pivot pivot >= pivot
Pivot is now in its correctt spot
QuickSort(A) if (size of A > 1) [not at base case] Choose a pivot value Partition A into left and right sides based on the pivot Recursively sort left side Recursively sort right side
Once recursive calls are completed, we are done… sorting done in place so no merging required, we are done We have gotten rid of overhead
Partitioning— Start with countter on left and right most index of the array If the data at the left counter is less than the pivot, we leave it where it is If the data at the right counte is greatter than the pivot, we leave it where it is If here is a value on either side of the pivot tha does not abide by the rules, we need to swap the values in question (pivot and value) – then continue with the original counter Special case when the counter are next to each other… this tells us that the partition algorithm is completed– we now have to move the pivot into tthe correct spot between the two counters Swap the counter closest to the pivot with the pivot to get the pivot in the correct spot
Now that we’ve partitioned, we need to recursively sort the items on the left side and then the right side
Partition breaks into left and right side and the return value is the index where the pivot is located Firrst call quicksort from the firs value to the pivot -1 Then call qiccksort from the pivot + 1 to tthe end Inputs are thte array, first index and last index
Partition algorithm – inputt is the array, int firstt and int left pivot index = last indexFromLef = first indexFromRight = last - 1 While loop – until done While loop –increment as long as value is less than the pivot While loop – right side of array, make sure greater than pivot Contains condition indexFromRight > first – this is used if all of the values are greater than pivot – would make the condition neve true and lead to an indexOutOfBounds exception if not true Condition not needed in first while loop bc even if all values on left less than pivot, we would come in contact with the pivot befoe going past the aray Now swap the index from left and index from right – makes sure we havent gone through the entire array If we havent gone through the entire array, we loop all over again After the loop is completed, swap he pivot wih the back counter
Mergesort – O(Nlog2N) in best case and average case Quicksort– doesnt always divide evenly so allows for more variability in asymptotic analysis Optimal Analysis – pivot value is the middle value – equal number on both sides In this case, similar to mergesort and we will get O(Nlog2N) but without the overhhead of a temp array Worst case – Pivot is an extreme value – all greater or all less than Comparisons done in partition algorithm Compare N-1 items to the pivot – all N-1 items will be less than the pivot – single recursive call of N-1 … usually we should get 2 rewcursive calls but since all on one side, only one rercursive call Need to make N-2 ccomparison to pivot witha recursive call of N-2 Then a thirrs call of N-3 and this continues until we get to the base case This is the worst case because the most comparisons are being made Results in (N-1)(N)/2 –> O(N^2) – same as primitive sort Worst case is bad because it violates divide and conquer, we never divided it
Which runtime case is more likley?? Depends on original data distribution and how the pivot is chosen
Worst case for quicksort– data already sorted and pivot is the rightmost value The same worst case occurs if we have reverse sorted data Quicksort has average / expected of O(NlgN) but has the pottential for a worst case of O(N^2) Need to rethink how to choose the pivot
Optimizes quicksort by choosing the pivot better Would be better to choose the pivot from multiple locations to reduce the probability of getting the worst case O(N^2)
Instead oof choosing thte pivot at a single index, look at 3 values from the array .. A[first], A[mid], A[last] – locations prior to partition Sort the 3 pivot values relative to eacch other… i.e. Original A[first] = 50 A[mid] = 20 A[last] = 40 Rearange the Pivots – sorted pivots A[first] = 20 A[mid] = 40 A[last] = 50
Now we choose A[mid] as the pivot, it is the median of the 3 values that we checked
Best Case: already sorted data with median of three yields the best case
Input – array, int first, int last First sorts the 3 index’s that we picked A[mid] is now the pivot Need to move the pivot onto the right side of the array – move to A[kast - 1] because we know that A[last] is greater than A[mid] We start the partitian algorithm at A[first -1] because we know that it is smaller than A[mid]
Simple partition has extra condition in while loop (case where all values are greater than the pivot to prevent an index ouot of bounds excception) but in tthe median of three, this condition does not have to be checked for because there is not way for this to occur as we already have a value smaller than the pivot into A[first]
Median of three removes the possibility of getting the worst case by having the data be sorted and the last value be the pivot but it doesnt remove the possibiliy of getting O(N^2) All median of three guarantees is that there will be one item less than the pivot and one item grreater than the pivot Median of Three guaranttees a liklihood of the worst case of O(N^2) butt makes it much less likley to occur
Expeccted run time O(Nlog2N) Worst Case - O(N^2) - less likely and fewer scenarios where this wouold come up using median of three
Pivot index is chosen randomly
Choose random index between 1st and last and swap it to the end of the array
Worst Case– just as bad as simple quicksort– very unlikley
Average Case – Nlog2N
Why is worst case so unlikely ???
Probability that the pivot we choose is the extreme — 2/N.. then the same thing on the next call – 2/(N-1) , and then the next one 2/(N-2) etc etc What is the proability that we always choose the case ine very call of parttition? — (2^N)/N! – tthhis is highly highly unlikley even in smalll arrays and incredibly unlikely tthe larger the array gets
Worst case is still O(N^2) but infintesimally small and will almost neverr occur Best Case - O(NlgN)
This is because Exponential grows much faster than N^2 … it is hyperexponential rather than exponential
Downside– it has some overhead – having to pick random numbers Need to rrun empirical analysis to know for sure
Use two pivots rather than one. Gives us 3 regions in our partitioned array Values less than the first pivot Values betweent he first and second pivot Values after the second pivott
[ Less that first | P[first] | Mid Values | P[second] | Greater than second ]
This is super super efficient and it is what java uses
Stop recursing when the logical size is 1 (the base case) However, this is not always efficient… at some point the cost of thte recursive call outweights the cost of Divide and Conquer
Instead we can stop at some base case > 1 and instead of recursive calls, use some other sorting algorithm with better runtime than Quicksort for smaller values Insertion sort is a good choice for this – for very small and partially sorted arrays, it has a best case of N
Recursive call with quicksort until a predefined base case > 1 and then from there call InsertionSort to finish the sort
ALTERNATTIVELY – do nothing in base case and keep in the partially sorted fashhion.. after all the recursive calls to break it down, call insertion sort to sort the mostly sorted array Requires empirical testing to find out which is better
MergeSort – more consistant runtime (NlogN) Quicksort – in normal case, Quicksort outperforms MergeSort (worst case is N^2 but quicksort doone in place and quicksort outperforms mergesort)
Quicksort is not very Stable Given X1 = X2, in a stable algorithm, X1 will still be before X2 in the sorted data (MergeSort) This is not guarenteed in quicksort, X2 may be before X1
For complex objects, may be better to use MergeSort to preserve other data held independent of the soort as Quicksort may lose some of this complexity
Java uses MergeSort for objects and Quuicksort for primitive types
Why is mergesort stable? – Merge sort compares pre sorted Left data with pre sorted Right data and if they are equal, uses the Left data before the right
Quicksort does not do this– comparisons are done during the partition method Looks for values on the wrong sides of the array and swaps them – this swapping allows values that are equal to be swapped and placed in the incorrect order
Can do mergesort on a linked list but there is more overhead On array, divides by index but this is not possible on linked list, requires traversal (this is the overhead) and this overhead is linear but is trivial Maintains asymptotic runtime of Nlog2N Rather than merging into a temp array, we merge into a new linked list
Quicksort can be done with linked list but loses a lot of its speed and would require a doubly linked list
Mergesort better for linked list