CISC-365 October 2, 2009

20091002

We did a quick study of Binary Search, as an illustration of how we analyze D&C algorithms

Assume A[0 .. n-1] is a sorted array of n values

BinSearch(a, b, k)    // a and b are the lower and upper indices of the part of the array we are searching, and k is the value we are searching for
    if (b < a) return -1
    else
        m = (a+b)/2
        if A[m] == k    return m
        else
            if A[m] < k    return BinSearch(m+1,b, k)
            else return BinSearch(a,m-1, k)

With any algorithm, there are two questions that must be answered
    1.    Does it always give the correct answer?
    2.    How long does it take?

With Divide and Conquer algorithms, correctness is usually established with Proof by Induction.

Base Case: If n (the number of elements being searched) = 1, then by examination the BinSearch algorithm works correctly.

I.H. Suppose that for all n <= n*, where n* is some value >= 1, the BinSearch algorithm always works correctly.

Now let n = n*+1. If the algorithm finds the target value immediately, it returns the correct value. If it doesn't find the target value immediately, it recurses on a smaller set of values (i.e. a set of size <= n*). By the IH, the algorithm returns the correct value on any such set. The algorithm then returns this value, which is the correct action.

Thus the algorithm works correctly on all sets of size n*+1, and by induction it works on all sets of any size.

The computational complexity of Divide & Conquer algorithms is usually established by solving a recurrence relation. For BinSearch, the recurrence relation looks like this:

T(1) = c            i.e. the algorithm takes constant time on a set containing just one value

T(n) = c + T(n/2) in general

Solving this by expansion yields the familiar result that for Binary Search, T(n) is O(log n)

We moved on to a quick review of Quicksort, and we focussed on the importance of the selection of the pivot value and the partitioning step. If the pivot is either the smallest or the largest value in the set, one of the two sides of the partition will be empty and the other will contain n-1 values. If this happens every time we choose a pivot, the algorithm will run in O(n^2) time. However the percentage of permutations that will produce this pathological behaviour is extremely small - in practice Quicksort virtually always runs in O(n * log n) time.

However if that is not sufficient, there IS a way to guarantee that we can get a good pivot value, one which will guarantee that Quicksort will run in O(n * log n) time. Stay turned ...