The Flying Dutchman: QuickSort

QuickSort is a divide-and-conquer algorithm that splits up an array into two halves, and recursively sorts each half. In other words, to sort an array, we pick a random pivot, split the array in three section, being smaller, equal, or larger than the pivot, and recursively sort the array:

  def qsort(array):
      if len(array) < 2: return array
      pivot = array[0]
      less = filter(lambda n: n<pivot, array)
      equal = filter(lambda n: n==pivot, array)
      larger = filter(lambda n: n>pivot, array)
      return qsort(less) + equal + qsort(larger)

In the original implementation of QuickSort, the first element in the array was picked as pivot, as is done on line 3 above. Below, the red bar indicates the choice of pivot. At each step of the algorithm, it is at the left of the array being sorted:

Choosing the first element as pivot can generate extra work when the array is already (partially) sorted. Rather than picking the first element, the middle element can be chosen, behaving better on already sorted arrays:

def qsort(array):
    if len(array) < 2: return array
    pivot = array[len(array)/2]
    less = filter(lambda n: n<pivot, array)
    equal = filter(lambda n: n==pivot, array)
    larger = filter(lambda n: n>pivot, array)
    return qsort(less) + equal + qsort(larger)

When choosing the value in the middle of the array as pivot, yields the following result, with again the red bar indicating the pivot:

Rather than pick the middle element as pivot, a random value can be picked from the array:

def qsort(array):
    if len(array) < 2: return array
    pivot = array[random.randint(0, len(array)-1)]
    less = filter(lambda n: n<pivot, array)
    equal = filter(lambda n: n==pivot, array)
    larger = filter(lambda n: n>pivot, array)
    return qsort(less) + equal + qsort(larger)

The pivot is now picked randomly as shown in the visualization below:

The final approach to pivot selection we will look at was invented by Robert Sedgewick and involves picking the median of three values:

def qsort(array):
    if len(array) < 2: return array
    pivot = (array[0] + array[len(array)/2] + array[-1])/3
    less = filter(lambda n: n<pivot, array)
    equal = filter(lambda n: n==pivot, array)
    larger = filter(lambda n: n>pivot, array)
    return qsort(less) + equal + qsort(larger)

The pivot is now picked by looking at three different values as shown in the visualization below:

Aside from how we pick the pivot, all versions of QuickSort shown above have the same thing in common that they use recursion for implementing the divide and conquer tactic. Recursion has overhead in that new stackframes need to be constructed for each call. Recursion also limits the maximum size of the array we can sort, but that is more of a theoretical challenge.

QuickSort can also be implement without using recursion, by using a user-defined stack to keep track of the work to be done. A possible implementation:

def qsort(array):
    work = [array]
    result = []
    while work:
        array = work.pop(0)
        if len(array) < 2:
            result.extend(array)
        else:
            pivot = array[0]
            work = [
                filter(lambda n: n<pivot, array),
                filter(lambda n: n==pivot, array),
                filter(lambda n: n>pivot, array),
            ] + work
    return result

And the execution (this time using the first element as pivot again):

In general, QuickSort is not a stable sort. Elements of equal value are not guaranteed to remain in their original sort order for most implementations. However, the implementations above keep the elements that are equal to the pivot in the original order as they were found. That makes the above implementation stable.

Check out more algorithm visualizations at PyAlgoViz.

Python code styled as tango by hilite.me with border:1px solid #ddd;padding:11px 7px;

The Flying Dutchman

Sunday, December 18, 2016

QuickSort

No comments:

Post a Comment