The Flying Dutchman: The 3Sum Problem

For a given list of numbers, the 3Sum problem discovers whether there is at least one combination of three numbers in the list that sum up to zero. The same problem is also listed at LeetCode, with a small twist, where all solutions are to be returned, with duplicates removed.

A few examples:

[0, 0, 0] ==> [[0, 0, 0]]
[-1, 0, 1, 2, -1, -4] ==> [[-1, -1, 2], [-1, 0, 1]]
[-2, 0, -2, 1, 0, 4, 0, -1, -2, 0, -2, 1, 0, 4, 0, -1] ==> [[-1, 0, 1], [-2, -2, 4], [0, 0, 0], [-2, 1, 1]]

The naive, brute force, yet Pythonic approach would be to use itertools.combinations to generate a collection of all possible triplets. From those triplets, we filter out the ones that sum up to zero. Each triplet we then order and add to a set to remove the duplicates. We need to temporarily convert the triples to a tuple, as lists are not hashable in Python. Finally, we convert the set of unique results into a list of lists, as requested by Leetcode:

def threesum_brute_force(L):
    triples = itertools.combinations(L, 3)  # O(n^3)

    zeroes = [t for t in triples if sum(t) == 0]
    return map(list, set([tuple(sorted(t)) for t in zeroes]))

That approach is O(n^3) because we compare all possible combinations. It may look like we actually create them ahead of time before doing any further analysis, but itertools.combinations is implemented as a generator function, so we only produce one triplet at a time, when needed. For a list with 800 elements, we end up generating and comparing 510,081,600 triples.

A similar implementation uses three loops to create each of the combinations. This is still O(n^3), more clearly showing now as we iterate over i, j, and k. This version runs much faster than the previous one, as we avoid creating all the intermediate tuples themselves. We are still doing a lot of repetitive work.

def threesum_triple_loop(L):
    result = set()
    for i in xrange(len(L) - 2):
        for j in xrange(i + 1, len(L) - 1):
            for k in xrange(j + 1, len(L)):  # O(n^3)
                if L[i] + L[j] + L[k] == 0:
                    result.add(tuple(sorted((L[i], L[j], L[k]))))
    return map(list, result)

Rather than comparing all triples, producing O(n^3) time complexity, there is a way to solve this problem in O(n^2). The insight is to first sort all the numbers. Then we loop from left to right. At each incremental step the next number will be larger than before. Rather than have two nested loops to find the next two elements of the triple, we maintain a region bound by j and k.

While we go over each number, we sum this number with the two numbers on the edge of the region. If the sum is greater than zero, we are too far to the right and we shrink the region towards the left. If the sum is less than zero, we are too far to the left. The region remembers the last result and provides a good starting point to find the next zero sum triple. This makes this solution a Dynamic Programming solution, rather than brute force.

A final optimization is to stop once i reaches zero. Namely, after than point the total sum can only be larger than zero, so we can stop iterating.

def threesum(L):
    L = sorted(L)
    result = set()
    for i in xrange(len(L) - 2):
        j = i + 1
        k = len(L) - 1
        while j<k:
            s = L[i] + L[j] + L[k]
            if s == 0:
                result.add(tuple(sorted((L[i], L[j], L[k]))))
            if s > 0:
                k -= 1
            else:
                j += 1
        if L[i] >= 0:
            break
    return map(list, result)

Using DP, we narrowed down the search space dramatically. However, in the above solution, each time we increment i, we pick the next k by selecting the last number in the list, which is also the maximum number. Now, once i gets closer and closer to zero, this k will be less appropriate and we will end up doing a linear search from k towards i to get a smaller positive number that will give us a zero sum. The optimal k can be found more efficiently with a binary search, which makes the overall algorithm more efficient again.

A similar argument applies to j. If we consider the maximum number, it may make less sense to make j become i+1 for its first candidate. Namely if i+j+k==0, then j should be -i + -k. For example, assume the number at i is -30 and the maximum for the list is 24. In that case, rather than make j point at a number such as -29, we can make j skip all the way ahead to 6, as that would be the first candidate to yield zero. Again, using binary search will give us a starting point for j more effectively than doing a linear search, if we are dealing with a large number of elements:


def threesum_binarysearch(L):
    L = sorted(L)
    n = len(L)
    result = set()
    i = 0
    while i < n - 2:
        j = binarySearch(L, i + 1, n - 2, -(L[i] + L[-1]))
        k = binarySearch(L, j + 1, n - 1, -(L[i] + L[j]))
        while j<k and k<n:
            s = L[i] + L[j] + L[k]
            if s == 0:
                result.add(tuple(sorted((L[i], L[j], L[k]))))
                k -= 1
                j += 1
            elif s > 0:
                k -= 1
            else:
                j += 1
        if L[i] == 0:
            break
        i += 1
    return map(list, result)

The binary search makes the search for j and k possible in O(log n), rather than O(n):

def binarySearch(L, min, max, target):
    while True:
        if max < min:
            return min
        mid = (min + max) / 2
        if L[mid] < target:
            min = mid + 1
        elif L[mid] > target:
            max = mid - 1
        else:
            return mid

To give you an idea of the performance of each algorithm, here is a test run with 800 numbers, showing the number of i+j+k==0 comparisons performed by each algorithm and the total time needed:

Algorithm	Comparisons	Time (s)
Brute force	510,081,600	21.09800005
Three loops	636,804	0.1089999676
DP	195,048	0.07100009918
DP + binary search	133,383	0.03600001335

In short, when solving problems like 3Sum, avoid comparing all elements. This can be done by reducing the search space and ignoring candidates that won't lead to a solution anyways. Aside from reducing the search space, if we can order it in some form, we have even more optimization opportunities. In that case, we can typically avoid linear search and use binary search.

Check out many more algorithms with visualizations at PyAlgoViz.

Python code styled as tango by hilite.me with border:1px solid #ddd;padding:11px 7px;

The Flying Dutchman

Friday, January 6, 2017

The 3Sum Problem

No comments:

Post a Comment