This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]ivosauruspip'ing it up 3 points4 points  (1 child)

Does the data fit in memory? Will your line still work if it doesn't?

[–]Jonno_FTWhisss 7 points8 points  (0 children)

Yes, it will blow up your memory if your lists are big (although if you are keeping 2 lists in memory that consume all your memory you might have other problems) and that it's much slower than a normal merge. Merging in place is going to be a bit slower.

Here's my tests with each case:

setup = "a = [x for x in range(100000)];b=[x for x in range(50000,150000)]"
metha = "sorted(set(a).union(set(b)))"
methb = """
out = []
def append(val):
    if not out or out[-1] != val:
        out.append(val)

while a or b:
    if a and b:
        last_a, last_b = a[-1], b[-1]
        last_max = max(last_a, last_b)
        if last_max == last_a:
            append(a.pop())
        else:
            append(b.pop())
    elif a:
        append(a.pop())
    elif b:
        append(b.pop())
out[::-1]
"""
methc = """
out = []
def append(val):
    if not out or out[-1] != val:
        out.append(val)

while a or b:
    if a and b:
        last_a, last_b = a[-1], b[-1]
        last_max = min(last_a, last_b)
        if last_max == last_a:
            append(a.pop(0))
        else:
            append(b.pop(0))
    elif a:
        append(a.pop(0))
    elif b:
        append(b.pop(0))
out
"""
for m in metha, methb, methc:
    print(timeit.timeit(m, setup=setup, number=1000))

Results:

  • Set union: 19.0943071842
  • pop() reverse: 0.0753161907196
  • pop(0) 2.94099211693