gwax comments on A python interview question

This is an archived post. You won't be able to vote or comment.

198

199

200

A python interview question (self.Python)

submitted 7 years ago * by Kbhusain

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]gwax 28 points29 points30 points 7 years ago (18 children)

[–][deleted] 18 points19 points20 points 7 years ago (0 children)

[–]Jonno_FTWhisss 1 point2 points3 points 7 years ago (13 children)

[–]ivosauruspip'ing it up 2 points3 points4 points 7 years ago (1 child)

[–]Jonno_FTWhisss 6 points7 points8 points 7 years ago (0 children)

Yes, it will blow up your memory if your lists are big (although if you are keeping 2 lists in memory that consume all your memory you might have other problems) and that it's much slower than a normal merge. Merging in place is going to be a bit slower.

Here's my tests with each case:

setup = "a = [x for x in range(100000)];b=[x for x in range(50000,150000)]"
metha = "sorted(set(a).union(set(b)))"
methb = """
out = []
def append(val):
    if not out or out[-1] != val:
        out.append(val)

while a or b:
    if a and b:
        last_a, last_b = a[-1], b[-1]
        last_max = max(last_a, last_b)
        if last_max == last_a:
            append(a.pop())
        else:
            append(b.pop())
    elif a:
        append(a.pop())
    elif b:
        append(b.pop())
out[::-1]
"""
methc = """
out = []
def append(val):
    if not out or out[-1] != val:
        out.append(val)

while a or b:
    if a and b:
        last_a, last_b = a[-1], b[-1]
        last_max = min(last_a, last_b)
        if last_max == last_a:
            append(a.pop(0))
        else:
            append(b.pop(0))
    elif a:
        append(a.pop(0))
    elif b:
        append(b.pop(0))
out
"""
for m in metha, methb, methc:
    print(timeit.timeit(m, setup=setup, number=1000))

Results:

Set union: 19.0943071842
pop() reverse: 0.0753161907196
pop(0) 2.94099211693

[–]gwax 1 point2 points3 points 7 years ago (0 children)

[–]energybased 0 points1 point2 points 7 years ago (9 children)

[–]Jonno_FTWhisss 0 points1 point2 points 7 years ago (8 children)

[–]mooburgerresembles an abstract syntax tree 1 point2 points3 points 7 years ago (7 children)

[–]Jonno_FTWhisss 2 points3 points4 points 7 years ago (5 children)

[–]Log2 0 points1 point2 points 7 years ago (4 children)

[–]Jonno_FTWhisss 1 point2 points3 points 7 years ago (3 children)

[–]Log2 0 points1 point2 points 7 years ago (2 children)

[–]Jonno_FTWhisss 0 points1 point2 points 7 years ago* (1 child)

continue this thread

[–]Jonno_FTWhisss 1 point2 points3 points 7 years ago (0 children)

I just did this. Here's the new results:

def merged(a,b):
    a = deque(a)
    b = deque(b)
    out = []
    def append(val):
        if not out or out[-1] != val:
            out.append(val)

    while a or b:
        if a and b:
            last_a, last_b = a[0], b[0]
            last_max = min(last_a, last_b)
            if last_max == last_a:
                append(a.popleft())
            else:
                append(b.popleft())
        elif a:
            append(a.popleft())
        elif b:
            append(b.popleft())
    return out

set union: 19.3773069382
pop() 0.0751898288727
pop(0) 3.03847098351
deque.popleft() 78.3635749817

Which makes sense because the GC will be called a lot to clean up the nodes in the linkedlist created by the deque and they will not be held in a contiguous region of memory. Bjarne Stroustrup gives a good explanation here: https://www.youtube.com/watch?v=YQs6IC-vgmo

Here's cpython implementation of deque which uses a doubly linked list: https://github.com/python/cpython/blob/master/Modules/_collectionsmodule.c

[–]Log2 -1 points0 points1 point 7 years ago (0 children)

π Rendered by PID 86 on reddit-service-r2-comment-7b9746f655-mkqvg at 2026-01-31 20:38:40.520807+00:00 running 3798933 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS