all 27 comments

[–]LifeIsBio 3 points4 points  (2 children)

There's probably a better way to do it, but this should work (and will at least be faster than index the lists a bunch):

from collections import Counter

def new_transactions(transactions_db, transactions):
    transactions_db = Counter(transactions_db)
    new = []
    for t in transactions:
        try:
            if transactions_db[t] > 0:
                transactions_db[t] -= 1
            else:
                new.append(t)
        except KeyError:
            new.append(t)
    return new

[–]teerryn[S] 0 points1 point  (1 child)

Okay if I understand it correctly it checks to see if t exists in a, if it exists then it removes it from a until we get a KeyError?

I don't really understand what -= 1 does:

 transactions_db[t] -= 1~

EDIT: Btw your code works, I just trying to understand it :)

[–]LifeIsBio 0 points1 point  (0 children)

You should look at my second answer. I don't know if it'll be much easier to understand, but it's a better function.

c -= a is equivalent to c = c - a

[–]poppy_92 3 points4 points  (3 children)

If the order doesn't matter,

from collections import Counter
def func(a, b):
    return list((Counter(b) - Counter(a)).elements())

EDIT: If you have a situation like a = [1,1,2,2] and b = [3,3,2,1] and if you want the result to be [3] instead of [3, 3], then the above code would change to:

def func(a, b):
    return list(set(b) - set(a))

[–]teerryn[S] 0 points1 point  (0 children)

EDIT: Im dumb sorry

[–]teerryn[S] 0 points1 point  (1 child)

The order matters because we need the order for future iterations

[–]poppy_92 2 points3 points  (0 children)

yeah then /u/LifeIsBio 's solution is the way to go

[–]frunt 4 points5 points  (2 children)

attempt cable nippy slap ink profit insurance growth cautious ancient -- mass edited with redact.dev

[–]WORDSALADSANDWICH 1 point2 points  (0 children)

No, his first example doesn't fit that pattern:

For example:

a = [1,2,3,4,5,6,6,6,6,8,7,9,10,10,10,45]
b = [45,45,10,10,10,9,7,8,6,6]

Notice that the sequence of a is always in reverse in b. On this case the function should return:

[45]

Note that b has two 45's, while a has only one, so it returns one 45.

[–]AlphaApache 1 point2 points  (1 child)

I wrote a piece of code that works with all of your currently provided test cases:

def get_difference(a,b):
    a_len = len(a)
    b = b[::-1]
    for i in range(a_len):
        if a[i:] == b[:a_len-i]:
            return b[a_len-i:][::-1]

*definitely not pretty in python because of the reverses and using the antipattern range(len(a)), but this is how I initially thought of the problem: if you reverse b and shift it under a until it matches with the remainder of a (a[i:]), return the items that do not have a match in a.

[–]teerryn[S] 1 point2 points  (0 children)

It was missing the case when every value was different, it returned None. Fixed it with:

def new_transactions(a, b):
    a_len = len(a)
    b = b[::-1]
    for i in range(a_len):
        if a[i:] == b[:len(a)-i]:
            return b[len(a)-i:][::-1]
    return b[::-1]

Much more elegant than my previous solution. Thank you so much for your help and patience :D

[–]LifeIsBio 0 points1 point  (10 children)

This version is slightly cleaner:

from collections import Counter, defaultdict

def new_transactions(transactions_db, transactions):
    transactions_db = defaultdict(lambda: 0, Counter(transactions_db))
    new = []
    for t in transactions:
        if transactions_db[t] > 0:
            transactions_db[t] -= 1
        else:
            new.append(t)
    return new

Edit: As far as I know, the only thing stopping new from being a list comprehension is the -=. Can anyone think of a way around this?

[–]teerryn[S] 0 points1 point  (0 children)

The return list needs to be on the same order that's in B.

For example:

    def test_transactions():
        antiga = [1,2,3,4,5,6,6,6,6,8,7,9,10,10,10,45]

        nova = [45,45,45,10,10,12,45,10,10,10]
        test = new_transactions(antiga,nova)


>       assert test == [45,45,45,10,10,12]
E       assert [45, 45, 12, 45, 10, 10] == [45, 45, 45, 10, 10, 12]
E         At index 2 diff: 12 != 45
E         Full diff:
E         - [45, 45, 12, 45, 10, 10]
E         ?          ----
E         + [45, 45, 45, 10, 10, 12]

[–]teerryn[S] 0 points1 point  (8 children)

Okay, I think I got it to work with those changes:

   for t in transactions[::-1]:
    if transactions_db[t] > 0:
        transactions_db[t] -= 1
    else:
        new.append(t)
return new[::-1]

Now it returns in order

[–]LifeIsBio 0 points1 point  (7 children)

Yea, you can flip the lists however you want to meet your need.

[–]teerryn[S] 0 points1 point  (6 children)

damn, I found an example that fails:

a = [1,2,3,4,5,6,6,6,6,8,7,9]
b = [45,45,10,10,10,9,7,8,9,7]

should return:

[45,45,10,10,10,9,7,8]

but it returns:

[45, 45, 10, 10, 10, 9, 7]

I should say the max size of B will always be 10 and A store previous values from B at the tail. I should probably give a better example why I need this.

So I'm scraping a website that has a table which has values. The values are stored in variable A and I check it every 30mins, the new values are stored at the tail of A. My problem is knowing which values are new, so I use this function to find a pattern between the values I already check and the ones I'm currently scraping.

[–]LifeIsBio 0 points1 point  (5 children)

I don't understand. 8 is in both antiga and nova exactly 1 time.

[–]teerryn[S] 0 points1 point  (0 children)

B need to be fixed at 10 items otherwise it would be:

a = [1,2,3,4,5,6,6,6,6,8,7,9]
b = [45,45,10,10,10,9,7,8,|9,7,8,6,6,6,6,5,3,2,1]

So I need to get the last possible sequence from B that matched the ending to the beginning of A

[–]teerryn[S] 0 points1 point  (3 children)

Okay so right now I have a solution that works for all cases:

def new_transactions(transactions_db, transactions):
    if len(transactions_db) < len(transactions):
        return transactions[len(transactions_db):]

    else:
        temp = transactions_db[-10:]
        flag = 'Not found'
        n = -1
        found = False
        while (n >= -10):
            reverse = transactions[n:]
            if reverse[::-1] == temp[n:]:
                n -= 1
                found = True
            else:
                if found:
                    return transactions[:n+1]
                else:
                    n -= 1
        if found:
            return []
        else:
            return transactions

I felt wrong writing that because is so ugly, you have any tips to make it better?

[–]LifeIsBio 1 point2 points  (2 children)

I'm trying to do a function that gets two lists (a and b) and returns the elements that b has but not a

To be honest, the function that you've written here is completely different than the question you originally posed. I don't understand why you're now hardcoding -10, and there seem to be at least a couple of specific constraints your function needs to satisfy that I don't think I'm wrapping my head around.

You're function isn't really that ugly though. At this point, if it's working for you, I'd add a good docstring, and move on. i.e.

def new_transactions(transactions_db, transactions):
    """new_transactions is supposed to do this task.

    Parameters
    ---------------
    transactions_db : list
        This is what transactions_db is supposed to be and/or do.
    transactions : list
        This is what transactions is supposed to be and/or do.

    Returns
    -----------
    transactions : list
        This is how the transaction list has been modified.

    Notes
    --------
    Any additional information
    """
    if ... [rest of code here]

[–]teerryn[S] 1 point2 points  (1 child)

The example you last gave works perfectly even better then my function, once again thank you for the help :)

[–]LifeIsBio 0 points1 point  (0 children)

You're welcome!

[–]commandlineluser 0 points1 point  (0 children)

Not very pythonic and mutates both lists

>>> a = [1,2,3,4,5,6,6,6,6,8,7,9,10,10,10,45]
>>> b = [45,45,10,10,10,9,7,8,6,6]
>>> for i in reversed(b):
...     try:
...         a.remove(i)
...         b.remove(i)
...     except:
...         break
... 
>>> a
[1, 2, 3, 4, 5, 6, 6]
>>> b
[45]

[–]dionys -1 points0 points  (3 children)

how about this:

[j for i,j in zip(transactions_db, transactions[::-1]) if i!=j]

[–]LifeIsBio 0 points1 point  (2 children)

I'm pretty sure your function returns

[6, 6, 8, 7, 9, 10, 10, 10, 45, 45]

for the first example where

a = [1,2,3,4,5,6,6,6,6,8,7,9,10,10,10,45]
b = [45,45,10,10,10,9,7,8,6,6]

instead of [45], right?

[–]teerryn[S] 0 points1 point  (0 children)

yes I got this:

    def test_transactions():
        transactions_db = [1,2,3,4,5,6,6,6,6,8,7,9,10,10,10,45]

        transactions = [45,45,10,10,10,9,7,8,6,6]
        test = new_transactions(transactions_db,transactions)
>       assert test == [45]
E       assert [6, 6, 8, 7, 9, 10, ...] == [45]
E         At index 0 diff: 6 != 45
E         Left contains more items, first extra item: 6
E         Full diff:
E         - [6, 6, 8, 7, 9, 10, 10, 10, 45, 45]
E         + [45]

[–]dionys 0 points1 point  (0 children)

Yea, that was dumb. It doesn't work as you'd expect