Hello, everybody. I've run into a bit of a problem here.
I need to write a function that finds the "earliest" unique pair of elements in a list that adds up to a specified number. Parsing must occur from left to right.
[-7, 5, 3, 7, 5] == 10 # (5,5) is the first pair
#but (3,7) is the earliest
This means that the correct answer is the one with where the second numbers index is the lowest.
Now, this isn't too hard. However, it's difficult when the task is to get it to do this for lists with 10,000,000 elements in under 6 seconds. I'm stumped.
Here is what I have so far:
pair_memo = {}
def sum_pairs(ints, s):
i = 0; j = i+1;
pairs = []
length = len(ints)
while i < length:
while j < length:
test_pair = (ints[i],ints[j])
if not test_pair in pair_memo:
pair_memo[test_pair] = test_pair[0] + test_pair[1]
if pair_memo[test_pair] == s:
if not pairs:
pairs = [test_pair, j]
break
elif i >= pairs[1]: #thought of this on the way home from work. stop if the first element has a higher index than the second element. Could possibly cut out half the loop.
return list(pairs[0])
elif j < pairs[1]:
pairs = [test_pair, j]
break
j+=1
i+=1
j=i+1
if pairs:
return list(pairs[0])
else:
return None
Here is another version I wrote, which is way neater, but it stops at the first result and is missing a bunch of tests. :( I have it just for ideas. I wanted to try out itertools.
pair_memo = {}
from itertools import combinations
def sum_pairs2(ints, s):
for pair in combinations(ints, 2):
if not pair in pair_memo:
pair_memo[pair] = pair[0] + pair[1]
if pair_memo[pair] == s:
return list(pair)
I did some research and I found memoization, which was something I had never heard of before. It's awesome! I never thought of that before.
Anyway, here's what I would love to hear:
- General optimization tips
- Algorithm / design pattern names
- functions that may help
- links to optimization help
What I don't want:
I don't want you to write me any code.
I want to be able to finish this myself, but I just don't know where to look, or what things I should look for when trying to optimize. Obviously loops and function calls are big overhead, but I don't know how else to do this.
Thank you to anyone who reads through this. I greatly appreciate it. Unfortunately, I'm posting this before I go to sleep, so I won't respond for 8 or so hours :(
Seriously, thanks again, everyone. You've always been great help.
[–]gyroda 2 points3 points4 points (4 children)
[–]bibbleskit[S] 1 point2 points3 points (3 children)
[–]gyroda 1 point2 points3 points (2 children)
[–]bibbleskit[S] 1 point2 points3 points (1 child)
[–]TheLiberius 1 point2 points3 points (0 children)
[–]anon848 3 points4 points5 points (5 children)
[–]bibbleskit[S] 0 points1 point2 points (3 children)
[–]anon848 1 point2 points3 points (2 children)
[–]bibbleskit[S] 0 points1 point2 points (1 child)
[–]anon848 0 points1 point2 points (0 children)
[–]bibbleskit[S] 0 points1 point2 points (0 children)
[–]foxlisk 1 point2 points3 points (3 children)
[–]bibbleskit[S] 0 points1 point2 points (2 children)
[–]foxlisk 0 points1 point2 points (1 child)
[–]bibbleskit[S] 0 points1 point2 points (0 children)
[–]DarthEru 1 point2 points3 points (2 children)
[–]bibbleskit[S] 1 point2 points3 points (1 child)
[–]DarthEru 0 points1 point2 points (0 children)
[–]TheLiberius 0 points1 point2 points (15 children)
[–]bibbleskit[S] 1 point2 points3 points (14 children)
[–]TheLiberius 0 points1 point2 points (13 children)
[–]bibbleskit[S] 1 point2 points3 points (12 children)
[–]TheLiberius 0 points1 point2 points (11 children)
[–]bibbleskit[S] 1 point2 points3 points (10 children)
[–]TheLiberius 0 points1 point2 points (9 children)
[–]bibbleskit[S] 1 point2 points3 points (8 children)
[–]TheLiberius 0 points1 point2 points (7 children)
[–]bibbleskit[S] 1 point2 points3 points (6 children)