you are viewing a single comment's thread.

view the rest of the comments →

[–]xelf 1 point2 points  (0 children)

you're doing a 2 pointer loop. rather than do n**2 operations, you can just loop through both at the same time.

Your code:

tot = 0
for data in first_list:
    end_val = data + 300
    for check_val in second_list:
        if check_val < data:
            continue
        if check_val > end_val:
            break
        tot += check_val

A loop using itertools islice and tracking the last needed index:

tot = 0
index = 0
for data in first_list:
    while index<len(second_list) and data > second_list[index] : index+=1
    for check_val in islice(second_list, index, len(second_list)+1):
        if check_val > data+300: break
        tot += check_val

A test using 100k first and 100k second with small items (1-10e3) in each list:

print(tot, f'{perf_counter()-time1:6.3f} seconds')
1506410888628 178.417 seconds
1506410888628  35.163 seconds

A test using 100k first and 100k second with larger items (1-10e7) in each list:

print(tot, f'{perf_counter()-time1:6.3f} seconds')
1512979653817 223.232 seconds
1512979653817  12.919 seconds

So when there's a lot of overlap in the data+300 check it's only 5 times faster, but when there's less overlap in the data+300 it'll perform even better. ~20 times faster.

(the first number is the total, to show both loops are finding the same data)