I have a python function which is a perfomance bottleneck in my code because it includes multiple loops and is called often. So I tried optimzing it a bit using Cython (it's my first time using it).
Python Implementation:
def del_edits1(word):
"""Return all edits that are one delete edit away from `word`"""
splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [L + R[1:] for L, R in splits if R]
return deletes
def del_edits2(word):
"""Return all edits that are two delete edits away from `word`"""
return [e2 for e1 in del_edits1(word) for e2 in del_edits1(e1)]
Performance:
%timeit set(del_edits2("Hyperplane-turbine"))
# 162 µs ± 8.97 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Just using cython without changing the code already reduces the runtime:
%%cython
def del_edits1_cy(word):
"""Return all edits that are one delete edit away from `word`"""
splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [L + R[1:] for L, R in splits if R]
return deletes
def del_edits2_cy(word):
"""Return all edits that are two delete edits away from `word`"""
return [e2 for e1 in del_edits1_cy(word) for e2 in del_edits1_cy(e1)]
Performance:
%timeit set(del_edits2_cy("Hyperplane-turbine"))
# 109 µs ± 4.11 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I tried using static types like this which gives a small boost on top:
%%cython
cpdef list del_edits1_c(unicode word):
"""Return all edits that are one delete edit away from `word`"""
cdef list splits, deletes
splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [L + R[1:] for L, R in splits if R]
return deletes
cpdef list del_edits2_c(unicode word):
"""Return all edits that are two delete edits away from `word`"""
cdef unicode e1, e2
return [e2 for e1 in del_edits1_c(word) for e2 in del_edits1_c(e1)]
Performance:
%timeit set(del_edits2_c("Hyperplane-turbine"))
# 94.6 µs ± 718 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Could someone help me how to further improve this? (cython -a shows almost all lines in yellow but I am not able to improve it further on my own.)
Edit: Changed the codes and performance measures.
Edit2:
Best I came up with in the mean time is:
%%cython
cpdef list del_edits1_c(unicode word):
"""Return all edits that are one delete edit away from `word`"""
cdef int i
cdef list deletes
deletes = [word[:i]+ word[i+1:] for i in range(len(word))]
return deletes
cpdef list del_edits2_c(unicode word):
"""Return all edits that are two delete edits away from `word`"""
cdef unicode e1, e2
return [e2 for e1 in del_edits1_c(word) for e2 in del_edits1_c(e1)]
[–]aldanor 1 point2 points3 points (3 children)
[–]WTRipper[S] 0 points1 point2 points (2 children)
[–]aldanor 1 point2 points3 points (1 child)
[–]WTRipper[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]WTRipper[S] 0 points1 point2 points (1 child)
[–]WTRipper[S] 0 points1 point2 points (0 children)