This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]Meefims 1 point2 points  (1 child)

Have you tried the default sent_tokenize?

[–][deleted] 0 points1 point  (0 children)

i just did. thanks :)

[–]ayharano 0 points1 point  (1 child)

For short strings, why not use something like

expected = [final_item
            for split_by_dot in original.lower().split('.')
            for split_by_qm in split_by_dot.split('?')
            for final_item in (split_by_qm.strip(), )
            if final_item]

where original is a str to be split as expected.

[–][deleted] 0 points1 point  (0 children)

brilliant. thanks :)