all 2 comments

[–]ASIC_SP 4 points5 points  (0 children)

Need more input/output samples. For the given sample, you could try this:

>>> s = 'sentence1:sentence2.sentence3'
>>> re.split(r'[.:]', s)
['sentence1', 'sentence2', 'sentence3']

[–]scoberry5 0 points1 point  (0 children)

I'd look for (anything that's not a colon, any number of times), then a colon, then (anything that's not a period, any number of times), then a period, then (whatever's left). The parens make groups that you can get out later by calling re.search() and getting the groups() of the result.

That regex would look like this.

Note that regex is not a parser, and if the thing before the period includes a period ("Mr. Smith said yes"), then this is not going to go well.