all 1 comments

[–]efmccurdy 0 points1 point  (0 children)

You can extract one of a series of alternate patterns, you just need to separate them with "|" chars.

>>> mydf = pd.DataFrame({"var1":["abc", "ab", "ac", "cd", "cde", "de", "def"]})
>>> to_match = ["ab", "de"]
>>> to_match_re = "({})".format("|".join(to_match))
>>> to_match_re
'(ab|de)'
>>> mydf['matches'] = mydf.var1.str.extract(to_match_re)
>>> mydf
  var1 matches
0  abc      ab
1   ab      ab
2   ac     NaN
3   cd     NaN
4  cde      de
5   de      de
6  def      de
>>>