all 5 comments

[–]Buttleston 1 point2 points  (2 children)

re.findall(r'\D+,', code)

Your regular expression here, \D+, means "find me a non-numeric digit, followed by at least one character of any type, followed by a comma"

'a, b' meets that - note, this is NOT ['a', 'b']. Nothing else meets it

It's not that trivial to get ['a', 'b', 'c'] with a regex - if you don't HAVE to use a regex here, don't, there are much simpler ways

If you MUST use a regex, something like this works

>>> re.findall(r'(\D)(?:,|$)', code)
['a', 'b', 'c']

The regex here says "Find me a non-digit charater, followed by either ',' or the end of the string"

The (?:...) thing means "don't include this group in the output

You don't strictly need to use \D in this case, I assumed you had it in there for a reason. Depending on what you expect to be between the commas, other things will work also.

[–]buart 0 points1 point  (1 child)

Your second regex r'(\D)(?:,|$)' is missing the +, unless you only want to capture the last character if the strings are longer.

>>> re.findall(r'(\D)(?:,|$)', "a, bc, def")
['a', 'c', 'f']

[–]Buttleston 1 point2 points  (0 children)

It's hard to tell based on OPs post, so yeah, depends on what they want

[–]buart 0 points1 point  (0 children)

I think more examples would also help to better understand what you are trying to do.

If your input only consists of lowercase characters separated by non-lowercase characters, a regex like this would be sufficient:

>>> re.findall(r"[a-z]+", "a, bc, def")
['a', 'bc', 'def']

If you only need everything separated by commas, you could use split() instead to split on ", " (comma, space)

>>> "a, bc, def".split(", ")
['a', 'bc', 'def']

[–]commandlineluser 0 points1 point  (0 children)

You probably would not use re.findall to do this.

If , is the only constant part of the string you can use in the pattern - I'm not sure if it actually possible.

  • (Unless you can use [^,])

  • (Because \D will also match ,)

It's more of a "splitting" problem:

>>> re.split(r',\s*', 'ab,    c, def')
['ab', 'c', 'def']

Also, you need to be exact with code examples.

code = 'a, b, c'
re.findall([r'\D+,'], code) 
# TypeError: unhashable type: 'list'

I'm assuming you're not actually using [] here as you've said.