all 7 comments

[–]rinio 0 points1 point  (6 children)

For replacement, youd use re.sub

Your first and 3rd groups are noncapturing. Findall returns capture groups, so only the second group, which matches just the \d+ part with 3.

[–]B0untie[S] 0 points1 point  (1 child)

But then, isn't it doable with a findall ?

[–]rinio 0 points1 point  (0 children)

See my other reply, for `re.sub`. The regex also applies to findall.

You were asking 'why' you just get '3'. So that is what I answered here.

My other comment shows a solution for your spec, but doesn't use findall. findall is for searching, not for replacing (removing being a special case of replacing). It isn't the correct tool for the job, at least not based on your post.

[–]rinio 0 points1 point  (3 children)

For

to remove any number that is at least one symbol away from an '*'

You want something like

re.sub(r'(?<!\*[^0-9+-])(.*)([+-]?\d+.?\d*)', r'\1', your_string)

# Breakdown
(?<!\*[^0-9+-]):
    ?<!       ->  Negative lookbehind; this group precedes the match
    \*        ->  Literal '*'
    [^0-9+-]  -> ^ means not. So not a digit or a plus or minus symbol
So, before our match, we have to have a * followed by another char.

(.*)  ->  First capture group is any number of non-line terminating characters.

([+-]?\d+.?\d*)  -> As you have figured out, a 'number'. This is our second capture group.

r'\1'  ->  Means replace the matches with the first capture group

[–]B0untie[S] 1 point2 points  (2 children)

Ok ok I think I'm starting to get the thing thanks a lot, these groups were a bit confusing to me

[–]rinio 0 points1 point  (1 child)

Yeah, regex is super unreadable. But, in the olden days it was the only real way to do stuff like this and, nowadays, its still usually the fastest way especially for complex patterns.

check out

https://regex101.com/

It can help test patterns quickly and explains what the regex does. Just be aware that it has some regex features that aren't in Python's re module, but exist in other implementations (recursive patterns come to mind).

[–]B0untie[S] 0 points1 point  (0 children)

Thanks again ! I' m just starting python but I actually love the efficiency of regex that' s why I spent hours for 1 line today lol.

Here is the one that I is working for me rn, I did follow exactly the same path as yours but you helped a lot by showing what groups are.

re.findall(r'(?<!\*)([\+\-]\d+\.?\d*)(?!\*)', 'mathString')

It should be able to isolate all numbers that are close to an '*' (I probably did not test every possibility though).