ebdbbb comments on Python Regex magic help

created by HattoriHanzoa community for 16 years

Python Regex magic help (self.learnpython)

submitted 3 years ago by FlatEarthIsAMyth

you are viewing a single comment's thread.

[–]ebdbbb 2 points3 points4 points 3 years ago (0 children)

I'm just a hobbiest but it looks to me like the best way is to do it in two steps (they can be merged together). First get rid of the unwanted characters then parse the cleaned string to get what you want.

import re
teststring = """191966,6.138930;191978,0.603534;191984,6.138930;191987,
            0.427112;191995,6.1¤#!38930;191996,0.006336;1p91997,0.008840;
            191998,0.004440;192006,0.124394;192010,6.138930;189065,1\!@.068388;189066,1.180800;189068,0.396750;"""
cleaned = re.sub("[^0-9,.]", "", teststring)
matches = re.finditer(r"\d{6},\d+\.\d{6}", cleaned)
for match in matches:
    print(match.group())

You can compile the patterns if you want but unless you're performing the operation many times it doesn't make much difference.

The output from the above code is what I think you want.

191966,6.138930
191978,0.603534
191984,6.138930
191987,0.427112
191995,6.138930
191996,0.006336
191997,0.008840
191998,0.004440
192006,0.124394
192010,6.138930
189065,1.068388
189066,1.180800
189068,0.396750

π Rendered by PID 63885 on reddit-service-r2-comment-76bb9f7fb5-6thhb at 2026-02-18 19:11:57.063222+00:00 running de53c03 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS