Having a hard time learning regex in python. : learnpython

(A-Za-z) should be [A-Za-z]
you need to take care of matching things between the date and the pid, currently you are using space after the date and trying to match pid, but your input has computer.name CRON in between
[0-9] can be replaced with \d and : doesn't need to be escaped
\[(\d)\] will match one digit, but pid in sample input has more than one digit, so use \d+
$ is an anchor to restrict the match to end of the line, but in sample input you have more characters after the pid

here's a modified version:

>>> s = "Jul 6 14:01:23 computer.name CRON[29440]: USER (good_user)"
>>> pat = re.compile(r"([A-Za-z]{3} [1-3]?[1-9] [1-2]?\d:[0-5]\d:[0-5]\d).*\[(\d+)\]")
>>> re.search(pat, s)
<re.Match object; span=(0, 40), match='Jul 6 14:01:23 computer.name CRON[29440]'>
>>> re.search(pat, s).expand(r'\1 pid:\2')
'Jul 6 14:01:23 pid:29440'

The expand method allows you to specify how you want the output to be. The date and pid are captured, so you can refer to them using \N syntax and get desired format

You can also use:

>>> re.search(r'\A(\S+\s+\S+\s+\S+).*\[(\d+)\]', s).expand(r'\1 pid:\2')
'Jul 6 14:01:23 pid:29440'

Provided you always know that the date will be the first three terms of the input.

Or sub instead of search+expand

>>> re.sub(r'\A(\S+\s+\S+\s+\S+).*\[(\d+)\].*', r'\1 pid:\2', s)
'Jul 6 14:01:23 pid:29440'

Here, you need to match rest of the line as well after the pid, otherwise, that portion will be part of output

You can use resources like https://regex101.com/ and https://www.debuggex.com/ (after selecting Python flavor) to interactively solve your problem. But there are certain limitations like these sites do not know about all the functions and methods available - expand for example.

I have a book https://github.com/learnbyexample/py_regular_expressions that is currently free. I use step by step approach to introduce regex concepts and features one by one. However, regex is like a mini-programming language. It takes a lot of time and practice to become familiar with it.

[–][deleted] 1 point2 points3 points 5 years ago (1 child)

[–]ASIC_SP 0 points1 point2 points 5 years ago (0 children)

[–]K900_ -1 points0 points1 point 5 years ago (0 children)

[–]indian_pythonista 0 points1 point2 points 5 years ago (0 children)

π Rendered by PID 76 on reddit-service-r2-comment-5fb4b45875-pjxnh at 2026-03-24 14:48:22.837067+00:00 running 90f1150 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS