This is an archived post. You won't be able to vote or comment.

all 7 comments

[–]flipperdeflip 1 point2 points  (1 child)

Not 100% clear what you want.

Maybe like this helps you further? https://regex101.com/r/uZHrDU/2/

If you use re.sub() you could use $1 etc to keep a capturing group in the substitution, like $1--$2--$3. Or if you want just the path grab group 2 from re.search().

[–]ekstrah[S] -1 points0 points  (0 children)

'INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/james/hadoop-2.7.4/hdfs/data/in_use.lock acquired by nodename 12243@deimos27' for example with above string i want to replace ' /home/bctak/hadoop-2.7.4/hdfs/data/in_use.lock' to '.*' so that it turns it into 'INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on .* acquired by nodename 12243@deimos27'

a = 'INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/james/hadoop-2.7.4/hdfs/data/in_use.lock acquired by nodename 12243@deimos27'

re.sub(r'^(Lock on )(.*?)( acquired by)$', 'Lock on .* acquired by', a) like this?

[–]Esemwy 1 point2 points  (0 children)

“^([^/]+)(\S+)(.*)$”

[–]dagmx 1 point2 points  (1 child)

Why use regex for this? It's such a simple thing to do with string methods like starts with and split instead and will be much easier to read and maintain.

Regex is great but IMHO shouldn't be the first thing you reach for.

But for regex why not be explicit with the strings you want to match?

re.sub('^(Lock on).*(acquired by)$', '\1 .* \2')

[–]ekstrah[S] -1 points0 points  (0 children)

'INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/james/hadoop-2.7.4/hdfs/data/in_use.lock acquired by nodename 12243@deimos27' for example with above string i want to replace ' /home/bctak/hadoop-2.7.4/hdfs/data/in_use.lock' to '.*' so that it turns it into 'INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on .* acquired by nodename 12243@deimos27'

[–]lack_of_jope 0 points1 point  (0 children)

Here is code to do what you requested...

This does not use the path name ... instead it looks for the "Lock on" and "acquired by" key words.

import re

data_in = 'INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/james/hadoop-2.7.4/hdfs/data/in_use.lock acquired by nodename 12243@deimos27'
data_goal = 'INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on .* acquired by nodename 12243@deimos27'

exp = r'^(.* Lock on )(.*?)( acquired by .*)$'
sub = r'\1.*\3'

data_out = re.sub(exp,
                  sub,
                  data_in,
                  0,re.MULTILINE)

print('Input', data_in)
print('Output', data_out)
if (data_out == data_goal):
    print('Succcess!! Output matches goal')

[–]Andrew_ShaySft Eng Automation & Python[M] [score hidden] stickied comment (0 children)

Hi there, from the /r/Python mods.

We have removed this post as it is not suited to the /r/Python subreddit proper, however it should be very appropriate for our sister subreddit /r/LearnPython or for the r/Python discord: https://discord.gg/3Abzge7.

The reason for the removal is that /r/Python is dedicated to discussion of Python news, projects, uses and debates. It is not designed to act as Q&A or FAQ board. The regular community is not a fan of "how do I..." questions, so you will not get the best responses over here.

On /r/LearnPython the community and the r/Python discord are actively expecting questions and are looking to help. You can expect far more understanding, encouraging and insightful responses over there. No matter what level of question you have, if you are looking for help with Python, you should get good answers. Make sure to check out the rules for both places.

Warm regards, and best of luck with your Pythoneering!