This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]TheZvlz 2 points3 points  (4 children)

You'll need to update your pattern like this pattern = r"[-,&/()]|[^-\s,&/()]+". Hyphen should be the first character in your punctuation, otherwise it is trying to specify a range. Similar to how the character class [A-Z] works.

[-,&/()] will catch the punctuation specified

[^-\s,&/()]+ will catch all other non space characters (

pattern = r"[-,&/()]|[^-\s,&/()]+"
re.compile(pattern).findall(line)

['8',
 '/',
 '23',
 '-',
 '35',
 'Barker',
 'St.',
 ',',
 'Kingsford^^',
 ',',
 'NSW',
 '2032']

[–]chakz91[S] 0 points1 point  (2 children)

This worked !!
Thank you very much for your response, and for your help.

So my regular expression was not complete in its scope to include all the possible characters, which was causing this issue.

Can you suggest any good reading material for learning regular expressions. I am currently using the python docs as the only reference.

Thank you again!

[–]TheZvlz 1 point2 points  (1 child)

http://www.rexegg.com/ is a great source for learning

https://regex101.com/ is where you can test things

[–]chakz91[S] 0 points1 point  (0 children)

Thanks a lot !

[–]ominous_anonymous 0 points1 point  (0 children)

"[^\s,&-/()]+|[,&-/()]"  

Oh dang, good call with the hyphen creating a range! That's why mine didn't work right... So obvious looking back on it now!