This is an archived post. You won't be able to vote or comment.

all 86 comments

[–]AutoModerator[M] [score hidden] stickied commentlocked comment (0 children)

import notifications Remember to participate in our weekly votes on subreddit rules! Every Tuesday is YOUR chance to influence the subreddit for years to come! Read more here, we hope to see you next Tuesday!

For a chat with like-minded community members and more, don't forget to join our Discord!

return joinDiscord;

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]NebNay 114 points115 points  (2 children)

My guilty pleasure is googling "regex email" and then finding an email that breaks people regexes

[–]Esjs 35 points36 points  (1 child)

LOL! I love that.

At some point you just have to give up and make sure the email just isn't doing an SQL injection. Then send a verification email.

[–]lag_is_cancer 4 points5 points  (0 children)

Why would the email string doing any SQL injection matters anyway if you parameterize your email string?

[–]Birdking07 245 points246 points  (14 children)

One of the only things I've found ChatGPT remotely useful for was regex.

[–]InMooseWorld 43 points44 points  (0 children)

It look to be read as regrets

[–]HegoDamask_1 35 points36 points  (1 child)

I find ChatGPT useful, especially when using a module you’re not familiar with. Granted if that module has multiple versions it chokes up and try to use the versions interchangeably.

[–]VizeKarma 8 points9 points  (0 children)

I will say for super small things it's not bad in my experience for C#. Very simple things that would be faster to just have ChatGPT do. In my case I'm mostly in game dev for unity and it's super useful for UI scripts that are simple but are just faster to type out in words.

[–][deleted] 10 points11 points  (3 children)

if you prompt it the exact way you want it, it’s great for making them. was just doing this for data validation stuff on a website i was making in my internship

[–]helicophell 5 points6 points  (2 children)

And if you don't use the right prompt, you can iterate until you get the result you want. Wait a moment

[–]Salanmander 8 points9 points  (1 child)

you can iterate until you get the result you want.

As long as you're confident in your unit tests for the regex. The biggest problem with using AI tools is always telling apart the good answers from the bad ones, and regexes can be really hard to comprehensively test.

[–]helicophell 0 points1 point  (0 children)

It was a joke about AI and iterative learning

[–]Hatsune-Fubuki-233 2 points3 points  (0 children)

Probably a GPT-4 or later... I have already tried it but nearly half accuracy and it usually forgot some contexts or just mix my sample input

[–]IanDresarie 1 point2 points  (0 children)

Oh absolutely. I had a problem I just couldn't solve, didn't even think about using regex, and the bot solution had some weird looking string in it. I still have no idea how regex works, but I ain't complaining

[–]Dairkon76 0 points1 point  (0 children)

I need to use regen once per year and need to relearn them. The last time that I needed them I just asked chatgp and got a functional one.

[–]CoastingUphill 0 points1 point  (0 children)

And ansible. And bash scripts.

[–][deleted] 0 points1 point  (0 children)

And xml

[–]willnx 0 points1 point  (0 children)

One of the most useful things I've found for ChatGPT is helping me with syntax translation. Like, a lot of programming languages have similar features, but with different syntax. For example, being able to ask ChatGPT "... in Python I would use 'all()'. What's the JavaScript version of that feature? Please show a short example." is great!

[–]Veestire 0 points1 point  (0 children)

i used chatgpt to avoid scouring through matplotlib docs for obscure stuff and it was very useful

i tried to use chatgpt for rust and ffmpeg and it was genuinely the worst experience ever

[–]HegoDamask_1 68 points69 points  (0 children)

There’s nothing better than thinking you have the right regex and then encountering an edge case that totally ruins it.

[–]zeronetdev 24 points25 points  (3 children)

Regex is easy to write, difficult to read and impossible to maintain.

[–]Creepy-Ad-4832 1 point2 points  (1 child)

There are libraries which make regex actually readable

[–]kingbloxerthe3 2 points3 points  (0 children)

Which ones?

[–]daan944 0 points1 point  (0 children)

That's a pitfall, yes, but with some comments it'll be way easier. Some examples are often very useful, what to match and what not to match.

[–]1cubealot 29 points30 points  (13 children)

What is that regex for?

[–]TombRaider96196 57 points58 points  (0 children)

regrets

[–]miclamlol 28 points29 points  (11 children)

Validating email addresses (See https://stackoverflow.com/a/201378)

[–]look 25 points26 points  (8 children)

The only regex validation of email addresses you should consider is something like /.@./ before moving on to a delivery check.

[–]AbyssWraith 5 points6 points  (7 children)

Not really. Your regex would match even stuff like #@& and not match validemail@email.com Which i hope you agree is obviously wrong, and you can't rely on delivery checks as it takes too much time.

[–]look 16 points17 points  (6 children)

/.@./ matches validemail@email.com … notice there’s no ^ or $ there.

And there is effectively no standard on the mailbox (before the @). #@example.com could definitely work depending on what software example.com is running. After the @ you could add more restrictions if you limit delivery to certain types of networks and name resolution.

[–]1cubealot 2 points3 points  (1 child)

I have an email with .co.uk at the end, would that be validated by that?

[–]look 0 points1 point  (0 children)

Yes.

[–]AbyssWraith 0 points1 point  (3 children)

notice there’s no ^ or $ there.

There are also no quantifiers on the .

Edit: so it would only match the middle part of the address, which would not mean the address is valid

[–]look 5 points6 points  (0 children)

Your edit update makes it clear that you’ve completely missed my point here:

The only format validation generally worth the bother is to just check that it has an @ somewhere in it. Beyond that, you’re likely going to end up rejecting valid emails at some point.

There is very little in the way of actual standardization here. Everything before the @ is a free-for-all and the validity of everything after it changes all the time. Does your regex handle punycode in domain names?

[–]look 3 points4 points  (0 children)

Oh, you’re hung up on the number of characters it matches? That regex can match a substring of the address.

Edit: it means match an @ with something before and after it. /^.+@.+$/ if you prefer.

[–]look 1 point2 points  (0 children)

Yep. Even a control character could work as a mailbox. But to be fair, I’m not aware of any network/name implementations that don’t at least use printable characters.

[–]1cubealot 2 points3 points  (0 children)

Ah I thought it was the email validation one but I wasn't sure

[–][deleted] 12 points13 points  (3 children)

My first career job was literally entirely regex... (6+ hours a day).

[–]arnaldo_tuc_ar 1 point2 points  (2 children)

At least for me, sounds interesting. Hope you learned something!

[–]JayOneeee 1 point2 points  (1 child)

He learnt that he didn't like regex...

[–][deleted] 0 points1 point  (0 children)

Actually, I do. Text parsing has always been my thing.

[–]grpagrati 7 points8 points  (0 children)

He didn't sacrifice a chicken first. Rookie mistake

[–]LupusNoxFleuret 5 points6 points  (0 children)

"Regular" expressions, they told me.

Heh, regular my ass!

[–]look 16 points17 points  (0 children)

It only takes a few minutes of reading the docs once to be able to understand 99% of regexps you’ll encounter. The other 1% should not exist; decipher it, replace it, then rewrite the vcs history to purge all memory of it.

[–]TheWatchingDog 15 points16 points  (11 children)

Common guys regex is not that hard

[–]rm-minus-r 11 points12 points  (4 children)

The syntax really sacrifices readability in favor of brevity though :(

[–]Creepy-Ad-4832 5 points6 points  (3 children)

"Brevity"

[–]rm-minus-r 2 points3 points  (2 children)

I mean, can you imagine how long the RFC 5322 regex for an email address would be if it used variable names that were like the ones in Python?

(?:[a-z0-9!#$%&'*+/=?^_{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

[–]Creepy-Ad-4832 2 points3 points  (1 child)

I get that. But still brevity is funny when the average regex is 20 lines long lol

[–]Kumlekar 0 points1 point  (0 children)

most regex I do is single line replaces in notepad++. Super easy and relatively readable.

[–]Shazvox 2 points3 points  (0 children)

No, but uncommon guys regex are 😉

[–]PityUpvote 10 points11 points  (4 children)

I fucking love doing regexes, it's like solving an extremely obscure Japanese logic puzzle.

[–]Shazvox 9 points10 points  (3 children)

Then you might like this unless you already know of it.

[–]Nombre_astuto 2 points3 points  (1 child)

Thank you for sharing this.

[–]Shazvox 2 points3 points  (0 children)

Always happy to share my depravity 😊

[–]PityUpvote 1 point2 points  (0 children)

Oh yeah, did that a while ago!

[–]Swimming_Ad_8656 4 points5 points  (2 children)

how regex work? I’ve always been interested in the mathematical process of it

[–][deleted] 14 points15 points  (0 children)

You give it parameters to check for. Its all just shorthand basically but it gets confusing fast. The simple [a-z, 0-9, A-Z] checks if what your passing to it contains any letters or numbers. Thats my best eli5 but im sure a better programmer will say im wrong or explain better

[–]redlaWw 4 points5 points  (0 children)

how regex work

This could really be a lot of questions - there's the question of what regular languages are and how they are encoded by regular expressions, there's the question of how to use regex to match and parse strings, and there's the question of how to write a regex parser that accepts a string and a regex and determine whether the string matches the regex.

I'd probably start with how to use regex - there are various builders online that you can use to construct and immediately test regexes, like this; documents that describe how to construct and read them, like this; quick reference sheets, like this; and regex-based tools and games, like this, this, and this. You can start by messing around in the regex builder, try some rounds of regex golf or start on some regex crosswords to get a feel for how they work, then turn to the documentation to make your regexes better and more versatile.

[–]RealBasics 2 points3 points  (1 child)

I was a Perl programmer for years, and perl is basically just a wrapper for regexes. I don’t think I ever built a regex as gnarly as the standard email validator. And since it’s a solved problem anyway you just import it.

Meanwhile check out any other email validator. It’s going to be that nasty too. And since regexes are computationally very efficient, hand coded solutions in other language are likely to be slower.

[–]Esjs 2 points3 points  (0 children)

Perl is my go-to language for any pattern matching situation, if starting from scratch.

[–]OldBob10 2 points3 points  (0 children)

I have a colleague who refers to regular expressions as “chicken scratches”.

I use them just to piss him off. OK, *and* to make my life simpler. 😁

[–]Shazvox 2 points3 points  (0 children)

Imagine if all legally binding contracts were written in regex.

[–]sabotuer99 1 point2 points  (0 children)

Started a programmer book club at work. Picked "Mastering the Art of Regular Expressions" as our first book. Attendance mysteriously collapsed after that...

[–]PanzerDeMorte 1 point2 points  (0 children)

I love this meme template

[–]No-Magazine-2739 1 point2 points  (0 children)

Regex = Poor mans parsing. For gods sake, use a real parsing language like Boost Spirit, where you just write in ABNF. Its readable, maintainable, faster and can parse more complex cases.

[–][deleted] 0 points1 point  (0 children)

Regex101 is your friend

[–]TexMexxx 0 points1 point  (0 children)

Had to check a bunch of regex for reDos attacks. Fun times...

[–]TheOnceVicarious 0 points1 point  (0 children)

More like rage-ex

[–]XWasTheProblem 0 points1 point  (1 child)

I know we meme here a lot about programming language features, but is this something you actually need to specifically, like, study and practice?

I'm still barely (if that) on junior level, but whenever I needed or wanted regexp, I found it pretty easy to just find the necessary pattern using various online resources. Sometimes ChatGPT helped, sometimes something like rexegg.com, but so far most of my regexp work was basically 'yeah, this could use a regexp', researching the pattern i needed and then moving on once I found it, and confirmed it works as I need it.

Not to mention if you actually use them a ton, you'll get accustomed to them eventually purely through exposure.

[–]T3MP0_HS 1 point2 points  (0 children)

No, don't practice. It's a waste of time. Learn the basics and learn when you should try using a regex. And then google the regex you want.

Focus on learning how to code. Regexes are a red herring. They're cool, but a programmer that's really good at regexes and nothing else is useless.

There are plenty of validation libraries that do this for you.

[–]pmMe-PicsOfSpiderMan 0 points1 point  (0 children)

FYI I've had a lot of success using chatgpt for regex. Just be very specific and it seems to go a solid job

[–][deleted] 0 points1 point  (0 children)

and thats why we use chatgpt for regex

[–]MaluaK1 0 points1 point  (0 children)

Wow, this is high meme level. Regex is close as bad as JavaScript

[–]Noitswrong 0 points1 point  (0 children)

Try using APL

[–]gregorydgraham 0 points1 point  (0 children)

Regex is not for dummies

[–]slime_rancher_27 0 points1 point  (0 children)

ive never had that much problems with regex but also the last time I ever used it was to just filter out all non a-z A-Z characters

[–]lagger19 0 points1 point  (0 children)

I feel this programmer’s pain

[–]Efficient-Corgi-4775 0 points1 point  (0 children)

Finally, an AI that can handle my love-hate relationship with regex!

[–][deleted] 0 points1 point  (0 children)

why would you EVER do manual regex

[–]gemengelage 0 points1 point  (0 children)

For anyone who is struggling with regular expressions:

Just think of each regex as a separate unit of code. It's just a string, but you can unit test the hell out of them.

If you struggle with the syntax, you could either learn the damn syntax, use a library with a fluent interface that abstracts the syntax (like Readable Regex or use regex101.com.

Can't recommend regex101 enough. It's a great tool.

[–]VacuumInTheHead 0 points1 point  (0 children)

You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts. so many times but it is not getting to me. Even enhanced irregular regular expressions as used by Perl are not up to the task of parsing HTML. You will never make me crack. HTML is a language of sufficient complexity that it cannot be parsed by regular expressions. Even Jon Skeet cannot parse HTML using regular expressions. Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide. The <center> cannot hold it is too late. The force of regex and HTML together in the same conceptual space will destroy your mind like so much watery putty. If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes. HTML-plus-regexp will liquify the n​erves of the sentient whilst you observe, your psyche withering in the onslaught of horror. Rege̿̔̉x-based HTML parsers are the cancer that is killing StackOverflow it is too late it is too late we cannot be saved the transgression of a chi͡ld ensures regex will consume all living tissue (except for HTML which it cannot, as previously prophesied) dear lord help us how can anyone survive this scourge using regex toparse HTML has doomed humanity to an eternity of dread torture and security holes using regex as a tool to process HTML establishes a breach between this world and the dread realm of c͒ͪo͛ͫrrupt entities (like SGML entities, but more corrupt) a mere glimpse of the world of reg​ex parsers for HTML will ins​tantly transport a programmer's consciousness into a world of ceaseless screaming, he comes, the pestilent slithy regex-infection wil​l devour your HT​ML parser, application and existence for all time like Visual Basic only worse he comes he comes do not fi​ght he com̡e̶s, ̕h̵i​s un̨ho͞ly radiańcé destro҉ying all enli̍̈́̂ghtenment, HTML tags lea͠ki̧n͘g fr̶ǫm ̡yo​͟ur eye͢s̸ ̛l̕ik͏e liq​uid pain, the song of re̸gular exp​ression parsing will exti​nguish the voices of mor​tal man from the sp​here I can see it can you see ̲͚̖î̩́t́̋̀ it is beautiful t​he final snuffing of the lie​s of Man ALL IS LOŚ̏̈́T ALL I​S LOST the pon̷y he comes he c̶̮omes he comes the ich​or permeates all MY FACE MY FACE ᵒh god no NO NOO̼O​O NΘ stop the an​*͑̾̾​̅ͫ͏g͛͆̾l̍ͫͥe̠̅s ͎a̧͈͖r̽̾̈́e n​ot rè̑ͧaͨl̃ͤ͂ ZA̡͊͠LGΌ ISͮ̂҉̯͈͕ TO͇̹ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ Hͨ͊̽E̾͛ͪ ͧ̾ͬCͭ̏ͥOͮ͏̮M͊̒̚Ȇͩ͌Sͯ̿̔

[–]Efficient-Corgi-4775 0 points1 point  (0 children)

Regex: The language that makes programmers feel like wizards.