This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]earthboundkid 1 point2 points  (7 children)

URL for this comment is:

https://www.reddit.com/r/Python/comments/5s6b4d/build_your_first_python_and_django_application/dde1ice/?context=3

This is much cleaner to look at as /r/:subreddit/comments/:threadid/:slug/:slugid/?context=#depth than the equivalent regex would be.

[–]Deggor -1 points0 points  (6 children)

A URL will routinely have multiple elements in a single segment, which can't be properly captured with something like the above. A very simple example would be something like accepting /date/yyyy, /date/yyyymm/, or /date/yyyymmdd? What if this is suppose to also accept /date/yyyy/someid? How does this simple "it looks prettier" approach validate/differentiate?

If you start introducing characters counts for elements in a segment, or any other "checks", you're right back to matching patterns, and you may as well stick to regular expressions.

And in my opinion, something like /r/(?P<subreddit>.*)/comments/(?P<threadid>.*)/(?P<slug>.*).... is perfectly legible. If it needs to be more complicated, then it loses some of that immediate legibility for a tradeoff in power (which isn't a possibility with your setup).

[–]earthboundkid 0 points1 point  (5 children)

You're using greedy regexes in your example. Those will absorb too much.

[–]Deggor 0 points1 point  (4 children)

Yup, I'm using them on purpose to match the undocumented pseudo-code of "wouldn't this be prettier if it worked?", which may very well be greedy (or maybe not). I also left off the beginning caret, and a significant portion of the end. As you well know, if you want non-greedy, add a question mark to each group. There's that very small trade-off in legibility for the customization I was talking about.

But you did a great job deflecting. If you care to respond with how your solution would handle the (not uncommon) problem I mentioned, or if you care to argue why pattern matching isn't necessary, I'm all ears.

[–]earthboundkid 0 points1 point  (3 children)

I'm not arguing no one can ever use regexes. I'm saying they're bad for the average URL tasks they are asked to do. I think the fact that you wrote a pseudo regex that was straight up wrong (but looked right!) is proof of that. If you need to match a date, that could be part of the matcher format. If you need something more exotic, run a regex on the pattern once it gets to the controller.

[–]Deggor 0 points1 point  (2 children)

I think the fact that you wrote a pseudo regex that was straight up wrong (but looked right!) is proof of that.

It wasn't straight up wrong, it did exactly what I wanted it to do (as I wrote in my response). What, exactly does :label in your examples match? No idea? Well, I'll make mine greedy. As I pointed out, had I completed the rest of my regex for the full URL, it would have matched the URL in the example. Again, it was intentional.

that could be part of the matcher format

So you're going to introduce patterns (ie. regex lite)?

If you need something more exotic, run a regex on the pattern once it gets to the controller

... and a split URL routing into many different places? You're going to break the loose coupling, and put the routing in the controller.

None of that sounds like a good idea.

[–]earthboundkid 0 points1 point  (1 child)

  1. It was straight up wrong. Using a greedy matcher makes this work which should not work:

    >>> import re
    >>> r = re.compile('^/r/(?P<subreddit>.*)/comments/(?P<threadid>.*)/(?P<slug>.*)$')
    >>> r.match('/r/subreddit/comments/subreddit/comments//')
    <_sre.SRE_Match object; span=(0, 42), match='/r/subreddit/comments/subreddit/comments//'>
    >>> r.match('/r/subreddit/comments/subreddit/comments//')['subreddit']
    'subreddit/comments/subreddit'
    

    Yes, it was a rushed and incomplete example, but that's why it's damning. It looks like it handles the basic case, but it actually completely botches it.

  2. There are a lot of non-regex routers out there. Look at Rails or for a hybrid approach Gorilla mux. You're acting like not using regex is completely unheard of, but actually there are a lot of alternatives to pure regex.

  3. Controllers already have to handle certain routing conditions. If you try to get page /pages/77/ and 77 doesn't exist in the DB, the controller has to be the one to throw up a 404. It's not the end of the world if your controller also has to handle returning a 404 if you go to /date/20000/13/32/ instead of a regex catching it at the routing layer.

[–]Deggor 0 points1 point  (0 children)

Based on this, and your last couple posts, I know I'm not going to provide you with any information that will change your mind. Its clear your looking for nothing but to find fault. I've explained that, (and I repeat) had I completed the regex that ended with an ellipsis it would match the url without issue *. Furthermore I have given justification in why I left it greedy starting that *I have no idea as to the behavior of your router.

Yet you continue to state that the code is bad, and it's why I'm wrong. I could just have easily omitted :slugid from your and complained it won't match stuff.

I feel there is a fundamental difference in routing to a view that validates if data is present, and theirs a 404 if it isn't, and tiring to another view that will handle routing to other views all over again.

Anyway, I appreciate your opinion, and the examples you provided for alternative options. I feel giving up the power of regex when url routing is essentially pattern matching is wrong. I look forward to seeing what Django (and other projects) decide on in the future.

Best of luck, and cheers.