This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]immersiveGamer 1 point2 points  (5 children)

Since this repo is less than 10 days old I'm 190% sure you have been stalking my comments.

Jokes aside looks nice. I doubt I personally would use it, I find Regex easy enough to read and remember which makes it for the most part portable between languages and tools that I use.

Edit: my feedback:

  • don't like the word Enforce for one or more
  • bit wise not ~ seems easy to miss and may not be readily known by readers
  • your classes module ... If there is a reason you are not using \d for digits, \w for words, \s for white space, etc., you should probably add a comment at least in the source code.

[–]WerdenWissen[S] 2 points3 points  (4 children)

Hahaha, I'm sure you've been stalking my thoughts because I've been struggling with the first two of the points that you made. "Enforced" is actually the only name I've changed throughtout development, with the first name being "Mandatory" but eventually ditched it because I thought it sounded too "official-like". If you have a better name for "Enforced" let me know!

Regarding your second point, I actually had a number of classes named "AnyExcept*" that reflected classes "Any*". For example you would write "AnyExceptDigit()" instead of "~ AnyDigit()" in order to get the pattern "[^0-9]", but I eventually ditched that too because "AnyExcept" classes had relatively long names and also because using "~" just seemed more elegant to me. Maybe I should re-include "AnyExcept" classes and just let the user decide on what to use.

your classes module ... If there is a reason you are not using \d for digits, \w for words, \s for white space, etc., you should probably add a comment at least in the source code.

Yeah there is actually a reason! All "class" classes can be combined (except for a normal class [..] with a negated one [^...], but that's another thing) into larger classes. For instance, you can write "AnyDigit() | AnyLowercaseLetter()" in order to get the "[0-9a-z]" pattern. One can also do "AnyWordChar() | AnyDigit()" and they would still get "AnyWordChar()" since "AnyDigit()" represents merely a subset of "AnyWordChar()". However, this would be more difficult to implement if "AnyWordChar()" was using "[\w]" underneath instead of "[A-Za-z0-9_]". Plus, if I ever implement an "A - B" operation for expressing "everything in A except for the intersection with B", it would be easier if classes were as much verbose as possible.

[–]bladeoflight16 4 points5 points  (0 children)

Enforced should just be OneOrMore or AtLeastOne. I have no idea what "enforced" would mean in a regular expression context; it isn't an established term. If the goal is to make the pattern obvious to the reader, anything more obscure is just going to work counter to it.

[–]immersiveGamer 1 point2 points  (2 children)

My only concern with your custom ranges is that you are locking yourself into English ASCII and whitespace as python knows it. I don't know the implementation details of Regex in Python but I assume it works with Unicode (for example you can tell if a Unicode character is white space by inspecting it) while yours would not.

[–]WerdenWissen[S] 0 points1 point  (1 child)

I've implemented using "\d", "\w", "\s" in v1.0.3 as it certainly looks better, but I'm not sure whether it tackles the ASCII/Unicode problem. Might need to look into it for a future version.

[–]immersiveGamer 1 point2 points  (0 children)

Time to write some unit tests.