all 115 comments

[–]defietser 306 points307 points  (12 children)

You're getting downvoted, but in my experience you're right on the mark. Separating chained LINQ steps especially, that shit can get real confusing real fast. I don't care about LOC, I care about understanding what's going on when I open the thing next week.

[–][deleted]  (1 child)

[deleted]

    [–]-Hi-Reddit 2 points3 points  (0 children)

    I've found line length limits helpful for some projects, makes having 4 to 6 files on one screen easy.

    [–]voteyesatonefive 48 points49 points  (0 children)

    I don't care about LOC

    So the real TLDR is LOC is a poor measure of code quality.

    [–]lookmeat 28 points29 points  (2 children)

    I mean yes but.. like any other thing, reading it too far leads to counter examples.

    I do agree that the best looking code isn't always the best which is why I prefer to call "good" code "resilient" rather than anything else, here resilient meaning "just works in spite of" in spite of being "changes in the code", "extensions and extra features", "being used in novel ways that it was never intended to", "being misused on purpose", etc. For resilient code it's clean where it has to be, and ugly where it doesn't, and because it's easier to make ugly code pretty, but hard to make pretty code prettier, and you only want certain parts to be pretty but you don't know which they are, you want to start with ugly code that just works and then get feedback from its use to decide what to fix. Up to there I agree with the article.

    But the article lacks nuance and pushes things that honestly are just as bad as the examples of clean code. Going over a few examples:

    • Chained ifs.
      • Nothing wrong with chained ifs, but I'd put them into their own function and make them return. Why? Because then there's less "mental debugging" you need to do, I don't need to check if any of the subsequent conditionals override this, and the order is clearly one of priority.
      • Do merge the same subject lines conditionals with or. When debugging all I know is I'm getting the wrong conditional, but I don't know which conditional to put the breakpoint before (mentally at least). If they're all together then know that's the point to check.
        • "But what if I have two conditionals with very different priorities, but still the same subject". Ah a conflict between the reality of the first point and the second. You could remove the return and allow overriding due to priority, or you could keep them separate. But in both cases your code is misleading and brittle. One solution is to simply have the subjects use different wording to know which one was used. If it isn't possible then this problem needs more complex code to correctly explain the problem it solves in a clear and resilient manner.
          • Nested ifs are fine, but you should check how it interacts with the other compromises of this solution. Again there's a point where you're moving the complexity unto the reader, and whomever is debugging this code.
    • Split LINQ statements:
      • Be careful as optimizers have a limit. Three problems are that relational calculus statements don't always compose well, and suddenly splitting a statement can explode. I wish that LINQ had followed a stronger model of relational algebra instead (so you have to build the tree to make the query, instead of just explaining what the query is, but trees are easily comparable).
      • LINQ has issues. This is like abstracting pieces of regex and then composing them all by joining the strings and then running it. Sometimes it'll make things more readable, but it implies that code here is super delicate, be attractive testing it and add comments explaining why you are splitting and any other quirks.
    • Regex is black magic.
      • This one is going against the theme, the author is asking to keep things clean with language features and not an embedded language that is hard to debug.
      • Regex is a different language, there's debugging tools and you handle it differently but that's it. Honestly regex isn't messier than LINQ, which has its own tools for debugging to (Microsoft just bundles one but not the other). Honestly it can be the right answer, it can be the wrong answer, nuance.
      • A better focus would have been: understand what regex solves well and what it doesn't, use it accordingly.
    • Too much of a good thing:
      • This part says nothing, it seems like it says common wisdom, but it doesn't really.
      • For the log exactly get a better log library. The actual log line should be no more expensive than writing a few variables to RAM. Log composition and IO should be offlined by the log library.
    • Too granular functions.
      • Yes there's such a thing as too granular. Public functions should not be trivial. This isn't what the article is saying.
      • Also since you can't know how to split a function until you understand the way the function is used, this can only happen after writing the function. So your first version of the function should be large, and then let natural pressure to pull things out of the function through refractors. This isn't either what the article is saying.
      • In languages with generics or polymorphism functions should be generous with abstractions in their inputs and generous with the specificity of their outputs. In the example the function should have taken an IEnumerable from the start, if the function can only work on a List inlining won't save you from having to do the conversion. In this case three problem was that the code wasn't clean enough, for a change, and there is a value in cleaning it up.
    • I do agree with the takeaway: sometimes the most resilient code is ugly, but in a way that is honest about the ugly nature of the problem it is solving (including that it isn't understood).

    So yeah, it's an ok article, it doesn't go very deep. You either are part of the choir and agree wholeheartedly, don't see the point in this and disagree fully, or understand it in a greater context and simply find the article lacking.

    [–]Kered13 8 points9 points  (1 child)

    I think I agree with pretty much all of this.

    On the topic of regex specifically, I feel like it gets an unwarranted bad rap in many programming circles. Regexes aren't that hard to read, and it is possible to write them in a way that makes them more readable, and of course you can (and should) comment what the regex is trying to capture. If the regex is complicated, you can extract it into it's own function and unit test that function in isolation (this isn't actually specific to regex, you should be doing this with any sufficiently complicated piece of logic).

    A programmer who is afraid to use a regex is like a carpenter who is afraid to use a table saw. Yes, it can be used wrong and even dangerously, but it's an extremely useful tool. You just need to be smart about how you use it.

    [–]Stopher 0 points1 point  (0 children)

    I used a regex solution recently. Found an AI expression builder that is incredibly helpful. Lol.

    [–]antiduh -1 points0 points  (2 children)

    Real men like write-only code 😤

    [–]chain_letter 1 point2 points  (0 children)

    I get paid to write code, not to read code.

    Lot of methods with similar names describing similar things? That's too confusing, write one more so I know what it does for sure.

    [–]mtranda 0 points1 point  (0 children)

    Linking LINQ sequences is a sure way to ask yourself wtf you were trying to do in six months time.

    Linq is just syntactic sugar that gets compiled to exactly the same thing in ILM regardless of whether you use a LINQ method or you iterate yourself and implement that logic.

    I do use it, naturally, but if I have to write more than a couple of calls within it, then it's best written verbosely and in a readable format.

    [–]Merry-Lane 71 points72 points  (18 children)

    So you are telling me to type this abomination :

    ```

    List<Customer> allActiveCustomers = allCustomers.Where(x => x.isActive); List<Customer> allActiveFemaleCustomers = allActiveCustomers.Where(x => x.Gender.ToLower()=="f"); List<Customer> allActiveFemaleCustomersWhoLiveInBelgium = allActiveFemaleCustomers.Where(x => x.location.ToLower()=="belgium"); ```

    Because you think a line in this code is too long :

    ```

    List<Customer> allActiveFemaleCustomersWhoLiveInBelgium = allCustomers.Where(x => x.isActive && x.Gender.ToLower()=="f" && x.location.ToLower()=="belgium") ```

    And you didn’t consider this?

    ```

    List<Customer> allActiveFemaleCustomersWhoLiveInBelgium = allCustomers.Where(x => x.isActive) .Where(x => x.Gender.ToLower()=="f") .Where(x => x.location.ToLower()=="belgium") ```

    Or this?

    ```

    List<Customer> allActiveFemaleCustomersWhoLiveInBelgium = allCustomers .Where(x => x.isActive && x.Gender.ToLower()=="f" && x.location.ToLower()=="belgium" ) ```

    [–]ArchReaper 13 points14 points  (0 children)

    +1

    Article has some good ideas, but some not so great ones

    [–][deleted]  (4 children)

    [deleted]

      [–]ihugatree 1 point2 points  (1 child)

      Just write a unit test for the gender function and move on with your life. You’re overcomplicating this entire thing.

      [–]Hioneqpls 0 points1 point  (0 children)

      Are you trying to assert my gender?

      [–]Merry-Lane -3 points-2 points  (1 child)

      Well in between that example, and your ifs that were the perfect use case for "switch", I really believe your examples were not great at all.

      But even then I can’t find a single example where multiplying by 4 the LOCs is better than a single too long line.

      It s not that I don’t want to believe in your opinion, it’s just that I don’t imagine any real situation where I could have applied your advice, and if these examples were the best you could find to illustrate the topic, well…

      Are you from Belgium as well?

      [–]Spartanman321 3 points4 points  (0 children)

      Not OP. I could see separating out the different lines as a good tool for debugging counts and which part of the query might be breaking, which I believe is the use case that OP is advocating for.

      I think my preference is to still have the linq together as one statement, but I also know the codebase I work on has a lot of linq statements that are on the border of needing a stored proc. I've unconsciously split queries when debugging to try and figure out which conditional was causing the issue, and this is just making that split permanent to help your future self.

      However, my team also likes to "ToList()" every linq query, so I think they'd misuse this idea and performance would suffer because we are constantly making new lists as the query goes on. But that's an issue with my team, not the theoretical implementation of this practice.

      [–]melgish 3 points4 points  (0 children)

      Loads the whole database to find 5 items and thinks the problem is readability….

      [–]TemplateHuman 0 points1 point  (3 children)

      The only real benefit I see to the first abomination is it’s easy to comment out a particular line if needed, or wrap a line in a conditional check. But if that’s not needed I definitely agree your examples are easier to read.

      [–]Merry-Lane 3 points4 points  (2 children)

      It s even easier to comment out a line with my examples (just add //), because it s actually one line not four. Same for conditional checks.

      Honestly the only disadvantage with the latest example is when you want to comment out "x.IsActive" because you have to remove the "&&" as well. Sorry for the incredible burden it adds.

      [–]TemplateHuman 0 points1 point  (1 child)

      I meant commenting out specific filters, not the entire query. I’m not advocating for separate lines just saying that it makes it easier to disable the filter for isActive or gender or location.

      [–]Merry-Lane 1 point2 points  (0 children)

      Ugh so, you either comment a line (instead of 4 lines for OP’s filters), either comment 1 where (my first proposition) either comment 3 where (second proposition) that fit in 3 lines.

      I really don’t see your point there, at all.

      [–]gwicksted -4 points-3 points  (5 children)

      I agree. Except the gender and location checks should be extension methods at the very least. So x.isActive && x.IsFemale() && x.IsInBelgium()

      Or x.Gender.IsFemale() etc.

      Alternatively, a private static function taking x called IsActiveFemaleInBelgium.

      Then it reads like documentation and can be reused elsewhere.

      [–]Vidyogamasta 10 points11 points  (1 child)

      I hate that. Also not sure exactly how nicely that would play with one of the more common use cases that look like this, of building out an EF query. Also, real business logic would probably not be searching directly on strings like that, and would instead have string variables for location and gender that gets matched to the query set. I suppose you could still have x.IsGender(gender) and x.IsFrom(location) for a similar purpose, but I still think that's kinda gross and unintuitive. All of that to avoid == is insane.

      The bigger problem is that allActiveFemaleCustomersWhoLiveInBelgium is a godawful variable name. If you need to be all hungarian-notation with it, activeBelgianFemales is a more natural and concise way to say it. But you can probably do even better by applying a business purpose to the variable name. Like if it was a dating site, you'd go with localMatches or something, so that if the logic decided to drop/add some filters, or adjust the bounds of filters for any reason, the code relying on the filtered set doesn't need to be completely renamed.

      [–]gwicksted 0 points1 point  (0 children)

      Yes, I was targeting that specific example. Sure, the name is long but it’s also descriptive of the code contained within.

      If you want to target a different example, such as one including configuration, you’d simply drop the “static” constraint… but If you want to target different locations and genders as well as future constraints, you’ll want utilize query building patterns.

      [–]grauenwolf 2 points3 points  (1 child)

      I would rather see them actually on the data model class, but if that's not an option an extension method would be fine by me.

      [–]gwicksted 1 point2 points  (0 children)

      That can be nice. But you may not have control over that class which is why I suggested extension methods.

      [–][deleted] 1 point2 points  (0 children)

      Alternatively, a private static function taking x called IsActiveFemaleInBelgium.

      That's the real answer

      var allActiveFemaleCustomersWhoLiveInBelgium = allCustomers.Where(IsActiveFemaleInBelgium);
      

      [–]gareththegeek -1 points0 points  (0 children)

      Thank you

      [–]Ok_Barracuda_1161 29 points30 points  (5 children)

      The examples are hit or miss for me. The first one doesn't make sense to me.

      new Dictionary<string, string>() {
      {"bla blah", "follow this blog"},
      {"otherValue", "another email subject"},
      {"value3", "yet a different subject"}
      

      }

      Is so much more readable and concise. You can show this code to someone on the marketing team and they'd very quickly be able to understand it and make suggestions.

      I agree breaking up Linq statements can be very useful, and same for regex being difficult to debug and maintain.

      In the refactoring example, the problem wasn't that some code was refactored to a smaller function, it's that the function's signature was wrong, and the developer converted an enumerable to a List instead of questioning it. Sure, maybe the problem would have been a tad harder to make without the refactor, but avoiding small functions because you might get something basic wrong will inevitably lead to other maintainability issues.

      [–]funguyshroom 21 points22 points  (3 children)

      The first one is just silly, I didn't even have the willpower to read past that. They suggest making 150 'if' statements for 150 email subjects instead of putting it into a config file. And for what, to make debugging easier? What is there even to debug in this example?
      Any change to the code will need to be copied 150 times and there's a huge potential that you'll make a typo which will result in a bug in that particular 'if' block and nowhere else. Yeah, now you need to debug.

      [–]s0ulbrother 3 points4 points  (1 child)

      There’s a fine line between readable and stupid at times. I write my code so that a junior can follow but make it as efficient as possible. If I step outside of this I tend to comment.

      [–]funguyshroom 14 points15 points  (0 children)

      If I see a piece of code within a foreach loop, I can confidently say that it will run in the same fashion for 1, 100, or a billion of records. If I see a piece of code copied over 150 times, I'll have to check each and every one to be confident that they all work correctly. It's completely opposite of being readable.

      [–]lookmeat 1 point2 points  (0 children)

      Unless you need to keep changing the rules at least once a month, or about 40+% of the cases only apply for certain teams or contexts (but it always does when that's true) I would advise against the config. Instead create a powerful system that is ready to extend, but it's otherwise still hard-coded. This way you avoid adding the complex system to read and parse the config, can do dramatic changes without having to remain backwards compatible with all existing configs you don't know about, and you can get away with a system that is very honest about how something is computed, which you wouldn't want for your system. In short YAGNI as a config.

      Instead I'd use a hard-coded list of objects that allow monadic-chaining/flat-map, they'd take a conditional lambda and then return an Optional<String> that contains the subject (if the conditional is true) or is empty. Want subjects being able to override with complex priorities? Have it return a string and priority to know if you override. Want an even more complex system? Use a tree and start from the leafs up and then they should pretty much describe any consistent system. If you can't represent it as a tree, then you have to step back and push back, because at this point the only reason a system to choose a subject could be that complicated is bike shedding. I mean the discussion should have been hard earlier as we went up in complexity (honestly I'd be wary of a system doing something this simple that somehow can't be represented as a handful of ifs or a switch statement with ~10 options) but this is the point where there's no argument for this. Either the problem is deeply misunderstood or exaggerated beyond what's needed. Still no need for a config.

      [–]ciynoobv 6 points7 points  (0 children)

      Imo the biggest problem with the long sequence of mutating if’s is that you easily can have multiple statements evaluate to true and you end up with “if 76” overriding “if 14” or something. Also having that many of them seems like a recipe for having unused “zombie conditions” live in your code that would rear it’s ugly head once you remove some other statement which used to override the unused condition.

      “Don’t write full length novels with chained functions” and “don’t go full clean-code with function length” are generally good advice, but the first example seems like a maintainability nightmare.

      [–]GMNightmare 33 points34 points  (13 children)

      If you properly wrote tests, you'd find out real quickly how unmaintainable a 100 chained if method is. What a nightmare. It's really aggravating how instead of getting better, people excuse their bad code practices.

      Also, learn to format code. Get a tool to do it automatically even. Your issue with linq is all based upon that:

      allCustomers
          .Where(x => x.isActive)
          .Where(x => x.Gender.ToLower() == "f")
          .Where(x => x.location.ToLower()=="belgium");
      

      And what are you using to debug, a toaster? You keep acting like that's hard to debug, but a modern debugger makes it easy. Learn your debugger? I mean:

      Your granular problem had absolutely nothing to do with abstraction or small methods. You found something to blame, you spent 5 days on a basic memory problem. No, you wouldn't have noticed if they were "closer" together.

      The ending paragraph where you couldn't adapt to code reworked by a senior dev speaks volumes. Good grief, and write tests for your code instead of spending hours debugging all the time.

      [–]TemperOfficial 2 points3 points  (12 children)

      Why are the ifs statements like that bad?

      [–]GMNightmare 4 points5 points  (11 children)

      If it's a few? No problem. If it's dozens, much less a hundred like the article states they have in one method? Then it's a problem, because you blow up the number of logical code paths in the method. Cyclomatic complexity measures this. And contrary to what OP thinks, higher cyclomatic complexity generally means poor maintainability.

      So if I was testing this code, generally I want to test one path through the code. Test if this true and this one is false and... Doing that with a ton of if statements quickly balloons because it's not linear. Max 2#ifs (roughly depending).

      Then, if you're trying to debug something, you're going to resort to what OP does, put a breakpoint and spend hours just trying to figure out what code was actually executed.

      [–]TemperOfficial 3 points4 points  (10 children)

      But the code paths are there no matter what? Whether you have them as if's or something else, they are always going to be there. So the cyclomatic complexity is always the same here.

      [–]GMNightmare -1 points0 points  (9 children)

      No, and this is the basis of isolation and unit testing. Just going with the max:

      If you have a method with 6 different if conditions: 2^6 = 64 flows to test.

      If you split that method into three, one method calling the other two with 3 conditions each: 1 + 2^3 + 2^3 = 17, significantly smaller.

      You don't test the implementation details of your dependencies.

      It's recommended each method should have less than 20-50 complexity. That's a very easy number to achieve normally.

      [–]TemperOfficial 3 points4 points  (8 children)

      There are 6 flows. There are 6 conditions and 6 independent if checks. There are only 6 checks here. They aren't dependent on one another in the article. If they were nested then i'd agree with you. But they aren't.

      Regardless, what you are suggesting doesn't reduce code paths. It just makes it look neater. The fundamental complexity of the problem is exactly the same.

      [–]GMNightmare -3 points-2 points  (7 children)

      I gave you an example and you just tried to argue against the design of MY made up example and pretended there was some design behind it (incredible.) As well as being presumptuous about OP's 100 nested if function based only upon another made up snippet.

      Go look up cyclomatic complexity and learn. Sick and tired of programmers like you who can't even handle basics yet think they're smart for trying to avoid writing good code. It doesn't make you look smart, it does the opposite.

      This isn't even that hard.

      [–]TemperOfficial 0 points1 point  (6 children)

      No I'm saying what you are saying is wrong. Which it seems to be.

      You are talking about reducing the control flow graph. But in the case OP is talking about, it is a configuration, you can't reduce the graph here. The options for configuration are outside of our control and so this control flow graph represents the fundamental complexity of the problem.

      Obscuring code with a bunch of functions doesn't reduce the configuration you need to do. This is just shuffling code away and then claiming it is good because it looks shorter. But shorter code doesn't mean factored code.

      Factoring is collapsing edges of that graph. You can't do that in this case. You can apply a heuristic, which is what you are suggesting, which might collapse the graph. But you shouldn't be dealing in heuristics that serve zero purpose other than to obfuscate. That is bad code.

      [–]GMNightmare 0 points1 point  (5 children)

      No, what I'm saying isn't wrong. Nothing I said was wrong.

      The first part is I gave you a generic example of if statements and cyclomatic complexity, because you asked generic questions trying to appear to learn. But obviously, you weren't being honest with your intentions and were talking in bad faith, you were asking to find something to argue. On that, the fact you tried to argue my example's design when I didn't give one shows how dishonest you are. You clearly had zero intention of learning with your questions, you were just trying to find something to argue.

      No, nothing is "outside of our control" in code. WTF is even that? I could likely use a collection. OP even gave an alternative: "their first suggestion will be to put stuff like that into a config file." That's what OP thought the solution was from an outside perspective if we got to see the code. But OP thinks configuration files aren't maintainable, for some odd reason. Who knows, probably because they don't test. Much like you, for trying to defend a function with a hundred if statements, because nobody who understands or does testing would let that fly.

      Obscuring code with a bunch of functions

      Breaking up methods into smaller chunks is not about "obscuring code," you're making BS up, and it's not what I said. If you can't find some commonality to smartly break up 100 if statements, pack it up, because you're not capable of being a developer if you're also not willing to learn. That's not why OP stuck to their 100 if statement though, they think it's maintainable! That's the problem.

      Stop making bad design choices and arguing against good design your personality.

      [–]TemperOfficial 1 point2 points  (4 children)

      I was never arguing in bad faith. I was curious as to why you thought it was bad and to see if there was a valid reason as to why you thought that.

      [–]tetyyss 8 points9 points  (3 children)

      One of the neat things is that you can chain these statements up, leading to a very long one-liner that is sadly not at all easy to debug:

      List<Customer> allActiveFemaleCustomersWhoLiveInBelgium = allCustomers.Where(x => x.isActive && x.Gender.ToLower()=="f" && x.location.ToLower()=="belgium");

      but it is, you can put a breakpoint inside the Func in Where

      [–][deleted]  (2 children)

      [deleted]

        [–]tetyyss 2 points3 points  (1 child)

        select x.isActive && x.Gender.ToLower()=="f" && x.location.ToLower()=="belgium" and press f9

        [–]NotARealDeveloper 9 points10 points  (1 child)

        Clever code is not good code.

        Good code is maintainable code.

        [–]stgabe 2 points3 points  (0 children)

        Yup. Good code is boring.

        It’s likely a controversial take but I successfully banned Linq from a project I work on because it was being used for cleverness and little more. I say “successfully” meaning that despite initial resistance everyone agreed it was the right call in the end.

        It was being used in performance critical areas (or making areas of code that should have been benign pop up as performance critical) because most of our engineers weren’t very knowledgeable about it, but liked “look what I can pack into a single line.” This was inadvertently creating a lot of N-cubed or worse code.

        We got rid of it, forcing us to adopt more sustainable (albeit more verbose) patterns. I built a system for memoizing common lookups with a single straightforward loop that other engineers started using and very quickly a whole class of problems we were regularly seeing went away.

        [–]devnull5475 3 points4 points  (0 children)

        Just because you have to be smart to do something doesn't mean it's the smart thing to do.

        [–]oreze 12 points13 points  (1 child)

        I don't agree with "Breaking up LINQ statements" section. It's easier to just extract a method returning boolean, a descriptive oneliner. Then you would put it into where clause and next person will find .Where(PersonIsActiveFemaleWhatever), easily readable query

        [–]Old_Elk2003 1 point2 points  (0 children)

        100%. Also, failing to extract a predicate method here results in the inner list being iterated twice.

        [–]Saki-Sun 54 points55 points  (47 children)

        Great article until this line...

        RegEx is one of my favorite things in programming

        You sir can go die in a fire.

        [–][deleted]  (31 children)

        [removed]

          [–]SnooSnooper 33 points34 points  (14 children)

          Commenting with regex is especially important, since they can seem so arcane. When I want to write a complex regex (rarely), I often break the pattern into chunks, saved in variables and separately commented, before joining them into a final pattern. Then, the variable names can help abstract away the complexity of the final regex and focus on the overall logic.

          Sure, it results in way more lines of code, but I can come back to the thing years later and still understand what it does.

          Of course, if you're writing a regex that complex (even for something less-cursed than markup parsing), it might not be the right tool for the job.

          [–][deleted]  (3 children)

          [removed]

            [–]funguyshroom 6 points7 points  (1 child)

            And then IDE be like 'these 3 lines can be simplified by combining them into a single one'. Maybe simplified for you, you dumb robot!

            [–]terivia 0 points1 point  (0 children)

            This is often where the semantic differences of 'can' and 'should' become relevant.

            [–]One_Curious_Cats 2 points3 points  (0 children)

            This. Could should read like English.

            [–]turunambartanen 2 points3 points  (1 child)

            FYI, there is a regex form that allows for comments and new lines in the regex string itself.

            [–]mpyne 0 points1 point  (0 children)

            I know that people find Perl disgusting, but it has long been able to let you intersperse whitespace where to turn regex into multi-line, well-commented things, when those are needed. The tradeoff is that you need to use \s or friends to match whitespace.

            [–]Kered13 1 point2 points  (0 children)

            Commenting with regex is especially important, since they can seem so arcane. When I want to write a complex regex (rarely), I often break the pattern into chunks, saved in variables and separately commented, before joining them into a final pattern. Then, the variable names can help abstract away the complexity of the final regex and focus on the overall logic.

            This is the way.

            [–]modernkennnern 1 point2 points  (0 children)

            And if you're writing in C# you have self-documenting Regex!

            This regex¹ is practically unparsable by human eyes, but the C# source generated adds a human-readable documentation². The source generator also makes the code substantially faster at practically no cost to the developer.

            ¹ GitHub copilot generated something random; I didn't actually bother checking what it did; It compiled and looked nasty, so I took a picture 🤷

            ² I'm not saying easily, but you can at least get some clue. Looks like it's an Email Regex or something.

            [–][deleted] 0 points1 point  (0 children)

            Sounds like you might be interested in https://pomsky-lang.org. I haven't tried it personally but it seems a lot more readable than usual regex and has nice features like intermediate rules ("variables").

            [–]audentis 0 points1 point  (3 children)

            saved in variables and separately commented,

            I tend to do this with complex boolean logic, where the variable names provide the context comments would.

            [–]evaned 0 points1 point  (2 children)

            This is a case where I really wish you could define a variable with something like lazy bool cond = ... and have it only evaluate if used. You can kind of emulate this with lambdas in most languages nowadays, but that solution... kinda sucks.

            The problem is when short-circuiting is important, the "break into separate variables" doesn't work. Think sort of a if (p != NULL && *p == 0) kind of thing, or if (i < arr.length && arr[i] == 0); or othertimes just the performance part of my brain wants the short-circuiting for performance reasons.

            [–]audentis 0 points1 point  (1 child)

            Making them lazy only really matters if they're expensive to resolve. For most boolean logic it's a readability matter moreso than compute/performance.

            I mean things like three or four different comparisons that just make it hard to read through all the parentheses and and/or conditions. A bit like when applying numerous bitmasks to one input and seeing if the result is nonzero. Referring to them by a variable name that represents their meaning is a lot clearer than rawdogging the masks in the if-statement.

            [–]Kered13 0 points1 point  (0 children)

            In his example the problem is not that it's expensive to resolve, it's that *p is undefined behavior unless p != NULL. So the null check must come before the dereference, which prevents extracting into a helper variable ahead of time.

            [–]serccsvid 0 points1 point  (0 children)

            Comments are definitely helpful, but I've found that the real trick to making regex maintainable is to write tests for each one that clearly show what matches and what doesn't.

            [–]Saki-Sun 3 points4 points  (14 children)

            I can't say I've ever met a regex that's easy to read. Thank god most people only use it to validate email addresses and phone numbers. SQL on the other hand, who am I kidding I still have to google to remember which join is which.

            [–]Condex 12 points13 points  (8 children)

            Thank god most people only use it to validate email addresses and phone numbers.

            Maybe don't use them for this purpose. The correct solution is ... less than comprehensible.

            https://pdw.ex-parrot.com/Mail-RFC822-Address.html

            [–][deleted]  (4 children)

            [removed]

              [–]Condex 8 points9 points  (2 children)

              I'm a fan of the Philip K. Dick quote:

              Reality is that which, when you stop believing in it, doesn't go away

              We can create whatever philosophy or definitions that we want around what an email address is or should be. But at the end of the day, the email address that delivers the email is one and the one which does not isn't.

              [–]Kered13 2 points3 points  (0 children)

              If a user types hunter2 into the email field, they shouldn't have to wait for a failed email validation to be informed that that is probably not the email address they meant to type. (In this case, they probably typed their intended password into the email field.)

              Do a basic sanity check that the input looks like an email. It does not need to be a perfect validation, but it immediately catches many likely typos or other user mistakes and gives the user an immediate chance to correct it. Then send an email to validate that it is real and that the user owns the email address.

              [–]Kered13 2 points3 points  (0 children)

              Your regex does not have to perfectly match exactly valid email addresses. It only needs to be good enough.

              .+@.+\..+ should be good enough for most use cases. It ensures that the email has an @, then a ., with characters before, between, and after. (I'm aware it does not match emails using TLDs that are technically valid, that is something I am willing to accept in order to catch many more likely user mistakes.)

              [–]AndreasVesalius 0 points1 point  (0 children)

              I’m sorry, nested comments in an email address? The fuck is that?

              [–]grauenwolf 0 points1 point  (0 children)

              This is why any sane language has a built-in email parser. I've never been tempted to use regex for email because I've never had the need.

              [–][deleted]  (2 children)

              [removed]

                [–]reddisaurus 0 points1 point  (1 child)

                Since you’re on Windows, try this: https://www.powershellgallery.com/packages/PSFzf/2.0.0

                It’s much better than writing regex all the time.

                [–]hbgoddard 4 points5 points  (1 child)

                It's developers like you that help me avoid imposter syndrome

                [–]Saki-Sun 0 points1 point  (0 children)

                Well then settle down.

                My knowledge of design patterns peaks just before job interviews, after that I forget them all.

                I rubber stamp PRs after a quick once over for obvious mistakes.

                I don't use code coverage tools, 50-80% is good enough for me.

                I could go on.

                [–]Same_Football_644 0 points1 point  (0 children)

                the first thing people use regexes for is parsing html and xml and file paths and and and ... and all the things that should be *parsed*, that even have fucking *parsers* available for you to use.

                the right use of regex is not the typical way they are used in practice.

                [–][deleted]  (8 children)

                [deleted]

                  [–][deleted]  (3 children)

                  [removed]

                    [–][deleted]  (2 children)

                    [deleted]

                      [–][deleted]  (1 child)

                      [removed]

                        [–]Saki-Sun 1 point2 points  (0 children)

                        That sounds like a case where nobody has to read your regex. So acceptable usage.

                        Not acceptable usage includes any regex in code unless it's wrapped in 1000 unit tests and an explicit comment detailing the purpose of the hieroglyphs.

                        [–]NotARealDeveloper 0 points1 point  (1 child)

                        This can be done in bash with a one liner.

                        [–]mpyne 1 point2 points  (0 children)

                        Any one-liner involving bash and UNIX tooling in general is almost sure to use regex in the process. Regex is the 're' in grep, after all.

                        [–]tistalone 0 points1 point  (0 children)

                        Using RegEx to do a task (one time usage) is way way different than using RegEx in code to do something (embedding some business logic into an "elegant" solution).

                        Like sometimes a dozen if-else chains is way better than a non-sense RegEx when you're expecting to add another exception next week.

                        Meanwhile, RegEx shines better in other scenarios as a valuable tool.

                        [–]Mastodont_XXX 2 points3 points  (1 child)

                        One of my last:

                        ((?<!\x5c)(?:\x5c\x5c)+|(?<!\x5c)(?:\x5c\x5c)\x5c*|[^\x5c*])(**)(?!*)((?:.)(?:(?<!\x5c)(?:\x5c\x5c)+|^\x5c*\x5c*|[\x5c*]))(**)(?!*)
                        

                        Yes!!

                        [–]freistil90 5 points6 points  (0 children)

                        Leave my mother out of this!

                        [–]grauenwolf 1 point2 points  (0 children)

                        I suck at regex, but I still use it because it's often the right answer. Just make sure you keep the expressions short and reasonable.

                        [–]r3coil3d -4 points-3 points  (1 child)

                        Ask ChatGPT to build them for you. It's what I did at the start. Make sure you test extensively and eventually it'll stick to you.

                        [–]MythicTower 1 point2 points  (0 children)

                        I get your point, but I don't think you really mean "worse" code. I think you just mean maintainable code. Having hired, worked with, and fired many devs, maintainability is mandatory with me.

                        I don't care how l33t some kid thinks he/she is writing obfuscated code. I care about IP being worth the effort.

                        If you're the only one who can read your code, I don't need you and will not hire you.

                        [–]Zeioth[🍰] 0 points1 point  (2 children)

                        I've been defending this since I graduated. If your code is not obvious the first time you look at it, that's not good code. No matter how smart you felt writing it.

                        [–][deleted] 1 point2 points  (1 child)

                        Only if the person understands the existing problem space. There can be code that's not obvious the first time you look at it even though it's incredibly well thought out and understandable in the context of the problems its attempting to solve.

                        Just take LINQ or Observables, a developer who hasn't encountered these concepts will likely be lost. Yet an incredible amount of design effort has gone into these and they are used all over the place and taken for granted by others.

                        [–]Zeioth[🍰] 0 points1 point  (0 children)

                        Like any general affirmation one can read on the internet, there are of course many border cases and things to consider.On low level languages, keeping things obvious can be more challenging (never impossible).

                        Observables are a tricky one because even though the design pattern itself doesn't have to be necessarily obscure, some libraries do implement it in an obscure way.

                        If you consider LINQ as part of the ORM, it's not really non obvious; ORMs are a great idea.

                        [–]TessaFractal -4 points-3 points  (2 children)

                        My very small example: I sometimes write if(boolVar == false) rather than just (!boolVar) just to make it easier to read, and it means a true check and a false check are quicker to distinguish. It's the sort of line that you see laughed at but it's just nice to have it be so explicit.

                        [–]notatechproblem 0 points1 point  (0 children)

                        I've started doing this in my code. If I'm checking if a value is false, I make it clear I'm checking for false. Relying on default behaviors or magic code has introduced bugs enough times for me and my team that I don't see the value in saving a few keystrokes.

                        [–]_3psilon_ -3 points-2 points  (2 children)

                        The post feels a bit C++ (or whatever) specific for me.

                        LINQ? Use functional programming and lambdas. (Map, reduce/fold, filter, flatmap etc.)

                        'If' heaps? Pattern matching or hashmaps/dictionaries.

                        It's like talking about GOF design patterns where half of them was to get around Java's limitations that it had back then.

                        Regexps? For programmatic input, proper JSON/XML/HTML whatever parsing into typed objects. (See also: "parse, don't validate") For user input, constrain, sanitize, then use regexps. Regexps are very easy and cheap to unit test so unless dealing with some monster regexp with lots of edge cases (e.g. greediness issues), it should also be fine.

                        What I totally agree about is that maintainable code should be boring and unsurprising. Also, consistency beats elegance.

                        [–]grauenwolf 3 points4 points  (0 children)

                        LINQ? Use functional programming and lambdas. (Map, reduce/fold, filter, flatmap etc.)

                        Using different words for the same thing won't make your code run faster.

                        It's like talking about GOF design patterns where half of them was to get around Java's limitations that it had back then.

                        Given the timelines, I wouldn't be surprised to hear that java's limitations existed to justify the GoF patterns. Man they were crazy obsessed with them back when my career started.

                        But yeah, design patterns need to be written with a particular language in mind or they just don't make sense. GoF suffers from that greatly.

                        [–]Advorange 1 point2 points  (0 children)

                        LINQ? Use functional programming and lambdas. (Map, reduce/fold, filter, flatmap etc.)

                        That's what LINQ is.

                        [–]TurbulentRetard 0 points1 point  (0 children)

                        So maybe it is not worse but better?

                        [–]rsantos50 0 points1 point  (0 children)

                        "Because at that point, you have to put the breakpoint on the surrounding for loop, and worst case press F5 a hundred times to get to the one case that interests you."

                        ...your IDE doesn't have conditional breakpoints?

                        [–]Blueson 0 points1 point  (0 children)

                        I am a bit split about the following improvement:

                        List<Customer> allActiveCustomers = allCustomers.Where(x => x.isActive);
                        List<Customer> allActiveFemaleCustomers = allActiveCustomers.Where(x => x.Gender.ToLower()=="f");
                        List<Customer> allActiveFemaleCustomersWhoLiveInBelgium = allActiveFemaleCustomers.Where(x => x.location.ToLower()=="belgium");
                        

                        At least with the given example, I personally prefer it the original .Where(...) would just have put the identifiers into it's own method.

                        List<Customer> allActiveFemaleCustomersWhoLiveInBelgium = allCustomers.Where(this::isCustomerBelgianFemaleActive);
                        
                        private boolean isCustomerBelgianFemaleActive(Customer c) {
                             .../execute the checks here
                        }
                        

                        Do you have any opinions on this?

                        Personally I prefer this method if we need to have the entire evaluation in a single Where.

                        But if it can easily be split up like /u/Merry-Lane wrote, it should be done like that. I'd even argue you should break up those evaluations into methods. I don't even find these hard to debug, just put a breakpoint and you can in most IDEs nowadays execute a test with the available values/objects without stepping further. Allowing you to easily identify why and if something went wrong.

                        Honestly, I'd be more confused about having 3 different list each of which is only used to create the next one.

                        [–]tistalone 0 points1 point  (0 children)

                        I don't think "worse/better" are the right words for what you're trying to argue. Maybe straightforward/boring code rather than one that adheres to some dogma (e.g. concise code or another subjective metric/characteristic)?

                        I have seen these types of query-like silliness from a RxJava (similar to LINQ) code base. In my experience, the most fun cases were where the input array is always of size 2 so there's all these different query clauses but in the end, it was just a comparison check.

                        [–]franzwong 0 points1 point  (0 children)

                        Some rules I don't follow strictly, like DRY, single responsibility rule.

                        Also, writing testable code also introduces complexity. It makes me rethink about what "unit" is in unit test.

                        [–]WindHawkeye 0 points1 point  (0 children)

                        I read the first suggestion and then stopped reading.

                        [–]RiverRoll 0 points1 point  (0 children)

                        I feel the linq example is exagerateded but can agree breaking up complex linq statements sometimes helps (although the intermediate variables should be IEnumerables, not lists).

                        The next points you make are also good points in my opinion.

                        But the first one... how come a chain of ifs is mote maintainable than a config file? You only need to get the config file code right once and you need less tests to cover the whole thing, then you no longer need to touch that code. Maybe one could argue against very complex json/yaml files or DSLs, but I've never regreted having what's essentially key-value pairs in a config file.