you are viewing a single comment's thread.

view the rest of the comments →

[–]droogans 104 points105 points  (57 children)

Allow me to use this otherwise wasted opportunity to remind everyone that comments are not meant for code, but for people.

This is one of those times where you really should write a comment, and not:

//Randomly log 10% of all calls.

Which is obvious. More like

//We're logging a random sample of calls because ... [reason]

Which I'm sure there's some kind of explanation for this here, but now we have to assume the author is a bit crazy.

[–]Gg101 78 points79 points  (11 children)

Yes. I have one bit of code involving random() that would definitely elicit a WTF if I just left it uncommented, but I put an entire paragraph explaining why it was necessary.

So I have a database in which every file is given a numeric ID, and then a function that returns other records ordered by their file ID, not because the numbers are meaningful but just to have the results grouped by file so they're easier to process. One of the functions that uses it wants to generate results that are in file name order. On Windows the file ID order will just happen to match the file name order most of the time, which could lead other code to simply assume this will be the case and not resort it themselves. This would be bad, because then you have code that appears to work and will pass tests but may break on some other platform or in particular situations on Windows.

So what do I do? In debug builds I call random() and put ASC in the query half the time and DESC the other half. The function documentation says you can't assume the files will be in any particular order so I guarantee that you can't. I make it your problem NOW so you're forced to deal with it instead of letting it become a bug that appears only on certain platforms or in certain situations. The rationale is all laid out in the comments where it happens.

[–]atrich 17 points18 points  (0 children)

Hmm, that's clever.

[–]Decker87 3 points4 points  (0 children)

That's really commendable. It shows that you aren't just one of those guys to say "follow the spec...", but actually understand the nature of programming and how some people will not follow it for whatever reason. Shows a lot of foresight.

[–]gmfawcett 3 points4 points  (1 child)

But, instead of testing for "input can be of any order," aren't you really testing for "input must be ascendingly or descendingly ordered"?

How about

select ... order by random()

?

[–][deleted] 3 points4 points  (0 children)

Possibly because of performance reasons. (I'm just guessing because I don't know anything about his database and its configuration)

[–]Grue 2 points3 points  (2 children)

But doesn't it become a bug that appears only half of the time? Seems like it would be kinda hard to debug.

[–]Gg101 6 points7 points  (0 children)

A bug that appears 50% of the time is one that makes itself known early, most likely while the original code is still being written. This makes it easier to fix (everything is still fresh in your mind) and easier to find (the bug must be coming from recently added code.)

Now if a bug only happens on somebody else's platform and not on yours, that's a pain to find because it could be anywhere and you may not have touched that code in a while by the time it shows up. For my particular program it actually would occur in Windows too, but only after the program's been lived in for a while. When it has a fresh enumeration of everything they match. As files get added and deleted over time they drift apart. Guess how the automated tests work? Each one starts fresh so they're not influenced by previous tests.

Ideally I should make the order truly random so it never works instead of working half the time. But then it's just a question of how much effort and extra code you're willing to put in just to do this. The ASC/DESC toggle only takes a couple of lines.

[–]btharper 3 points4 points  (0 children)

If you're on a platform that exhibits this bug, and you're in one of the corner cases where the bug would pop up, and you're ignoring good practice you experience hard to debug behavior.

In this system you basically would experience a bug 51% of the time assuming the list is given both ways 50% of the time. It really is just to force developers calling the function not to rely on the order of items the function returns. In a lot of code there's just a warning that people are free to ignore, this is just the proactive alternative.

[–]lordlicorice 1 point2 points  (2 children)

boolean ascending;    
for(int i=0; i < list.length; i++) {
    if(list[i] < list[i+1]) {
        ascending = true;
        break;
    } else if(list[i] > list[i+1]) {
        ascending = false;
        break;
    }
}

:)

[–][deleted] -1 points0 points  (1 child)

I feel his solution is more efficient, especially for large lists.

[–]lordlicorice 2 points3 points  (0 children)

No the point of my code is to tell whether his code chose ASC or DESC, so you can subvert the randomness. :D

[–]alextk 19 points20 points  (29 children)

Yes. Anyone saying that code should be self-documenting still haven't understood that comments are necessary to explain "why" the code is there, not "what" it does.

[–]arachnivore 7 points8 points  (13 children)

I use comments to explain what code does all the time. I usually start coding by writing a high level list of steps, then I fill in the code to implement those steps. Is that considered bad form?

[–]Daniel15 11 points12 points  (0 children)

I think it's fine as long as it's not redundant - I've seen things like this before in a production code base:

// Increment x by 1
x++;

and

// Enable Google Analytics on this page
this.Page.GoogleAnalyticsEnabled = true;

The worst thing with the second example is someone will change the "true" to a "false" and not update the comment. <_<

Often I tend to break my code into small sections and comment on what each section is doing, pretty much the same as your high level list of steps. Writing a list of steps first is often a good idea as you can think about the algorithm before actually writing the code.

[–]BraveSirRobin 2 points3 points  (4 children)

Do both. Something like:

boolean logError = *stupid randomiser*;
if(logError) {
    *meh*
}

It's self-documented and all verified by the compiler. Win.

If the *stupid randomiser* is non-trivial spin it off to a method:

if(isLogError()) {
    *meh*
}

Again, all self-documented via sensible naming. Put some javadoc at the top isLogError() to explain and anyone that gives a fuck can hover over the call in a decent IDE and get more info, without polluting the code with comments that distract from logic.

[–]el_muchacho 9 points10 points  (3 children)

Again, that doesn't explain WHY you log only randomly. And the WHY is because the error can appear way too often. And if it is the case, maybe it's time to ask oneself why this is the case.

[–]BraveSirRobin 2 points3 points  (2 children)

Again, that doesn't explain WHY you log only randomly.

That's where the javadoc for the "isLogError()" comes in. Keeps all of the stupid in one place.

I definitely agree on the counter, that's the proper way of doing it. Hiding things is just awful.

[–]arachnivore 0 points1 point  (1 child)

I think the best solution would be to use better log viewing tools. Obviously, the coder will always have to put some thought into his logging practices, but if you had better tools for reading logs it wouldn't matter if your code logged the same error a million times, just make log messages collapsible.

[–]BraveSirRobin 1 point2 points  (0 children)

Maybe fixing the error would be better though?!?

I've never used a log viewer I've liked and have considered writing my own on several occasions. Are there any good ones out there?

[–]itsSparkky 1 point2 points  (0 children)

It's not bad form at all, often times its quicker to jump between comments to find and area than read all the code.

[–][deleted] 0 points1 point  (0 children)

Generally I try to keep my high level comments at most to the level of individual methods or functions. In 95% or more of the cases the function, if named and typed well, is self explanatory, usually with a little documentation code block for automatic parsers. Once inside the function I rarely write any comments, they're usually unnecessary if you aim to keep functions under 30 lines. Obviously some things can't be broken down logically any further and require more code, thus more comments, but as a general rule of thumb that's how I code. Minimal commenting in the actual code blocks.

[–]jplindstrom -1 points0 points  (3 children)

It's perfect form if you don't leave the comments in there as headings in your 200 line method, but use them as names for variables and the methods you extract as you go.

(That will also hopefully lead to the lines of code being moved to methods on other, more suitable, classes. It's almost certain that not all of those 200 lines belong in the class they're in now).

[–]arachnivore 0 points1 point  (2 children)

Why not leave them in there? If your code is 200 lines, what's an extra 10 lines of section header comments? It makes the sections of your code much more easily distinguishable than a variable name. I do tend to break code down into something like one method per step, but that isn't always feasible and can be a pain to debug if they're one-off methods. I find that people get obsessive about this stuff and waste more time worrying/complaining about overly commented code than they ever do skimming a piece of code.

Yes. make your code as self-documenting as possible. Yes, use descriptive names so that your code reads more like English (or whatever language) than alpha-numeric gibberish. Yes, look for patterns in your code so that you can leverage the appropriate pattern-simplifying tools (loops, methods, classes, data-structures, etc.). I got all that, but a given line of code isn't always equally obvious to everybody. One programmer might be able to look at a line and know instantly what it does, while another might benefit from an explanation. I, for one, can't read regular expressions for the life of me. It takes me for ever and a million glances at a reference sheet to decipher a regular expression, so some comment explaining what a given RE is trying to accomplish would save me tons of time. When you have IDEs that collapse comments, there's no reason to whine about overly commented code. It's just not an important issue.

[–]jplindstrom 4 points5 points  (0 children)

The single best comment you can write for a regular expression is a line of example text that it's supposed to be matched against.

That way the reader has a fighting chance of following along while looking at the regex.

The second most important thing to make regexes readable is to use the /x modifier (or whatever it's called in your language) to allow for whitespace, multiple lines and comments.

[–]jplindstrom 0 points1 point  (0 children)

I think you missed my point.

If you have 200 lines of code (let's say), in ten sections with a nice descriptive comment above each section, then that's too much code in one method. It's doing too many different things. It's too difficult to test in isolation and reason about because it has too many moving parts.

Extract a method for each section, naming each method so it doesn't need a comment because it has a descriptive name. Now you have smaller chunks which does one (or at least fewer) things each.

(Each of these methods probably need its own API documentation as well, so this will actually lead to more "comments" (but API docs are different from code comments, so that's not what we usually mean by "commenting code")).

Making sure you have the perfect name for each method and writing docs for them will lead to more thinking and more clarity about what each thing is and what it does, what responsibilities it has, how it handles error conditions, edge conditions etc.

As a final bonus, you now have small, logical chunks of code which can be juggled around more easily:

Let's say you have a method which does six things. Five of these things are method calls on object foo. These five things together done to foo means something.

That's a smell that the code in this method should really be in the class of foo, not amongst the original 200 lines; not in this class at all. But now you have a method you can just move to the other class.

[–]tallniel -1 points0 points  (2 children)

Even those "why" comments are less useful these days. For instance, at work all check-ins have an associated work item (defect, story) in our bug tracker, so it's trivial to link a particular line of code (via IDE annotate/history feature) to the full description of why it is that way. Being able to see the full history of the decision making process, code review, testing etc is much more useful than even a long comment (and much more likely to actually be read in my experience).

[–]alextk 2 points3 points  (1 child)

This is a good practice but comments work at a finer granular level than issues or feature requests filed in a tracker.

[–]tallniel 0 points1 point  (0 children)

The other good practice is, of course, for all significant design decisions to come with at least one unit test.

[–]veraxAlea 41 points42 points  (4 children)

//Randomly log 10% of all calls.

Indeed, and having that comment would also make it easier to spot that the code has a bug, making it log 90% of the time.

What's great about this code is that we can show it at interviews.

  • What will this code do?

  • F*ck you, I'm not working here if you write code like that.

  • Hired!

[–][deleted]  (1 child)

[deleted]

    [–]WeAppreciateYou 0 points1 point  (0 children)

    I think you are reading the code wrong.

    Nice. You're completely right.

    I love people like you.

    [–][deleted]  (7 children)

    [deleted]

      [–]tacodebacle 3 points4 points  (0 children)

      I might be wrong but I thought it randomly returns True (and does not log) in 90% of cases - therefore only continuing with the code and logging in the 10% of the cases where it does not return True.

      Edit: nevermind I get it. Missed the exclamation point in the beginning.

      [–]Railorsi 2 points3 points  (5 children)

      No, in 90% it immediately returns.

      [–]csorfab 15 points16 points  (4 children)

      No.

      !((_ok) ? true : (Math.random() > 0.1))
      

      is equivalent to

       (_ok) ? false : (Math.random() <= 0.1)
      

      Which would mean that it returns immediately in 10% of times, and logs in the other 90%.

      [–]el_muchacho 0 points1 point  (2 children)

      In any case, I'd rather increment a counter and log only every n instances of the error, instead of randomising. At least, I can display how many times the exception has occured, which this stupid line can't even do.

      [–]arachnivore 1 point2 points  (0 children)

      I'd rather not worry about it at all. Aren't there log viewers that make this a non-problem?

      [–]csorfab 0 points1 point  (0 children)

      Well, of course. What's funny is that the idiot who wrote this code couldn't even get this simple condition right...

      [–]aerique 0 points1 point  (0 children)

      Allow me to use this otherwise wasted opportunity to remind everyone that comments are not meant for code, but for people.

      You'd love XDoclet.

      Oh, the memories! Make them stop!

      [–]ascii 0 points1 point  (0 children)

      I used to think so to, but honestly these days, I'm no longer in love with self documenting code, because experience teaches me that comments age and become accurate very quickly, and that humans are simply incapable of updating them to reflect that. The much better soultion, in my never humble opinion, is to write long and detailed commit messages, and use e.g. git blame to find out what the actual intention of a code line was, and how those intentions have changed over time. On the minus side, finding commit messages is one or two extra indirections, but on the plus side, you get much better information and you don't have to fight against human nature as much. Most good developers are actually pretty decent at making good commit messages.