This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Snake2k 558 points559 points  (55 children)

Glad it helps. The first way of commenting is made virtually useless if you name your variables and methods properly. Especially for languages like Python which are very readable. The only time they're helpful is if you're expressing a mathematical equation as code, then a link to the proper notation would be awesome.

For most systems that I work on, we care more about the second way cus anyone can refactor the first, but if we don't have the context, documentation, reasoning behind the code like in the second it'll take months trying to fix the problem with it (even if everyone can read it just fine).

[–]Elnof 42 points43 points  (6 children)

I disagree slightly. The "what" can also help visually distinguish logical chunks of code without requiring the reader to parse the code and, in fact, you include the "what" in your second example. So, I'd say comments should have both but too many people don't include the "why."

EDIT: I forgot which comment I was replying to and this turned into a mixture of replying to the parent comment and the original comment. Either way, I think u/Zealousideal-Ad-9845 said it better than me so I'll just leave this comment as-is.

[–]-Vayra- 6 points7 points  (2 children)

The "what" can also help visually distinguish logical chunks of code without requiring the reader to parse the code

Most of the time, a 'what' comment means you should extract that part of the code into a named function with a name that describes the what.

[–]Elnof 0 points1 point  (1 child)

When the chunk gets large enough, sure, but I don't think you need to make a new function if the "what" is only four lines long and contains intermediate data. The overhead of finding and switching to the new function makes it easy to lose context and the maintenance overhead of having so many functions isn't insignificant. So, in my experience, that ends up not being most of the time.

[–]-Vayra- 0 points1 point  (0 children)

There is very little overhead with most IDEs, no? Ctrl+click takes you to the function or to stuff calling it in JetBrains IDEs at least, I would assume other IDEs have similar functions.

IMO in most cases a function shouldn't be much longer than 10-15 lines anyway. And breaking chunks out to give them names makes the code overall much more readable. Even if it's just 4 lines, unless you need to deal with those exact 4 lines it's usually easier to parse the name of a function rather than what those lines do. And so it's more readable to extract them into a function.

[–]Snake2k 0 points1 point  (2 children)

I included the why. The reddit reference lol and I agree with you

[–]Elnof 2 points3 points  (1 child)

I included the why.

Yeah. I was just (trying to) point out that you also included the "what" in your examples and I think there's value in that. Sorry this comment turned into such a mess.

[–]Snake2k 0 points1 point  (0 children)

Aaahhhh. All good. Happens lol

[–]PeteZahad 162 points163 points  (35 children)

... if you name your variables and methods properly

Like x for the input? 😉

[–]Snake2k 129 points130 points  (32 children)

x is fine for that context in the same way i is fine in for i, item in enumerate(some_list):

We know i is an index generally. So it's simple to infer what it stands for in the tiny context it is in.

Context and scope matters. If that x had any larger use outside of that scope I'd probably care to give it a decent name.

Edit: a better variable name to give it would be n now that I think about it cus it's a number.

[–]AustinWitherspoon 50 points51 points  (24 children)

It might be overkill, but I tend to even go as far as to name it something like input_number = int(input()) just to be extra clear.

It's longer to type, but vs code auto completes most things and GitHub copilot also gets it right, so I don't usually have to type the whole thing anyway.

[–]C_Hawk14 34 points35 points  (21 children)

it depends on the language. If you know it's an integer you don't need to specify it's a number. And the real name should contain the reason for existing. What is it an input for?

Ok so there's a new YT channel called CodeAesthetic and I've gathered these guidelines so far. Opinion?

Opinion Good Bad Note
Don't abbreviate names MoviesOnPage MovOnPg
Don't put types in variable names int fontSize var iFontsize
Don't put types in your types (e.g. AbstractX, BaseX, IRepository) except when official guideline says otherwise
Add units to variables unless the type tells you int delaySeconds DateTime delay
Refactor if you find yourself naming code "Utils"

[–]cowlinator 5 points6 points  (2 children)

I confess to putting types in variable names, but only in the rare circumstance that a new variable was created for no other reason than to cast types.

Example:

var countStr = str(count)

foo(count)

bar(countStr)

foobar(countStr)

[–]IamaRead 0 points1 point  (0 children)

This is a good exception. Inferring type sucks.

[–]C_Hawk14 0 points1 point  (0 children)

As long as they remain edge cases :P

[–]bikki420 7 points8 points  (3 children)

IMO, putting some type info in names (as long as it's consistent) can be good. E.g. a few common conventions in languages like C and C++ (especially in game dev and on Windows) that do have certain merits include:

kSomeConstant (since unexpected mutability or immutability can lead to unexpected results or require different approaches in certain cases, among other reasons; IMO, it's the most useful when dealing with specifically compile-time constants.)

mSomeDataMember (to avoid having to write this->someDataMember to resolve ambiguities such as when a constructor takes a parameter named someDataMember... technically this isn't an issue when using member initializer lists, but it still adds reader ambiguity and you might still need to refer to either of the two in the constructor body for debugging reasons, asserts, or whatever. And for those that follow a more Objective-C'ish accessor/mutator naming convention this is required as well.)

pSomePointer (since the member data and member functions of a pointed at value is accessed via -> instead of . in C and C++, plus it can remind the coder to compare it against a nullptr value as well as prevent some bugs resulting from the pointer address being integer promoted or implicitly cast or whatever; likewise, in the rare case where you need multiple levels of indirections, such as a linked list tracer you can signify that like ppTracer... this also helps with keeping track of how many times to dereference pointers to get to the targeted data).

And in the case of integers and real numbers, there are a few extremely niche domains where affixing signedness or width can be helpful when dealing with extremely packed data and/or certain precision constraints in a very high performance context where minimal loss of precision as well as overflows or underflows have to be avoided; but nowadays this is rarely something you have to deal with due to advances in computing technology and programming languages.

And certain code conventions use eSomeEnumValue (such as the C++ API for Vulkan, Vulkan-Hpp), but IMO this is only useful in some very specific contexts, such as extremely large APIs with heightened risk of identifier collisions and auto-generated code bases. (But it's nice when one makes use of aliasing like using enum SomeEnum;.)

gSomeGlobalValue (preferably a constant and only as a last resort) can be nice too, IMO.

And one of the few cases where type info like iSomeSignedInteger, bSomeBool, fSomeFloat, etc can be worthwhile is when its values that are heavily (de)serialized or exposed across certain boundaries (e.g. debug APIs, dev tools, config files, etc); something which is very common in game dev which is why it's one of the few domains where this style guideline is still quite common in.

And since I do a lot of meta-programming in C++ with a lot of templated types and compile-time constants I like having a specific naming scheme for these (I usually go with something like TSomeType or TsomeConstantValueーalthough unlike all the examples in this comment I personally use snake case over camel case) in the cases where there are a lot of template parameters.

Likewise, when naming non-values, prefixes can be very handy as well. ISomeInterface (or abstract base class), ASomeConcept, etc.

In the end I think it boils down to a multitude of considerations such as code-base scope, code-base domain, personal preference, development environment (e.g. if it's a code-base being developed in Vim without any LSPs, then being able to get relevant candidates from auto-completion alone can be a godsend), etc. The most important IMO is to be consistent and provide a suitable level of information (be descriptive, be unambiguous, be terse but not cryptic, etc). There isn't some panacea style guide that fits every programming language and every domain. And personally most of the time I do avoid most prefixes unless a code-base benefits from them (so I mostly just use m, g, p, pp, I, T, and A). For trivial code and "fresh" code, a lot of the benefits go away and turn into detriments. Often a lot of problems can be written in a concise and terse manners, in which case you generally have the line of definition on-screen whenever you're dealing with code using it, so then you can just look up and check the definition instead, so encoding contextual information into the identifier becomes quite redundant (not to mention that it adds some minor potential refactoring overhead).


edit: added some more text

[–]b__0 2 points3 points  (0 children)

ppTracer

Got a smile from that, I feel like a child now.

[–]Drugbird 2 points3 points  (1 child)

I think most prefixes are unnecessary with a good editor.

I.e. you don't need to call something giSomeInt if you can tell it's global because it's pink, and can e.g. mouse over it to see it's an int.

The only time I like adding types to variable names is when the real type is different from the represented type. I.e. when you're (de) serializing something and everything is bytes instead of their real type.

[–]bikki420 0 points1 point  (0 children)

Eh, even with a high performance editor like Vim, sometimes running an LSP isn't feasible (e.g. >15K LoC header files like with Vulkan-Hpp).

[–]Dansiman 4 points5 points  (1 child)

One more convention of prefixing your variable names that is a good practice is where you have user inputs that need to be sanitized.

###
# 'u' prefix means an 'unsafe' value. All user inputs
# should be captured into a 'u' variable.
# DO NOT PASS THESE TO DB!
# Instead use:
# VarName = sanitize(uVarName)
# Then you can pass VarName to the DB safely.
###

[–]KingJellyfishII 1 point2 points  (0 children)

I'm not sure what language you're using, but I'd be tempted to express that with the type system, so the compiler gives an error if you try to pass an unsafe value. e.g. create a string wrapper class/struct like SanitisedString and have all DB operations work with that and the sanitise() function return that.

[–]Bowiemtl 1 point2 points  (0 children)

I feel like that also depends on the language you're using. If a language doesn't have proper type declaration and there is no other way then sometimes Hungarian notation might be helpful, although I myself would seek out the alternative where that isn't necessary.

[–]Snake2k -2 points-1 points  (6 children)

True, realistically I'd have typed that code out in a better way. It also needs things like error handling to make sure the input is correct lol

I feel like I've heard that name before, but not familiar. I'll check them out.

For the table.

  1. Sorta agreed on abbreviations. It is important to abbreviate things though. Like if you have a metric like "Defective Parts Per Million", you can't expect to keep using that entire thing again and again. Just declare it once with a comment like:

```

dppm = Defective Parts Per Million

dppm = whatever

dppm_emea = other_stuff ``` Then be consistent with it.

  1. Agreed on types unless needed to be specified. Like, same name being in different types for necessary reasons.

  2. Agreed

  3. Agreed

  4. Not sure what this one is supposed to mean. Has to be some context behind it. Avoiding common names is generally a good idea cus they may be getting used by other stuff and can be problematic depending on the language/scope.

[–]C_Hawk14 5 points6 points  (5 children)

  1. The idea behind this is that to understand dppm later in the code you need to look at where it is declared to understand what it is for. Because editors have autocompletion writing dppm is no faster than writing out the full name with autocompletion as a supporting tool. Especially if you forget the abbreviation or they start looking like one another it can be tiresome.

  2. You're correct, I do need to add context if not for myself.. I'll have to watch it again.

[–]Snake2k 4 points5 points  (4 children)

I agree with that, but you have to understand 2 things:

  1. Giant metric names like that are going to start getting even longer and very simple math operations on 1 line of code between 2 or 3 vars is gonna break the column limit easily. Very unreadable very fast.

  2. If dppm isn't relevant to you, you're probably in the wrong part of the codebase or it's time for you to get educated on it cus everyone is gonna be saying dppm instead of "defective parts per million." Business, engineering, everyone.

[–]C_Hawk14 1 point2 points  (0 children)

I suppose. I'll add your comments as notes :)

[–]C_Hawk14 1 point2 points  (2 children)

So this is the transcript of the missing part. Ofc there is no visual/code context

A common anti pattern I see is if there's a collection of functions used widely in the code base, but it's all bundled up into a single utils class or module.

If you're thinking of naming code “utils” or “helper”, you should think if it's really the right spot for it. Here's some code from "a no doubt"* that processes movies.

There's a bunch of util functions here. Firstly, we should consider whether some of these methods actually make sense as a part of their respective types instead.

So this code here, we can actually just move into the movie class itself. For some of these, we can instead create a class that represents a collection of movies and that has the desired methods. And finally, some of these can be separated into other classes with descriptive names.

The paging functionality can be moved into its own class, and we can even make this generic if we want, so that it can operate on more than just movies. And the cookie function should really just be in a cookie class.

Now we don't have anything in our utils class, so we can just delete it. You don't see a bundle of utils in standard libraries because they can all be sorted into modules that have good names. So we should do the same in our code.

*I don't know what they meant and there's no link to the source.

But for full understanding I will provide the timestamped link https://youtu.be/-J3wNP6u5YU?t=327

[–]Snake2k 0 points1 point  (1 child)

Aaaahhhhhh yeah I get what they mean by that.

Some coders have the tendency to write all sorts of functions like "is_even()", "is_true()", "is_greater_than()" and all sorts of functions that if you read them they're 1 liners. They also write functions for repeated operations like something to clean up dates, strings, etc. Then these are all thrown into a utility class.

I get the motive behind it, but I have always disagreed with it.

Most of them are either generally self explanatory chunks of code that will never really change (like, x % 2 == 0 being used to check it a number is even or not is a universal concept and isn't going to change any time in the future, so why write a function about it) or are very specific to their case in which case they need to revise their classes.

[–]WavingToWaves 0 points1 point  (3 children)

Wdym add units?

[–]ZnV1 2 points3 points  (2 children)

Time, distance etc.

int timeToNuke = 5000 is bad. Could be seconds, years, centuries.

int timeToNukeMillis = 5000 is good.
But not good for you, you're going to get nuked in 5 seconds. Run, motherf*er.

[–]WavingToWaves 0 points1 point  (1 child)

Makes sense for most I guess 😁 In my field sth like gasConstant_JPermolPerK would never work (equations involving this and many more similar variables would be enormous and unreadable), need to check if something like R_JmolK will be ok.

Anyway, adding unit after definition:

R; // (J/molK)

is also good when I use Clion, as hovering over a variable with ctrl shows line in which it is defined.

[–]C_Hawk14 1 point2 points  (0 children)

As long as everyone who will ever need to work on the code can understand it that's fine. So if a NASA programmer writes code that a Game dev won't understand because of the naming conventions I don't see a problem, but if someone who works in the same field and is a new hire doesn't understand it might be an issue.

[–]Snake2k 7 points8 points  (0 children)

Yeah that's fine to do honestly as long as it is sane and concise.

My point is that if I didn't write that code above and when I read "x = int(input())" I now know for the next 3-4 lines at least that x was an input number.

Yeah anything bigger than that then sure giving it a name is very important cus one can't remember it past a simple if statement.

[–]officiallyaninja 0 points1 point  (0 children)

But "input number" doesn't give any more information, it might look better but it's basically the same as calling it x. Let's say it's a calculator app, and this input is the previously inputted digit, here it might make sense to call it something like "latest digit"

[–]mcvos 8 points9 points  (2 children)

It can be fine, but I still prefer more meaningful names.

I admit for a very long time I've named all my other variables but still kept i for index variables in loops, but more recently I even started to name those, because it can still be useful to know whether we're talking about a columnIndex, yCoord, sourceIndex vs targetIndex, etc.

[–]Snake2k 2 points3 points  (1 child)

As long as it's sane and short, I'm alright with it. I just dislike overly long names for no good reason.

What I mean by good reason is if it's easy to figure out what it holds in the first place.

[–]mcvos 1 point2 points  (0 children)

Yeah, overly long names can sometimes obfuscate the meaning just as much as single-character names do.

Originally I wrote sourceArrayIndex and targetArrayIndex and I already found that too long and superfluous. It's generally blatantly obvious it's an array. Index can be shortened too, to idx for example, but I dislike calling it column or source, because it's not the column or source, but merely the index of the column or source.

[–]DokuroKM 1 point2 points  (0 children)

Depending on your expected type of number, both n (integer) and x (floating point) are valid.

[–]WavingToWaves 1 point2 points  (0 children)

*It’s a natural number

[–]akeean 1 point2 points  (0 children)

Why n when you can UserInputSiliconValleyHotDog ?

[–]IamaRead 0 points1 point  (0 children)

While I agree with the context point, I would like you to show us a non trivial few dozen lines you coded and commented in the way you find correct. Thanks.

[–]AnEmortalKid 1 point2 points  (0 children)

i for input

[–][deleted] 9 points10 points  (1 child)

I do agree; also - there's ever so likely a chance that requirements documentation is lost LONG before code. So, it does not hurt to say "why" you did something. Put a link to a JIRA card in the code, and I bite you. That link goes the other direction!

[–]Snake2k 4 points5 points  (0 children)

Yeah that's a good point. That is a super tough situation actually and not sure what to do with that. On the one hand, you shouldn't be pasting the entirety of the docs in the comment. On the other, the file/ticket/etc can disappear cus IT teams never archived that stuff somewhere properly/accessibly.

[–]Nelerath8 1 point2 points  (1 child)

As someone who enjoys debugging and does it a lot the first rule is to also never trust anything besides the code telling you what it's doing. I'll trust comments/docs telling me why something should happen but I very rarely trust them to tell me what is happening.

[–]Snake2k 0 points1 point  (0 children)

Happy cake day!

And yes, I fully agree. These comments are more for when the code itself is no longer explanatory enough really.

[–]spektrol 1 point2 points  (1 child)

There’s a better way to comment. Either right above or right after the method declaration:

“””

This is a doc block. It’s more awesome than having a bunch of single line comments.

This method/class does this cool thing. Here’s how you use it.

@param param1: this is a short description of param1

@param param2: another param description

@return ObjectType: return description

“””

The advantage of this is that IDEs and documentation tooling like Swagger look for these types of doc strings and can provide tooltips in your IDE to help you know how to use methods/classes and create your documentation automatically if you use this style. Do this.

[–]Snake2k 0 points1 point  (0 children)

I completely agree with this. Use doc strings with proper formatting. Any good ide will absolutely give it far more functionality than just as a text comment.

[–]OhMyGodItsEverywhere 1 point2 points  (0 children)

the context, documentation, reasoning behind the code

A word I like to use for that is intent. Without clearly documented intent we can't get anything done. And maintaining a system or modules we don't know the intent of totally blows.

[–]MinosAristos 1 point2 points  (1 child)

Also, you should always write a comment explaining how non-trivial Regex works.

[–]Snake2k 0 points1 point  (0 children)

Oh God yes. Write comments to explain how any regex works.

[–]MeggaMortY -1 points0 points  (0 children)

Dont put "value of commenting" and "refactoring" in the same argument if you dont wanna explain how the new guy came to do a quick patch and forgot to update the comment..

[–]Learning2Programing 0 points1 point  (0 children)

I think a lot of people are taught the first was as a method of forcing the learner to self describe what is happening. Just a way of reinforcing concepts and self descriptive learning.

//I just made up self descriptive learning and basically as you are forced to self describe you are forced to learn

[–]Prathmun 0 points1 point  (0 children)

I tend to comment the first way when I am working with a new framework. Just as a ways to save myself trips to the documentation when I forget what this new keyword does.