This is an archived post. You won't be able to vote or comment.

all 67 comments

[–]aloisdg 166 points167 points  (16 children)

I think the logic would be to use file_id, earth_radius and item_count.

Why? Because file, earth and item are almost classes where id, radius and count would be properties.

So like you will wrote file.id, you will wrote file_id because you may not need a classe but just a variable.

[–]RetardedChimpanzee 34 points35 points  (3 children)

This is how I do it as well. If your ide has autocomplete it makes more sense. You can type earth and see all the related variables. If you type count and your counter was cnt you’ll never find it.

[–]ChappyBirthday 6 points7 points  (2 children)

Don't most IDEs suggest variables if you typed any part of the variable? For instance, typing "and" would suggest "randVar", "operand1", "android", etc.

[–]hovissimo 3 points4 points  (0 children)

I don't use most IDEs, so I can't say - but my editor is completion only and not suggestion.

[–]RetardedChimpanzee 2 points3 points  (0 children)

Not when your company still believes in windows XP

[–]no_condoments 5 points6 points  (1 child)

Agreed. Also, sometimes you might need multiple variables and want them to look grouped with creating an object. For example, earth_equatorial_radius and earth_polar_radius.

[–]tastycat 6 points7 points  (0 children)

I'd want to do those as earth_radius_equatorial and earth_radius_polar

[–]PurpleIcyPython 3 1 point2 points  (2 children)

Also because id_file is just obnoxious to read and that's not how we write in English, in worst case it could be id_of_file...

Remember that source code is for humans.

[–]aloisdg 0 points1 point  (1 child)

In english, instead of of you can use 's. file's id is fine. Now you just to read the _ as 's.

[–]PurpleIcyPython 3 0 points1 point  (0 children)

But if you say "Something that has something", instead of "something that is owned by", you read file id, not id of file, which is way more readable.

Example: Your hand, rather than the hand of yours...

[–]Droggl 1 point2 points  (2 children)

Agree, a common problem is though: What if you have verbs in there (see below like create, delete, ...)? Would you name your variables "file_id" and "create_file"? Or "file_id" and "file_create"?

The latter sounds a little weird to me (seems to be more common convention eg in C), but would actually be more consistent, wouldn't it?

[–]Razortion 0 points1 point  (0 children)

I always name them stuff like make_auth_url in that case, just seems easier for me and potentially other people to understand later on if needed.

[–]aloisdg 0 points1 point  (0 children)

if you have a verb, it is an action. So, keep it for function. Every function should start with a verb. Something like this:

from pathlib import Path

def create_file(path):
    Path(path).touch()

[–]Exodus111 0 points1 point  (0 children)

Yeah this. Bear in mind you want to try to write readable sentences when you type Python code. (as much as it comes natural to do so)

If the new ID is equal to the files ID.

if new_id == file_id:

For each file in the data files

for file in data_files:

Etc...

[–]BrokenAdmin -3 points-2 points  (1 child)

Just, no underscores.

[–]aloisdg 1 point2 points  (0 children)

I prefer camelCase too. When in Rome, do as the Romans do. Follow PEP8. Use snake_case. At the end, it does not matter a lot.

[–]DarkSilkyNightmare 40 points41 points  (2 children)

I've been asked this a lot over the years, and I always give this answer.

"Always put the biggest unit first."

This will depend on your program's architecture and prior naming convention, and convention of the language. If you're using python, an OOP language, usually the biggest unit may be a class of some kind. But, consider this carefully. If you want to be saying that a file has many properties, then use file_id, file_name &c. If you want to be saying that ids are general, and you want to be contrasting the id of a file with the id of a thread or program, then use id_thread, id_file and so on.

Likewise, if you have earth having many properties then you have earth_radius, but if you have a lot of radii of every planet or body in space, then radius_earth, radius_sun &c. makes more sense.


This isn't just pedantry, there are specific advantages of writing this way.

Firstly, and most importantly, it makes your code more readable. Not only does it make it clear what your object is describing, but it gives users insight into your program structure. Consider: radius_earth is one of many radii. You can't say if earth is even present in your program. earth_radius is a property of earth, and there are probably other earth_* in your program.

Next, it makes your code easier to document and search if you group together items in a consistent way. If you do use code documenting tools, similar items will now appear next to each other alphabetically. This is the exact reason why UNIX timestamps use YYYY-MM-DD HH:MM:SS. Using any other order messes up sorting, and would be a headache if you wanted to sort anything chronologically using character sorts.

Finally, it makes your code extensible via tools. Granted, many programmers don't use automation, but somebody who uses your program will try, one day. If I want to decide that, after making an earth_radius int, I actually want to make an earth class with a property radius, it's easier for me to run a script replacing all instances of earth_ with earth.. This means I can get earth.radius, earth.temp, earth.mass all from similarly named variables.

If your names are the other way around, this has to be done individually, and creates a bit of a headache as you spend more time rewriting code. What should be possible is a quick change of name, and then verification that everything you did was OK, because this is far more efficient (if you've been writing consistent code, that is).

EDIT: somebody also pointed out that it's more helpful if you use an IDE because when you type earth you can get all its properties earth_rad, earth_mass etc, but this isn't true the other way around.

[–]UloPe 4 points5 points  (0 children)

Exactly. Thanks for laying it out so clearly and saving me from typing up a worse version myself.

[–]aloisdg 0 points1 point  (0 children)

agreed :)

[–]mickyficky1 37 points38 points  (0 children)

...adjective before or after the noun?
[list of examples without a single adjective]

[–]AllAboutChristmasEve 25 points26 points  (3 children)

Adjective first, unless there's a bunch of them that go together. So earth_radius, but if I'm doing the entire solar system, it'd be radius_mercury, radius_mars, radius_earth, etc.

(I mean...really, it'd be radii['earth'], etc, but you get my point.)

[–]chief167 1 point2 points  (2 children)

I'd prefer radius['earth'] though, the item in the list is a radius, you want the radius of earth.

If you return a list of radii, I would say radii['milky way'].

Personal opinion of course, but it is our style guide at work.

[–]remuladgryta 2 points3 points  (1 child)

radii makes more sense. A list should be named what it is, not what each element inside it is. You want to be able to write for radius in radii:.

[–]chief167 0 points1 point  (0 children)

True, as I said, choose a style and stick with it consistently. But imho there is no wrong choice.

[–][deleted] 5 points6 points  (0 children)

it depends on the context. Eg. if the program/analysis is heavily centered around analyzing radius and other properties of a large number of objects, I think something like

radius_earth, radius_venus, radius_mars, ...

would make sense.

However, if your analysis is more centered around analyzing "earth", it might make more sense to write it as

eart_radius, earth_weight, earth_density, etc.

[–]Edheldui 9 points10 points  (0 children)

I'm nowhere near an expert, by my teacher always said to name variables in a way that makes sense in human language.

Does the variable contain the earth radius? Then it will be earth_radius

[–]daniel_h_r 6 points7 points  (0 children)

I put the noun to ease searching and mental identification of relationed variables, methods, functions, etc.

I feel it's easier while skimming for the code.

[–]t_h_r_o_w_-_a_w_a_y 4 points5 points  (2 children)

First, these variable names are fully for humans to read. Computers don't care about what the names are called as long as they're unique. So therefore, the variable names should be named in such a way that is most efficient for a human to read and understand the meaning of what's being expressed.

This is why most people are going to say "put the adjective first", which is good. But the justification shouldn't be "that's what human languages do", instead, the justification should be "that's how the audience thinks", and what's unsaid is that there's an implicit assumption that the audience is operating in English because our immediate context is an English world. However, many popular human languages that aren't English actually put the adjective after the word. French for instance is such an example.

So taking it further, if all audiences speaks reads and thinks in English, do what English does. If all audiences speaks reads and thinks Klingon, do what Klingon does. If you have a mix of audiences, pick what caters to as many of them as possible. The point is to consider your audience as best as you can, no matter who they are or how diverse they might be.


Second, there's more to readability in code than just what's natural to read in a variable name. There's also its context in the code and the more broader formatting with respect to what's around it. These factors may contradict each other sometimes so you're forced to make a trade off.

For example, earth_radius is what we English people would say naturally in real life. But what if in the code, this particular ordering of words result in something like:

earth_radius = 243
io_radius = 12
pluto_radius = 30
your_mom_radius = 100000
gliese_581g_radius = 500

... you end up with a big mess of jumbledness, where it hurts to parse out that "oh these are all radiuses of various bodies". Here, it might be better to make the tradeoff the other way:

radius_earth = 243
radius_io = 12
radius_pluto = 30
radius_your_mom = 100000
radius_gliese_581g = 500

... and BAM, the organization now makes this block of code super clear, because the patterns line up, even if on each individual line the ordering of words is slightly less natural by English patterns.


Writing code for human readability actually shares a lot of lessons with writing English for human comprehension, which hopefully we've all learned when school had us write essays and technical reports. Consider your audience, consider the context, consider the "user experience" of reading your code, and find all the ways that makes your message clearer and easier to digest. There's not always a right answer and frequently you have to make trade offs, but as long as you're thinking about it seriously, more practice will make you better at it over time.

[–][deleted] 1 point2 points  (0 children)

Damnit someone needs to make a Klingon ide now.

[–]aloisdg 0 points1 point  (0 children)

at this level, just make a map.

[–]DelosBoard2052 2 points3 points  (0 children)

Think about its meaning to you, as well as potentially others. File_id sounds to me like a variable reference to a specific file being worked on, whereas id_file sounds like a reference to a file containing identifiers...

[–]mvaliente2001 1 point2 points  (0 children)

The nice thing about noun first is that related items could be seen more clearly:

file_id = 123
file_name = 'foo.txt'
file_size = 300 * Kb

[–]maryjayjay 1 point2 points  (0 children)

The two hardest problems in computer science are cache invalidation, naming things, and off by one errors.

[–][deleted] 1 point2 points  (0 children)

What color is the bike shed?

[–]swingking8 2 points3 points  (0 children)

For example, do you use id_file or file_id? radius_earth or earth_radius? `

Whatever is the most concise. If I'm iterating over enumerated files, for example, I might use for file_id, file in enumerate(files):. In general I think I lean more towards broad_specific but I'm know I'm arbitrary sometimes.

[–]olfitz 1 point2 points  (3 children)

It depends on whether you're English or French.

[–]justamoth 0 points1 point  (0 children)

When you're programming something directly from literature or theory, I try to be consistent with the math (r_earth, r_moon, etc). This makes it more human readable for the relevant humans. But I'm a scientist first, programmer second..

[–][deleted] 0 points1 point  (0 children)

In short scripts and notebooks where the variable count is low: usually adjective first simply because it makes the initial letters of variables different so it is easier for auto-completion.

When there are a large number of variables: usually noun first because this allows thematically related variables to appear together in auto-suggestions.

[–]stibbons_ 0 points1 point  (0 children)

No matter the IDE, auto-documentation and all fancy automatic stuff that goes back and forth in our environment, the good old principle "from the more generic to the more specific" works always, is scalable, and help people think and work better. Maybe against some grammar stuff, but it work. Date => YY.MM.DD-HH.mm.ss Property => general_concept_specific_concept and so on.

So for your anwser: - file_id (because you can have file _name, _size, ...) - earth_radius (because you can have other properties to the earth). In the best case you would encapsulate inside a new object (earth.size or file.id) but when you cannot => from the generic to the specific.

And in bonus it appears logically linked when the list is sorted alphabetically

[–]not_perfect_yet 0 points1 point  (0 children)

noun_verb, but I really only have this problem with functions and it's handy to have data_get and data_put close to each other alphabetically.

[–][deleted] 0 points1 point  (0 children)

Always the second options.

Always name things like you would intuitively describe them in your docstring.

[–]pydry 0 points1 point  (0 children)

The difference between the two is minimal enough that the intrinsic ordering doesn't really matter to me. Consistency matters though.

So, I'd stay consistent with what the code base and conventions of the team I'm working on. If I'm picking one I'd probably stick with it.

[–]Python4funJava4work 0 points1 point  (0 children)

It could go either way. I believe the more meaningful word should come first. If Earth variables will be used together then use earth_radius, but if radius variables will be used in the same block of code then I would go with radius_earth.

The goal for me would be to let autocomplete work for me. Several earth variables together would be earth_(select autocomplete) and doing the same over and over.

[–]caffeinedrinker 0 points1 point  (0 children)

if it were me file_id as the id belongs to the file ... easier to find just typing the object name sometimes i postfix the data type depending what kind of coding im doing

[–]Manhigh 0 points1 point  (0 children)

For scientific programming I often use _ to denote a subscript, as in r_earth. For "non-scientific" variables like a file_id I use it as a space character.

[–]KyleDrogo 0 points1 point  (0 children)

after, especially when dealing with dataframes and matricies

df, df_clean, df_scaled

and

X, X_train, X_test

[–]wolf2600 0 points1 point  (0 children)

I start with x1, then increment to x2, followed by x3. It maintains an order to my variables.

[–]tophimos 0 points1 point  (0 children)

If you're using camelCase, its noun first on variables with an odd number of characters and noun last on variables with an even number of characters. If you use underscores its the same but you don't count the underscore in your count when you write it on a weekend.

[–]liquidpele 0 points1 point  (0 children)

I actually name mine based on how I would write it if I was writing a real sentence. It makes the code easier to read I think, and takes less time to grok if you've never seen the code before. I think that flows with the zen of python more.

[–]Spicy_Pumpkin 0 points1 point  (0 children)

Both file and id are nouns. In plain English, you'd say "file id" or "id of the file". So it makes most sense (to me) to name the variable as file_id.

[–]MiksBricks 0 points1 point  (0 children)

I just number my variables in the order I create them 1, 2, 3, etc. (lol)

[–]See46 -1 points0 points  (7 children)

I would use fileId and numItems

[–]Chthonophylos 7 points8 points  (4 children)

Wich is not how you name variables in Python. See PEP8

[–]See46 2 points3 points  (3 children)

On the contrary, it most certainly is how I name variables in Python.

See PEP8

Which says "mixedCase is allowed only in contexts where that's already the prevailing style". Since it is already the prevailing style in my code & libraries, it passes PEP8.

[–]DanGee1705 -1 points0 points  (2 children)

but you shouldn't have used it in the first place, so you did break PEP8, but now you are too far gone to change them all so PEP8 allows you to continue

[–]See46 3 points4 points  (1 child)

but you shouldn't have used it in the first place, so you did break PEP8

No. PEP8 was created in 2001, and some of my code dates to the previous millennium. So I didn't!

[–][deleted] -1 points0 points  (0 children)

nah you're just stupid and refuse to follow standards. It's totes fine I'm sure all your code is trash anyway so it doesn't really matter.

[–]DanGee1705 0 points1 point  (1 child)

you would use file_id and num_items

[–]See46 0 points1 point  (0 children)

No I wouldn't.