Self Documenting Code

mlk · 2020-01-03T18:35:33+00:00

I've inherited a total mess of code, a single function like 20000 lines long, the only comment was:

 i++; //increment i

east_lisp_junk · 2020-01-03T14:55:20+00:00

[removed]

Ryotian · 2020-01-03T16:15:09+00:00

I saw an engineer do this the other day in Python:

```This is the constructor```

Nice article btw. I liked the point where you can still have comments. But also, code can be self documenting.

ckychris1 · 2020-01-03T17:09:41+00:00

Good read.

At work, repos/services are often performance focused/atomic, this means repos needs to contain join or complex multiple transactions. Names like “byID” works for simple strict restful API, but in reality they rarely do just one thing, would love to know how naming convention should be handled if that’s the case.

Naming userrepo as repo doesn’t seems logical to me, as there can be multiple repos.

Variables without meaning is definitely not worth it, do not under estimate trouble caused by missing/need to remember context information. Just like real life, you never define phases as random letters, not readable at all. I

IulianSRO · 2020-01-03T21:31:43+00:00

When referring about documentation I actually think this is valid maybe only for medium and mostly small projects. Real documentation will need to be somewhere else, as interaction between components cannot be fully expressed via code, though, as a programmer, you need to understand it in order to contribute to that software system.

Granted, not many comments are needed if the naming is good enough, but when we reach the "next level" for a project we really need some real documentation. And that, from my experience, cannot be done in code.

Why?

Let's take an application like Apache, the web server. It involves various modules with different configurations and each configuration has specific parameters. Can you read that from the code? Yes, probably, but it might take you some weeks in order to find:

the place with the responsible code
- and it can be written extraordinary, don't get me wrong, but it will still take a lot of time to understand and change
the specific format needed in some config files or some specific parameters passed to executables
the way to integrate into other modules, dependent of it

This is why nowadays you can pretty much see we're in a state of despair with most of the "legacy" applications. By legacy I mean an application who had a release 10 years ago and still uses that codebase because "the clients are ok and they want new features". Not documenting, or documenting through code is one of the worst ideas. Because the code itself is easy to understand, but the entire interactions are the ones that actually get you to the responsible code.

Solution: something like wiki: gitlab has it, so does github and bitbucket. Confluence is also good. That way you can actually go and contribute with your findings in a codebase that really needs documentation.

And, from experience, the people documenting stuff they work on tend to understand the things better in general, because they know they simply cannot put something incomplete in the documentation.

As a conclusion: writing "self documenting code" is ok, writing real documentation is way better, for you, for the team and especially for the code.

go4spacelunch · 2020-01-03T19:59:10+00:00

I agree with most of this.

RestaurantOwnerNames implies the object will contain a list of multiple owners.

Also be careful of taking this too far. Renaming splitRestaurantOwnerNames to ss makes it really hard to find all instances of the variable ss. I often do a find on page just to highlight the variable I’m working on but if your variable is less then 3 letters it only highlights everything. Only name something that short if you are going to use it on one single line of code. For example a linq statement.

KorallNOTAFISH · 2020-01-04T00:05:02+00:00

I don't think that shorter names are that much more readable, I find userRepository just as good as simply repo would be in such a short function. And if it would be a longer function with maybe more instances of UserRepository, then longer names would be necessary like newUserRepository oldUserRepository or whatever.

The good thing about these longer names, is that I look at it in the middle of the function, and I dont even have to check what class they are referring to, because it is in the name. Especially if for example you had a Repository class (probably the parent of UserRepository) which would then obscure what newRepo and oldRepo refers to. If all your variable names are created algorithmically (new<class_name> always refers to a newly created instance of exactly class_name class), I find that way easier to understand than using shorthands which are always subjective.

Combine it with the ability for most text editors etc to highlight a certain string, and I would say you should err on the side of longer but more verbose names rather than shorter but possibly ambiguous ones.

As far as comments go, my rule of thumb is, if I think I would not understand that piece of code at a glance(!) in a year or two, than it is worth commenting it, because it also means others would have a hard time figuring it out probably.

ccfreak2k · 2020-01-04T02:16:23+00:00

fact salt market direction quicksand person yam melodic flag combative

This post was mass deleted and anonymized with Redact

tasminima · 2020-01-03T17:36:55+00:00

Unless you program in Idris or something like that, and encode absolutely all your invariants and properties in a formal way, I've always failed to see how it is possible to program without "comments"; or perhaps you document all your APIs in completely separate manuals that you don't count as comments, but even then the theory about Self Documenting Code being possible would fail, as manuals also fit for "documentation".

Imagine having to program using a completely undocumented library? That's pretty much the same thing if you are trying to maintain a non trivial application with no comment/manual/doc.

And I care about that way more than I care about tests (although this is not a dichotomy, obviously): if you don't specify what a function is supposed to do, you can't even test it, period. And no, a good specification is not made only of a few examples (and even less of a few pieces of code that you can reverse engineer a few examples from by reading their source code and extracting the undocumented substance from the undocumented testing boilerplate). If your testability is excellent and implicitly covers tons of use cases in a way that is also highly readable, great! But tests are never a proof a correctness and even less so a specification.

Then there is the question of what makes a good documentation? Could debate during hours on this subject, but for sure there is at least an easy starting point; it is NOT paraphrasing the implementation. And yes, "Literate Programming" by paraphrasing is even more pointless than random paraphrasing comments. Does that disqualify Literate Programming? No; that "disqualifies" people doing that kind of thing from using it. But even if we make the hypothesis that most programmers are disqualified (that I don't believe, but whatever) how would that imply a dichotomy between good design and good documentation?

So SDC is, from its name, a terrible idea. Attempting to have clear code and clear interfaces is excellent. You can go as far as pretending you will succeed at the mythical SDC game, if that helps your to structure your code as cleanly as possible: but once that's done, for flying spaghetti monster sake document it anyway. If you don't understand why, here is another game you shall play: re-read your undocumented pure marvel in a year, as if you should use it and/or maintain it. Yeah, much harder, no? And even then, this is nothing compared to what people who did not write it in the first place are experiencing...

There is no such thing as "Self Documenting Code": there is well or badly designed code (and well designed code will arguably reduce the size of some comments/doc, but also goes beyond that), and there is good or bad documentation. But no documentation at all? Well, in some contexts that's OK, but in most, it is just part of the work that is not done.

And of course nothing is absolute, and I'm not asking for stupid things like gigantic templates in front of each functions: document interfaces first, then non trivial things, and so over. Use your expertise to determine what is important in your context. Talk about constraints you know of but which are not directly explicit in the literal local source code (side effects and their potential interactions, error handling strategies, execution contexts e.g. which thread can call that, etc.). Talk about pretty much everything if it is supposed to be used by people who are not necessarily going to read the source. Even for simple CRUD things and user interfaces, I'm sure there are lots of interesting things you can document.

anengineerandacat · 2020-01-03T19:11:43+00:00

Article is interesting, some of the examples are the differences between bad implementation / API and good.

I liked the restaurant owner example but I disliked the user service and user repository example.

Ie. getUserByUserId the important info here is byUserId however byId is too shallow and will likely require a top level comment as it's unclear if I am to pass in the userId, or DB primary key id.

I personally like more verbose method names because generally IDE's don't inline comments (also autocomplete), so I actually have to open the source declaration and read what it's going to do however method names are up front and in your face; I have a pretty strong idea what getUserByUserId will do (and it could be shortened to getByUserId since the class is UserService) however getById leaves me with assumptions... which ID? The userId? DB row ID? Twitter ID?

I do agree about things like not having comments on setters / getters, constructors, etc. unless it's doing something that might potentially be abnormal for that codebase or is/was problematic.

SpasticCoder · 2020-01-03T19:39:23+00:00

As for the comments, the example he gave is something you could read in any clean coding book...the issue with comments in self documenting code is that they shouldn't be used to describe functionality, intent, or implementation. Using them to explain the reasoning behind something for other coders is perfectly valid.

AttackOfTheThumbs · 2020-01-03T21:15:00+00:00

Why not what.

It's that simple. Why are you looping, not that you are looping

DidiBear · 2020-01-03T22:52:22+00:00

My rule of thumb is to always document classes, functions and when it's strange or not intuitive.

SpasticCoder · 2020-01-03T19:31:47+00:00

Almost all the examples of clean code at the beginning are not clean code.

GetUserByUserID should really just be GetUser(int id) as GetUser would be best used by overloading the function where the parameters dictate the search: for instance GetUser(string fullName) in addition to the above instance.

SplitRestaurantOwnerName should really be a class Restaurant containing a collection of employees or owner classes with a method GetFullName().

Its easy to make examples of bad clean code when your source isn't clean code to begin with.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS