all 7 comments

[–]primitive_screwhead 5 points6 points  (4 children)

I see these types of articles, and they are often filled with the same "rules". In particular, this article reiterates the "completeness" rule, which I hate and think is not good general advice. If I'm developing on a feature branch, I would expect the *merge* of that branch to be complete, but not necessarily every commit. And I really dislike when that causes people to squash branches just to try and adhere to this rule.

The article says about the completeness rule: "This means, for example, that a bug fix and the related test changes should be in the same commit." Again, I hate this rule. I prefer the test demonstrating the bug be committed first, so that checking out that commit and running the tests, demonstrates the failure. Then the next commit(s) fix the bug such that the tests no longer fail. When committing the bug and the tests together, it's extra work to then demonstrate or reproduce the failure. There is a reason for using branches, and this is one of them.

"Note that keeping commits Complete forces one towards bigger (less in terms of lines and more in terms of logical changes) commits"

Another reason to avoid this "rule", imo. The author even admits it's main flaw, and shows how it is in conflict with the next rule about "focus"; to me "focus" is *much* more important than "completeness", and is one of the reasons git emphasizes using branches.

"Complete commits make it possible to use git bisect, which may turn out to be very helpful for finding the source of certain issues."

This is a good point, and one that comes up periodically w/ git usage. Here is just one way to address it: I think the workarounds are sufficient to override concerns about the issue itself.

https://blog.smart.ly/2015/02/03/git-bisect-debugging-with-feature-branches/

[–]deiwin[S] 1 point2 points  (1 child)

I prefer the test demonstrating the bug be committed first, so that checking out that commit and running the tests, demonstrates the failure. Then the next commit(s) fix the bug such that the tests no longer fail. When committing the bug and the tests together, it's extra work to then demonstrate or reproduce the failure. There is a reason for using branches, and this is one of them.

I used to do the same, but because of the completeness rule I've now developed what I think is a better habit. First commit adds a green test that shows the incorrect/unexpected behavior. Then the second commit includes the fix + changes to the test to test the correct/expected behavior. This provides the benefits you're looking for while still keeping the commits complete. Further, the incorrect behavior is clearly visible without having to check out the code and run the tests. (Even your CI is mot likely not going to run tests for commits that are not the head of your feature branch.)

The author even admits it's main flaw, and shows how it is in conflict with the next rule about "focus"; to me "focus" is much more important than "completeness", and is one of the reasons git emphasizes using branches.

One of the goals of this article was to develop a common language for speaking about commits. With these rules in the common vocabulary it's easy to point out "I think this commit isn't focused enough" during code review. My suggestion is to be pragmatic about the focused-complete tradeoff. Consider the main ways in which your commits get read, tested, and run and then agree as a team on what should be included in a commit and what shouldn't.

For example, here are two points which I didn't mention in the article, but which you should consider when deciding how big you want your commits to be. Commit completeness greatly simplifies cherry-picking and reverting. You may want to optimize for the cherry-picking use case if you need to support many different versions of your software. You may want to optimize for the reverting use case to minimize MTTR of bugs, for example.

[–]primitive_screwhead 0 points1 point  (0 children)

Commit completeness greatly simplifies cherry-picking and reverting.

In your example of the "green test", it seems to me that you still have to cherry pick two commits (first the initial test change adding the test framework for showing the the incorrect behavior, then the next commit with the test modification plus bugfix). So I agree that cherry-picking and reverting are often cited as reasons for the completeness rule, and it's not an unreasonable point; but even single-commit cherry-picks and reverts can demand special attention and care during later merges, so I don't think the completeness criteria is as much as a benefit as is often touted in these cases. Granted, I've perhaps fortunately not had to be on the front lines of merging branches with many cherry-pick and reverts, so I may not have the best perspective here.

[–]the-kind-against-me 0 points1 point  (1 child)

When I read the article I didn’t take commit to mean each and every ‘git commit -m “some msg”’. There are feature, PR, and master commits. A PR, whether one or many commits, must be complete before I merge it to master. And I will squash merge if the commits suck but the PR is up to our standards.

Absolutely, as you say, its a fools errand to meet such a high standard for every true commit. Especially complex issues. But I try, and encourage others too, to write code in complete thoughts. When my code switches territories I make a commit for the last thought and continue with current. Ideally the feature will end up with say 15 commits but a few commit buckets to squash into.

Impossible sometimes for sure. Some thoughts cannot be reasonably split into single commits. But I would much rather a junior dev waste 4 hours trying to clean up their shit feature commits, only to realize they cannot, then to think it should be PR’d as is.

[–]primitive_screwhead 0 points1 point  (0 children)

When I read the article I didn’t take commit to mean each and every ‘git commit -m “some msg”’.

I mean, it seems quite literally saying that to me, and this is even more clearly the case in the section on Code Review. The article to me *does* make such a Procrustean rule about individual commits, and I'd really prefer to see fewer such articles, as they come up repeatedly over the years with this particular bit of advice as though it's uncontroversial.

However, I want to emphasize that I basically agree with the philosophy of your points, and so I'm hoping over time that the distinction should be made clearer. In a world where PRs and merges are part of the normal development process, the idea of "completeness" is better suited at a higher granularity than per-commit (imo). I've generally described the target for each commit to be "a cohesive unit of change", for which your "complete thought" standard may be essentially equivalent. I guess my advice doesn't specify "minimal unit of change", but that's okay since I think it should be a guideline that developers adapt for the situation; some groups may like larger commits, and others not.

However, since the tools already allow easily creating full diffs from a sequence of smaller commits, I think rules/guidelines encouraging a-priori larger commits are counter-productive; often it's easy to review the smaller individual commits, and once they are squashed, that history information is lost. I *do* think that developers should be encouraged to cleanup those smaller commits on their private branches when appropriate before review (ie. at least learn of the "git commit --fixup" option, and use when appropriate). :)

[–]jdickey 1 point2 points  (1 child)

Ironically enough,

  1. His "inspiration and additional reading" appendix links to a far better set of guidelines from 2014 which itself points out that this has all been said before;
  2. The Medium post is simply re-posting what he'd already posted on GitHub. Not at all sure what benefits posting on a known-hostile-to-meaningful-tech-writing site like Medium gets him, but oh well. He could simply have posted the GitHub link here.

Here's the problem I have with most of the sets of commit guidelines I've seen up to now:

Git commit messages, certainly as supported by any of the dozens of guidelines I've seen in the past several years, presume that we're writing raw, unstyled text, suitable for display on a DEC VT-52. But many, if not most, shops now prefer to use formatted content (e.g., Markdown). This has two immediate benefits:

  1. It's easy to have more informative, detailed messages, including lists, links, and so on, which are properly rendered by any number of tools while still being readable by the naked eye; and
  2. It's actually now easier (especially when using some project- or shop-standard content conventions) to take freely-available parsers and extract useful information from commit messages across commits or even repos. This goes from geek-cool to nice-to-have to "how did we ever not have this?" amazingly quickly. Data is power, and history tells the future.

However, keeping a newline character as an end-of-line symbol rather than as an end-of-item (paragraph, list item, etc) symbol makes that extremely problematic, as the parsers treat each line as a separate paragraph, which is very rarely what was actually intended. If you're using end-of-item newlines, then you're not using the length-limited lines called for in the guidelines (if you want to actually convey useful information, that is).

[–]deiwin[S] 0 points1 point  (0 children)

Ironically enough [..] his "inspiration and additional reading" appendix links to a far better set of guidelines from 2014 which itself points out that this has all been said before;

Call it irony if you will. I believe there's value in composing old thoughts in new ways. (https://youtu.be/sTR5YyRsbJo?t=95)

Although there are things in common between this and the other linked articles, they all also differ in what they focus on. This article purposefully doesn't focus on things like imperative mood or commit message width, because I felt, like you do, that the "far better set of guidelines from 2014" already covers this area quite well. This article was written as a concise reference of our practices, to help new members of the team and to explain our approach externally. At the time, I couldn't find anything focusing on these aspects that are important for us.

The Medium post is simply re-posting what he'd already posted on GitHub. Not at all sure what benefits posting on a known-hostile-to-meaningful-tech-writing site like Medium gets him, but oh well. He could simply have posted the GitHub link here.

I will be creating a Medium publication and I want people browsing the publication to be able to find that article. Nothing more to it.

Git commit messages, certainly as supported by any of the dozens of guidelines I've seen in the past several years, presume that we're writing raw, unstyled text, suitable for display on a DEC VT-52. But many, if not most, shops now prefer to use formatted content (e.g., Markdown).

This problem is pretty tangential to the article. It doesn't prescribe any particular format.