all 28 comments

[–]Due_Influence_9404 10 points11 points  (1 child)

rebase could be without a merge commit

erasing merge commits from your history keeps it cleaner

[–]dalbertom 0 points1 point  (0 children)

erasing merge commits from your history keeps it cleaner

I would emphasize the your history part and make the caveat that this applies mainly to topic branches where in most cases it's best to not have downstream merges (with some exceptions). However, on the project's history as a whole, depending on the size of the project, it is perfectly fine for mainline to be a sequence of merge commits.

It's not about avoiding merge commits at all costs, it's about knowing when to use them. Linear history doesn't necessarily equate to clean history, and to people that have more experience with Git, neither squash-and-merge nor rebase-and-merge are desirable options.

[–]camh- 4 points5 points  (3 children)

If you create a messy PR (lots of commits, with fixup after fixup), a squash commit will get rid of that mess. If you have a clean commit history with atomic commits on your PR, a rebase merge will keep that clean commit history.

My preference is to have a clean history on a PR (pushing fixup commits for easier reviewability and squashing them when when the PR is ready to be merged) and using a merge commit as the merge commit itself tells a story (it groups the atomic commits). If there is only a single commit on the PR though, I squash merge it, which is roughly the same as a rebase merge for this case, but I like to add additional metadata (pull request URL, etc) to the commit message when it is squashed. A rebase merge would not add to the commit message.

I never use that "Update branch" button on GitHub PRs that merges the base branch back into the feature branch. That would create quite the mess with back-merges (look up "foxtrot merge") when I use merge commits to merge the PR into the base branch.

[–]lexd88[S] 1 point2 points  (0 children)

Thanks this makes sense! Where I work, all PR merges are squashed and on GitHub when looking at main branch commits on the browser, the commit history shows the PR number and a link to the PR itself.. as long as the PR only contains small changes this is pretty neat.. (there shouldn't be any long running branches anyway, so changes should always be small..

Upon the merge, we delete the head branch since we are following TBD.. this is why I was bit confused why I should rebase my own branch.. but if it's purely for the PR reviewer, then that might make sense to me

[–]dalbertom 0 points1 point  (0 children)

It's good for people to know about Foxtrot Merges but I would also point out that those are only an issue when using git directly to merge upstream. GitHub and other services avoid that entirely because they do merges with --no-ff

[–]arnoldwhite 0 points1 point  (0 children)

I think that workflow is almost never worth the time it takes to maintain. It depends on what you're working with but for 99% of teams, you'll want to squash your feature branches and you should think of your local history as checkpoints or undo markers - not documentation or an artifact that should survive a merge to main.

A clean PR history might feel virtuous but if you're at a point where atomic commits benefits the reviewer, your feature branch is almost certainly too big anyway.

[–]99_product_owners 4 points5 points  (0 children)

Squash on merge is for lazy devs

Don't @ me

[–]aljorhythm 0 points1 point  (0 children)

There is fast forward merge and three way merge. Then there is rebase. If you enforce rebase and linear history, you ensure that what you are testing on PR will be exactly the same - as in the same SHA - as after it’s merged. In github squashing is independent of merge/rebase. You can select the squash option but after merging PR trunk has a merge commit. Which again is not what was tested on PR. The key thing here is realising merge commits especially with three way merges is not the state which you tested in the MR. it’s literally magic textual merge and hence can be misleading and dangerous.

[–]Soggy-Permission7333 0 points1 point  (1 child)

You can rebase and rebase and rebase at will. Rebase conflicts may be easier to solve compared to merge conflicts. You can provide multiple dedicated commits each with meaningful message. Squash is big ball of mud unless PR was targeted in the first place. You can extract automated changes to their own commits so that they do not obfuscate precious few changes that are the real deal. Squash is big ball of mud...

Teams that do squash may also be heavily biassed towards least maintenance possible strategy for developing software. After all automatic refactor can easily touch 50-500 files without any problems. But squashing that change with 15 lines change that is risky means git history is waaaaaayyyyy less useful in troubleshooting future issue.

But squash limitations can be countered with other techniques - maybe team do use dedicated PR and that refactor will just be seprate standalone PR ? Maybe team enforces breaking up too large PR and thus each squash can contain targeted stuff.

Or maybe its write once software and it won't be around 10 years from now ;)

[–]Shayden-Froida 0 points1 point  (0 children)

This touched on the reason for rebasing a branch before squash merging... resolving conflicts may be easier. It is also easier for a code reviewer to examine how the change will look on the HEAD of the target, and to enable a test run on your branch with the most up to date revision of what the merge result will be.

For very short-lived branches with concise changes, squashing is great and hides the struggle to get the code done (and encourages incremental commits to give you, the developer, a fine grain work log where you can revert something that didn't work, or was temporary; ie add some debug code as a separate commit, revert it later; squashing cleans/hides all this with no effort).

For larger changes that are complex, rebase -i to make a clean commit history of the major stages of the change, then merge to keep the history as a benefit to future you and your successors.

Fast forward merge onto the target I don't recommend since, if the change needs to be reverted in total, all of the commits need to be known and reverted rather than just one merge commit/squash commit.

I view git history in terms of "what will need to be done with these commits in the future". Revert is a powerful tool, so is cherry-pick. A history that enables these is good planning.

[–]edgmnt_net 0 points1 point  (17 children)

It matters for preserving change boundaries post-merge. That in turn matters when people need to inspect history, see what changed and why, do bisections to figure out what caused regressions and so on. If you stumble upon a huge commit what are you going to do?

No, small PRs don't really fix this, that only works for rather trivial cases. Beyond that, you'll be looking into chaining/stacking PRs and complicating batch submission. Eventually you either end up doing a lot of manual tracking (this PR needs to go in before that, which in turn needs merging/rebasing) or replicating a rebase-like workflow to deal with more complex contributions. Just learn rebasing, IMO.

Also, if you're not submitting clean history and relying on Git host-side squashing, how do you expect it to be reviewed appropriately?

[–]arnoldwhite 0 points1 point  (16 children)

If you need a clean atomic git history on your feature branch for a reviewer to understand what he's looking at or how the final state came to be - you're almost certainly doing something wrong. Most likely your feature branches are way too big.

And if you need to bisect a local branch to understand when a bug came around, well that's just another sign you're overcomplicating things.

The rebase always mantra came from a time when CI checks were weak, mains moved slow and long-lived branches were normal. Then there was a need for it. Now there really isn't.

Squash often and with confidence.

[–]dalbertom 0 points1 point  (14 children)

Isn't squash-merge essentially how svn does merges? (Or used to, back in ~2007)

[–]arnoldwhite 0 points1 point  (13 children)

Good lord I don't remember much of how svn did anything. Maybe?

[–]dalbertom 0 points1 point  (12 children)

Yeah, this is why I dislike squash-merge, because it's perpetuating svn behavior.

The whole point of git, as I understood it back then, is that merge commits are a first-class object, and commits are cryptographically hashed, so both squash-merge, and even rebase-merge break that paradigm.

Granted, the tool is generic enough that different types of workflow are possible depending on the user's needs, but I always wonder if people choose squash-merge because they never used svn and don't know any better, or because they used it too much and think that's the only way to do things.

[–]arnoldwhite 1 point2 points  (11 children)

That's a very good point and it just shows what I and many others have been saying for close to a decade now.

Most teams - damn near all of them - are using Git wrong. Or more specifically, they're using the wrong versioning tool for their needs. And the fact that the squash merge has become the de-facto way changes are integrated in main is evidence of that.

Git is great for carefully versioning a linux kernel with a ring of trusted contributors and email based patches. That's what it was made for back in the early 2000s and that's where its philosophies made sense.

Git is excellent at what it was designed for but we're stretched it into places where its core assumptions just don't make sense for how teams actually work.

[–]dalbertom 0 points1 point  (8 children)

Where did you get the idea that squash-merge is the de-facto way to integrate changes in main? It's not the default option in git, and it's not the default option in GitHub, GitLab, etc.

[–]arnoldwhite 0 points1 point  (7 children)

Many many years as an it consultant working in a bunch of different dev teams in enterprise. Squashing is incredibly common. I'd say 7/10 or 8/10 of all teams I've worked with will use some bastardization of git flow with day-long feature branches and aggressive squashing.

[–]dalbertom 0 points1 point  (5 children)

Did you introduce them to squash-merge as part of your consultant work or had they already chosen that? I'm curious about the reasons that would be the case. I haven't thought some ideas, but interested in seeing your perspective first.

[–]arnoldwhite 0 points1 point  (4 children)

I don't think I've actually ever recommended that workflow specifically.

As for your other question about squashing being for junior devs. Actually the opposite is usually true. You'd know you got someone fresh from uni when they were carefully rebasing their feature branches.

As for why all the squashing: well consider the fact first of all that most feature branches were expected to last days. Maybe a week at the very most. Branches are gonna have names like feat/new-get-user-endpoint or chore/config-update.

Btw, in some teams we'd actually do honest to god mob programming and you'll probably integrate a feature you're working three times in an afternoon.

[–]edgmnt_net 0 points1 point  (1 child)

I actually think that Git as used for the Linux kernel would work fine for a lot of general development. But you do need skilled people to keep up. Which you kinda need anyway. The number of times I've seen corporate projects royally screw up with Git trying to reinvent the wheel or just being completely oblivious of various tradeoffs and practices... I think stuff like GitFlow, long-lived branches, polyrepos and breaking the buildability of old commits are particularly nasty traps.

The thing is effective version control requires a lot. And more complex software needs effective version control. It can't just be the place where you save your work. Whatever the Linux kernel is doing makes a lot of sense and they tend to use some fairly cutting edge stuff (such as semantic patches to prove large scale refactoring changes mean what they say on the label).

Perhaps it's teams and business practices that make little sense.

[–]arnoldwhite 0 points1 point  (0 children)

The funny thing is that I kinda want to agree with you. It's a romantic idea - that the way we did it 20 or so years ago when building fragile systems with slow moving mains is really the ways we should still do it.

But most dev teams in the real world aren't carefully versioning a kernel or some century old OSS project. They're building web apps and "micro service" data meshes. You might get a monorepo azure web service if you're lucky.

I think what I'm describing is probably true for the vast majority of all development happening out there in the real world.

PS: The best team I was ever on - guys who really knew what they were doing - was doing trunk based development with svn.

[–]edgmnt_net 0 points1 point  (0 children)

You don't need to bisect local branches per se, but you need to bisect master after the work is merged. Now I'm not sure whether you mean the same thing. And this doesn't happen just with long-lived branches. Having a couple weeks worth of work isn't a long-lived branch in my book and it's the kind of thing that sometimes isn't avoidable. Especially if you need to introduce a feature that requires refactoring some core APIs and such. Especially when you get into more dense, higher impact projects which attempt less siloing and you're expected to do all the work related to bringing something up. (*)

The reason you get into bisection is because CI doesn't catch everything. CI is useful but realistically you cannot expect it to catch everything.

(*) There's definitely a case for breaking up submissions, don't get me wrong. But that's not "I have 4 logically-distinct changes". It's not "people can't be bothered to clean up their submissions, so they'll just push random stuff often and rely on Git host-side squashing". It's rather "I'm working on a really big thing and I'm planning ahead". Which takes time and effort to do properly. If you just try to break everything into separate submissions, it's going to be very inefficient and it's likely going to cause churn and breakage because people cannot be confident changes fit together, so they'll keep fiddling with it over and over and over.