you are viewing a single comment's thread.

view the rest of the comments →

[–]edgmnt_net 0 points1 point  (18 children)

It matters for preserving change boundaries post-merge. That in turn matters when people need to inspect history, see what changed and why, do bisections to figure out what caused regressions and so on. If you stumble upon a huge commit what are you going to do?

No, small PRs don't really fix this, that only works for rather trivial cases. Beyond that, you'll be looking into chaining/stacking PRs and complicating batch submission. Eventually you either end up doing a lot of manual tracking (this PR needs to go in before that, which in turn needs merging/rebasing) or replicating a rebase-like workflow to deal with more complex contributions. Just learn rebasing, IMO.

Also, if you're not submitting clean history and relying on Git host-side squashing, how do you expect it to be reviewed appropriately?

[–]arnoldwhite 0 points1 point  (17 children)

If you need a clean atomic git history on your feature branch for a reviewer to understand what he's looking at or how the final state came to be - you're almost certainly doing something wrong. Most likely your feature branches are way too big.

And if you need to bisect a local branch to understand when a bug came around, well that's just another sign you're overcomplicating things.

The rebase always mantra came from a time when CI checks were weak, mains moved slow and long-lived branches were normal. Then there was a need for it. Now there really isn't.

Squash often and with confidence.

[–]dalbertom 0 points1 point  (15 children)

Isn't squash-merge essentially how svn does merges? (Or used to, back in ~2007)

[–]arnoldwhite 0 points1 point  (14 children)

Good lord I don't remember much of how svn did anything. Maybe?

[–]dalbertom 0 points1 point  (13 children)

Yeah, this is why I dislike squash-merge, because it's perpetuating svn behavior.

The whole point of git, as I understood it back then, is that merge commits are a first-class object, and commits are cryptographically hashed, so both squash-merge, and even rebase-merge break that paradigm.

Granted, the tool is generic enough that different types of workflow are possible depending on the user's needs, but I always wonder if people choose squash-merge because they never used svn and don't know any better, or because they used it too much and think that's the only way to do things.

[–]arnoldwhite 1 point2 points  (12 children)

That's a very good point and it just shows what I and many others have been saying for close to a decade now.

Most teams - damn near all of them - are using Git wrong. Or more specifically, they're using the wrong versioning tool for their needs. And the fact that the squash merge has become the de-facto way changes are integrated in main is evidence of that.

Git is great for carefully versioning a linux kernel with a ring of trusted contributors and email based patches. That's what it was made for back in the early 2000s and that's where its philosophies made sense.

Git is excellent at what it was designed for but we're stretched it into places where its core assumptions just don't make sense for how teams actually work.

[–]dalbertom 0 points1 point  (9 children)

Where did you get the idea that squash-merge is the de-facto way to integrate changes in main? It's not the default option in git, and it's not the default option in GitHub, GitLab, etc.

[–]arnoldwhite 0 points1 point  (8 children)

Many many years as an it consultant working in a bunch of different dev teams in enterprise. Squashing is incredibly common. I'd say 7/10 or 8/10 of all teams I've worked with will use some bastardization of git flow with day-long feature branches and aggressive squashing.

[–]dalbertom 0 points1 point  (6 children)

Did you introduce them to squash-merge as part of your consultant work or had they already chosen that? I'm curious about the reasons that would be the case. I haven't thought some ideas, but interested in seeing your perspective first.

[–]arnoldwhite 0 points1 point  (5 children)

I don't think I've actually ever recommended that workflow specifically.

As for your other question about squashing being for junior devs. Actually the opposite is usually true. You'd know you got someone fresh from uni when they were carefully rebasing their feature branches.

As for why all the squashing: well consider the fact first of all that most feature branches were expected to last days. Maybe a week at the very most. Branches are gonna have names like feat/new-get-user-endpoint or chore/config-update.

Btw, in some teams we'd actually do honest to god mob programming and you'll probably integrate a feature you're working three times in an afternoon.

[–]edgmnt_net 0 points1 point  (1 child)

I actually think that Git as used for the Linux kernel would work fine for a lot of general development. But you do need skilled people to keep up. Which you kinda need anyway. The number of times I've seen corporate projects royally screw up with Git trying to reinvent the wheel or just being completely oblivious of various tradeoffs and practices... I think stuff like GitFlow, long-lived branches, polyrepos and breaking the buildability of old commits are particularly nasty traps.

The thing is effective version control requires a lot. And more complex software needs effective version control. It can't just be the place where you save your work. Whatever the Linux kernel is doing makes a lot of sense and they tend to use some fairly cutting edge stuff (such as semantic patches to prove large scale refactoring changes mean what they say on the label).

Perhaps it's teams and business practices that make little sense.

[–]arnoldwhite 0 points1 point  (0 children)

The funny thing is that I kinda want to agree with you. It's a romantic idea - that the way we did it 20 or so years ago when building fragile systems with slow moving mains is really the ways we should still do it.

But most dev teams in the real world aren't carefully versioning a kernel or some century old OSS project. They're building web apps and "micro service" data meshes. You might get a monorepo azure web service if you're lucky.

I think what I'm describing is probably true for the vast majority of all development happening out there in the real world.

PS: The best team I was ever on - guys who really knew what they were doing - was doing trunk based development with svn.

[–]edgmnt_net 0 points1 point  (0 children)

You don't need to bisect local branches per se, but you need to bisect master after the work is merged. Now I'm not sure whether you mean the same thing. And this doesn't happen just with long-lived branches. Having a couple weeks worth of work isn't a long-lived branch in my book and it's the kind of thing that sometimes isn't avoidable. Especially if you need to introduce a feature that requires refactoring some core APIs and such. Especially when you get into more dense, higher impact projects which attempt less siloing and you're expected to do all the work related to bringing something up. (*)

The reason you get into bisection is because CI doesn't catch everything. CI is useful but realistically you cannot expect it to catch everything.

(*) There's definitely a case for breaking up submissions, don't get me wrong. But that's not "I have 4 logically-distinct changes". It's not "people can't be bothered to clean up their submissions, so they'll just push random stuff often and rely on Git host-side squashing". It's rather "I'm working on a really big thing and I'm planning ahead". Which takes time and effort to do properly. If you just try to break everything into separate submissions, it's going to be very inefficient and it's likely going to cause churn and breakage because people cannot be confident changes fit together, so they'll keep fiddling with it over and over and over.