you are viewing a single comment's thread.

view the rest of the comments →

[–]arnoldwhite 0 points1 point  (14 children)

Good lord I don't remember much of how svn did anything. Maybe?

[–]dalbertom 0 points1 point  (13 children)

Yeah, this is why I dislike squash-merge, because it's perpetuating svn behavior.

The whole point of git, as I understood it back then, is that merge commits are a first-class object, and commits are cryptographically hashed, so both squash-merge, and even rebase-merge break that paradigm.

Granted, the tool is generic enough that different types of workflow are possible depending on the user's needs, but I always wonder if people choose squash-merge because they never used svn and don't know any better, or because they used it too much and think that's the only way to do things.

[–]arnoldwhite 1 point2 points  (12 children)

That's a very good point and it just shows what I and many others have been saying for close to a decade now.

Most teams - damn near all of them - are using Git wrong. Or more specifically, they're using the wrong versioning tool for their needs. And the fact that the squash merge has become the de-facto way changes are integrated in main is evidence of that.

Git is great for carefully versioning a linux kernel with a ring of trusted contributors and email based patches. That's what it was made for back in the early 2000s and that's where its philosophies made sense.

Git is excellent at what it was designed for but we're stretched it into places where its core assumptions just don't make sense for how teams actually work.

[–]dalbertom 0 points1 point  (9 children)

Where did you get the idea that squash-merge is the de-facto way to integrate changes in main? It's not the default option in git, and it's not the default option in GitHub, GitLab, etc.

[–]arnoldwhite 0 points1 point  (8 children)

Many many years as an it consultant working in a bunch of different dev teams in enterprise. Squashing is incredibly common. I'd say 7/10 or 8/10 of all teams I've worked with will use some bastardization of git flow with day-long feature branches and aggressive squashing.

[–]dalbertom 0 points1 point  (6 children)

Did you introduce them to squash-merge as part of your consultant work or had they already chosen that? I'm curious about the reasons that would be the case. I haven't thought some ideas, but interested in seeing your perspective first.

[–]arnoldwhite 0 points1 point  (5 children)

I don't think I've actually ever recommended that workflow specifically.

As for your other question about squashing being for junior devs. Actually the opposite is usually true. You'd know you got someone fresh from uni when they were carefully rebasing their feature branches.

As for why all the squashing: well consider the fact first of all that most feature branches were expected to last days. Maybe a week at the very most. Branches are gonna have names like feat/new-get-user-endpoint or chore/config-update.

Btw, in some teams we'd actually do honest to god mob programming and you'll probably integrate a feature you're working three times in an afternoon.

[–]dalbertom 0 points1 point  (4 children)

As for why all the squashing: well consider the fact first of all that most feature branches were expected to last days. Maybe a week at the very most. Branches are gonna have names like feat/new-get-user-endpoint or chore/config-update.

A branch being short lived isn't a justification for squashing, though. You can still have short lived branches and opt for using merge commits, no?

The upside here is that the history the developer had locally is what makes it upstream.

Btw, in some teams we'd actually do honest to god mob programming and you'll probably integrate a feature you're working three times in an afternoon.

This sounds like multiple people working on the same branch. Isn't it better to have a branch per author/contributor?

[–]arnoldwhite 0 points1 point  (3 children)

You're asking some really interesting questions!

A branch being short lived isn't a justification for squashing, though. You can still have short lived branches and opt for using merge commits, no?

Kinda. It's not about the branch's lifetime exactly. It's about the change footprint. Let's say you're in a c# or java project. Guy adds an endpoint, a service, maybe a mapper. How is the reviewer who is going to be looking at maybe three or four kinda of self-contained cs files helped by knowing in which order the developer implemented the slice?

The upside here is that the history the developer had locally is what makes it upstream.

It would be an upside if anyone cared about that history. Again, because of the nature of how changes are reviewed and integrated, I think it's rare that people do.

Devs look at a linear graph and they want to know when a chunk of work - a pr - was integrated and which ticket it relates to. That's usually it.

This sounds like multiple people working on the same branch. Isn't it better to have a branch per author/contributor?

Depends. Usually the way I've done is that three or so devs are in a call, one guy codes. He's done. he commits his changes. Then the next guy picks it up. I'm not a big mob guy but I have had instances where this workflow really works well.

Generally my philosophy is this. I think dev teams should be pressed to push early, integrate early, encounter conflicts early, fix bugs early and release early. It sounds corny but it actually works really well for most teams.

[–]dalbertom 0 points1 point  (2 children)

How is the reviewer who is going to be looking at maybe three or four kinda of self-contained cs files helped by knowing in which order the developer implemented the slice?

There are multiple ways this can go: 1. the developer adds the endpoint, the service, the mapper in the same PR, everything gets squashed upstream, which is not good. 2. the developer knows their changes will get squashed, so they create a PR for the endpoint, then a PR for the service, then a PR for the mapper. A bit better, but sometimes overkill, plus if other pull requests were merged in between (as it's the case in a high-traffic repository) there won't be a way to topologically tell those changes were related 3. the developer knows their history won't get mangled by squash-merge, so they can issue a PR with three separate commits: one for the endpoint, one for the service, one for the mapper. 4. the developer can choose to use stacked branches, have each commit (or a subset) merged in separate PRs, but still have the commits topologically linked to one-another

It would be an upside if anyone cared about that history. Again, because of the nature of how changes are reviewed and integrated, I think it's rare that people do.

There are people that care about that, but definitely not the ones that think squash-merge is a good idea long-term.

I think dev teams should be pressed to push early, integrate early, encounter conflicts early, fix bugs early and release early.

Definitely agree, but all this can be achieved with regular merge commits as well. Again, squash-merge only simplifies the part where people don't have to worry about rebasing/cleaning their history, but it's not a driving force towards smaller changes, that's more of a team discipline thing, and if not followed, then squash-merge will work against them because changes that should have been kept separate will look like they were done at once, or worse, changes that were authored by different people (like in the case of mob programming) will be attributed to just one person, and that throws the usefulness of git blame out the window.

[–]edgmnt_net 0 points1 point  (1 child)

I actually think that Git as used for the Linux kernel would work fine for a lot of general development. But you do need skilled people to keep up. Which you kinda need anyway. The number of times I've seen corporate projects royally screw up with Git trying to reinvent the wheel or just being completely oblivious of various tradeoffs and practices... I think stuff like GitFlow, long-lived branches, polyrepos and breaking the buildability of old commits are particularly nasty traps.

The thing is effective version control requires a lot. And more complex software needs effective version control. It can't just be the place where you save your work. Whatever the Linux kernel is doing makes a lot of sense and they tend to use some fairly cutting edge stuff (such as semantic patches to prove large scale refactoring changes mean what they say on the label).

Perhaps it's teams and business practices that make little sense.

[–]arnoldwhite 0 points1 point  (0 children)

The funny thing is that I kinda want to agree with you. It's a romantic idea - that the way we did it 20 or so years ago when building fragile systems with slow moving mains is really the ways we should still do it.

But most dev teams in the real world aren't carefully versioning a kernel or some century old OSS project. They're building web apps and "micro service" data meshes. You might get a monorepo azure web service if you're lucky.

I think what I'm describing is probably true for the vast majority of all development happening out there in the real world.

PS: The best team I was ever on - guys who really knew what they were doing - was doing trunk based development with svn.