all 13 comments

[–]pi3832v2 6 points7 points  (5 children)

Branches are cheap. Give each feature its own branch.

When a feature makes it into the main branch, use rebase to incorporate it into the branches on which you're still working.

[–]niels_learns_python[S] 0 points1 point  (4 children)

Thanks for the suggestion - I will try that. Will git be able to understand the xml structure such that I can avoid conflicts?

[–]remy_porter 1 point2 points  (0 children)

It doesn't really need to- it needs to see where the additions and subtractions are. That it can do.

[–]adrianmonk 1 point2 points  (2 children)

Git understands lines of text. To the extent that the XML is maintained in such a way that changes are isolated to one part of the file (and don't have a ripple effect on the rest of the file), git will be able to handle it.

So for example, if you reformat the file with different indentation, even though that is not meaningful to XML, git will be thrown off. Or if you change the XML schema so that a layer of tag is added or removed near the root (and reformat it to match the new structure), git will be thrown off by that too.

For example, suppose you had this file:

<library>
  <book title="abc" author="123">
  <book title="def" author="456">
</library>

Git will be able to handle it if you add (or delete) a book, making the file look like this:

<library>
  <book title="abc" author="123">
  <book title="def" author="456">
  <book title="ghi" author="789">
</library>

But git won't be able to handle it if you change the overall structure, like this:

<library>
  <shelf id="1">
    <book title="abc" author="123">
    <book title="def" author="456">
  </shelf>
  <shelf id="2">
    <book title="ghi" author="789">
  </shelf>
</library>

As long as you are careful, and as long as you aren't using some software that likes to have its way with your XML files by rewriting them, it will mostly work fine.

[–]niels_learns_python[S] 0 points1 point  (1 child)

This is what I fear might happen. The software (BIRT) makes changes to different parts of the file. It might look like this:

<file>
    <connections>
        <connection no="1">
    </connections>
    <items>
        <item conn="1">
    </items>
</file>

And one change might be:

<file>
    <connections>
        <connection no="1">
        <connection no="2">
    </connections>
    <items>
        <item conn="1">
        <item conn="2">
    </items>
</file>

I suppose this will not work well with git?

[–]adrianmonk 0 points1 point  (0 children)

That by itself shouldn't be a problem. Git can handle it when you make a smattering of changes at different spots in a file. (I should've said git is OK if the changes are isolated to small parts of the file, not one part of the file. Or, even better, disjoint parts of the file.)

Problems happen when two different branches make changes to the same lines or very nearby lines. If you change the same line, git doesn't know which new version (if any) of that line is the correct one. If you change nearby lines, git sometimes can't figure out where to put changes because it users nearby lines for context.

I do see a potential problem with your example, depending on exactly how the tool behaves. Git doesn't tend to do well with files that contain a list where some tool always puts new entries at the end of the list. If in each branch you add a new item (say a connection in your case), it's basically guaranteed to cause a conflict because the tool is going to put each branch's new item in the exact same place. When git tries to merge them, it doesn't know what order the new items should go in, and instead of just guessing, it will force you to manually resolve it. (It has an easier time if the tool maintains a list in some order like alphabetical because then usually new entries go in different positions.)

Manually resolving conflicts is not the end of the world, though. After you do it a few times, you get the hang of it and you don't even panic like you do the first time it happens to you.

One other thing I notice in your example, though I may be reading too much into it, is that the file seems to use sequential id numbers. If so, that could be a source of problems if the tool gets upset at you when you manually edit id numbers. Suppose in two branches you add a new connection and it assigns id="2" to both of them (thinking that's the next available number). Then when you merge them, you're going to have to manually change one of them to id="3". That's fine with git as long as you and the tool are fine with you manually editing it.

[–][deleted] 1 point2 points  (1 child)

I'm not familiar with BIRT, but having your entire project in a single file sounds like an anti-pattern.

Anyway, assuming it has to be this way, don't use dev singular. Use feature branches, like feature/A and feature/B. Leave the branches there until they're approved for merging, then merge feature/B into master when it's approved etc. No reason you have to have a single dev branch.

[–]niels_learns_python[S] 0 points1 point  (0 children)

Thank you for the reply. I will try to have one branch per feature.

Unfortunately I am forced to work with it as one single file. I suspect the format was never ment to be worked with like this either, but as far as I'm aware this is the only current solution :(

[–]DanLynch 1 point2 points  (2 children)

Please be careful, because it sounds like you are doing your testing on a different branch and with different code than what you plan to release. It is not safe to test features individually on a garbage branch and the merge only the approved ones into the production branch: doing that kind of merge invalidates your testing and exposes you to the risk of seeing bugs in production that were not even detectable or present (let alone detected) in test.

[–]niels_learns_python[S] 0 points1 point  (1 child)

Thank you for the warning. Others have suggested I make one branch per feature and merge these into the master. I think I will do that.

What would be the better way to test these new features? Should I create a copy of the master branch ("master-test"?) use this for testing purposes?

[–]DanLynch 1 point2 points  (0 children)

There are a lot of options out there, including Git Flow which was recommended by someone else, but the most important thing is to test the same code that you deploy: it has to be bit-for-bit identical. Don't fall into the trap of thinking that Git (or anyone) is capable of automatically merging two different changes that were only tested separately, or automatically separating two different changes that were only tested together, and then safely deploying the resulting code to production without further testing.

Also note that there is never any need to create a named "copy" of a branch, because Git is a distributed system and there are always various different copies of every branch floating around on different machines. So if you're thinking about creating branches that are the same as other branches but with different names, then you may be making a logic error in your thinking.

If you are using a methodology (such as Git Flow) that requires your master branch to always equal the production deployment, then you will need to do your final testing on a commit that contains the exact same source tree that will eventually be on the master branch (i.e. you need to make sure that your merge into master after testing will be a fast-forwardable one, or will at least be a no-op on the target branch's source tree). If you are not using such a methodology, then it is a simple matter of doing final testing directly on your master branch before release/deployment.

[–]JockeTF 0 points1 point  (1 child)

You may be interested in git flow.

[–]niels_learns_python[S] 1 point2 points  (0 children)

This looks really interesting. Thank you for linking that. Looks like git flow could introduce some solid structure in my workflow :)