What is a Merge Conflict? Understanding the Difference Between Semantic and Code Conflicts

Modern software development teams have been using Version Control Systems and Continuous Integration pipelines for ages. The use of a VCS to expand a codebase has become essential to help us contribute, discuss changes, understand the evolution of the code, or even revert some mistakes. In turn, the CI will guarantee that the code is tested before it is integrated. However, in this never ending ballet of new contributions, conflicts will be frequent, and their resolution a part of the developer's job.

This article will differentiate two types of conflicts. First, it will start with the basic notions of a Merge Conflict in Git. Second, it will deep-dive into the other, often overlooked type of conflict. It arises when some added contribution breaks the code unexpectedly. That conflict may be called "Semantic Conflict" because when it happens the change does not integrate properly in it's context and the code does not make sense anymore. Finally, the article will give you some recommendations to mitigate the first, resolve the latter, and explain what is being explored at Mergify to go further than that.

The Modern Developer's Landscape

To set the scene, let us just remind you what a standard developer's environment would look like today. Consider team Back-End, which is made up of a couple of developers working on a GitHub Repository to build the API of their SaaS.

Developers regularly push their feature branches in Pull Requests, targeting the main branch, as part of their Trunked-Based Development strategy. They set up a CI/CD pipeline that runs their unit tests, some code linters, and a security scanner among other things.

Developers will review each other's code and wait for the green light (all pipeline steps have passed) to merge the different Pull Requests. That is when the first form of conflict may happen.

Understanding Git Merge Conflicts AKA Code Conflicts

Consider Developers A and B, working respectively on feature branches featureA and featureB (created from main) and embarked on a new day of coding their awesome SaaS. They both submitted their changes in Pull Requests (PRs) targeting main in the morning. Developer A is lucky because they got their PR merged before noon. Developer B who was already working on other matters in the meantime, came back in the afternoon and noticed that the GitHub interface signals a conflict in the PR.

At Mergify we like to ping the author and label a conflict automatically to make it stand out

One of the scenarios that might have happened here is that they have modified the same lines of code in the same file. To understand why this happens, let's delve into a brief explanation of how Git works. Git is based on commits, which represent a snapshot of the code at the time they are made. When our two developers initiated their work, they both created their own branch from the same commit, known as the common ancestor (i.e., the starting point was the same code before diverging). That poor Git will attempt to merge featureB but notices that since the branch was created from main, there has been alterations coming from featureA to some of the lines that Developer B is also modifying in their feature branch...

Should Developer B integrate the changes of Developer A? Or rather overwrite them? Well, Git is a robot. It has no clue what is happening in the changes and does not know their value for the product. But Git and GitHub are well designed, and that's why they will detect that conflict and notify us humans. To sum up, it is entirely up to the developers to decide how to "resolve" the Merge Conflict. On that topic, Mergify already published a great article on resolving a merge conflict as well as another one leveraging Mergify to draw the developer's attention to the problematic PRs and treat the merge conflicts more efficiently.

The Semantic Conflicts

Another very well-known conflict is the Semantic Conflict as described years ago by Martin Fowler. In this situation, a breaking change makes its way to production without anyone noticing it. That one is time-consuming and can have bad consequences for your business. Unfortunately, this conflict is often accepted by a lot of teams as being a part of the job, although solutions exist nowadays, as Mr. Fowler expected when he wrote "Maybe some day tools will be able to tackle some of them"!

Let's talk about Developers A and B again to tell more about this sad story. New day of coding, new features X and Y. Like the other day, featureX and featureY PRs are submitted, and they both have a green light from the CI. Let us take a practical example:

# ** An original function in branch main (the common ancestor) **

def remove_money(account_id, amount):
    """
    Removes the specified amount of money from the account.
    """
    # Implementation to remove money from the account
    print(f"Removing {amount} from account {account_id}")


# ** The change in featureX **

def remove_money(account_id, amount, currency='USD'):
    """
    Removes the specified amount of money from the account in the specified currency.
    """
    # Modified implementation to handle currency
    print(f"Removing {amount} {currency} from account {account_id}")


# ** The change in featureY **

def collect_mensual_fees_in_yuan(account_id):
    """
    Function in featureY that uses the remove_money function from main.
    """
    amount = 1000
    # The developer in featureY is unaware of the change in featureX
    # They continue to use the remove_money function as it was originally defined in main
    remove_money(account_id, amount)

for account_ids in get_account_ids():
  collect_mensual_fees()

featureX is merged, and this time, no merge conflict is detected. However, featureY PR is being put aside for a moment, (which can be relatively short, it all depends on the activity on the repository) because there are discussions about other hot topics in the development of the SaaS. Unfortunately, featureX and a few others were merged in the meantime, modifying some files and part of the code logic on which the implemented featureY depends. At some point, the team merges featureY though, without further ado. After all, everything is green.

The change is integrated and deployed by the team's Continuous Delivery pipeline. Bang! The team now has to face a production incident, because the key feature modified by the featureY branch severely degrades the way things work on the SaaS. In our example, the fees used to be collected in yuan, but they are now collected in USD by default. The fintech collected an amount which is about 7 times higher than expected in the current exchange rates!

This scenario highlights the importance of having the changes in feature branches on top of an up-to-date main branch, and re-tested. The go-to solution to make sure it is the case consists of rebasing the feature branch onto the main one or merging the main branch into the feature one. The topic and the conceptual differences between those strategies have already been discussed in this article. At this moment, the CI must absolutely be re-triggered just before merging to main, thus catching the "hidden" bugs (provided that the software is well-tested!).

Conclusion

As explored in the article, the two types of conflicts arise from simultaneous code modifications and the nuanced complexities of merging divergent branches.

The first form of conflict, namely the merge conflict or even code conflicts, is something that the developer has to deal with. Fortunately, this form of conflict is predictable and happens before any really bad merge breaks your software. It is time-consuming though, and it increases the time to market of a feature. A few tips to avoid them as much as possible would be to keep your change as minimal as possible, communicate with your team about the changes you undertake, and adopt common formatting for the team.

The latter form of conflict is related to the Continuous Integration of changes and how things usually work in this area. Even with the most efficient CI system, its green light on a Pull Request can be deceptive when incompatible changes have been introduced before it is effectively merged. Without any update and CI re-run, production may break. Mergify offers an amazing solution to tackle this problem using our Merge Queue. There is no need to bother about keeping your feature branch up-to-date, Mergify will do it for you before any merge, thus letting the CI catch the potential error.

In this ever-evolving world of code, where collaboration is the key to success, having a deep understanding of conflicts and the methods to solve or mitigate them is very important to keep working safely and efficiently. This entry-level article hopefully gave you the keys to a better understanding of the problem and a few interesting hints to improve your work. In the next article of this series, I will detail a technical challenge taken by Mergify to automatically resolve the first form of conflict evoked (namely the Merge Conflict), in a particular use case that arises weekly in our repositories.