Speculative Checks and Batch: Under the Hood

If this title does not ring a bell, you might need to read first what a merge queue is, what problem speculative checks solve, and how mixing speculative checks and batching can save you a lot of time.

Go ahead. We'll wait.

Now that you have an idea of what we're going to talk about, we will dive into how Mergify solves this whole pull request check problem internally.

πŸ“œ Checking Hypothesis

To understand the problem, we need to lay the current situation out. We have a base branch that works fine 🀞 and a few pull requests placed in a merge queue, ready to be merged.

We want to make sure they all work with one another and don't break the base branch's CI once merged. Those pull requests are not, by definition, up to date with each other since they are independent.

Now, let's say we have three pull requests ready to be merged, named PR1, PR2 and PR3. To speculatively test the future of the main branch, the merge queue needs to check that all those combinations work:

  1. latest commit of main + PR1
  2. latest commit of main + PR1 + PR2
  3. latest commit of main + PR1 + PR2 + PR3

All that needs to be prepared for each of these cases is a new Git branch including all those commits. We then need the continuous integration system to be triggered on these test branches and get the result to determine what works and what does not.

βš™οΈ Triggering CIs

There are multiple options to trigger a CI, and they all depend on which CI is being used. In Mergify's cases, the supported CI list is infinite, and being compatible with all of them is our ultimate goal.

Let's dig a bit into the different options.

Option 1: Ask users to configure their CIs tooling to run automatically on branches created by Mergify and report the branches' status to GitHub.

Pros:

  • This is a solution used by some popular basic merge queue bots (e.g., borg-ng).

Cons:

  • There's no easy way to find the CI reports on the GitHub user interface;
  • Additional CI configuration needs to be done;
  • The required CI list from the branch protections configuration can't be used.

Option 2: Open pull requests with those hypothetical future branches.

Pros:

  • No additional CI configuration;
  • GitHub branch protection works out of the box;
  • Merge queue status/result can be reported on the pull request and is easy to find.

Cons:

  • Opens extra pull requests.

Mergify chose option 2 and mitigated its sole downside by clearly signaling the extra pull requests (mark as a draft to avoid notifications, title prefixed with an understandable message, etc.). If you ever used our speculative checks feature, you have noticed it.

Mergify Running Speculative Checks

🧱 Where the problems begin

While it seems straightforward to implement, multiple corner cases need to be handled.

πŸ‘‡ Manual Merges

First, even if the CI reports success on a temporary pull request, that does not mean Mergify can merge the pull requests. Indeed, the base branch may have moved in between the time the queue checks were scheduled and when they finished, as users are still free to merge a pull request manually. In such a case, the queue needs to be reset and recomputed, and the temporary branches and pull requests re-created from scratch.

To ensure this integrity, Mergify stores the base commit SHA1 used to create a temporary pull request. Mergify has multiple event-based mechanisms to be notified if it ever changes.

πŸ€¦β€β™‚οΈ CI failing

Another possible scenario is the CIs failing for PR2. In that case, the speculated branch is no longer valid as Mergify will not merge PR2. All speculated branches that include PR2 need to be rebuilt without PR2 being in the queue.

PR #2 does not work. It's going to be removed from the queue.

Everything before PR2 is valid, though, and PR1 can be merged as long as it passes the CI. Therefore, Mergify only needs to discard PR2 from the queue, marking it as failed, and create a new speculative branch: latest commit of main + PR1 + PR3.

PR #2 has been removed and the queue continues to be processed.

πŸ§‘β€πŸš’ Hotfixes

Mergify's merge queues also handle pull request priorities. When a high-priority pull request is queued, the regular merge queue gets interrupted, and Mergify schedules and checks the hotfix.

Imagine that a pull request PR4 enters the merge queue and is placed between PR1 and PR2 (the former having a higher priority than PR4). In that case, the behavior resembles what would happen in a CI failure case: the speculation testings of PR2 and PR3 are no longer valid and need to be rescheduled.

πŸš†CI failure and batch

Another optimization that Mergify can make to increase the merge queue throughput is merging pull requests by batches. If you have three pull requests in queue and are happy to merge with a batch of 3, Mergify can test the combination of those three pull requests together and merge at the same time if they pass.

As we already explained in our blog post about the RCV theorem, this is a trade-off between velocity and consistency.

Illustration of the RCV theorem.

The tricky case with batching is what to do when the CI reports a failure. If you're testing a combination of latest commit of main + PR1 + PR2 + PR3 and it fails. What should be your following action?

Dropping the three pull requests from the queue is the most straightforward idea but a bit harsh. They might not be all broken. The ultimate goal would be to know which one is broken or incompatible with the others.

A naive but straightforward option would be to fall back to a model where they get tested one by one. That would also be suboptimal and CI-time-consuming.

The best option is to use an n-ary search algorithm, where n is the number of speculative checks configured, plus 1.

If you're not using speculative checks, Mergify will split the batch in half (n=2) and test the batch using a simple dichotomy method.

If you have eight pull requests in your batch and the number of speculative checks set to 3, in case of a batch failing, Β Mergify will use a quaternary search (n=4) and be able to test all the following combinations:

  • latest commit of main + PR1 + PR2
  • latest commit of main + PR1 + PR2 + PR3 + PR4
  • latest commit of main + PR1 + PR2 + PR3 + PR4 + PR5 + PR6

(The latest combination being the original failing one, latest commit of main + all PR up to PR8, which we know fails already).

Once all the tests are run, depending on the result, Mergify will continue iterating over the remaining valid batches and split them until it finds the failing pull requests. It'll then merge what is passing and reschedule testing of what needs to be retested.

🀏 A Tiny Optimization

There is a small optimization that Mergify does and is worth mentioning. When Mergify selects the first pull request in the queue, it may get merged right away. Indeed, when the pull request is already up-to-date with its base branch and the CI already run, all the conditions to be merged match.

In that case, Mergify will not create a temporary pull request but instead will merge the pull request right away to save CI time.

➑️ What's next?

As the number of pull requests grows in the queue, the change of functional incompatibility between grows exponentially. The current algorithm used in merge queues helps find those early in the process and way before reaching the base branch, avoiding bad deployments.

If a lot of CI time is available, it would be possible to build multiple trees of what the future of the base branch could become. Doing so, once a pull request is deemed broken, every edge containing the guilty commits could be entirely dropped, while the rest of the tree could still be tested and potentially merged.

That would allow Mergify to guarantee pull request merge time, even in the case of failure. Again, this would be a trade-off between cost and velocity, in favor of the latter.

Who knows, that might be a feature available soon enough. Stay tuned!