Limiting the Pull Requests Size

Limiting the Pull Requests Size
Photo by Michal Balog / Unsplash

Code review is a great medium to get quality code in the repository and to your customer. One of the mechanisms that facilitate code reviews is a pull request. They together make sure the code is well checked and free from any bugs. Despite the benefits of pull requests, one thing that totally defeats the purpose of pull requests is their massive size.

No one wants to look at pull requests that are too bulky, have many changes, and do multiple unrelated things at the same time. Let's see what happens when someone on your team makes PR that is too big to review.

Problems

  • Mental fatigue because of the breadth and depth of changes
  • Time commitment
  • Lack of context on all parts of the code for a single person
  • Possibility of  bugs or bad practices falling through the crack
  • Lack of enthusiasm for reviewing the big PR
  • Git UI becomes slow because it cannot load all the changes at once
  • Merge conflicts because code might be touching parts of the app someone else might be editing too
  • A risk that big PR might break something in the production and the chance that it might go unnoticed
  • PR getting stale because few people may be interested in reviewing it
  • Too many review comments and continuous back and forth

Can you do something to alleviate the PR size problem? Yes, of course. There are two ways of cutting back on the PR size by dividing it into smaller portions,

When it has all the unrelated changes

When you're working on a feature, bug size, or refactor where you are making independent changes, it's possible that you can make them independently on parallel branches where each branch is created from a fixed parent branch (For example, main or master).

You can create a chunk of your changes in the form of separate PRs and have them reviewed by people who are familiar with that part of the code. Since these changes are unrelated, there is less chance that they will run into conflicts.

Managing PRs while doing unrelated changes

Things become a bit complicated when you want to divide PR into smaller segments, but the changes you're doing are related. I would strongly advise using the stack-based solution that I was using at my last job.

This approach is based on the idea that the first PR you make is branched off of the main/ master. The second PR you make is based on the first PR and so on. Since the next PR is generated from the previous PR, all the previous changes are automatically propagated to the next PR and related changes stay in sync.

The benefits are the same as the first approach - Smaller PR size and less review overhead.

Managing PRs while doing related changes

Problem:

This approach comes with problems of its own. Let's take a look at them and see how we can alleviate some of them

  1. PRs become difficult to track as you have a long chain of dependent PRs. One way to keep track is to assign each PR a title starting with (1/x), (2/x), (3/x) and so on so that you know where each PR stands and you can merge them by sequence
  2. This approach is also prone to merge conflicts which can easily roll into the next PRs. Unfortunately, this happens irrespective of the size of each PR. If you make a change in the first PR, and later modify it, the following PRs are going to get confused about it. Good news? Once you figure out conflicts for the next immediate PR, it becomes easier to fix the conflicts for all the following PRs
  3. At my previous company, we had a tool that would allow us to merge ALL the PRs on the stack with a single click (PR-1 through PR-4) once all the pre-requisites are met. (Code approval, CI/CD pipeline pass, and tests) . Unfortunately, not all Git platforms have this support. Even if all the PRs on the stack are approved, you still need to merge them one by one
Each PR, irrespective of its size should contain the correct and stable code. Meaning,, it should not introduce any bugs, compiler errors or broken unit tests. Small PRs do not mean you can break the feature. It's like dividing your work in smaller milestones and addressing them in individual PRs while making sure they, as a single unit is still a working code

Having seen two ways of reducing PR sizes, let's look at some of its benefits,

Benefits:

  1. Small PRs encourage faster and more thorough reviews
  2. Each part of the change can be divided into smaller PRs and can be assigned to a person or group of people with expertise in that area of code
  3. Less chance of merge conflicts due to small size
  4. People are excited about reviewing the code due to the small size
  5. In case things go wrong, you only revert the small commit that broke the feature without affecting the rest of the change
  6. With small changes, you don't feel intimidated to write unit tests. You can also do exhaustive testing for any change you make
  7. Improved developer velocity
Pro tip 1: You can also experitment with how wait time varies by PR size. For example, big PRs take longer time to review while smaller PRs are reviewed within few hours
Pro tip 2: In order to keep track of PR sizes, you can use the Git bot where it runs on each newly created PR and assigns labels such as xs, s, m, l, xl etc. based on the PR sizes

Summary

I hope this post was useful in highlighting the importance of small PRs. We all strive for better code, excellent organization, fewer bugs, and faster feature releases. Small PRs allow us to achieve all the targets by encouraging thorough reviews and faster response times. Large PRs cause mental fatigue and prolong the work as no one is excited for looking at them.

In case your team is facing problems due to PR size, I hope this post was helpful in letting you know the importance of small PR sizes and tips for reducing PR sizes. In my team, I am always striving for better and more detailed code reviews while making sure reviewers also enjoy the process of knowledge sharing with PR authors without extra burden.

Does your team perform the code reviews? What problems do you face on a day-to-day basis? Have you ever run into the problem of large PRs? If so, how did you tackle it? I would love to hear your thoughts and experiences on this topic.


If you have any thoughts, concerns, or feedback to share, you can get back to me on my Twitter handle