The goal of this post is to share why teams I worked with used pull requests and how this was practically organized. Please note that I’m strongly in favor of using trunk-based development over pull requests, and I collected references to get you started in another post. But I still share experience, knowing it’s not a perfect example, as it might provide inspiration on the advantages and disadvantages of such an approach.
The context
The context of this experience is a multi-team scrum setting with cross functional teams working on a single midsize product, within a mid-sized company that had little prior experience with developing software. All code for this application was situated in a single GIT repository. We started off with 4 developers from the same team working on this repository (team of 6). Within the course of the next year the development capacity grew to 6 teams, from which 16 developers spread over 4 teams contributing to the same application.
What is a pull request?
For people who do not know what a pull request is, let’s briefly explain.
Writing code works best if you do that in small steps. To ensure nothing gets lost, all changes are stored in a code history system (in this case git). In this context a commit is a set of consistent changes submitted to the history.
To ensure that several developers can work fluently on the same system, teams often make use of branches. You could understand a branch with a copy of the original, where you can work on a specific aspect (a feature or fix) until it is finished. Once finished, it must be merged back in the current working version.
A pull requests is a mechanism for a developer to notify team members that a feature or fix, developed on a separate branch, is ready. This lets everybody involved know that they can review the code, providing a forum discussing the implementation of the proposed feature. If there are any problems with the changes, teammates can post feedback in the pull request and even tweak the feature by pushing follow-up commits. If the feature is deemed good enough by your colleagues they will give their approval and the feature can be merged back in the integrated working version (often called the develop branch).
More details on how to work with pull requests can be found here
https://www.atlassian.com/git/tutorials/making-a-pull-request.
Why pull requests?
Within the teams, the level of experience of the developers was very mixed: ranging from people who with a lot of business knowledge who hardly touched a line of code before (let alone in the language we used) to well seasoned developers.
Before starting to look we had a lot of issues. So many things went wrong, that it was rather exceptional that the application under development actually worked for a few hours and there were lots of discussions amongst the developers. There were good intentions to start with testing and some attempts were made, but it did not seem to stick and the focus was mainly on building new functionality (although that went slower and slower over time). As the situation was getting worse every day, we needed a solution that would put a quick stop to this.
We focused on solving the following underlying causes:
- A lack of knowledge of the application code: structure of the code, architecture, dependencies;
- A lack of experience in developing a mid-sized application;
- A lack of knowledge and discipline to start testing.
There are several possible solutions to such problems like: training, knowledge sharing sessions, communities of practice, architectural sessions, good documentation and tutorials, automated tests or (executable) specifications, mob programming, pair programming, code reviews, pull requests …
We settled for a mix of several solutions:
- Code reviews through pull request
- A community of practice to share knowledge and come to better technical and non-technical working agreements among the PHP developers
- Better automated testing on unit, API and functional level.
This article focuses on code reviews with pull requests only. If you want to read about the other techniques we used, feel free to reach out on twitter.
Yeah, but why pull requests?
We actually experimented with several techniques. We settled for the one the team felt most comfortable with and had experience in, and for which there were clear advocates willing to spend a lot of effort to make this work.
Personally I would have liked pair programming or mob programming even more, and I realised the drawback of pull requests also. Of course I attempted to convince people and to show the differences, but as I promote self-management and self-steering it is important to put your money where your mouth is. So I did the only sensible thing, put my ego aside and supported setting up what the team came up with.
Working agreements around pull requests
Below are the concrete agreements we made around pull requests, as they evolved over the course of about 1 year. The text below is a nearly identical copy of what is on the wiki, slightly edited to remove some very company-specific terms.
Before making a pull request
We expect the following to be done before / while making the pull request
- Perform a functional test yourself
- Run all the automated tests
- Provide a description on what you did and how to functionally test it
- Ensure that there is a ticket number in your branch name (for everything!).
- Please do not bypass our code standard scanner (PHPCS).
Once a pull requests is made
The teams I’m coaching use bitbucket to store their code, and thus we also use the system of bitbucket to do pull requests. This system has support to automatically trigger a test run on a buildserver like Jenkins. We run the following automated checks on every pull request:
- style checks (phpcs)
- unit tests, API tests, functional tests
- the sonar rules
Bitbucket nicely indicates on a pull request if the tests are still running, have failed or passed. It was also configured in a way that the code cannot be merged until all tests have run successfully.
Before merge
On the pull request, an in-depth code review and test needs to happen. The following working agreements are in place for that:
- There must be a +1 on functional tests (meaning at least one person must indicate that he performed a manual functional test).
- There must be at least two approvals by other developers (also enforced by bitbucket configuration) after they reviewed the code.
- Look if comments are still open in the pull request (even with two approvals), and process them nicely.
- Ensure that there is a ticket number in the branch name.
Experiences: a bumpy road
Iteration 1: Cold turkey?
We decided to go cold turkey and follow the above working agreements strictly from the beginning. At the time of introducing the working agreements, tests could not yet automatically run so an additional rule was that the tests had to be run manually (+1 on run tests).
The effects were astonishing:
- The number of new bugs dropped
- The number of stories delivered also dropped drastically :-). It typically took a few days before a story was reviewed, and the cycle of reviewing and fixing often crossed the border of a week. It took a while for everyone to get used to this, but it eventually improved again.
- The number of discussions between developers went through the roof. People were clearly not used to give and receive feedback. It also highlighted a lot of other problems like inter-team disagreements, a lack of clear technical vision or architecture, a lack of technical design discussions, a lack of knowledge on testing, …
- The working agreements were sometimes interpreted so strictly that there was no place for sensible exceptions and common sense. Due to the long timings, the high number of discussions and lack of common sense for exceptions, some people started to search for shortcuts and did not follow the process anymore.
To fix the second point above we agreed that everyone would do several pull request reviews a day, and that we would stop starting new stuff. Before picking up a new story, work in progress should be finished first! We also focussed on getting the pull requests smaller.
To fix the third point we set up a Community of Practice (CoP) to allow resolving discussions faster, which started to work after a while. We also tried to reboot the architecture Community of Practice and more design discussions, both did not really work (for reasons not so relevant for this article).
To fix the fourth point (and some of the third) we slightly altered our approach as indicated in iteration two.
Iteration 2: Stricter but more relaxed.
Several brainstorm sessions happened, and we came up with the following solution. A group of representatives from the developers (called quality assurance (QA) leads) would take stronger ownership and:
- Assist their team members in following the working agreements, help to remove technical hurdles, unclarities, impediments;
- Perform the second review on a pull request, to ensure the bar was set high enough;
- Discuss and resolve open issues or exceptions amongst each other.
So the following pull request agreement was added:
-
One of the approvals must come from a PHP QA person.
- The QA cannot approve unless they sees that all three parts (approval of code, manual functional test and automated tests) have been run successfully. Only then they knows the pull request has quality assured.
The effects:
- Quality improved even more.
- Architectural changes and test automation were applied more consistently.
- Discussion issues were resolved faster, but there were still a lot of fundamental disagreements and heavy standoffs between developers. Often, people were really convinced of their truth with little respect for other people’s opinions, leading to all kinds of problems.
The QA people became a bottleneck, preventing a fast cycle time for pull request.
Iteration 3: Stabilization
As knowledge of the team started to align and the Community of Practice started to work better, the need for the QA check was reduced. So the QA leads decided themselves to remove the QA lead role and scratch the working agreement a QA approval. As far as I remember, this took about 2.5 months (5 sprints).
Conclusions
Let’s make up the balance sheet after about 1,5 years. The things that worked out really well:
- Introducing pull requests can be a bumpy road, as it makes certain problems and disagreements more transparent. The discussions that happen on pull requests are probably one of the best indicators about the health of your teams. Pull request itself do not help to increase trust or collaboration, it even seem to encourage the reverse.
- Introducing pull requests drastically increased our quality, in combination with the additional discipline pull-requests brought to actually implement good automated tests. As quality improved, the team also gets more trust from stakeholders and thus more time to fix even more quality issues.
- Pull requests help for knowledge sharing and to align parts of the design. A lot of people know more parts of the code than before, and learned some tricks from each other.
- (Some) people learned to slice problems into smaller chunks, as the review of a pull request becomes a lot harder if it is larger. In the beginning a pull request contained a whole sprint of work from a team member or even more, after a few months this was reduced to an average of about 2 days or less. This also led to smaller stories and a more predictable delivery.
The challenges we are still facing:
-
Pull-requests do not help to solve technical dept or design problems. They come too late in the process to fix more fundamental problems, as the functionality is already built and there is a tendency to not fundamentally change it anymore. This leads to a lot of discussions, standoffs between developers, pushes from the product owner to get the damm thing that works merged and by doing that technical debt is introduced in the code. Other techniques are needed for this, i.e. healthy design discussions, pairing, mobbing or something alike.
-
Developing a story has a higher cycle time and delays integrations, as an additional queue (the waiting pull requests) is introduced. Developers also have to do a lot of context switches due to several iterations of working on something else while waiting for a review and the fix the remarks. In this specific context it actually payed of in combination with introducing structured unit testing, but the tradeoff might not always be positive.
Many thanks to Jeremy, Alexander and Erik for the help on improving this article, and special thanks to Jeremy for proving the illustrations. This blogpost is an improved version that originally appeared on https://co-learning.be/.