How to choose the right code repository solution
As a developer, you have so many options when it comes to choosing the right development strategy, methodologies, and frameworks for your next project. When…
As a developer, you have so many options when it comes to choosing the right development strategy, methodologies, and frameworks for your next project. So, when it comes to storing your code, you might be trying to figure out the best way to do so. Should you use a monorepo, which stores code for many projects within a repository? Or would it be better for you and your team to store code in separate, isolated repositories?
In this blog, you will learn about monorepos, polyrepos, and what you and your team should consider when choosing which is right for your next project.
What is a monorepo and a polyrepo?
A monorepo is a software repository that contains multiple software components, applications, or services. The alternative approach to a monorepo is called a polyrepo (also known as many-repo or multi-repo), where each component, application, or service is in a separate isolated repository. So, how do you decide which one to use?
Choosing between the two
There are many advantages and disadvantages to each method, but we’ll save that for a later discussion. If given a choice between a monorepo or a polyrepo, I believe the following questions reign supreme: is it more important to integrate with code from the same enterprise stored in the same repo? Or is it more important to integrate with code outside of the enterprise, which is stored in separate repos?
We can approach these questions in a number of ways. To begin, you must first figure out if it is easier to integrate code from inside or outside the same repo. If you are developing in a monorepo and code is written according to the same standards, integration can occur seamlessly. Still, when planning to build a monorepo, you need to think about how much code you want to integrate together. For example, you might decide to integrate all application developers writing Java in one monorepo and integrate all data scientists writing Python into a second repository. Or, it might make sense to encourage deep collaboration within a business unit or organization and have only one for that unit. At the largest scale, a single monorepo can serve an entire enterprise. Bundling multiple services that are written in the same language for the same organization might be a good starting point for trying a monorepo.
In Figure 1, you can see that the separation between a monorepo and a polyrepo is technically made based on having one or more than one project in the same repo. But, practically, the separation between the two can be blurred, yielding multiple combinations of polyrepos and monorepos for your organization.
|Figure 1. A continuum between a polyrepo and a monorepo|
Lastly, there are a few more benefits that come with using a monorepo:
- Faster time-to-market based on increased code reuse
- Higher-quality code
- Improved security through full openness of code and common tooling or standards
- A fluid workforce that can easily move from project to project
On the contrary, if you’re considering using a polyrepo, the first external integration (for example, pulling in an external library) is usually trivial. Each additional integration can get progressively more difficult as conflicts can occur. For example, the first integration might dictate the use of token-based authentication. A token has no value on its own, but when combined with the right tokenization system becomes a critical piece in securing your application. Token-based authentication can only take place when each request to a server is accompanied by a signed token that the server can verify.
However, the second integration might only support password authentication. So, what next? Do you build different authentication layers? Do you regress the token-based authentication to use passwords as a lowest common denominator? Do you accept your code is fragmented due to its integration into two different systems? With so many outcomes that create more questions than answers, choosing a polyrepo might leave you more confused than not.
Although a little more diffcult, there are still a few benefits that come with choosing a polyrepo, which include:
- Better support from existing DevOps tooling
- Looser coupling between teams
- More experimentation around tooling and processes
- A closer match to Agile squad design
What’s your end goal?
When you are choosing between a monorepo and a polyrepo, you might also want to consider your purpose and mission. Is the goal of your team to quickly build business value (regardless of the technologies that are used) or is it to build technical value as a differentiator in the marketplace? In the former case, I suggest you use polyrepos to quickly build isolated business value, generating applications that use the most readily available tools. If it is the latter case, it might be worthwhile to invest the time in building your own technology for integration into offerings with competitive advantages.
Assessing your team’s abilities
It’s also important to consider the confidence you have in your team’s ability to deliver a solution. One could look at externally developed software and think that it is beyond their own team’s capabilities to replicate. Others can look at externally developed software and instantly identify its limitations, using that insight to formulate the ways that a homegrown solution can leapfrog over the alternatives. A monorepo, in this case, might serve you well by helping you develop a solution that is customized for your enterprise and your unique environment.
Hopefully you now have a better understanding of a monorepo and polyrepo approach and recognize there are many factors that you and your team should consider when choosing between various code storage strategies.