Building early microservice boundaries that you won't regret later
Over the years I’ve seen developer enthusiasm for microservices rise and fall. Some developers become staunch advocates of microservice architecture, while others become disillusioned with distributed systems and now favor monolithic architecture.
My personal opinion on microservices and monoliths is still similar to my statement from 2017:
Too many startups (the HN crowd) are just rewriting back and forth between bad monolith and bad microservices without ever turning either into a clean implementation, because high startup engineer turnover makes rewriting easier than refactoring existing cruft into something good
— Nathan Peck (@nathankpeck) December 11, 2017
With that said I want to share my main technique for building early microservice boundaries that you won’t regret later on. If you follow this strategy then not only can you feel good about your microservices, but also when a future developer inherits these microservices from you, perhaps they won’t feel the need to rewrite everything.
I’m going to assume that you are familiar with what microservices are, and some of the patterns and challenges of distributed systems. It is also important to note that I’m writing this for the perspective of a team that is early in the engineering effort and still doing research on how they want to design their system. This isn’t advice for the large organization with hundreds of developers working on an already established project. Some aspects of this can apply to large companies as well, but those companies also have their own unique factors to consider. I write this primarily for the small team of 10-20 developers that wants to build something new and big, and build it well, so that it lasts.
What is the purpose of this microservice boundary?
The most important technique for defining the boundaries between your microservices is to determine what the purpose of the boundary is. Dividing your software into microservices without a legitimate reason for doing so is guaranteed to lead to regrets later on.
Starting from this line of thinking often leads to advice like “just start with a monolith and split out into microservices later on”. I don’t agree with that. Starting out with a monolith is not necessarily going to minimize your future regrets. In fact it is much more likely to lead to some very frustrating regrets.
To explain what I mean lets look at a few examples and apply this approach of identifying the purpose of the microservice boundary.
Example One: Authentication microservice
Authentication is a problem that almost every piece of software needs to solve. It also has a few unique characteristics that make it an ideal target for splitting out into a day one microservice, rather than including it inside a monolith.
The first factor that makes authentication unique is passwords. If you are accepting passwords you had better be hashing them before storing them. Password hashing is a very different type of workload than anything else that your software needs to do, and that is by design. It is designed to take a lot of CPU cycles in order to prevent brute force attacks on the hash, but this means a modern CPU core can typically only complete 10 or fewer authentications per second. Usually your API code is fairly light, and the heaviest operation it needs to do is serializing and deserializing JSON. A single CPU core often handles hundreds of API requests per second. So the resource requirements of authentication are radically different from the rest of your system.
Additionally authentication has special needs due to the fact that it involves personally identifying information (PII). The authentication code has email addresses for sending email verifications and password resets, as well as phone numbers for sending two factor authentication SMS to the user. It is crucially important that this information be kept secure against external attacks and internal misuse. You do not want your organization to show up in the news as having leaked PII of your users, then get hit with a GDPR violation.
What is the purpose of this microservice boundary?
- If there is a burst of authentications from lots of users signing up or signing in then the authentication service can use its own pool of CPU resources that it does not share with the rest of the service. This maintains availability for existing authenticated users by preventing a monolithic “brown out” from high authentication CPU usage causing high latency in the rest of the API.
- The authentication service can be given its own database, with its own database auth, so that we can keep user’s PII separate, and safe from both external attackers and internal misuse. This protects user safety as well as the business reputation.
Example Two: Social media link preview
When you post a link on Twitter, Slack, or any number of other social sites you will see a preview card appear. The code for generating this preview card is a good thing to split out into its own microservice.
Figuring out what the preview card should contain requires your code to fetch content from an arbitrary user submitted URL, and parse that content. This is an incredibly scary thing to do. That URL may serve back a zip bomb that expands to GB in size. Or maybe it serves a carefully crafted piece of HTML that crashes your parser, causes it to consume large amount of CPU and memory, or even get remote code execution. A user might even send URL’s pointing at IP addresses inside of your private network, to try to fetch private content such as the EC2 metadata endpoint. This piece of code has to fetch potentially malicious external content, so it absolutely must be extremely isolated from the rest of your system.
What is the purpose of this microservice boundary?
- If this service crashes or is otherwise exploited by a malicious link then there is a limited blast radius that does not impact the rest of the system. The only thing that breaks is the link preview.
- This service can be isolated from the rest of the network so that it can’t be exploited to make requests to other private internal services.
Example Three: Image Resizing
Your application allows users to upload a profile image, and you need to generate a few different size variations of this image. This example has aspects that are similar to both the password hashing problem above, and the social media link preview. Resizing an image is a heavy operation that is orders of magnitude more complex than serving a simple API request. A user might upload a gigantic full resolution iPhone camera image that takes a second of CPU time to process. And don’t underestimate the danger of parsing a complex file format, where the user may be able to exploit bugs in the parsing algorithm to get remote code execution.
What is the purpose of this microservice boundary?
- If users upload a bunch of large images at the same time we can constrain the CPU consumption and prevent a service “brown out” from resource contention between the image resizing workload and the rest of the API.
- If a user uploads a malicious image file that exploits the image parser to get remote code execution there will be limited blast radius because they will only gain access to that specific service, not the entire system.
A microservice with a strong purpose is one you won’t regret
My goal here is not to exhaustively document every example of microservice design, but rather to provide a few examples to help you think about the purpose of microservice boundaries in your overall system. When you identify your own microservice boundary that has a strong purpose you will feel good about it both initially, and long into the future.
One word of caution: if you are in the position of designing microservice boundaries, one of the key things you need to do is document why you chose each microservice boundary. Every microservice codebase should include a README file with a clear mission statement about why this bit of code needs to be its own microservice, and what the benefits are to users of your system and the business as whole. This will serve as a reminder to yourself, as well as a warning to others in the future who might take over maintainership. Without such documentation it is easy to drift away from the original purpose. For example, maybe someone starts adding endpoints for fetching PII out of the authentication service, not realizing that this microservice was intended to protect PII in a GDPR compliant manner. One of the fundamental purposes of the authentication microservice boundary is broken, and it becomes something you regret. Properly documenting your microservice boundaries and the “why” behind each one helps prevent this type of drift.
I firmly believe that anyone can succeed with microservice architecture as long as they approach it with the right intentions. Don’t create microservices just for the sake of having microservices. Build microservice boundaries with purpose, and you won’t regret it!
If you are interested in chatting more about microservices, or have a question for me, then please reach out to me on Twitter.