Dissolving the Monolith: Patterns
Monolith is not bad per se. Here are the benefits of monolithic architecture:
- simple to develop
- easy to make radical changes to the application
- straightforward to test
- straightforward to deploy
- easy to scale
On contrary these are some of the biggest limitations:
- complexity
- slow development
- path to deploy is arduous
- delivering reliable monolith is hard
- locked to (obsolete) tech stack
To escape monolithic hell, migration to some other form is necessary. One of the new architecture styles is microservice architecture style. Here are the benefits of using this architecture style:
- it enables continuous delivery and deployment of large, complex applications
- services are small and easily maintained
- services are independently deployable
- it enables teams to be autonomous
- it allows easy experimenting and adoption of new technologies
- it has better fault isolation
but, it does not come without drawbacks. These are some of them:
- finding the right service is challenging
- distributed systems are complex (this makes development, testing and deployment difficult)
- deploying feature that span multiple services require careful coordination
- deciding when to adopt this kind of architecture is difficult
That being said, you must keep in mind that microservice architecture is not a sliver bullet. Depending on the situation, sometimes it's even better to go with different architecture style (even the monolithic).
By migrating to microservice, or when starting to create/integrate a new one you should always consult best practices. As there are Gang of Four patterns available for software development, there are also some patterns for microservice architecture. There are several groups of patterns depending on the problem which needs to be solved. Those groups are:
- application patters — solve problems faced by developers
- application infrastructure patterns — cover infrastructure issues that also impact development
- infrastructure patterns — solve problems that are mostly infrastructure issues outside of development
The first step in defining application's architecture is to define the system operations. You should start from system requirements. Then we should sketch out a high-level domain model. Once this is done we should identify the requests that application must handle. There are two types of operations: commands (Create, Update, Delete) and queries (Read). Lastly we need to identify the services.
If we are breaking monolith apart, there are two patterns for decomposition:
- decompose by business capability
- decompose by subdomain (DDD decomposition)
keep in mind that these strategies can be used even when you are starting development of a new microservice.
On the surface decomposition looks straightforward, however you may encounter several obstacles on the way:
- network latency
- reduced availability (due to synchronous communication)
- maintaining data consistency across services
- obtaining a consistent view of the data
- god classes preventing decomposition
There are lots of different IPC (Inter Process Communication) technologies which can be chosen. Services can communicate synchronously by using request/response communication based mechanisms (e.g. HTTP REST or gRPC), or asynchronously (e.g. AMQP or STOMP). Additionally interaction between services can be one-to-one or one-to-many. During communication services are exchanging messages, which can also be in various formats: text based (JSON, XML) or binary (Protocol buffers).
Let's look at the trade-offs of mostly used communication types:
REST
pros:
- simple and familiar
- easy to be tested (browser, postman, curl)
- directly supports request/response communication style
- HTTP is firewall friendly
- doesn't require an intermediate broker (simplifies architecture)
cons:
- only supports request/response communication style
- reduced availability
- clients must know exact URL of the service
- fetching multiple resources in a request is challenging
- can be difficult to map multiple update operations to HTTP verbs
gRPC
pros:
- straightforward to design an API that has a rich set of update operations
- has an efficient, compact IPC mechanism, especially when exchanging large messages
- bidirectional streaming
- enables interoperability between clients and servers
cons:
- more work on JavaScript clients to consume gRPC
- older firewalls might not support HTTP/2
Broker-based messaging
pros:
- loose coupling
- message buffering
- flexible communication
- explicit interprocess communication
cons:
- potential performance bottleneck
- potential single point of failure
- additional operational complexity
In order to prevent failures from cascading through the system, services which use synchronous communication must be designed to handle partial failures (invoking service down or exhibiting high latency). To do so service should use timeouts, limiting the number of requests, and circuit breaker pattern. Each of these techniques can be used independently or in combination.
Whenever a synchronous communication is used there should be a service discovery mechanism included in order to enable clients to determine the network location of a service instance. This mechanism can be implemented on both server (deployment platform) and client (application) side. The simplest approach is to have it on deployment side (server-side discovery and 3rd party registration patterns), but it's not that complex to implement it on application side (client-side discovery and self registration patterns).
There are several impediments when using asynchronous communication with messaging. One challenge is competing consumers and message ordering. To overcome this you can use sharded (partitioned) channels (Apache Kafka or AWS Kinesis are already providing it out-of-the-box). Another challenge is handling of duplicate messages. Most message brokers promise to deliver message at least once, as guaranteeing exactly-once messaging is too costly. The ways to handle duplicate messages are:
- idempotent message handlers
- track messages and discard duplicates
But the biggest key challenge when using asynchronous communication is atomically updating the database and publishing a message. A good solution is transactional outbox pattern (write the message to the database first). On the other side a process which is retrieving the message can use either pooling publisher pattern or transactional log trailing pattern.
In monolithic systems every request is executed within a database transaction. Transaction management is more challenging in complex monolithic applications which use multiple databases and/or message brokers. And in a microservice architecture transactions span multiple services, where each of the services has its own database. In this situation we need to use specific mechanism to manage distributed transaction(s).
Traditional XA/2PC-based distributed transactions aren't a good fit for modern applications. A better approach is to use Saga pattern. Saga represents a sequence of local transactions that are coordinated using messages. Each local transaction is bound to a specific service. If there is a failure during saga, a compensating transactions must be executed to explicitly undo the changes.
There are two types of sagas: choreography and orchestration saga. In the choreography saga, a local transaction publishes events that trigger other saga participants to execute their local transaction. Decision making and sequencing is distributed among participants. This approach is better for simple sagas. In orchestration saga, a centralized orchestrator sends command messages to participants. By modeling saga orchestrator as state machine development of it can be drastically simplified. This type of saga is usually better for complex sagas.
Let’s look at the trade-offs between these two types of sagas.
Choreography based saga
pros:
- simple to develop
- provides loose coupling
cons:
- more difficult to understand
- can introduce cyclic dependencies between services
- risks of tight coupling
Orchestration based saga
pros:
- simpler dependencies (no cyclic dependencies)
- less coupling
- improves separation of concerns and simplifies the business logic
cons:
- risk of centralizing too much business logic in the orchestrator
Designing sagas can be challenging as unlike ACID transactions, sagas aren't isolated from one another. The problems that can be introduced are: lost updates, dirty reads and non-repeatable reads. In order to bypass accessing other sagas data some of the countermeasures need to be implemented. You can choose between:
- semantic locks: set a flag in any record which needs to be created/altered. This flag indicates that the record is not commited and can be altered
- commutative updates: make saga operations commutative to eliminate lost updates
- pessimistic view: reorder saga steps in another saga to eliminate possibility of dirty reads
- reread value: reread record, verify that it's unchanged and update it. This prevents lost updates
- version file: records operations which are performed on a record. This enables to turn non-commutative into commutative operations
- by value: strategy for selecting concurrency mechanism based on business risk. It uses properties of each request to decide between using sagas and distributed transactions.
The procedural Transaction script is a good way to implement a simple business logic. The business logic will be organized as a collection of procedural transaction scripts (one for each type of request). Don't be ashamed to use procedural type of programming where necessary.
A complex business logic is a totally different story. This is really challenging task to implement in microservice architecture where the logic spans over multiple services. The best approach to use here is object-oriented Domain model pattern, where the business logic is organized as an object model consisting of several classes which have state and behavior.
A good way to organize a service's business logic is by grouping it in a collections of DDD aggregates. Each collection can be represented as a graph of objects treated as a unit. DDD aggregates are quite useful as they modularize the domain model, eliminate the possibility of object references between services and ensure that each ACID transaction is executed within service.
Identifying an aggregate is the most complex a key task in the process, as it requires designing a domain model(s) as aggregate(s), identifying their boundaries and their roots. There are some rules which ensure that an aggregate is self-contained unit that can enforce its invariants:
- reference only the aggregate root
- inter-aggregate references must use primary keys
- one transaction creates or updates one aggregate
An aggregate should publish domain event(s) when it's created or updated. Domain event subscribers can also notify users and other applications, and publish WebSocket messages to user's browsers.
The traditional approach to persistence maps classes to database tables (using ORM). Composite entities (entities with children) are usually mapped to two different tables. Persisting parent entry, with children, persists rows in both tables. This approach works well because most enterprise applications store data this way, but it has several limitations:
- object-relational impedance mismatch
- lack of aggregate history
- auditing is tedious and error prone
- event publishing is bound to the business logic
As always, there is a solution for this problem: event sourcing.
Event sourcing is a way of structuring the business logic and persisting aggregates. The aggregate is persisted as a sequence of events, where each event represents either creation or state change of the aggregate. An application can recreate current state of the aggregate by replaying the events. Event sourcing preserves the history of domain object, providing an accurate audit log and reliably publishes domain events.
During application lifecycle there can me more than one request to simultaneously update the same aggregate. With traditional persistence approach optimistic locking is used to prevent one transaction overwriting another's changes. As we already know by now to overcome this problem in microservice architecture we can use polling and transaction log tailing.
To improve performance of long-lived aggregates, we can use snapshots to reduce the number of events which need to be replayed.
Most of the services consume messages from other services, thus it's important to build idempotent message consumer. Event sourcing-based business logic must also implement the same idempotent behavior. The concrete implementation depends on whether the event store users a RDBMS or a NoSQL database. As events are stored in an event store (which represents a hybrid of a database and message broker). Once a service stores event in event store, it automatically delivers an event to subscribers.
One of the challenges with event sourcing approach is evolution of events. The application must be capable of handling multiple event versions when replaying events. A good solution is to use upcasting, which upgrades events to the latest version when they are loaded from the event store. The other tricky problem is data deletion. Encryption and pseudonymization must be used in order to comply with e.g. GDPR.
Event sourcing is a simple way to implement choreography-based saga. Services have event handlers that listen to the events published by event sourcing based aggregates. Also this approach is good way to implement orchestration-based saga.
Let's compare the pros and cons of event sourcing:
pros:
- reliably publishes domain events
- preserves the history of aggregates
- (mostly) avoids O/R impedance mismatch problem
- provides a time machine to developers
cons:
- different programming model with a learning curve
- evolving events can be challenging
- deleting data can be tricky
- querying event store is chanllenging
In the monolithic application querying for data is an easy task as the data resides in one place. On contrary in microservice architecture each service owns it's data, and querying for it is challenging as data is scattered. There are two patterns for implementing query operation:
- API composition pattern: gathers the data from multiple services. It makes clients of the service that own the data responsible for invoking the service and combining the result. This is the simplest approach and it should be used whenever possible
- Command query responsibility segregation (CQRS) pattern: maintains one or more view databases whose sole purpose is to support queries. It is more powerful than API composition pattern, but also more complex.
Each of the above mentioned patterns has it's own issues which should be taken into consideration:
API composition pattern:
- which component should be API composer?
- how to write efficient aggregation logic?
- increases overhead
- introduces risk of reduced availability
- lack of transactional data consistency
CQRS pattern:
- more complex architecture
- dealing with replication lags
- dealing with concurrent updates as well as detecting and dismissing duplicate events
- handling consistency of CQRS views (moved to client)
After all this being said, depending on the problem which needs to be solved you will choose appropriate pattern. API composition pattern should be the first choice (whenever possible due to simplicity), and CQRS is more robust solution (efficient implementation of queries in microservice architecture, efficient implementation of diverse queries, improves separation of concerns…)
The task of designing API is challenging if there is more than one client as different client types require different data. Consequently, one-size-fits-all API doesn't make sense to expose.
There are numerous drawbacks of using one-size-fits-all API, let's mention some of them:
- poor user experience due to numerous requests (mobile)
- lack of encapsulation requires frontend developers to change their code in parallel with backend developers (mobile)
- services might use client-unfriendly protocols (mobile)
- network latency (web)
- firewall (web)
- API stability and maintainability for third-party consumers (web, mobile, other)
To overcome these problems API gateway pattern can be used. An API gateway is a service that's an entry point for the outside world. It provides each client with a custom API, it's responsible for request routing, API composition, protocol translation and implementation of edge functions (such as authentication, authorization, rate limiting, caching, metrics collection and request logging).
An API gateway has a layered, modular architecture. It can be designed in a way of a single API gateway or it can use backends for frontends (BFF) pattern (defines a gateway fro each client type). The BFF pattern gives developer teams greater autonomy, because they can develop, deploy and operate on their own API gateway.
Let’s compare the pros and cons of event sourcing:
pros:
- encapsulates internal structure of the application
- clients talk to the gateway instead of invoking specific service(s)
- provides client-specific API
- simplifies client code
cons:
- another component which needs to be developed, deployed and managed
- risk of becoming a bottleneck
- must be updated every time when new API (from internal services) needs to be exposed
There are numerous technologies which can be used to implement API gateway, either off-the-shelf product or developing your own using a framework. If you decide for custom implementation you must bear in mind the following things when designing an API gateway: (a) performance and scalability, (b) writing maintainable code by using reactive programming abstractions, (c) handling partial failures and (d) being a good citizen in microservice architecture (discovery and observability).
It is outmost important to have automated tests in the system. Automated tests are the key factor of rapid and safe delivery of software. The purpose of test is to verify the behavior of the system under test (SUT), where by system you can refer to any software element under test (from class method to entire application).
To speed up and simplify the tests you can use test doubles. A test double is an object which simulates the behavior of SUT's dependency. There are two types of test doubles: stubs (returns a value to the SUT) and mocks (verifies that SUT is invoking correct dependency).
Use the test pyramid to facilitate testing efforts.
Use contracts to drive testing of interactions between services. This way we can write tests which verify that the adapters of multiple services in action conform to the contract. By using this approach test will run faster and will eliminate transitive dependencies.
To verify the behavior of service via its API, component test should be written. As mentioned earlier for these type of tests stubs should be used.
At the end you could write end-to-end tests. In order to minimize the number of end-to-end tests, which are slow, brittle and time-consuming, end-to-end test can be combined in user journey tests. A user journey test simulates the journey through the application and verifies high-level behavior of (relatively) large part of application's functionality. By having these type of test, the amount of overhead (setup) per test is minimized, which speeds up the test. There should be a few tests which will cover a wide range of application's functionalities.
It's not enough that a service implements only functional requirements, it must also ensure that it satisfies three critically important quality attributes: security, configurability and observability.
Security in a microservice architecture is not different than in a monolithic architecture. The responsibility of every application developer is to implement the following aspects of security: authentication, authorization, auditing and secure interprocess communication. The majority of security aspects can be easily replicated to microservice architecture, but there are some aspects which are necessarily different:
- how user identity is passed between API gateway and the services
- who is responsible for authentication and authorization
A commonly used approach is for the API gateway to authenticate the clients, by including transparent token (e.g. JWT) in each request to a service. The token contains the identity of the principal and their roles. The services then use the information from the token to authorize access to resource(s). OAuth 2.0 is a good foundation for security.
A service typically uses one or more external services (e.g. database, message broker…). The location of these services, as well as credentials for each, almost always depend on the environment that the service is running in. The externalized configuration pattern must be applied and a mechanism that provides a service with configuration properties (at runtime) must be implemented. There are two main approaches:
- push model where the deployment infrastructure passes the configuration properties to the service instance using environment variables or configuration file
- pull model where the service instance reads its configuration properties from a configuration server
Developers and operations share responsibility for implementing observability in a system. Operations is responsible for infrastructure (handling log aggregations, metrics, exception tracking and distributed tracing), and developers are responsible for ensuring that the services are observable. Services must have health check API endpoint, generate log entries, collect and expose metrics, report exceptions to an exception tracking service and implement distributed tracing.
In order to simplify and accelerate development, you can use microservices chassis and develop services on top of it. A chassis is a framework, or set of frameworks, that handles various cross-cutting concerns. Over time most likely majority of the network related functions of a microservice chassis will migrate into service mesh: a layer of infrastructure software through which all of the service's network traffic flows.
When it comes to deployment of microservice(s) you should choose the most lightweighted pattern that supports service's requirements. In order to choose the best one, evaluate the options in the following order: serverless, containers, virtual machines and language-specific packages.
A serverless deployment is a the best option as it eliminates the need to administer operating system and runtimes and provides automated elastic provisioning and request-based pricing. But this deployment pattern is not a good fit for every service, because of long-tail latencies and the requirement to use an event/request-based programming model.
Docker containers are more flexible than serverless deployment and have more predictable latency. It's best to use Docker orchestration framework (e.g. Kubernetes) which manages containers on a cluster of machines. The drawback of this approach is that you must administer the operating system and runtimes and (most likely) Docker orchestration framework and VMs that it runs on. Benefits of this approach are: encapsulation of technology stack, isolated service instances and constrained service instance resources; and the drawbacks are: administration of the container infrastructure and possibly the VM infrastructure it runs on.
Virtual machines are a heavyweight deployment option, and it infuluences deployment time as well as resources. On the other hand, modern clouds (e.g. Amazon EC2) are highly automated and provide rich set of features. Having this in mind, it can be easier to deploy simple application using VMs than using Docker orchestration framework. Benefits of this approach are: VM image encapsulates technology stack, isolated service instances and usage of mature cloud infrastructure; and the drawbacks are: less-efficient resource utilization, relatively slow deployments and system administration overhead.
Deploying services as language-specific packages is generally best avoided, unless there is a small number of services. This approach is most likely be used to deploy monolithic application, but once there is something in place you should consider setting up a sophisticated deployment infrastructure. Benefits of this approach are: fast deployment and efficient resource utilization (when running multiple instances on the same machine or within the same process); and the drawbacks are: lack of encapsulation of technology stack, no ability to constrain the resources consumed by service instance, lack of isolation when running multiple service instances on the same machine and automatically determining where to place service instances can be challenging.
By using service mesh you can deploy a service to production, test it, and only then route production traffic to it. Separating deployment from release improves the reliability of rolling out new version of services.
Now we know which patterns to use when dissolving the monolith, but is it the right time to move to microservice architecture? To answer that ask if delivery process is slow or you have buggy software releases or you have poor scalability or … If there are multiple yes answers, then it's right time to move away from monolithic hell.
It's important to migrate to microservice by incrementally developing a strangler application: a new microservice application built around the existing monolithic application. Once a progress is made, demonstrating its value early and often will ensure business to support the migration effect. All changes to the monolithic application should be minimized as making changes can be time consuming, costly and risky. And after all we want to migrate them all to microservice(s). Deployment infrastructure is not needed as prerequisite, invest as minimal as possible in it, after the migration is done, then it will be clear what is actually needed to support the process.
There are three main strategies for strangling the monolith and incrementally replace it with microservices:
- implement new feature as service: by doing so it enables you to quickly and easily develop a feature using modern technology (or different freamework). It's a good way to demonstrate the value of migrating to microservice
- separate presentation tier and backend: this results in two "smaller" applications. Although it's not a huge improvement, it will enable deployment of these two monoliths independently
- break up the monolith by extracting functionality into services: for which it's important to focus on the services which will provide the most benefit (e.g. most in use, number of bugs is low, …)
Newly developed service almost always needs to interact with the monolith (e.g. service needs monolith's data and functionality and vice versa). By developing integration glue, which consists of inbound and outbound adapters, we will enable this type of collaboration. But we must be aware of data model pollution, and to overcome this integration glue must implement anti-corruption layer, which is a layer of software that translates between domain models.
One way to minimize impact of the extraction monolithic functionality to a microservice is to replicate the data, that was moved to the service, back to the monolith's database. As monolith's database remains unchanged this will eliminate the need to make potentially widespread changes to the monolith's code base.
Extracting a microservice or developing a new one can require to implement sagas in which monolith needs to take part. It can be a challenging to implement a compensatable transaction that requires widespreaded changes in the monolith. Consequently, you need to be careful when extracting functionalities to microservice to avoid implementing of compensatable transactions in the monolith. Always check the sequence of steps and if they can be reoreded or if something else makes more sense to be extracted first.
When refactoring to a microservice architecture you need to simultaneously support the monolithic application's existing security mechanism. But there is a simple solution which can be used to modify the monolithic's login handler to generate a cookie containing a security token, which is then forwarded to the service by the API gateway.
Now that we are aware of all the patterns and how to migrate to microservice architecture it's your turn to take this path. Bare in mind that can be a bumpy road, with lot's of challenges but it will be rewarding. Before making a definitive decision to go for microservice architecture, stop and ask yourself: is this the best way to go and will it accelerate delivery of my company's software?