Cloud Design Patterns - Saga distributed transactions pattern
The Saga design pattern is a way to manage data consistency across microservices in distributed transaction scenarios.
A saga is a sequence of transactions that updates each service and publishes a message or event to trigger the next transaction step.
If a step fails, the saga executes compensating transactions that counteract the preceding transactions.
Reading material
- https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/saga/saga
- https://microservices.io/patterns/data/saga.html
Problem
How to implement transactions that span services?
Solution
The Saga pattern provides transaction management using a sequence of local transactions. A local transaction is the atomic work effort performed by a saga participant. Each local transaction updates the database and publishes a message or event to trigger the next local transaction in the saga. If a local transaction fails, the saga executes a series of compensating transactions that undo the changes that were made by the preceding local transactions.
There are two common saga implementation approaches, choreography and orchestration. Each approach has its own set of challenges and technologies to coordinate the workflow.
Choreography
https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/saga/saga#choreography
Choreography is a way to coordinate sagas where participants exchange events without a centralized point of control. With choreography, each local transaction publishes domain events that trigger local transactions in other services.
Orchestration
https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/saga/saga#orchestration
Orchestration is a way to coordinate sagas where a centralized controller tells the saga participants what local transactions to execute. The saga orchestrator handles all the transactions and tells the participants which operation to perform based on events. The orchestrator executes saga requests, stores and interprets the states of each task, and handles failure recovery with compensating transactions.