In this post I want to talk about different variants of database transactions, I assume you already know about database transactions. So, let's cut to the chase.
Flat Transaction
Flat transaction are those regular transactions we know about and are the most common transactions in the DBMSs Flat transactions are simple but they can not address two problems:
- Multi stage transactions
For example, Let's say we want to book flight from City C1 to C2. Since there is no direct fly we
have to book 4 flights from, C1 -> CA -> CB -> C2
. The process of booking each of these flights
is a transaction by itself and the whole process is a transaction too.
- Bulk updates
Let's say we want to update billion tuples. What if the very last tuple fails to update and cause the transaction to abort. Then we need to revert the changes made by the transaction and revert a billion tuples which obviously is a huge task.
Transaction Savepoints
These transactions are similar to flat transaction with addition of one extra thing which is save points. So any where in there transaction users case ask for a save point and again they can rollback to a save point or rollback the entire transaction.
BEGIN READ(A) WRITE(A) SAVEPOINT 1 WRITE(B) SAVEPOINT 2 ROLLBACK TO 1 COMMIT
Note: These transactions only solve the multi stage transaction problem.
Nested transactions
Nested transactions are similar to save points transactions, but instead save points these transactions break down to smaller flat transactions. Each transaction commits separately from other transactions. But result of the parent transaction rule them all, so if the parent transaction fails all the nested transactions have to rollback.
BEGIN BEGIN T1 ... COMMIT T1 BEGIN T2 ... COMMIT T2 COMMIT
Note: These transactions only solve the multi stage transaction problem.
Chained transactions
In these kind of transactions, smaller flat transaction can be applied in a chain in the way that the result of each of them is not visible to the outside world until the end of the chain.
In theory chained transactions should be applied in sequence but in practice in some cases we can interleave their operations. Also Between T1 and T2 of a chained transaction, No other thread of code shouldn't be able to make changes to those resources which T2 will operates on.
If any transaction in the chain fails, it has nothing to do with the previous transactions in the chain. For example:
T1 -> T2 -> T3 S -> S -> F
In the chained transaction above only T3 failed and T1 and T2 are successfully committed to storage. We don't have to roll them back.
While chained transactions can break big transactions into smaller pieces for better parallelism but they only solve the multi stage transaction problem Not the bulk update problem.
Compensating transactions
This type of transactions are special transactions which are designed to semantically reverse the effect of another transaction which already committed.
One important thing to remember about compensating transactions is that they know how to revert the logical effect of other transactions NOT the physical effect. For example, If a transaction increase a counter by one. The physical revert would be to changes the binary data that inserted for that counter to what it was before the transaction, but the logical revert would be to decrease the counter by one when needed.
So basically these kind of transactions know about how to put the database in correct state before the other transaction.
Saga Transactions
A saga transaction is a sequence of chained transactions T1 - Tn
and compensating transaction
C1 - C(n-1)
where the following guaranteed:
- The transactions will commit in the order
T1...Tj
,Cj...C1
(wherej < n
).
So basically this means that a saga transaction is a seq of chained transactions which applies the smaller transactions in order with their corresponding compensating transactions.
In a chained transaction when ever transaction Tn aborts, the transactions before Tn stay committed, but in saga transactions they will be rollback using compensating transactions that know how to roll them back logically.
So Saga transactions can be fix both multi-staging and bulk update problems. But the issue here is that the compensating transactions are something that requires application level understanding of the use case so most of the time they are implemented in the application frameworks instead of DBMSs.