ACID Transactions

Article Info

Contributed by
2 authors

Last updated on
2021-06-28 16:26:21

Improve this article

CAP Theorem
Two-Phase Commit Protocol
Distributed Computing
Distributed Database
Types of Databases
Relational Database Management System

Article Versions

6 2021-06-28 16:26:21
2610,2604 6,2610

By arvindpdmn

Spelling correction.
5 2021-06-28 16:18:40
2604,1549 5,2604

By arvindpdmn

Spelling correction.
4 2019-08-18 13:27:58
1549,1545 4,1549

By lokesh.rawat

Added more info on distributed ACID.
3 2019-08-18 07:21:13
1545,1544 3,1545

By arvindpdmn

Publishing.
2 2019-08-18 07:19:26
1544,1543 2,1544

By lokesh.rawat

Adding content in all sections. Could be improved with X/Open DTP, 2PC and Saga.

Chat Room

Submitting ...

You are editing an existing chat message.
2021-06-28 16:19:08
-

By devbot5S

[URL Check] The following URLs in this article are outdated. Please update.

Redirected URLs:
References: https://www.ibm.com/support/knowledgecenter/en/SSGMCP_5.4.0/product-overview/acid.html → https://www.ibm.com/docs/en/cics-ts/5.4?topic=processing-acid-properties-transactions

Computer systems are rarely perfect and this applies to databases as well. To ensure that data is updated in a consistent manner, a sequence of operations are often grouped together into what's called a transaction. We may say that a transaction is something that changes the system state.

In case something should go wrong when executing these operations, the transaction as a whole can be aborted. This means that the database can be safely rolled back to the consistent state before the transaction. In database design and implementation, a transaction must fulfil four essential properties: Atomicity, Consistency, Isolation, Durability (ACID).

Monolithic or SQL databases adopted ACID in their transactions but distributed or NoSQL databases didn't. However, distributed databases are starting to support ACID without impacting performance.

Discussion

What's the meaning of the acronym ACID?
ACID transactions. Source: Yaseen 2019.
ACID can be explained as follows:
- Atomicity: All operations of a transaction happen as if they're a single operation. All changes are performed or none at all. If one operation in a transaction fails, the entire transaction fails and completed operations are rolled back.
- Consistency: Data starts with a consistent state and ends up in a consistent state. During the transaction, it may be inconsistent, but at the end, data is left in a consistent state.
- Isolation: Intermediate state of a transaction is not visible to other concurrent transactions. Concurrent transactions are effectively serialized.
- Durability: When a transaction completes, changes are stored in a persistent manner. Even if there's a power failure or other system errors, the effect of the completed transaction remains.
Could you explain ACID with an example?
A transaction showing debit and credit. Source: Steffensen 2009, slide 36.
Consider a money transfer transaction. At the minimum, it would involve two operations: debiting from one account, then crediting to another account. What happens if money is debited from an account, then some failure happens (disk crash or power outage), and the receiver's account is never credited? This scenario leaves the database in an inconsistent state. To guard against this, we wrap both these operations into a single transaction that's expected to have ACID properties.
So now, even if there's a failure, atomicity ensures that the transaction will rollback the debit operation. In a normal successful scenario, both accounts will be updated correctly. It's not an issue if there's a failure after the transaction, since accounts have been updated in some persistent storage.
Suppose another transaction T2 queries the database when the money transfer transaction T1 is in progress. It will see original balance in the sender's account since T1 hasn't completed, even when the debit operation has happened within T1. This is the concept of isolation.
How do databases meet the ACID requirements?
Transaction states and their transitions. Source: TutorialsPoint 2019.
A transaction moves through various states. In the end, it can either succeed (committed) or fail (aborted). When committed, all changes made by the transaction become persistent. When aborted, changes made by the transaction are rolled back so that the starting state of the database is returned. This is usually implemented by making changes in a temporary space and then committing the entire transaction as a final step.
Consistency is ensured by having a strongly consistent core involving a single operation on a single row. Integrity checks are built on top of this for multiple operations across multiple rows.
For isolation, read-write locking and multiversion concurrency control are two approaches. We also use globally ordered timestamps so that transactions can be organized serially. Partial ordering is possible when sub-transactions belonging to multiple transactions interleave. This lowers latency but doesn't give full isolation. Instead, a mechanism to detect write conflicts is used.
What are the some problems when isolation is not proper?
Isolation levels in SQL standard. Source: Mihalcea 2014.
Dirty read happens when full isolation is not present across transactions. A transaction updates a value but it's not yet committed. Another transaction reads the uncommitted value but the original transaction rolls back the changes due to an error.
What happens if consecutive reads give different values? This problem is called non-repeatable read. This can be solved by locking the record until the current write transaction has completed.
Another problem is phantom read that happens when a new record is inserted, and this record matches the selection criterion issued by another concurrent transaction. The reading transaction will end up using stale data. Range locking or predicate locking is the solution to this problem.
Are there scenarios where we can relax ACID requirements?
Modern databases offer distributed ACID. Source: Choudhury 2018b.
Database operations are either write-intensive transactional (OLTP) or read-intensive analytical (OLAP). ACID is essential for OLTP but can be relaxed for OLAP.
Likewise, ACID has been traditionally relaxed in NoSQL or distributed databases where the focus has been on scale and availability. Strong consistency is sacrificed for eventual consistency. This is because in a distributed database it's inefficient to globally to acquire and release locks. However, modern databases of the late 2010s are bringing back ACID even in a globally distributed environment. Examples include Google Cloud Spanner, CockroachDB, TiDB, YugaByte DB, FoundationDB and FaunaDB.
The point is that where ACID is not guaranteed by the database, developers should take of consistency and isolation in the application logic. This may end up being more expensive in terms of development effort. ACID may be essential in the banking and finance industry but not every application requires ACID. Developers should evaluate if eventual consistency or "ACID in practice" is adequate for their application.
What do you mean by Distributed ACID?
The simplest transaction in a distributed database is single row ACID where a transaction impacts only one row. Single shard ACID is where the transaction impacts multiple records but they're all on the same shard, which is present on a single node. A true Distributed ACID is when the transaction impacts records across shards and nodes.
Achieving distributed ACID is not trivial. A transaction manager that uses Two-Phase Commit (2PC) protocol can be used for atomicity. Paxos or Raft consensus protocols can be used for consistency. To synchronize times for isolation, Network Time Protocol (NTP) can be used.
Not all distributed databases support distributed ACID. For example, since version 4.0, MongoDB supports single shard ACID but not distributed ACID.
Consider Amazon Aurora, Amazon DynamoDB or Azure Cosmos DB. These have multiple masters with data replication. These are not compliant to distributed ACID. Each master has the entire data. Write conflicts are resolved using Last-Write-Wins (LWW) or Conflict-Free Replicated Data Types (CRDT).

Milestones

1981

Jim Gray at Tandem Computers defines a transaction and notes that it must have three properties: consistency, atomicity and durability. At this point, neither isolation as a property nor ACID as an acronym is mentioned. He explains how the transaction concept can be implemented as time-domain addressing or logging plus locking. Tandem itself implemented these ideas in its NonStop approach, thus making their systems highly fault-tolerant.

1983

Based on Gray's concept of a transaction, Theo Haerder and Andreas Reuter identify isolation property. They also coin the term ACID. They state that "transaction is the only unit of recovery in a database system".

1985

Evolution of transaction management since ACID. Source: Wang et al. 2008, fig. 1.

From the mid-1980s, as business applications get more complex, new transaction models appear: nested transactions, chained transactions, distributed transactions. Based on application semantics, a transaction is divided into sub-transactions. The concept of save points (invented in the 1970s) becomes useful to rollback to specific sub-transactions. However, ACID guarantees are only at the transaction level. Thus, ACID requirements are relaxed for these complex applications.

2000

With the growth of NoSQL and distributed databases since the early 2000s, CAP Theorem becomes the blueprint for designing a database system. Likewise, database system designers prefer to adopt BASE (Basic Availability, Soft-state, Eventual Consistency). In short, scale, resilience and availability are considered more important than immediate consistency.

2012

Google publishes a paper on Spanner, a globally-distributed database that gives high performance without sacrificing strong consistency. It's only in May 2017 that it becomes production ready under the name of Cloud Spanner. Likewise, researchers at Yale University publish details of Calvin for distributed transactions on partitioned database systems. These are couple of examples showing the trend towards distributed ACID from the mid-2010s.

References

Article Stats

1394

Words

Authors

Edits

Chats

Likes

10K

Hits

Cite As

Devopedia. 2021. "ACID Transactions." Version 6, June 28. Accessed 2023-11-12. https://devopedia.org/acid-transactions

Contributed by
2 authors

Last updated on
2021-06-28 16:26:21

Improve this article

data distributed computing database

CAP Theorem
Two-Phase Commit Protocol
Distributed Computing
Distributed Database
Types of Databases
Relational Database Management System

ACID Transactions

Discussion

Milestones

References

Further Reading

Article Stats

Cite As

See Also

ACID Transactions

Discussion

Milestones

References

Further Reading

Article Stats

Author-wise Stats for Article Edits

Cite As

See Also

Article Warnings

Login