An Updating Transaction: Distributed Transactions

Distributed Transactions
Transaction concepts Centralized/Distributed Transaction Architecture Schedule concepts Locking schemes Deadlocks Distributed commit protocol ( 2PC )
Distributed Systems
An Updating Transaction
Updating a master tape is fault tolerant: If a run fails for any reason, all the tape could be rewound and the job restarted with no harm done.
Distributed Systems 2
Transaction concepts

OS Processes DBMS Transactions A collection of actions that make consistent transformation of system states while preserving consistency Termination of transactions commit vs abort Example:
T:x=x+y R(x) Read(x) Read(y) Write(x) Commit W(x) R(y) C
Distributed Systems
Properties of Transactions: ACID
Atomicity
All or Nothing
Consistency
No violation of integrity constants, i.e., from one
consistent state to another consistent state
Isolation
Concurrent changes invisible, i.e. serializable
Durability
Committed updates persist (saved in permanent storage)
Transaction Processing Issues
Transaction structure
Flat vs Nested vs Distributed
Internal database consistency

Semantic data control and integrity enforcement
Reliability protocols
Atomicity and durability Local recovery protocols Global commit protocols
Concurrency control algorithms Replica control protocols
Distributed Systems
Basic Transaction Primitives

Primitive BEGIN_TRANSACTION END_TRANSACTION ABORT_TRANSACTION READ WRITE Description Make the start of a transaction Terminate the transaction and try to commit Kill the transaction and restore the old values Read data from a file, a table Write data to a file, a table
Distributed Systems
A Transaction Example
BEGIN_TRANSACTION reserve BJ -> JFK; reserve JFK -> TTY; reserve TTY -> MON; END_TRANSACTION (a) BEGIN_TRANSACTION reserve BJ -> JFK; reserve JFK -> TTY; reserve TTY -> MON full => ABORT_TRANSACTION (b)
a) b)
Transaction to reserve three flights commits Transaction aborts when third flight is unavailable
Distributed Systems
Transaction execution
Distributed Systems
Nested vs Distributed Transactions
Distributed Systems
Flat/nested Distributed Transactions

T11 S
1
S
3
S T1
1
T12 T S
2
S
4
S
0
T21
S
5
T2 S
3
T22
S
6
S
2
S
7
(a) Distributed flat
(b) Distributed nested
A circle (Si) denotes a server, and a square (Tj) represents a sub-transaction.

Distributed Transaction
A distributed transaction accesses resource managers distributed across a network When resource managers are DBMSs we refer to the system as a distributed database system Each DBMS might export stored procedures or an SQL interface. In either case, operations at a site are grouped together as a subtransaction and the site is referred to as a cohort of the distributed transaction Coordinator module plays major role in supporting ACID properties of distributed transaction
Transaction manager acts as coordinator

Distributed Transaction execution
Distributed Systems
12
Distributed ACID
Global Atomicity: All subtransactions of a distributed transaction must commit or all must abort.
An atomic commit protocol, initiated by a coordinator (e.g., the
transaction manager), ensures this.

Coordinator must poll cohorts to determine if they are all willing
to commit.
Global deadlocks: there must be no deadlocks involving multiple sites Global serialization: distributed transaction must be globally serializable
Schedule
Synchronizing concurrent transactions

Data base remains consistent Maximum degree of concurrency
Transaction execute concurrently but the net effect of the resulting history is equivalent to some serial history Conflict equivalence
The relative order of execution of the conflicting
operations belonging to unaborted transactions in the two schedules is same
Distributed Systems
14
Lost Update Problem

BEGIN_TRANSACTION(T) Kwithdraw(A 40) Kdeposit(B 40) END_TRANSACTION(T) BEGIN_TRANSACTION(U) Kwithdraw(C 30) Kdeposit(B 30) END_TRANSACTION(U)
Operations A.balance A.read() A.write(A.balance 40)
balance (A) 100 (A) 60
Operations
balance
C.balance C.read() C.write(C.balance 30) B.balance B.read() (B) 200 B.balance B.read() B.write(B.balance + 30) B.write(B.balance + 40) (B) 240
(C) 300 (C) 270
(B) 200 (B) 230
Transaction T and U both work on the same bank branch K

Inconsistent Retrieval Problem

BEGIN_TRANSACTION(T) Kwithdraw(A 100) Kdeposit(B 100) END_TRANSACTION(T) BEGIN_TRANSACTION(U) KTotal_balance(A, B, C) END_TRANSACTION(U)
balance (A) 200 (A) 100
Operations
total_balanc e
total_balance A.read() total_balance total_balance + B.read() total_balance total_balance + C.read() B.balance B.read() B.write(B.balance + 100) (B) 200 (B) 300 .
100 100 + 200 300 + 200

Serial Equivalent
balance (A) 100 (A) 60
Operations
balance
C.balance C.read() C.write(C.balance 30) B.balance B.read() B.write(B.balance + 40) (B) 200 (B) 240 B.balance B.read() B.write(B.balance + 30)
(C) 300 (C) 270
(B) 240 (B) 270

Serializability
BEGIN_TRANSACTION x = 0; x = x + 1; END_TRANSACTION (a) Schedule 1 Schedule 2 Schedule 3 BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION (b) BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION (c) Legal Legal Illegal
x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3 x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3; x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3; (d)
a) c) Three transactions T1, T2, and T3 d) Possible schedules

Concurrency control Algorithms
Pessimistic
Two-Phase locking based(2PL) Centralized 2PL Distributed 2PL Timestamp Ordering (TO) Basic TO Multiversion TO Conservative TO Hybrid
Optimistic
Locking and TO based
Distributed Systems
19
Two-Phase Locking (1)

A transaction locks an object before using it When an object is locked by another transaction, the requesting transaction must wait When a transaction releases a lock, it may not request another lock Strict 2 PL hold lock till the end The scheduler first acquires all the locks it needs during the growing phase The scheduler releases locks during shrinking phase
Distributed Systems
21

Operations BEGIN_TRANSACTION(T) A.balance A.read() A.write(A.balance 40) B.balance B.read()
balance
Operations BEGIN_TRANSACTION(U)
balance
lock (A) 100 (A) 60 lock (B) 200
C.balance C.read() C.write(C.balance 30)
lock (C) 300 (C) 270
B.balance B.read() B.write(B.balance + 40) END_TRANSACTION(T) (B) 240 release (A) (B) B.write(B.balance + 30) END_TRANSACTION(U) Distributed Systems
Waiting for (B)s lock
lock (B) 240 (B) 270
22
Centralized 2PL
One 2PL scheduler in the distributed system Lock requests are issued to the central scheduler
Primary 2PL
Each data item is assigned a primary copy, the lock manager on that
copy is responsible for lock/release

Like centralized 2PL, but locking has been distributed
Distributed 2PL
2PL schedulers are placed at each site and each scheduler handles
lock requests at that site

A transaction may read any of the replicated copies by obtaining a
read lock on one of the copies. Writing into x requires obtaining write lock on all the copies.
Timestamp ordering (1)
Transaction Ti is assigned a globally unique time stamp ts(Ti) Using Lamports algorithm, we can ensure that the timestamps are unique (important) Transaction manager attaches the timestamp to all the operations Each data item is assigned a write timestamp (wts) and a read timestamp (rts) such that:
rts(x) = largest time stamp of any read on x wts(x) = largest time stamp of any write on x
Distributed Systems
24
Timestamp ordering (2)
Conflicting operations are resolved by timestamp order, let ts(Ti) be the timestamp of transaction Ti, and Ri(x), Wi(x) be read/write operation from Ti For Wi(x) if (ts(Ti) < wts(x) and ts(Ti) < rts(x)) then reject Wi(x) (abort Ti) else accept Wi(x) wts(x) ts(Ti)
25
for Ri(x): if (ts(Ti) < wts(x)) then reject Ri(x) (abort Ti) else accept Ri(x) rts(x) ts(Ti)
Distributed Systems
Distributed commit protocols

How to execute commit for distributed transactions Issue: How to ensure atomicity and durability One-phase commit (1PC): the coordinator communicates with all servers to commit. Problem: a server can not abort a transaction. Two-phase commit (2PC): allow any server to abort its part of a transaction. Commonly used. Three-phase commit (3PC): avoid blocking servers in the presence of coordinator failure. Mostly referred in literature, not in practice.
Two phase commit (2PC)
Consider a distributed transaction involving the participation of a number of processes each running on a different machine, and assuming no failure occur Phase1: The coordinator gets the participants ready to write the results into the database Phase2: Everybody writes the results into database
Coordinator: The process at the site where the transaction originates
and which controls the execution

Participant: The process at the other sites that participate in
executing the transaction
Global commit rule:

The coordinator aborts iff at least one participant votes to abort The coordinator commits iff all the participants vote to commit
2PC phases
Distributed Systems
28
2PC actions
Distributed Systems
29
Distributed 2PC
The Coordinator initiates 2PC The participants run a distributed algorithm to reach the agreement of global commit or abort.
10
Problems with 2PC
Blocking
Ready implies that the participant waits for the coordinator If coordinator fails, site is blocked until recovery Blocking reduces availability
Independent recovery not possible

Independent recovery protocols do not exist for multiple site
failures
Distributed Systems
31
Deadlocks

If transactions follow 2PL, then it may have deadlocks Consider the following scenarios:
w1(x) w2(y) r1(y) r2(x) r1(x) r2(x) w1(x) w2(x)
Deadlock management
Ignore Let the application programmers handle it Prevention no run time support Avoidance run time support Detection and recovery find it at your own leisure!!
Distributed Systems
32
Deadlock Conditions
T W V
T U U V W
(a) Simple circle
(b) Complex circle
Mutual exclusion Hold and wait (a) TUVWT No preemption (b) VWTV Circular chain VWV
11
Deadlock Avoidance
WAIT-DIE rule
If (ts(Ti) < ts(Tj)) then Ti waits else Ti dies Non preemptive Ti never preempts Tj Prefers younger transactions
WOUND-WAIT rule
If (ts(Ti) < ts(Tj)) then Tj is wounded else Ti waits Preemptive Ti preempts Tj if it is younger Prefers older transactions
Problem: very expensive, as deadlock is rarely and randomly occurred, forcing deadlock detection involves system overhead.
Deadlock Detection
Deadlock is a stable property. With distributed transaction, a deadlock might not be detectable at any one site But a deadlock still comes from cycles in a Wait for graph Topology for deadlock detection algorithms
Centralized periodically collecting waiting states Distributed Path pushing Hierarchical build a hierarchy of detectors
Distributed Systems
35
Centralized periodically collecting waiting states

wait S1 A W S2 B
lock wait
lock U lock S3 wait D C V
lock
S1 U V S2 V W S3 W U
12

An Updating Transaction: Distributed Transactions

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

An Updating Transaction: Distributed Transactions

Încărcat de

Drepturi de autor:

Formate disponibile

Distributed Transactions

Properties of Transactions: ACID

consistent state to another consistent state

Transaction Processing Issues

Internal database consistency

Concurrency control algorithms Replica control protocols

Basic Transaction Primitives

Nested vs Distributed Transactions

Flat/nested Distributed Transactions

(a) Distributed flat

(b) Distributed nested

A circle (Si) denotes a server, and a square (Tj) represents a sub-transaction.

Transaction manager acts as coordinator

Distributed Transaction execution

transaction manager), ensures this.

Synchronizing concurrent transactions

operations belonging to unaborted transactions in the two schedules is same

Lost Update Problem

Operations A.balance A.read() A.write(A.balance 40)

balance (A) 100 (A) 60

(C) 300 (C) 270

(B) 200 (B) 230

Transaction T and U both work on the same bank branch K

Inconsistent Retrieval Problem

Operations A.balance A.read() A.write(A.balance 100)

balance (A) 200 (A) 100

100 100 + 200 300 + 200

Transaction T and U both work on the same bank branch K

Operations A.balance A.read() A.write(A.balance 40)

balance (A) 100 (A) 60

(C) 300 (C) 270

(B) 240 (B) 270

Transaction T and U both work on the same bank branch K

a) c) Three transactions T1, T2, and T3 d) Possible schedules

Concurrency control Algorithms

Two-Phase Locking (1)

Two-Phase Locking (2)

Two-Phase Locking (3)

Operations BEGIN_TRANSACTION(T) A.balance A.read() A.write(A.balance 40) B.balance B.read()

lock (A) 100 (A) 60 lock (B) 200

C.balance C.read() C.write(C.balance 30)

lock (C) 300 (C) 270

Waiting for (B)s lock

lock (B) 240 (B) 270

Two-Phase Locking (4)

copy is responsible for lock/release

lock requests at that site

Timestamp ordering (1)

Timestamp ordering (2)

Distributed commit protocols

Two phase commit (2PC)

and which controls the execution

executing the transaction

Global commit rule:

Problems with 2PC

Independent recovery not possible

(a) Simple circle

(b) Complex circle

Centralized periodically collecting waiting states

lock U lock S3 wait D C V

S-ar putea să vă placă și