Sunteți pe pagina 1din 70

Transaction processing (Chapter 21)

What is a transaction?
A transaction is an executing program that forms a logical
unit of database processing.

• insert/update/delete/retrieval – are transactions


• Only retrievals – read-only transaction
• Otherwise, read-write transaction/write transaction

• Transaction can be Embedded in application program OR


• Can be Specified interactively by SQL (or other high level
query language)

1
Example of Transaction processing system

• Airline reservation system,


• Railway reservation system,
• Banking system,
• Credit card processing system,
• Online retail purchasing system,
• Stock markets database system,
• Supermarket checkouts system &
• the likes

2
Features & Requirements of
Transaction processing system

Features
Large database &
Hundreds of concurrent users!

Requirements
High availability &
Fast response time

3
Transaction on a bank database

Transaction T1
Transfer Rs.50000/- from a/c no. 1259 to a/c no 3245.

Transaction T2
Checking balance in a/c no. 3245.

Transaction T3
Deposit Rs.70000/- into a/c no. 3245.

4
Transactions on university database

Transaction T4
Register courses for the next semester

Transaction T5
View attendance status

Transaction T6
View semester results

Transaction T7
Withdraw from a course

5
ACID properties of transaction

A Atomicity

C Consistency

I Isolation

D Durability

6
Atomicity (All or Nothing)

 Either all operations of a transaction are


executed.
OR
 None of the operations of the transaction is
executed.

 A transaction should either be performed in its


entirety or not performed at all.
 It is the responsibility of the recovery manager
( a module of DBMS) to ensure atomicity.

7
Consistency

 A transaction should take a DB from one legal


state to another legal state.

 Consistency is defined by database


programmer & is enforced by DBMS software.

8
Isolation

 Isolation means that a transaction cannot see


intermediate results produced by some other
transaction.

 Isolation is ensured by concurrency control


module of DBMS.

9
Durability

 Changes made in a database cannot be undone


after the changes had been committed &
 these changes should not be lost because of
failure.
 Durability is ensured by recovery manager of
DBMS.

10
Data model used in transaction processing

 A database is looked upon as a collection of


named data items.
 A data item can be
 a database record,
 value of an attribute in a database record,
 an entire table,
 the entire database,
 a disk block (disk page).
 Size of data item is called granularity of data
item.
 Data item is also called database item.

11
Disk blocks and memory buffers

12
Data transfer between disk and memory takes place in units of disk block.
Basic database access operation

 read_item(X)
Reads database item named X into the
program variable X.

 write_item(X)
Write the value of the program variable X into
the database item named X.

13
read_item(X) involves following steps
1. Find the address of the disk block that
contains data item X.

2. Copy that disk block into a buffer in main


memory (if that disk block is not already in
some main memory buffer).

3. Copy data item X from the buffer to the


program variable named X.

14
write_item(X) involves the following steps
1. Find the address of the disk block that contains
data item X.
2. Copy that disk block into a buffer in main
memory (if that disk block is not already in
some main memory buffer).
3. Copy data item X from the program variable
named X into its correct location in the buffer.
4. Store the updated block from the buffer back
to disk (either immediately or at some later
point in time).

15
Representation of a transaction in terms of read
and write
Transaction T1
Transfer Rs.50000/- from a/c no. 1259 to a/c no 3245.

read_item(X);
X := X – 50000;
write_item(X);
read_item(Y);
Y:= Y + 50000; X = Balance in account number 1259
Y = Balance in account number 3245
write_item(Y);

16
State transition diagram illustrating the states for
transaction execution

17
States for transaction execution
 BEGIN_TRANSACTION. This marks the beginning of
transaction execution.
 READ or WRITE. These specify read or write operations
on the database items that are executed as part of a
transaction.
 END_TRANSACTION. This specifies that READ and
WRITE transaction operations have ended and marks the
end of transaction execution. However, at this point it
may be necessary to check whether the changes
introduced by the transaction can be permanently
applied to the database (committed) or whether the
transaction has to be aborted because it violates
serializability or for some other reason.

18
States for transaction execution
 COMMIT. This signals a successful end of the transaction
so that any changes (updates) executed by the
transaction can be safely committed to the database
and will not be undone.
 ROLLBACK (or ABORT). This signals that the transaction
has ended unsuccessfully, so that any changes or effects
that the transaction may have applied to the database
must be undone.

19
System log

• System maintains a log (a sequential, append only


file kept on disk).
• A part of the log (last portions) reside in DBMS
cache & this part of DBMS cache is called log buffer
• Records of log file look like:
[start_transaction, T]
[write_item, T, X, old_value, new_value]
[read_item, T, X]
[commit, T]
[abort, T]

20
Commit point of a transaction

• Transaction has successfully executed all its


operations.
• Records of all database operations have been written to
the log.

Beyond the commit point a transaction is said to have been


committed and its effect must be permanently recorded in
the database.

Transaction then writes a commit record [commit, T] to the


log.

21
Database cache
(also known as DBMS cache)

 Consists of a number of memory buffers that


hold data fetched from disk for display/update.

 Each buffer is of the same size as that of disk


block.

 DBMS cache also holds indexes in use.

 DBMS cache holds a part of the system log &


this area of DBMS cache is called log buffer.

22
Main memory & DBMS cache
DBMS cache

Data buffers
DBMS cache

Log buffers

Indexes in use
Accessed by OS

23
Two transactions T1 & T2

24
A serial schedule consisting of T1 & T2

r1(X); w1(X); r1(Y); w1(Y); r2(X); w2(X);


25
Another serial schedule consisting of T1 & T2

r2(X); w2(X); r1(X); w1(X); r1(Y); w1(Y);


26
Drawback of serial schedule?

Low system throughput.

Throughput = number of transactions completed


successfully / unit time

27
Concurrent execution of transactions of a schedule

Execution of transactions in a schedule in an interleaved


fashion is called concurrent execution of transactions in a
schedule.

Improve efficiency of the database system.

28
Concurrent execution of T1 & T2 in an interleaved
fashion. (A non-serial schedule)

r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);


29
Another non-serial schedule

r1(X); w1(X); r2(X); w2(X); r1(Y); w1(Y);

30
Concurrent execution of multiple transactions
leads to several problems

1) Lost update problem


2) Dirty read problem
3) Incorrect summary problem
4) Unrepeatable read problem

31
Concurrent execution of multiple transactions
leads to several problems

1) Lost update problem

r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);


32
Problems with concurrent execution of multiple
transactions

2) Dirty read problem (Temporary update problem)

r1(X); w1(X); r2(X); w2(X); r1(Y);


33
Problems with concurrent execution of multiple
transactions

3) Incorrect summary problem

34
r3(A); … r1(X); w1(X); … r3(X); r3(Y); r1(Y); w1(Y);
Problems with concurrent execution of multiple
transactions

4) Unrepeatable read

Transaction T
read_item(X) (X = no. of seats available in flight 1)
read_item(Y) (Y = no. of seats available in flight 2)
read_item(Z) (Z = no. of seats available in flight 3)
.
.
.
--- Book N tickets on flight 1
read_item(X)
X := X – N (may fail!)

35
Schedule of transactions (formal definition)

A schedule (or history) S of n transactions T1,T2,...,Tn is an


ordering of operations of the transactions.

Operations from different transactions can be interleaved in


a schedule S.

Each transaction Ti that participates in the schedule S, the


operations of Ti in S must appear in the same order in which
they occur in Ti.

36
Serial and non-serial schedule (formal definition)

A schedule S is serial if, for every transaction T participating


in the schedule, all the operations of T are executed
consecutively in the schedule;
otherwise, the schedule is called non-serial.

A serial schedule has no interleaving of operations of


different transactions inside the schedule.

Every serial schedule is a correct schedule.

37
Serial schedule

Sa: r1(X); w1(X); r1(Y); w1(Y); r2(X); w2(X); Sb: r2(X); w2(X); r1(X); w1(X); r1(Y); w1(Y);

38
Non-serial schedule

Sc: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); Sd: r1(X); w1(X); r2(X); w2(X); r1(Y);

39
Serializable schedule
 A non-serial schedule is called SERIALIZABLE
if it is equivalent to a serial schedule.

 A SERIALIZABLE schedule can be converted


into an equivalent serial schedule.

 Every SERIALIZABLE schedule is a correct


schedule.

40
Equivalence of schedules

Two schedules are said to be equivalent


 if they consist of same transactions and
 produce the same final state of the
database.

41
Two types of equivalence

 Conflict equivalence
 View equivalence

Schedules are said to be conflict equivalent if


conflicting operations appear in the schedules in
the same order.

42
Conflict (conflicting operations)
Two operations in a schedule are said to conflict if they
satisfy all the three conditions viz.

1. Operations belong to different transactions.


2. Operations operate on the same data item.
3. At least one of the operations is a write operation.

r1(X);w2(X) read-write conflict,


w1(X);w2(X) write-write conflict
w1(X);r2(X) write-read conflict

Intuitively, two operations are conflicting if changing their


order may result in a different outcome.

43
Conflict equivalent schedules

Two schedules are said to be conflict equivalent if the


conflicting operations appear in the same order in the
schedules.

S1: r1(X);…; w2(X);…


S2: …w2(x);…; r1(X);…
S1 & S2 are not conflict equivalent

S3: w1(X);…; w2(X);…


S4: …w2(x);…; w1(X);…
S3 & S4 are not conflict equivalent

44
Conflict serializable schedule

A schedule is conflict serializable, if it is conflict equivalent to


a serial schedule

Sd:r1(X);w1(X);r2(X);w2(X);r1(Y);w1(Y); Sd is Conflict serializable to Sa

45
Testing conflict serializability of schedule

Sd:r1(X);w1(X);r2(X);w2(X);r1(Y);w1(Y);

r1(X);w2(X)

w1(X);r2(X)

w1(X);w2(X)

Sd is conflict serializable

Precedence graph is also called Serialization graph Precedence graph of Sd

46
Testing conflict serializability of schedule

Sc:r1(X);r2(X);w1(X);r1(Y);w2(X);w2(Y);

r1(X);w2(X)

r2(X);w1(X)
w1(X);w2(X)

Sc is NOT conflict serializable Precedence graph of Sc

47
Algorithm to test conflict serializability

1. For each transaction Ti participating in schedule S, create


a node labeled Ti in the precedence graph.
2. For each case in S, where Ti executes a read_item(X)
before Tj executes a write_item(X), create an edge
(Ti → Tj) in the precedence graph.
3. For each case in S, where Ti executes a write_item(X)
before Tj executes a read_item(X), create an edge
(Ti → Tj) in the precedence graph.
4. For each case in S where Ti executes a write_item(X)
before Tj executes a write_item(X), create an edge
(Ti → Tj) in the precedence graph.
5. The schedule S is serializable if and only if the
precedence graph has no cycle.
48
49
50
Three transactions T1, T2 & T3

51
A non-serial schedule of T1, T2 & T3

r2(Z);r2(Y);w2(Y);r3(Y);r3(Z);r1(X);w1(X);w3(Y);w3(Z);r2(X);r1(Y);w1(Y);w2(X)

52
Precedence graph

There is no equivalent serial schedule,


because of existence of cycles in the precedence graph.
The cycles in the precedence graph are

53
Another non-serial schedule of T1, T2 & T3

r3(Y);r3(Z);r1(X);w1(X);w3(Y);w3(Z);r2(Z);r1(Y);w1(Y);r2(Y);w2(Y);r2(X);w2(X)
54
Precedence graph

55
View equivalence

S: …wj(X);…wj(Y);…ri(Y);…ri(X);…wk(Z); …
S’: …wj(Y);…ri(Y);…wj(X);…ri(X);…wk(Z); …

S & S’ are view equivalent

56
View equivalence of schedule

Two schedules S and S’ are view equivalent if the following


three conditions hold.
1. The same set of transactions participates in S and S’, and
S and S’ include the same operations of those transactions.
2. For any operation ri(X) of Ti in S, if the value of X read by
the operation has been written by an operation wj(X) of Tj
(or if it is the original value of X before the schedule
started), the same condition must hold for the value of X
read by operation ri(X) of Ti in S’.
3. If the operation wk(Y) of Tk is the last operation to write
item Y in S, then wk(Y) of Tk must also be the last operation
to write item Y in S’.

57
View serializable

 A schedule is said to be View Serializable if it is view


equivalent to a serial schedule.

 The schedule Sg is view serializable because it is view


equivalent to the serial schedule S

Sg: r1(X); w2(X); w1(X); w3(X); c1; c2; c3; (non-serial)


S: r1(X); w1(X); w2(X); w3(X); c1; c2; c3; (serial)

 Every conflict serializable schedule is view serializable,


but the converse is not true.

58
Characterizing schedules based on serializability

 In the preceding slides we have characterized


schedules based on serializability.
 We have considered two kinds of serializability
namely,
 Conflict serializability &
 View serializability
 A schedule is called conflict serializable if it is
conflict equivalent to a serial schedule.
 A schedule is called view serializable if it is view
equivalent to a serial schedule.

59
Characterizing schedule based on recoverability

 A schedule is recoverable if no transaction T in


the schedule commits until all transactions T’
that have written some item X that T reads
have committed.
 A transaction T reads from transaction T’ in a
schedule if some item X is first written by T’
and later read by T. In addition, T’ should not
have been aborted before T reads item X, and
there should be no transactions that write X
after T’ writes it and before T reads it

60
Characterizing schedule based on recoverability

 Sa: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);


 Recoverable
 Sb: r1(X); w1(X); r2(X); w2(X); r1(Y); a1; a2;
 Recoverable
 Sa’: r1(X); r2(X); w1(X); r1(Y); w2(X); c2; w1(Y); c1;
 Recoverable
 Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;
 Not recoverable
 Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2;
 Recoverable

61
Characterizing schedule based on recoverability

 In a recoverable schedule, no committed


transaction ever needs to be rolled back, and
so the definition of committed transaction as
durable is not violated.
 However, it is possible for a phenomenon
known as cascading rollback (or cascading
abort) to occur in some recoverable schedules,
where an uncommitted transaction has to be
rolled back because it read an item from a
transaction that failed.

62
Characterizing schedule based on recoverability

 This is illustrated in schedule Se, where


transaction T2 has to be rolled back because it
read item X from T1, and T1 then aborted.

 Se: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;

 The schedule is recoverable but it suffers from


cascading rollback.

63
Cascadeless schedule

A schedule is called a cascadeless schedule if no


transaction T involved in the schedule reads a
database item X modified by some other
transaction T’ in the same schedule until T’ has
committed/aborted.

64
Cascadeless schedule

 To make the schedule Se

 Se: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;

 cascadeless we should modify it as Se’ given by

 Se’: r1(X); w1(X); r1(Y); w1(Y); a1; r2(X); w2(X); c2;

65
Strict schedule

A schedule is called a strict schedule if no


transaction T involved in the schedule reads /
writes a database item X modified by some other
transaction T’ in the same schedule until T’ has
committed/aborted.

66
Strict, Cascadeless, Recoverable
 Any strict schedule is also cascadeless, and any
cascadeless schedule is also recoverable.
 The cascadeless schedules will be a subset of
the recoverable schedules, and the strict
schedules will be a subset of the cascadeless
schedules.
 Thus, all strict schedules are cascadeless, and
all cascadeless schedules are recoverable.

67
Exercise problem
 Which of the following schedules is (conflict)
serializable? For each serializable schedule,
determine the equivalent serial schedules.
a) r1(X); r3(X); w1(X); r2(X); w3(X);
b) r1(X); r3(X); w3(X); w1(X); r2(X);
c) r3(X); r2(X); w3(X); r1(X); w1(X);
d) r3(X); r2(X); r1(X); w3(X); w1(X);

68
Exercise problem
 Consider the three transactions T1, T2, and T3, and the
schedules S1 and S2 given below. Draw the serializability
(precedence) graphs for S1 and S2, and state whether
each schedule is serializable. If a schedule is serializable,
write down the equivalent serial schedule(s).
 T1: r1 (X); r1 (Z); w1 (X);
 T2: r2 (Z); r2 (Y); w2 (Z); w2 (Y);
 T3: r3 (X); r3 (Y); w3 (Y);
 S1: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); w3 (Y);
r2 (Y); w2 (Z); w2 (Y);
 S2: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X);
w2 (Z); w3 (Y); w2 (Y);

69
Exercise problem
 Consider schedules S3, S4, and S5 below.
Determine whether each schedule is strict,
cascadeless, recoverable, or non-recoverable.
 S3: r1(X); r2(Z); r1(Z); r3(X); r3(Y); w1(X); c1;
w3(Y); c3; r2(Y); w2(Z); w2(Y); c2;
 S4: r1(X); r2(Z); r1(Z); r3(X); r3(Y); w1(X);
w3(Y); r2(Y); w2(Z); w2(Y); c1; c2; c3;
 S5: r1(X); r2(Z); r3(X); r1(Z); r2(Y); r3(Y);
w1(X); c1; w2(Z); w3(Y); w2(Y); c3; c2;

70

S-ar putea să vă placă și