Sunteți pe pagina 1din 9

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.

6, November 2014


Alireza Souri1 and Saeid Pashazadeh2*

Department of Computer Engineering, College of Engineering, East Azarbaijan Science and

Research Branch, Islamic Azad University, Tabriz, Iran
Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran

Data replication is generally used for increasing accessibility, availability, performance and scalability of
database systems. For implementing data replication mechanisms, we encounter with some consistency
problems. One of the important problems for implementing data replication mechanism is consistency. In
this paper, the performance tradeoffs of consistency models for semi-active data replication protocol in
distributed systems are analyzed. A brief deliberation about consistency models in data replication is
shown. Research on how client-centric guarantees relate to data-centric models is discussed. How
guaranteeing conditions of data-centric consistency models and client-centric consistency models is
provided, is also analyzed. Analysis of the consistency models guarantee in terms of multi-client and single
client for the semi-active data replication protocol without failure and leader death is presented. The
experimental results show that semi-active data replication protocol is appropriate for distributed systems
by multi-client replication such as web services.

Distributed systems, Consistency, Semi-active replication protocol, Data-centric model, Client-centric

Todays, there are considerable interests for geologically distributed database systems, in which
the databases are distributed among different processing systems, and some request shipping
mechanism is provided to support the access to non-local data [1]. Access of remote data is
expensive in terms of the communications overheads and delays, data can be replicated either
fully or partially. Data replication reduces amount of remote accesses and greatly decreases
access overhead for read-only transactions. However, the overhead for maintaining consistency of
replicated copies in the presence of updates increases. focus of database systems for maintaining
consistency is on the relationships between data items and the overall correctness of the entire
database [2]. Guarantee of consistency in distributed databases is expensive because consistency
and isolation are typically guaranteed via locking mechanisms which requires extensive
communication overhead.
Researchers of distributed systems community investigate state shared by multiple replicas, i.e.,
several copies of a datum exist which may or may not be identical. Executions of operations on
these replicas may change the state of one or more replicas. Essentially, a consistency criterion
(or consistency model) defines which executions of a distributed system are considered correct
[3], i.e. which order of operations leaves the data in a correct state based on the used consistency
model. In following parts of the paper, consistency of distributed database is considered as


International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

follows: A system is in a consistent state, if all replicas are identical and the required guarantees
of operations ordering of the specific consistency model are not violated [4].
In this paper, we analyze the trade-offs of consistency models for semi-active data replication
protocol in distributed systems. Researching on how client-centric guarantees relate to datacentric models is discussed. How guaranteeing conditions of data-centric consistency models and
client-centric consistency models is provided, is also analyzed. In section 2, the model of
analytical semi-active data replication protocol is described. In section 3, analysis of client-centric
and data-centric consistency models are shown for the semi-active data replication protocol.
Finally, we draw the conclusion and describe future work in section 4.


In this part of paper, semi-active replication protocol briefly is describe according to [5, 6]. Semiactive replication is a middle approach between active replication and passive replication. Semiactive replication has not need for a deterministic method in process of replicated service
requests. The semi-active replication allows non-determinism for computation and, therefore its
recovery overheads are low (in comparison with active replication). It enables to take into
account the timing requirements of real-time applications. In semi-active replication protocol, one
of the replicas of the group always is the leader and the others are the followers. In the absence of
faults, only the leader replica provides output messages. The follower replicas carry out
autonomously the same computation as the leader except when some nondeterministic decision
must be made. In that case, they must follow the instructions sent by their leader so as to do
likewise. For a semi-active replication protocol, there are five generic stages [7]. These stages
represent the procedure of update operation in the protocol and will be used to characterize the
different methods [8]. Figure 1 shows procedure of a semi-active data replication protocol by
Single-Client process.

Request Phase (RE): in this phase, a client submits a request to all replicas by using
Atomic Broadcast.
Server Coordination Phase (SC): During the server coordination phase, the replicas
coordinate by using the order given by the AB (Atomic Broadcast) protocol.
Execution Phase (EX): In this phase, all replicas execute the submitted request in the
order they are delivered.
Agreement Coordination (AC): this phase can navigate in case of a non-deterministic
order to guaranteeing atomicity. In first phase leader informs other replicas (as followers)
using the VSB (View Synchronous Broadcast) protocol. This method is similar to two
phase commit protocol in eager update everywhere with distributed locking approach.
Client Response (CL): The client response phase shows the send back operation of the
client when it receives a response from the system. The replicas send back the response to
the client.


International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

Figure 1. Semi-active replication protocol by single-client process.

Figure 2 shows procedure of a semi-active data replication protocol for processing requests of
multiple independent clients. When two clients request for updating data items, two main
different scenarios can be defined. First scenario is that both clients update the same data item x.
In this scenario, the semi-active protocol executes requests based on their early submission time
and lower ID of the client. Guarantee of consistency models is important for this scenario. The
conflict of write-write and read-write conditions causes that some consistency models such as
causal consistency and sequential consistency models cannot guarantee the client-centric
consistency models in semi-active replication protocol.
Second scenario shows that each client updates a separate data items such as x and z in the
procedure of update operation. In this scenario, the client requests can execute simultaneously
[9]. Of course, the semi-active protocol navigates both client requests using atomic broadcast and
view synchronous broadcast protocols. The leader manages sending and receiving requests from
each replica separately. In following section, we analyze how consistency models can be
guaranteed for semi-active replication protocol.

Figure. 2. Semi-active replication protocol in processing requests of multiple clients.


International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

Analysis of guaranteeing consistency models for semi-active replication protocol is presented in
this part of paper. Both data-centric and client-centric consistency guarantees have two scopes:
ordering and staleness. Staleness describes how much a given replica is lagging behind, either
expressed in terms of time or versions [10]. Low bounded staleness values can often be tolerated
by applications as long as the corresponding real-world events that would have the same or higher
staleness values without a database system. In general, apart from context of semi-active
protocol, when replica R1 send request to replica R2 , the system storage of replica R1 will be
updated right away. In contrast replica R2 might not be consistent with replica R1 for some time.
There, small staleness values often will appear but we may not sense it. Ordering is more critical
than stainless. In a setting with strict consistency, all requests must be executed on all replicas in
same chronological order. This consistency model is hard to implement in distributed databases
due to clock synchronization issues and communication delays which cause that replicated
servers might disagree on the chronological order of events [11]. The standard database
mechanism of locking offers poor performance levels in a distributed setting. Based on this, datacentric consistency models exist that relax certain ordering requirements while keeping those that
are essential to applications. These models can be ordered by the strictness of their guarantees
Definition 1 (View side): Let assume relation as client view and relation as server view that
fulfill the following conditions for a server Si and a client Ci in the view of operations on shared
values. The OR, OW and Oc show read operation, write operations and client operation

For w(x)v, w(x)u OW and r(x)v OR , v u : (w(x)v r(x)v) ( w(x)v w(x)u


For r(x)v OR , o(y)u Oci : (r(x)v o(y)u r(x)v o(y)u).

3.1 Client-centric Consistency

Client-centric consistency models take different approach. Cross effects between the models exist
and the guarantees itself are disjunction in their promises and complementing each others [13].
Client-centric consistency models are described at first and then description of data-centric
models and how those two are related are presented.
Monotonic Read Consistency (MRC): guarantees that a process has read a value of data item x
by version n after that always will read versions n [8]. This is helpful as expected data visibility
of an application might not be instantaneous but versions become visible at least in chronological
order, i.e., the system never goes backward in time [14]. Let assume a Client 1 in a semi-active
replication as shown in figure 2. If this client first sees the value of a data item on current state of
system and then tries to transfer value of this version of data item to a replica R3 which may fail
due to deficient assets, this at least will cause severe customer irritation if not more. The MRC
is formally expressed with the following condition:

For Ci and Sj: r(x)v r(yj)u w(x)v r(y)u

Read Your Writes Consistency (RYWC): guarantees that a process has written a value x by
version n thereafter always will read a version that is at least as new as n, i.e., n [3]. This helps,

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

to avoid user irritation when replica R1 checks his system storage state; it does not see old value
of data item and consequently sends the same amount of its version again. Generally, RYWC
avoids impression of failed request where a client issues the same request several times. For
idempotent operations reissuing requests causes only additional load on the system, while
reissuing other requests will create severe inconsistencies. The RYWC is formally expressed with
the following condition:

For Ci and Sj : w(x)v r(yj)u w(x)v r(y)u.

Monotonic Writes Consistency (MWC): guarantees that two updates by the same client will be
serialized in the order that they arrive at the storage system in a semi-active replication protocol
[15]. This is useful to avoid seemingly lost updates when an application first writes and then
updates a datum but the update is executed before the initial write and is, thus, overwritten. For
example, in the semi-active replication protocol a replica R1 might have corrected the version of
replica R2 before finalizing the transfer. If MWC is not guaranteed, the version n might end up in
the wrong value. The consistency condition is formulated as follow:

For Ci : w(x)v w(y)u Sj : w(x)v w(y)u.

Write Follows Read Consistency (WFRC) guarantees that an update following a read on data
item with version n will only execute on replicas that have seen at least version of data item that
is greater than n [4]. This, also, helps against seemingly lost updates where the update is
overwritten by a delayed update request for versions n. This model essentially extends
guarantee of MWC for updates by other clients that have at least been seen. In figure 2, these
client-centric properties are typically guaranteed explicitly. Benchmarks can be used to determine
the probability of violations or to measure the second dimension staleness [16]. Formal definition
of WFRC is as follows:

For Ci : r(x)v w(y)u Sj : w(x)v w(y)u.

3.2 Data-centric Consistency

In this section, we will present data-centric consistency models ordered by the strictness of their
guarantees in semi-active data replication protocol. For each model, how it can be translated into
a client-centric consistency model is discuss in this section. As already discussed, there are two
consistency scopes: staleness and ordering. The following consistency models (apart from
Linearizability) do not consider staleness. In fact, increasing strictness of ordering guarantees
often leads to higher staleness values as updates may not be applied directly but are required to
fulfill dependencies at first [14].
The lowest possible ordering guarantee is typically described as Weak Consistency [17]. As the
name states, guarantees are very weak in that they do not really exist. Essentially, weak
consistency translates to a colloquial replicas might by chance become consistent. While an
implementation may or may not have a protocol to synchronize replicas, a typical use-case can be
found in the context of a browser cache: it is updated from time to time but replicas will rarely be
consistent. As Weak Consistency does not provide any ordering guarantees at all, there is no
relation to client-centric consistency models in semi-active data replication protocol. It is required
that synchronization considers in ordering read and write operations if:

All accesses to synchronization variables are seen by all processes (or nodes, processors)
in the same order (sequentially) - these are synchronization operations. Accesses to
critical sections are seen sequentially.

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

All other accesses may be seen in different order on different processes (or nodes,
The set of both read and write operations in between different synchronization operations
is the same in each process.

Eventual Consistency (EC) is a little stricter. It requires convergence of replicas, i.e., in the
absence of updates and failures the system converges towards a consistent state. Updates may be
reordered in any way possible and a consistent state is simply defined as all replicas being
identical. EC is very vague in terms of concrete guarantees but is very popular for web-based
services. In terms of client-centric consistency guarantees, EC often fulfills these guarantees for a
majority of requests but does not guarantee to do so. As an example, Amazon S34 currently
delivers MRC for about 95% of all requests whereas it still violated MRC in about 12% of all
requests in 2011 [18]. While there are certainly some use-cases where EC cannot be applied, it
often suffices as the real world itself is inherently eventually consistent. The difference is, that
more conflict resolution is necessary at the application layer [19] requiring a higher skill set from
application developers. Instead of pessimistically locking data items guesses and apologies are
used [20]. Data stores that are eventually consistent thus have the property that in the absence of
updates, all replicas converge toward identical copies of each other. Eventual consistency
essentially requires only that updates are guaranteed to propagate to all replicas. Write-write
conflicts are often relatively easy to solve when assuming that only a small group of processes
can perform updates. Eventual consistency is therefore often cheap to implement.
Causal Consistency (CC) is the strictest level of consistency that can be achieved in an always
available storage system [21] based on the tradeoffs of the CAP theorem [22]. In a causally
consistent storage system, all requests that have a causal relationship to another request must be
serialized (i.e., executed) in the same order on all replicas while unrelated requests may be
serialized in arbitrary order. A client request cr2 causally depends on a client request cr1:

If both requests are issued by the same client and cr1 was received to the storage system
before cr2,
If cr2 is a read that returns the result of cr1 which is an update request or
If there is a transitive relation between both requests [3].

Of course, CC captures potential causality so that systems like COPS have to evaluate large
dependency trees before applying an update [23]. This adds an overhead and increases staleness
as updates cannot become visible right away. Bailis et al. [2] proposed minimizing this impact by
having the application explicitly define dependencies that need to be considered. A typical
implementation uses vector clocks to identify (potential) causal dependencies. CC can also be
defined via the client-centric guarantees discussed above; If all four are fulfilled, the system is
causally consistent [3]. It is also possible to create the client-side illusion of CC with the
combination of version caching and vector clocks [24]. As Guerraoui and Hari pointed out, CC
does not require replica convergence [4]. Convergence is only asserted when the latest update is
causally dependent on all previous writes since the last idempotent replace-update and staleness is
Sequential Consistency (SC) is a very strict consistency model and cannot be achieved in always
available systems. It requires that all requests are serialized in the same order on all replicas and
that requests by the same client are executed in the order that they are received by the storage
system [13]. While this model does not guarantee anything about the recentness of values read by
clients, it orders that all updates become visible to clients in the same order. Often, SC is
described as strict consistency which is not entirely true as staleness is not addressed. But since
real-world staleness values are often very small, SC usually is suffice even for applications

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

seemingly requires strict consistency. Generally, vector clocks that define causal relationships can
be in conflict (e.g., for unrelated concurrent updates). If vector clocks are used for request
ordering and an approach exists that defines a transitive, Multi-Client order for all conflicting
vector clocks, then a causally consistent system becomes sequentially consistent. In focusing on
client-centric consistency guarantees, the main difference between CC and SC is that WFRC
becomes Multi-Client in so far as reads by all clients are considered. This means that as soon as a
client has seen a particular version n, all updates by other clients will only be executed on replicas
that have already processed the update on version n. This guarantee can be provided as SC
guarantees that all replicas execute all updates in the same order. So, once a version n has been
read, it is guaranteed to have been finally serialized as that version so that any updates will be
serialized with a higher version number such as version n+1.
Linearizability (LIN) describes what is typically meant with strict consistency. It does not only
consider ordering but also considers staleness, i.e., it requires that all requests are ordered
chronologically by their arrival time in the system and that all requests always see the effects of
the preceding request. This can be visualized as all operations happening instantaneously at a
single point in time and not during an interval of time [25]. LIN is hard to implement in
distributed systems as there is always needs to clock synchronization (which is necessary to
determine a chronological order of requests). In practice, however, sufficiently high precision is
required to guarantee that violations are highly improbable to occur. Furthermore, in case of
violations LIN downgrades to SC.
While Consensus protocols can guarantee that in all replicas, requests are serialized in the same
order, they cannot guarantee that all replicas execute requests in the actual chronological order of
arrival in the system. Implementation using distributed locking is likely to show poor
performance. Expressed in terms of client-centric consistency guarantees, the difference between
SC and LIN is that both RYWC and MWC become Multi-Client guarantees. This means that a
client will always see all updates and that all writes will be executed in the (Multi-Client)
chronological order. MRC then also becomes Multi-Client as a side effect.
Beyond the data-centric consistency models discussed here, there are a few other models which
we leave out as no implementations exist [26]. Table 1 gives an overview of the relationship
between different client-centric and data-centric consistency models in semi-active replication
protocol. Entries N/A mean that the guarantee may be reached for single requests from time to
time but only based on chance. In contrast, frequently specifies that such a guarantee is fulfilled
for a large number of requests. Single-Client describes that the guarantees from section 3.1 are
fulfilled, whereas we use Multi-Client to describe when such a guarantee is extended to all clients
at the same time. The LN condition is confirmed for all of the client-centric consistency models in
single-client and multi-client. For EC model, the semi-active replication protocol can satisfy all of
client-centric consistency models time to time. If the semi-active replication protocol can satisfy
all of the client-centric consistency models in single-client status, then CC model can be satisfied
by the semi-active replication protocol. Only WFRC model can satisfied for multi-client status in
SC model. If we can use a time atomic broad cast protocol, then the semi-active replication
protocol can guarantees all of the client-centric consistency models in multi-client status [10].
Table 1. Association of Data-centric and Client-centric Consistency Models in Semi-active data replication
protocol according to Single-client and Multi-client.
Data-centric model

MR consistency

RYW consistency



Weak consistency





International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

Causal consistency





However, the relationship between client-centric and data-centric consistency models is

dependent on some conditions such as failure, fault, time out of atomic broad cast and leader
death [27]. When failure is occurred, semi-active data replication protocol cannot navigate
guaranteeing consistency models. So, semi-active replication protocol has some problem for
establishment of consistency models in times of failure and leader death.

In this paper, a functional model presented for the semi-active data replication protocol for singleclient and multi-client applications. Then analysis of client-centric and data-centric consistency
models presented for the semi-active data replication protocol. Analysis of the consistency
models guarantee in terms of multi-client and single client for the semi-active data replication
protocol without failure and leader death presented. The experimental results show that semiactive data replication protocol is appropriate for distributed systems by multi-client replication
such as web services. In future work, we are interested to research on a new approach of semiactive data replication protocol that this protocol can satisfy all of client-centric consistency
models in occurring leader death and failure.


J. Garca-Garca, C. Ordonez, and P. Tosic, "Efficiently repairing and measuring replica consistency
in distributed databases," Distributed and Parallel Databases, vol. 31, pp. 377-411, 2013/09/01 2013.
[2] P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica, "The potential dangers of causal
consistency and an explicit solution," presented at the Proceedings of the Third ACM Symposium on
Cloud Computing, San Jose, California, 2012.
[3] J. Brzezinski, C. Sobaniec, and D. Wawrzyniak, "From session causality to causal consistency," in
Parallel, Distributed and Network-Based Processing, 2004. Proceedings. 12th Euromicro Conference
on, 2004, pp. 152-158.
[4] R. Guerraoui and C. Hari, "On the consistency problem in mobile distributed computing," presented
at the Proceedings of the second ACM international workshop on Principles of mobile computing,
Toulouse, France, 2002.
[5] F. B. Schneider, "Implementing fault-tolerant services using the state machine approach: a tutorial,"
ACM Comput. Surv., vol. 22, pp. 299-319, 1990.
[6] D. Powell, "Distributed Fault Tolerance - Lessons Learned from Delta-4," presented at the Revised
Papers from a Workshop on Hardware and Software Architectures for Fault Tolerance, 1994.
[7] De, x, A. planche, P. Y. Theaudiere, and Y. Trinquet, "Implementing a semi-active replication
strategy in CHORUS/ClassiX, a distributed real-time executive," in Reliable Distributed Systems,
1999. Proceedings of the 18th IEEE Symposium on, 1999, pp. 90-101.
[8] M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, and G. Alonso, "Understanding replication in
databases and distributed systems," in Distributed Computing Systems, 2000. Proceedings. 20th
International Conference on, 2000, pp. 464-474.
[9] M. Almulla, K. Abrougui, and A. Boukerche, "LEADMesh: Design and analysis of an efficient leader
election protocol for wireless mesh networks," Simulation Modelling Practice and Theory, vol. 36,
pp. 22-32, 8// 2013.
[10] D. Bermbach and J. Kuhlenkamp, "Consistency in distributed storage sSystems," in Networked
Systems. vol. 7853, V. Gramoli and R. Guerraoui, Eds., ed: Springer Berlin Heidelberg, 2013, pp.
[11] M. T. Ozsu, Principles of Distributed Database Systems: Prentice Hall Press, 2007.
[12] A. S. Tanenbaum and R. V. Renesse, "Distributed operating systems," ACM Comput. Surv., vol. 17,
pp. 419-470, 1985.

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

[13] A. S. Tanenbaum and M. v. Steen, Distributed Systems: Principles and Paradigms (2nd Edition):
Prentice-Hall, Inc., 2006.
[14] Y. Zhu and J. Wang, "Client-centric consistency formalization and verification for system with largescale distributed data storage," Future Generation Computer Systems, vol. 26, pp. 1180-1188, 10//
[15] D. B. Terry, A. J. Demers, K. Petersen, M. Spreitzer, M. Theimer, and B. W. Welch, "Session
guarantees for weakly consistent replicated data," presented at the Proceedings of the Third
International Conference on Parallel and Distributed Information Systems, 1994.
[16] M. R. Rahman, W. Golab, A. AuYoung, K. Keeton, and J. J. Wylie, "Toward a principled framework
for benchmarking consistency," presented at the Proceedings of the Eighth USENIX conference on
Hot Topics in System Dependability, Hollywood, CA, 2012.
[17] W. Golab, X. Li, and M. A. Shah, "Analyzing consistency properties for fun and profit," presented at
the Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed
computing, San Jose, California, USA, 2011.
[18] D. Bermbach and S. Tai, "Eventual consistency: How soon is eventual? An evaluation of Amazon
S3's consistency behavior," presented at the Proceedings of the 6th Workshop on Middleware for
Service Oriented Computing, Lisbon, Portugal, 2011.
[19] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, et al., "Dynamo:
amazon's highly available key-value store," SIGOPS Oper. Syst. Rev., vol. 41, pp. 205-220, 2007.
[20] P. Bailis and A. Ghodsi, "Eventual consistency today: limitations, extensions, and beyond," Queue,
vol. 11, pp. 20-32, 2013.
[21] S. Vazhkudai, S. Tuecke, and I. Foster, "Replica selection in the Globus data grid," in Cluster
Computing and the Grid, 2001. Proceedings. First IEEE/ACM International Symposium on, 2001, pp.
[22] S. Gilbert and N. Lynch, "Brewer's conjecture and the feasibility of consistent, available, partitiontolerant web services," SIGACT News, vol. 33, pp. 51-59, 2002.
[23] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen, "Don't settle for eventual: scalable
causal consistency for wide-area storage with COPS," presented at the Proceedings of the TwentyThird ACM Symposium on Operating Systems Principles, Cascais, Portugal, 2011.
[24] D. Bermbach, J. Kuhlenkamp, B. Derre, M. Klems, and S. Tai, "A middleware guaranteeing clientcentric consistency on top of eventually consistent datastores," in Cloud Engineering (IC2E), 2013
IEEE International Conference on, 2013, pp. 114-123.
[25] M. P. Herlihy and J. M. Wing, "Linearizability: a correctness condition for concurrent objects," ACM
Trans. Program. Lang. Syst., vol. 12, pp. 463-492, 1990.
[26] J. Brzezinski, C. Sobaniec, and D. Wawrzyniak, "Session guarantees to achieve PRAM consistency of
replicated shared objects," in Parallel Processing and Applied Mathematics. vol. 3019, R.
Wyrzykowski, J. Dongarra, M. Paprzycki, and J. Waniewski, Eds., ed: Springer Berlin Heidelberg,
2004, pp. 1-8.
[27] V. Hadzilacos and S. Toueg, "Fault-tolerant broadcasts and related problems," in Distributed systems
(2nd Ed.), ed: ACM Press/Addison-Wesley Publishing Co., 1993, pp. 97-145.


S-ar putea să vă placă și