Oracle Extended Rac 10g r2

Oracle Real Application Clusters
on Lxtended Distance Clusters

Updated or Oracle RAC 10g Release 2

.v Oracte !bite Paer
October 200

Real Application Clusters on Extended Distance Clusters - Page 2
Oracle Real Application Clusters on Lxtended
Distance Clusters
Lxecutie Oeriew.......................................................................................... 3
Introduction ....................................................................................................... 4
Beneits o RAC on Lxtended Distance Clusters ........................................ 5
lull utilization o resources ......................................................................... 5
Lxtreme Rapid Recoery............................................................................. 5
Components & Design Considerations ......................................................... 6
Connectiity...................................................................................................
Storage............................................................................................................ 8
Cluster Quorums, or Lnsuring Surial o One Part o the Cluster:.. 10
lardware Vendor speciics............................................................................ 15
Sun ................................................................................................................ 15
lP................................................................................................................. 15
IBM............................................................................................................... 15
lull Oracle Stack ............................................................................................. 16
Oracle Clusterware ..................................................................................... 16
ASM.............................................................................................................. 16
Comparison with a local RAC and Data Guard remote site .................... 18
Comparison Summary................................................................................ 18
Strengths o RAC on Lxtended Distance Clusters................................ 18
Strength o local RAC - Data Guard at a remote site.......................... 19
Conclusion........................................................................................................ 22
Appendix A: Detailed Quorum Lxamples .................................................. 23
Appendix B: Customers Using RAC on Lxtended Distance Clusters.... 25
Reerences ........................................................................................................ 26

Oracle Real Application Clusters on Lxtended
Distance Clusters

EXECUTIVE OVERVIEW
Oracle Real Application Clusters ,RAC, is a proen mechanism or local high
aailability ,lA, or database applications. It was designed to support clusters that
reside in a single physical datacenter. As technology adances, customers are
looking at the iability o using RAC oer a distance.
Can RAC be used oer a distance, and what does this imply RAC on Lxtended
Distance Clusters is an architecture that proides extremely ast recoery rom a
site ailure and allows or all nodes, at all sites, to actiely process transactions as
part o single database cluster. \hile this architecture creates great interest and has
been successully implemented, it is critical to understand where this architecture
best its especially in regards to distance, latency, and degree o protection it
proides.
1he high impact o latency, and thereore distance, creates some practical
limitations as to where this architecture can be deployed. 1his architecture its best
where the 2 datacenters are located relatiely close ,~100km, and where the
extremely expensie costs o setting up direct cables with dedicated channels
between the sites has already been taken.
RAC on Lxtended Distance Clusters proides greater high aailability than local
RAC but it may not it the ull Disaster Recoery requirements o your
organization. leasible separation is great protection or some disasters ,local power
outage, airplane crash, serer room looding, but not all. Disasters such as
earthquakes, hurricanes, and regional loods may aect a greater area. Customers
should do an analysis to determine i both sites are likely to be aected by the
same disaster.
lor comprehensie protection against disasters including protection against
corruptions, and regional disasters Oracle recommends the use o Data Guard with
RAC as described in the Maximum Aailability Architecture ,MAA,.1 Data Guard
also proides additional beneits such as support or ull rolling upgrades across
Oracle ersions.
Coniguring an extended distance cluster is more complex than a local cluster.
Speciic ocus needs to go into node layout, quorum disks, data disk placement,
and other actors discussed in this paper.
Implemented properly, this architecture can proide greater lA than a local RAC
database. 1his paper will address the necessary components, the beneits and
limitations o this architecture, and will highlight some actual customer examples.
INTRODUCTION
Oracle`s Real Application Clusters ,RAC, is designed primarily as a scalability and
aailability solution that resides in a single data center. It is possible, under certain
circumstances, to build and deploy a RAC system where the nodes in the cluster
are separated by greater distances. lor example i a customer has a corporate
campus they might want to place the indiidual RAC nodes in separate buildings.
1his coniguration proides a degree o disaster tolerance, in addition to the
normal RAC high aailability, since a ire in one building would not, i properly set
up, stop database processing.
1his paper discusses the potential beneits that attract customers to this type o
architecture, coers the components required and design considerations that
should be considered when implementing, reiews empirical perormance data
oer arious distances, and coers the additional adantages that are proided by
an Oracle Data Guard solution. linally it looks at seeral case studies rom actual
production customer implementations.
Clusters, where all the nodes are not local, hae been reerred to by many names
including campus clusters, metro clusters, geo clusters, stretch clusters and
extended clusters. Some o these names imply a ague notion o distance range.
1hroughout this paper this type o coniguration will be reerred as RAC on
Lxtended Distance Clusters.
1his paper is intended to proide a deeper understanding o the topic and to allow
one to determine i this type coniguration is applicable and appropriate.

BENEFITS OF RAC ON EXTENDED DISTANCE CLUSTERS
Implementing a RAC database on a cluster where some o the nodes are located a
dierent site, is attractie to customers or the two main adantages it proides
Full utilization of resources
Being able to distribute any and all work across all nodes, including running as a
single workload across the whole cluster, allows or the greatest lexibility in usage
o resources.

Extreme Rapid Recovery
Should one site ail, or example because o a ire at a site, all work can be routed
to the remaining site that can ery rapidly , 1-2 minutes, take oer the processing.

Site A
All Work Gets Distributed to All Nodes
One Physical Database
Site B

Site A
Work Continues on Remaining Site
One Physical Database
Site B

COMPONENTS & DESIGN CONSIDERATIONS
RAC on an Lxtended Distance Cluster is ery similar to a RAC implementation at
a single site.
1o build a RAC database on an Lxtended Distance Cluster enironment you will
need to.
o Place one set o nodes at Site A
o Place the other set o nodes at Site B
Fibre Channel Switch
for SAN disk access
FC-SW over DWDM
D
W
D
M
Dedicated Gigabit
Ethernet Switch for
memory
interconnect access
Single database mirrored
physically across two locations
D
W
D
M
Redundant public
network
1 or more
database servers
(nodes) at each
physical location
Site A Site B
D
W
D
M
D
W
D
M

Use ast dedicated connectiity between the nodes,buildings or RAC
cross instance communication ,Dense \aelength Diision Multiplexing
,D\DM or Dark liber, is optional,
Use host or array based mirroring to allow you to host all the data on both
sites and keep it synchronously mirrored.
Details o the components, and design considerations, ollow.
Connectivity
Networking requirements or a distance cluster are much greater than that o a
normal \ide Area Network ,\AN, used or Disaster Recoery. 1his plays in
two aspects: necessary connections and latency.
Necessary Connections
Interconnect, SAN, and IP Networking need to be kept on separate aeaicatea
channels, each with required redundancy. Redundant connections must not share
the same Dark liber ,i used,, switch, path, or een building entrances. Keep in
mind that cables can be cut.

1he SAN and Interconnect connections need to be on direct point-to-point cables
,see eects o latency in the next section,. 1raditional networks are limited to
about 10 km i you are to aoid using repeaters. Dark liber networks allow the
communication to occur without these repeaters. Since latency is limited, Dark
liber networks allow or a greater distance in separation between the nodes. 1he
disadantage o Dark liber networks are they can cost hundreds o thousands o
dollars, so generally they are only an option i they already exist between the two
sites.
Latency eects and perormance implications o distances are discussed in the
Latency & Lmpirical Perormance Results Chapter.
Site A
Dual Public
Dual SAN
Site B

Dual Private
Storage
RAC on Lxtended Distance Clusters by deinition has multiple actie instances on
nodes at dierent locations. lor aailability reasons the data needs to be located
at both sites, and thereore one needs to look at alternaties or mirroring the
storage.
Host Based Mirroring (Active/Active Storage)

o Use two SAN,lC storage subsystems, one co-located with each node.
o Standard, cluster aware, host based ,OS leel, mirroring sotware is
implemented across both disk systems. \ith this, system writes are
propagated at the OS leel to both sets o disks, making them appear as single
set o disks independent o location. 1hese Logical Volume Managers
,LVM, need to be tied closely with the clusterware. Lxamples o these
include Veritas CVM, lP-UX Mirror Disk,UX, & Oracle`s Automatic Storage
Management ,ASM,.
o \hile there may be a perormance impact
1
rom doing host based ersus array
based mirroring, this is the preerred coniguration rom an aailability
perspectie. \hen we reer to RAC on Lxtended Distance Clusters in this
paper, it generally reers to this actie,actie storage coniguration.

1
lost based mirroring requires CPU cycles rom the host machines. Array based
mirroring oloads this work to the storage layer. Adantage or disadantage o this
depends on which layer you either hae spare cycles or it is more cost eectie to add
cycles.
Primary
Primary
Array Based Mirroring (Active/Failover Storage)

o Use two SAN,lC storage subsystems, one co-located with each node and
each is cross cabled to both nodes
o One storage subsystem has all the lie database iles on it, all writes are
sent to this system
o 1he second storage subsystem has an array based mirror mechanism ,i.e.
LMC`s SRDl, lP`s CA, etc., o the irst storage subsystems iles
o Perormance impacts in this case come rom both doing additional work
in the storage array or the mirroring, but more importantly by I,Os rom
the secondary site haing to cross the distance` 4 times
2
beore they
return control.
o In this case additional cycles are consumed in the storage arrays to do the
mirroring, and additional latency introduced or the secondary` site I,O
as it needs to irst come to the primary storage and
Why not have just a single storage location?
\hile it is possible to implement RAC on Lxtended Distance Clusters with storage
on only one site, should the site with the storage ail, storage is no longer aailable
to any suriing nodes, and the whole cluster becomes unaailable. 1his deeats
the purpose o haing had the RAC nodes at dierent locations.

2
Secondary host to primary storage, primary storage to secondary storage, secondary
storage to primary storage, primary storage to secondary storage. All need to be synch
to ensure no data loss.
CAUTION: Array Based Mirroring
generally implies a primary/secondary
storage site solution. Should the primary
storage location fail, all instances will crash
and need to be restarted once the
secondary storage is made active. Array
based mirroring requires a switch be made
from receiving changes at the remote side
to functioning as local disk. From an HA
viewpoint it is recommended to instead do
Host Based mirroring as it does not require
a manual restart.
Primary Secondary
Cluster Quorums, or Ensuring Survival of One Part of the Cluster:

Cluster quorum mechanisms hae a bigger impact on the design o an extended
distance cluster than they would on a local cluster.
\hen a local cluster is being built, one need not worry much about how quorum
mechanisms work. Cluster sotware is designed to make the process ool proo
both or aoiding split brains
3
and or giing the best odds or a portion o the
cluster to surie when communication ailure between the nodes occurs.
Once one takes the nodes o the cluster and separates them, things are no longer
so simple. 1hey hae a tie breaking mechanism that must be located someplace.
Alternatiely all cluster sotware support putting a tie-breaker at a third site. 1his
allows both sites to be equal and the third site can act as an arbitrator should either
ail or connectiity is lost between the sites. Because o the lA implications, the
3 site implementation is highly recommended.

Setting up oting disks across sites should only be done directly ia the clusterware
sotware. 1hey should not be mirrored remotely otherwise as this could
potentially result in a dual actie database scenario.
Depending on the clusterware proider, the third site may not hae the same
connectiity requirements and may be connectable ia a \AN. Some quorum
mechanisms may also require a balanced number o nodes across the sites. More
detailed discussion and examples o quorum mechanisms, and the alternaties or
implementing the third site are discussed in Appendix A.

3
A split brain is when two portions o the cluster stop coordinating and start doing
actions on their own. 1his would usually lead to a database corruption, so clustering
sotware and Oracle are both careully written to aoid a split brain situation rom
occurring.
CAUTION: Extended RAC
implementations without a third site for tie
breaking quorum, require making one site a
primary site and the other a secondary.
Then should the primary site fail, the
secondary site will require a manual restart.
Third Site
Latency & Empirical Performance Results
Oracle Real Application Clusters requires that the cluster interconnect ,and thus
Cache lusion, hae a dedicated low latency network. A dedicated network is
required to ensure consistent response times and aoid the loss o the cluster
heartbeat, which can cause nodes to be kicked out o the cluster. Interconnect
latency directly aects the time it takes to access blocks in the cache o remote
nodes, and thus directly aects application scalability and perormance. Local
interconnect traic is generally in the 1-2 ms range and improements ,or
degradations, can hae a big eect on the scalability leels o the application.
I,O latencies tend to be in the 8-15ms range, and are also aected by the
additional latencies introduced with distance.
Various partners hae tested RAC on Lxtended Distance Clusters. 1hese tests
include ones done by Mai Cutler ,lP, and Stean Pommerenk ,Oracle, at 0,25,50,
and 100 km, tests done by Paul Bramy ,Oracle,, Christine O`Sullian ,IBM,,
1hierry Plumeau ,IBM, at the LMLA Joint Solutions Center Oracle,IBM at 0,5,
and 20 km, and tests done by Veritas at 0, 20, 40 and 80km. All included a ull
OL1P application test and some included unit tests o the indiidual components.
1he unit tests results rom the lP,Oracle testing will be used to illustrate what
happens at each component leel.

1his igure shows the eects o distance on I,O latency with SAN Buer Credits.
SAN Buer Credits allow a greater number o unacknowledged packets on the
wire, thus allow greater parallelism in the mirroring process. As distances increase,
especially with high traic olumes, these SAN Buer Credits can make a huge
dierence. lor example when the tests aboe where run without the additional
SAN Buer Credits, I,O Latency at 100km was 120-20 greater than local,
instead o 43 in the chart aboe. 1he olks at the IBM,Oracle LMLA Joint
Figure 2: I/O Latency Increase Over Distance
0%
10%
20%
30%
40%
50%
Local 50km 100km Distance
I
/
O

L
a
t
e
n
c
y

I
n
c
r
e
a
s
e
Solutions Center recommend 1 SAN Buer Credit or each 2 kilometers.
4
1hese
numbers are consistent with the results rom the Oracle,IBM testing which had
20-24 throughput degradation on I,O Unit tests at 20km when SAN Buer
Credits where not set.
Interconnect Traffic Unit Test Results
1ests at both high and low load leels, and with one or two interconnects, show
that there is an increase o about 1 ms at 100km. \hile Cache lusion traic is
not as sensitie to distance as I,O latency, the eect o this latency increase can be
as signiicant

4
Bramy, O`Sullian, Plumeau & the LMLA Joint Solutions Center, Oracle 9i RAC
Metropolitan Area Network implementation in an IBM pSeries enironment.
0
1
2
3
4
5
6
Local 25km 50km 100km Distance
M
S
Low Load - 1 or 2 IC
High Load - 1 IC
High Load - 2 IC
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Local 20km 40km 80km
Distance
%

o
f

L
o
c
a
l

P
e
r
f
o
r
m
a
n
c
e
Veritas RAC Test
IBM/Oracle RAC Test
1uned example
with buer
credits
Overall Application Impact

Unit tests are useul, but the inal real impact comes down to how a ull application
reacts to the increased latencies induced by distance. laing three independent
sets o tests proides a more complete picture than each indiidual test. A
summarization o each test is proided, and ull details can be seen in the paper by
each respectie endor listed in reerences.
1he IBM,Oracle tests perormed a representatie workload, which was
accomplished by running the SwingBench workload with proper use o SAN
Buer Credits. 1hese tests at 20km showed 1 degradation or read transactions,
2-8 degradation or most write transactions. 1he aerage single transaction
resulted in 2 degradation.
Veritas used another well-known OL1P workload, and set it up in a manner in
which it was highly scalable. 1hese tests done at 0, 20, 40, and 80km showed that
the application suered minimal perormance loss ,4 in their worst case at
80km,.
Lxtended RAC implementations without a third site or tie breaking quorum,
require making one site a primary` site and the other a secondary. 1hen should
the primary site ail, the secondary site will reqvire a vavvat re.tart.. Other tests
where done without haing SAN Buer Credits set. Combined with a ery
contentious application, this resulted in minimal impact at 25km ,10,, but
signiicant degradation at 50km-100km. lurther testing is needed to determine why
the 50 & 100km numbers are similar, but the 0, 25 and 100km numbers orm a
CAUTION: Not using SAN Buffer
Credits can cause serious application
performance degradation for greater
distances
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Local 25km 50km 100km
Distance
%

o
f

L
o
c
a
l

P
e
r
f
o
r
m
a
n
c
e
No Buffer Credits
ery nice linear slope. \ith appropriate SAN Buer Credits these numbers would
be expected to signiicantly improe and be closer to the Veritas and Oracle,IBM
numbers.
Real lie applications are expected at best to ollow the IBM,Oracle & Veritas
examples demonstrated earlier. In reality they will probably hae more
interconnect traic and thus suer slightly more rom the distance. lor example a
real lie example done at Comic Relie in the United Kingdom by Mike lallas and
Rob Smyth rom Oracle. 1hose results showed that a cluster with an 8km distance
between nodes has roughly 10 degradation in serice ersus running the
application on a local cluster.
5

Lach o these results is or a particular application with a particular setup. Other
applications will be aected dierently, but the basic idea is that as distance
increases, IO and Cache lusion message traic latency increases. 1he limitations
come rom a combination o the ideal network speed minus ineiciencies and
additional latency added by each time the network goes through a switch, router or
hub.
6
As was preiously stated, Dark liber can be used to achiee connections
greater than 10km without repeaters.
\hile there is no magic barrier to how ar RAC on an Lxtended Distance Clusters
can unction, it will hae the least impact on perormance at campus or metro
distances. \rite intensie applications are generally more aected than read
intensie applications. I a desire exists to deploy RAC at a greater distance,
perormance tests using the speciic application are recommended.
lrom these numbers I am generally comortable with a RAC on Lxtended
Distance Clusters at distances under 25km, concerned about perormance at 50km,
and skeptical at 100km or more. 1here is no magic barrier or the distance, latency
just keeps getting worse. \rite intensie applications are generally more aected
then read intensie applications.

5
lallas & Smyth Covic Retief Rea ô.e Da, 200 ;R^D0), Iv.tattivg a 1breeôae R.C
Ctv.ter iv a Dvatite Covfigvratiov v.ivg av Kv D!DM iv/, Issue 1, April 2003
6
I a direct connection does not exist, oer 1 km switches should be used instead o
hubs, as hubs experience an exponential degradation oer distance ,80 already at
1km,

,Algieri & Dahan, page 44,
HARDWARE VENDOR SPECIFICS
1he hardware endor should approe the cluster coniguration that is
implemented, including the disk mirroring mechanisms used, quorum deice
placement, and interconnect redundancy.
Below we proide examples o support by lP, Sun and IBM or a general
extended distance cluster enironment. Other conigurations may also be
supported, please contact them or speciics.
Sun
1he campus cluster conigurations started being supported with in Sun Cluster 3.0
In Sun Cluster 3.1 support is now expanded on a wide ariety o the newer Sun
storage deices, last Lthernet or Gigabit Lthernet, media conerters or extending
the interconnect up to 10 kilometers and the supported node count has been
increased to 8. Details o this are coered in Sun Blueprints document in the
reerences section.
1his paper is not RAC speciic, but there are many customers in production on
this platorm listed in the customer`s section.
HP
1ru-64: Support or extended distance clusters has existed or a good many years,
and this is the enironment being used by some o the production customers
reerenced below.
HP-UX: lP's oering on lP-UX is called Lxtended Sericeguard clusters, and is
oered on either 2 or 3 site conigurations. 1he number o nodes supported aries
rom 2-16, and is dependent on storage technology and distance. lP has tested
and can thereore support a single cluster whose nodes ,and disks, are separated by
a distance o up to 100 kilometers using ClS, CVM, and SLVM. Joint Oracle and
lP test results at 25, 50, and 100 kilometer distance are included in this paper and
where presented at Oracle \orld San lrancisco in 2003 ,see reerences,.
IBM
1he Oracle,IBM Joint Solution Center has successully tested RAC with pSeries,
iSeries and xSeries serers. 1ests hae been done on both AIX & Linux, and hae
used either AIX Mirroring ,on AIX, or ASM ,on Linux or AIX, to keep the disks
in sync. 1hey also hae built practical experience with real world production
customers. 1heir tests included both detailed high aailability and perormance
components. 1heir research has great detail on D\DM and iber networking,
and detailed coniguration on was what used in the testing ,see reerences,.
FULL ORACLE STACK
Starting in Oracle Database 10g Release 2, one is now able to create an extended
cluster on any OS using standard Oracle components. 1he Oracle Clusterware
can be used or integrity and Automatic Storage Management ,ASM, or mirroring.

Oracle Clusterware
Starting with the ersion o Oracle Clusterware released with Oracle Database 10g
Release 2, Oracle proides direct support or mirroring o the Oracle Cluster
Repository ,OCR,, as well as supporting multiple oting disks.
1o setup an extended RAC with Oracle Clusterware:
1. OCR must be mirrored across both sites using Oracle proided
mechanisms.
2. Voting disks we hae a preerably 2 oting disks at each site, and tie-
breaking oting disk at a third site. 1his third site only need be a
supported NlS deice oer a \AN. On most platorms this is still
currently the same as ull RAC support ,i.e. using something like a
NetApp iler, but support or Generic NlS is in progress and is currently
aailable on Linux.

ASM
ASM built in mirroring can be used to eiciently mirror the rest o the database
iles across both sites. Storage at each site much be setup as seperate ailure
groups and use ASM mirroring, to ensure at least one copy o the data at each site.

Roland Knapp, Daniel Dibbets, Amit Das, Using standard NlS to support a third
oting disk on a stretch cluster coniguration on Linux, September 2006
Third Site
For Voting
Disk
(mounted
via NFS or
iSCSI)
ASM used for Mirroring DB
files
WAN
WAN
Primary Primary
DB files (ASM)
OCR
Voting Disk
DB files (ASM)
OCR
Voting Disk
1wo minor limitations do exist with ASM mirroring which are not necessarily
present when using other cluster sotware:
1. ASM does not currently proide partial resilering. Should a loss o
connectiity between the sites occur, one o the ailure groups will be
marked inalid. \hen the site rejoins the cluster, the ailure groups will
need to be manually added. 1his will not impact normal operations.
2. ASM does not currently proide a mechanism or local reads. I,O read
requests to an ASM group will be satisied rom any aailable mirror.
Some other cluster sotware ,Veritas, Sun Cluster, etc, do proide reads
rom the local mirror. Lxcept or extended clusters that are ery ar apart
this should not hae much o an impact.
Solutions or both o these are planned or a uture release o ASM.
COMPARISON WITH A LOCAL RAC AND DATA GUARD REMOTE SITE
lere is a comparison o a RAC oer an Lxtended Distance Cluster ersus a local
RAC cluster or lA and Data Guard or DR.
Comparison Summary
RAC on Lxtended Distance
Clusters
RAC + DG
Needed Nodes 2 3
Active Nodes All One Side Only
DG site can be used or
reporting purposes
Recovery from Site
Iailure
Seconds, No Interention
Required
Seconds, No Interention
Required
8

Performance Hit
See charts
Minor to Crippling Insigniicant to Minor in
same Cases
Network
Requirements
ligh cost direct dedicated
network w, lowest latency.
Much greater network
bandwidth
DG Sync - ligh cost
direct dedicated network
w, lowest latency.

DG Async
Shared commercially
aailable network. Does
not hae low latency
requirements.
Lffective Distance Campus & Metro Country and Continental-
\ide distances
Disaster Protection lost, building, and localized
site ailures
lost, building, localized
site ailures,
Database Corruptions
Local and wider area Site
Disasters
Costs Very ligh Network Costs Additional Nodes

Strengths of RAC on Extended Distance Clusters
All Nodes Active
One o the main attractions or an Lxtended Distance Cluster enironment is that
all nodes can be actie, and dedicated nodes are not required or disaster recoery.
1hus instead o a minimum o 2 RAC clusters required in ull RAC-DG
architecture, 1 RAC cluster can be used. One note o comment: in a RAC-DG
architecture, the DR site can be used or other purposes including reporting and
decision support actiities.
In enironments with larger number o nodes, some adantage is still gained rom
haing all nodes able be an actie part o the same cluster

8
Assuming you are using last-Start lailoer in Oracle Database 10g Release 2 onwards
Fast Recovery
Prior to Oracle 10g Release 2 the biggest adantage o RAC on Lxtended Distance
Clusters is that when a site ails, it is possible to recoer quickly with no manual
interention needed. \ith Data Guard, when the primary site ails, ailoer is
generally manually instantiated. In Oracle Database 10g Release 2, last-Start
lailoer was introduced as a Oracle Data Guard eature that automatically, quickly,
and reliably ails oer to a designated, synchronized standby database in the eent
o loss o the primary database, without requiring manual interention to execute
the ailoer. 1his also requires a third arbitrating site.
Now in the eent o serer ailures, both RAC and Data Guard with last-Start
lailoer can accomplish the ailoer in a ew seconds, requiring no manual
interention.
Costs
1he biggest attraction o Lxtended Distance Clusters is in its potential to reduce
costs. By being able to hae all nodes actie, it is possible to get scalability, ery
high aailability and DR with just 2 nodes.
\hile one could get away with just one mirror copy o the data at each site, this
would be a risky proposition when one site becomes unaailable. 1wo mirrors
should be kept at each site, totaling 4 copies o the data ,same as w, RAC - Data
Guard,.
Cost increments can be incurred by the higher bandwidth and specialized
communication needs o an extended distance cluster enironment. Dark liber
or example can easily cost bvvarea. of tbov.ava. of aottar.. Additional costs can
come rom reduced perormance, and the potential need to implement a third site
9

or the quorum disk.
Strength of local RAC + Data Guard at a remote site
No Performance Hit
A Data Guard enironment can be setup to be asynchronous, which allows data to
be transerred across a great distance and still hae rom none to a minimal impact
on the perormance o the primary enironment. O course in an asynchronous
coniguration you no longer hae the guarantee o zero data loss.
\ith RAC, the sites are much more tightly coupled, thus any latencies inoled
hae a greater aect because o the separation o the two sites. Details o this
were discussed in the latency section ,Page 6,. lurthermore, the latency aects the
data transer between caches. Data Guard only sends redo data, and thus is less
sensitie to network latency.

9
1his can be negligible or large corporations with multiple locations. lor example
with the lP quorum serer this can be any site with IP access, running a ery small
serer.
1o show the dierence perormance and bandwidth impact o Data Guard ersus
ull mirroring, it is useul to look at internal analysis o Oracle's corporate e-mail
systems. lere it was demonstrated that times more data was transmitted oer
the network and 2 times more I,O operations were perormed using a remote
mirroring solution, compared to using Data Guard.
10

Keeping in mind the perormance impact caused by distance with RAC on an
Lxtended Distance Cluster or a well-known OL1P workload ,ligure 4,, it is
interesting to compare this to some other perormance impact tests or
synchronous Data Guard
11
with another well known OL1P workload. 1hese tests
show that the impact o distance on Data Guard is much less, een allowing
distance to be taken to thousands o km, something that would be impossible with
RAC.

Now why is there such a dierence 1he impacts o latency are actually quite
dierent and occur in dierent layers: with Data Guard, the impact is on the
synchronous I,O rom lgwr and network I,O or redo transmission, whereas with
RAC on Lxtended Cluster, the impact is on the synchronous I,O rom dbwr, lgwr
and network.
Greater Disaster Protection
An Lxtended Distance Cluster scenario does not proide ull disaster recoery
,DR, as distance between the sites is limited. In reality it is more o an extended
lA solution as it does get you some degree o separation between the sites.
In DR design it is important to aoid common utility ailure ,water, electricity,,
being on the same lood plain, or being part o a larger location that can all be
damaged by the same jumbo jet. In an earthquake zone the general

10
rom Oracle Data Guard and Remote Mirroring Solutions, Oracle 1echnology
Network ,O1N,
11
rom Oracle9i Data Guard Log 1ransport Serices and Perormance Characterization by
Rabah Mediouni and Rick Anderson
Performance Degradation for Data Guard Sync
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Local 25km 50km 100km 550km 1400km 2760km 3650km 5000km
Distance
%

o
f

L
o
c
a
l

P
e
r
f
o
r
m
a
n
c
e
recommendation is 300 km at right angles to the main ault line. lurricanes and
wars can take out ery large areas. 1errorism brings more unpredictable eects.
So or example i the two sites are in a non-looding non-earthquake zone, not
under a light path and each has independent automatic standby generators and
sel-contained cooling then 1Km may be ample except perhaps in times o war,
terrorism, hurricanes, etc.
Data Guard is able to unction more eiciently and at a much greater distance, and
is in general a more complete DR solution. Many o its adantages come rom the
act that Data Guard does not depend upon remote-mirroring to synchronize the
replica at the remote site. Speciically:
Data Guard Redo 1ransport Serices proide the database perormance
and network utilization adantages described aboe.
Unlike remote-mirroring, Data Guard Apply Serices alidate redo data
beore it is applied to data iles at the remote site. 1his alidation isolates
the remote database rom hardware-induced data ile corruptions that can
occur at the primary location or during the transmission o data to the
remote site.
Data Guard can proide a delayed copy to protect against user errors
,important in Oracle9i but not in Oracle Database 10g when llashback
Database is used,.
Oracle Databae 10g Rolling Upgrades with Data Guard also proide the
ability to reduce downtime during planned outages.
Costs
A RAC approach with only Data Guard at the remote site requires less network
bandwidth and these networks do not need to be as redundant or with such
extreme low latencies they would need or a RAC enironment on Lxtended
Distance Clusters.
Other Limitations of RAC on Extended Distance Clusters
o All endors do not oer these conigurations and the amount o testing
they hae done aries.
o Quorum implementations in some platorms ,lP-UX, require that there
is an equal number o nodes at each site.
CONCLUSION
RAC on Lxtended Distance Clusters is an attractie alternatie architecture that
allows scalability, rapid aailability, and een some ery limited disaster recoery
protection with all nodes ully actie.
1his architecture can proide great alue when used properly, but it is critical that
that the limitations are well understood. Distance can hae a huge eect on
perormance, so keeping the distance short and using costly dedicated airect
networks are critical.
\hile this is a greater lA solution compared to local RAC, it is not a ull Disaster
Recoery solution. Distance cannot be great enough to protect against major
disasters, nor does one get the extra protection against corruptions and lexibility
or planned outages that a RAC and Data Guard combination proides.
\hile this coniguration has been deployed by a small number o customers,
thorough planning and testing is recommended beore attempting to implement.

APPENDIX A: DETAILED QUORUM EXAMPLES
Clusters are designed so that in the case o a ailure o communication between
any 2 subsets o nodes o the cluster, at most one sub-cluster will surie and thus
aoid corrupting the database.
1he At Most` in the last phrase is key. I the clusterware cannot guarantee ater
a ailure that only one sub-cluster will surie, then all sub-clusters go down. \ou
cannot assume that sub-clusters will be able to talk to each other ,a
communication ailure could be the cause o needing to reorm the cluster,.
low the clusterware handles quorum aects how one should layout an extended
cluster. Some cases require a balanced number o nodes at each o the 2 main
sites, while all cases require a third site to locate the tie-breaking deice or higher
aailability.
1he ollowing examples will help you to understand the details o these
restrictions, as well as get a better understanding o how quorum works.
HP Serviceguard / Sun Cluster example
Quorum is achieed here by giing each node a ote, and a quorum deice
,normally a disk or serer, acts a tiebreaker to make sure only one side gets the
majority.
Veritas Storage Foundation for RAC (fomerally DBE/AC) example:
\ith Veritas SlRAC, nodes don`t get otes but instead all nodes race or access to
3 coordinator disks. Because o the algorithm used, larger subsets o nodes will
get to the coordinator disks quicker thus are more likely to surie.
In a 2-site enironment, one would not want both sides to surie, as this would
quickly cause corruptions. 1hereore one side must be able to orm a quorum,
and the tie breaking ote must exist on one side or the other. 1his ends up
creating a primary and a secondary site. Should the primary site ail, the secondary
site will not hae a quorum and will shut down. In this case a manual
reconiguration is required and this should be practiced and well rehearsed.
In a 3-site implementation, quorum can be redistributed so that any 2 sites let can
hae a majority o otes or coordinator disks to ensure that the cluster suries.
Oracle Clusterware example:
1he ollowing example applies to when only Oracle Clusterware is used ,i.e. on
Linux and \indows in Oracle9i and on all platorms in Oracle10g when a third
party clusterware is not used in conjunction with Oracle Clusterware,.
By design, shared disk cluster nodes hae 2 ways to communicate with each other,
thru the interconnect network and shared disk sub system. Many endor s
clusterware monitor cluster aailability only based upon the network heartbeat.,
but depend upon SCSI timeouts or detecting disk ailures to one or all nodes,
these timeouts can take up to 15 minutes..
Oracle Clusterware uses the concept o a oting disk and a heartbeat to monitor
the cluster through both the disk subsystem and the interconnect. 1his helps
Oracle Clusterware to resole asymmetric ailures ery eectiely without resorting
to SCSI timeout mechanisms.
1his method o using the oting disk actiely helps protect against heterogeneous
ailures ,where one node sees the cluster as being ine but others do not, but it
also means that the oting disk` must be accessible at all times, rom all nodes or
the cluster will ail, and the location o oting disk` will make that site primary.
1he oting disk` ile should be mirrored locally or high aailability purposes.
Multiple oting disks setup ia Oracle Clusterware are not mirrors, but members o
a group or which you need to achiee a quorum to continue. 1hus a local mirror
is good. 1hey should not be mirrored remotely as part o an extended cluster as
this could allow two sub clusters to continue working ater a ailure and potentially
lead to a split-brain or dierging database situation.


APPENDIX B: CUSTOMERS USING RAC ON EXTENDED DISTANCE CLUSTERS
1he Roer Group completed the irst known implementation with a similar architecture in the mid 1990`s using Oracle
Parallel Serer. Since then other clients hae implemented it with Real Application Clusters including the ollowing
examples:
1he list below includes many o the known production customers running RAC on an extended cluster. Because Oracle9i
has been around or a longer period it has a larger set o the production customers. 1oday the majority o new customers
implementing are doing so using Oracle 10g, Oracle Clusterware and using ASM to mirror the data between the sites.
Names in italics hae been modiied to only show country or region and industry.
Name Release Nodes Platorm OS Clusterware Stretch
Distance
,KM,
tatiav ivavciat errice. firv 10g 20 IBM AIX lACMP 0.2
Groupe Diusion Plus 10g 2 IBM AIX Oracle 0.5
.v.triav 1 errice. Proriaer 10g 2 IBM AIX lACMP 1
vroeav tectrovic. firv 9i 2 IBM AIX lACMP 8
| Potice Deartvevt 9i 2 IBM AIX lACMP 3
vroeav Corervvevt 9i 2 IBM AIX lACMP 8
| roaaca.ter 9i 2 IBM AIX lACMP 0.2
.v.triav o.itat 9i 2 IBM AIX lACMP 0.6
raitiav Creait |viov êtror/ 9i 3 IBM AIX lACMP 10
UzPromStroyBank 9i 2 IBM AIX lACMP 1.
Daiso Sangyo 10g 2 lP lP-UX Oracle 10
| ortvve 100 firv 9i 2 lP lP-UX lP Serice
Guard
2
raitiav o.itat 9i 2 lP lP-UX lP Serice
Guard
0.5
tatiav Mavvfactvrer 10g 4 lP Linux Oracle 0.8
reai.b .vtovotire Part. 10g 2 IBM Linux Oracle 2
.v.triav eattb Proriaer 10g 2 IBM Linux Oracle 0.3
1homson Legal 10g 8 Sun Linux Oracle 1
ôrtb .vericav otter, 9i 4 lP OpenVMS 10
Cervav 1etecov 10g 4 Sun Solaris Sun Cluster 5
vroeav av/ 10g 2 Sun Solaris Oracle 5
vroeav Mobite Oerator 9i 3 Sun Solaris Veritas
Cluster
48
Comic Relie 9i 3 Sun Solaris Sun Cluster 8
Cervav av/ 9i 2 Sun Solaris 12
vroeav Mait 9i 2 Sun Solaris Veritas
Cluster
12
vroeav Corervvevt 9i 2 Sun Solaris Sun Cluster 0.4
|K |virer.it, 9i 2 Sun Solaris Sun Cluster 0.8
tatiav 1etco 9i 2 Sun Solaris Sun Cluster 2
Austrian Railways 9i 2 lP 1ru64 1ruCluster 1.5
Nordac, Draeger 9i 4 lP 1ru64 1ruCluster 0.3
Uniersity o Melbourne 9i 3 lP 1ru64 1ruCluster 0.8
vroeav tectrovic. Covovevt. firv 10g 2 IBM \indows Oracle 0.5
avi.b eattb firv 9i 6 Dell \indows Oracle 25
REFERENCES
Roland Knapp, Daniel Dibbets, Amit Das, Using standard NlS to support a
third oting disk on a stretch cluster coniguration on Linux, September 2006
LMLA Joint Solutions Center Oracle/IBM, 10gRAC Release2 ligh
Aailability 1est Oer 2 distant sites on xSeries, July 2005
Paul Bramy (Oracle), Christine O'Sullivan (IBM), 1hierry Plumeau (IBM)
at the LMLA Joint Solutions Center Oracle/IBM, Oractei R.C Metrootitav
.rea êtror/ ivtevevtatiov iv av M erie. evrirovvevt, July 2003
Veritas, 1R1. 1otvve Mavager for otari.: Perforvavce rief - Revote Mirrorivg
|.ivg 11M, December 2003
HP Oracle C1C, Lxtended Sericeguard cluster conigurations. Detailed
coniguration inormation or extended RAC on lP-UX clusters, Noemer 2003
Mai Cutler (HP), Sandy Gruver (HP), Stefan Pommerenk (Oracle) tivivate
tbe Cvrrevt Pb,.icat Re.trictiov. of a ivgte Oracte Ctv.ter, Oracle\orld San lrancisco
2003
Joseph Algieri & Xavier Dahan ,lP,, tevaea MC,erriceCvara ctv.ter
covfigvratiov. ;Metro ctv.ter.), Version 1.4, January 2002
Michael Hallas and Robert Smyth, Covic Retief Rea ô.e Da, 200 ;R^D0),
v.tattivg a 1breeôae R.C Ctv.ter iv a Dvatite Covfigvratiov v.ivg av Kv D!DM
iv/, Issue 1, April 2003
Lawrence 1o, Oracte Databa.e 10g Retea.e 2: Roaava to Maivvv .raitabitit,
.rcbitectvre ;M..), April 2006
Michael 1. Smith, Oracte Databa.e 10g Retea.e 2 e.t Practice.: Data Cvara Reao
1rav.ort c êtror/ Covfigvratiov, August 2006
Oracle 1echnology Network, Oracte Data Cvara ava Revote Mirrorivg otvtiov.
Joseph Meeks, Michael 1. Smith, Ashish Ray, Sadhana Kyathappala, a.t
tart aitorer e.t Practice.: Oracte Data Cvara 10g Retea.e 2, Noember, 2005
1im Read, Architecting Aailability & Disaster Recoery Solutions, Sun
BluePrints OnLine, April 2006

Oracle Ral Application Clusters on Extended Distance Cluster
October 2006
Author: Erik Peterson
Reviewers: Daniel Dibbets, Bill Bridge, Joseph Meeks

Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.

Worldwide Inquiries:
Phone: +1.650.506.7000
Fax: +1.650.506.7200
oracle.com

Copyright 2006, Oracle. All rights reserved.
This document is provided for information purposes only and the
contents hereof are subject to change without notice.
This document is not warranted to be error-free, nor subject to any
other warranties or conditions, whether expressed orally or implied
in law, including implied warranties and conditions of merchantability
or fitness for a particular purpose. We specifically disclaim any
liability with respect to this document and no contractual obligations
are formed either directly or indirectly by this document. This document
may not be reproduced or transmitted in any form or by any means,
electronic or mechanical, for any purpose, without our prior written permission.
Oracle, JD Edwards, PeopleSoft, and Siebel are registered trademarks of Oracle
Corporation and/or its affiliates. Other names may be trademarks
of their respective owners.

Oracle Extended Rac 10g r2

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Oracle Extended Rac 10g r2

Încărcat de

Drepturi de autor:

Formate disponibile

Oracle Real Application Clusters

on Lxtended Distance Clusters

S-ar putea să vă placă și