Documente Academic
Documente Profesional
Documente Cultură
ASM
ASM built in mirroring can be used to eiciently mirror the rest o the database
iles across both sites. Storage at each site much be setup as seperate ailure
groups and use ASM mirroring, to ensure at least one copy o the data at each site.
Roland Knapp, Daniel Dibbets, Amit Das, Using standard NlS to support a third
oting disk on a stretch cluster coniguration on Linux, September 2006
Third Site
For Voting
Disk
(mounted
via NFS or
iSCSI)
ASM used for Mirroring DB
files
WAN
WAN
Primary Primary
DB files (ASM)
OCR
Voting Disk
DB files (ASM)
OCR
Voting Disk
Real Application Clusters on Extended Distance Clusters - Page 17
1wo minor limitations do exist with ASM mirroring which are not necessarily
present when using other cluster sotware:
1. ASM does not currently proide partial resilering. Should a loss o
connectiity between the sites occur, one o the ailure groups will be
marked inalid. \hen the site rejoins the cluster, the ailure groups will
need to be manually added. 1his will not impact normal operations.
2. ASM does not currently proide a mechanism or local reads. I,O read
requests to an ASM group will be satisied rom any aailable mirror.
Some other cluster sotware ,Veritas, Sun Cluster, etc, do proide reads
rom the local mirror. Lxcept or extended clusters that are ery ar apart
this should not hae much o an impact.
Solutions or both o these are planned or a uture release o ASM.
Real Application Clusters on Extended Distance Clusters - Page 18
COMPARISON WITH A LOCAL RAC AND DATA GUARD REMOTE SITE
lere is a comparison o a RAC oer an Lxtended Distance Cluster ersus a local
RAC cluster or lA and Data Guard or DR.
Comparison Summary
RAC on Lxtended Distance
Clusters
RAC + DG
Needed Nodes 2 3
Active Nodes All One Side Only
DG site can be used or
reporting purposes
Recovery from Site
Iailure
Seconds, No Interention
Required
Seconds, No Interention
Required
8
Performance Hit
See charts
Minor to Crippling Insigniicant to Minor in
same Cases
Network
Requirements
ligh cost direct dedicated
network w, lowest latency.
Much greater network
bandwidth
DG Sync - ligh cost
direct dedicated network
w, lowest latency.
DG Async
Shared commercially
aailable network. Does
not hae low latency
requirements.
Lffective Distance Campus & Metro Country and Continental-
\ide distances
Disaster Protection lost, building, and localized
site ailures
lost, building, localized
site ailures,
Database Corruptions
Local and wider area Site
Disasters
Costs Very ligh Network Costs Additional Nodes
Strengths of RAC on Extended Distance Clusters
All Nodes Active
One o the main attractions or an Lxtended Distance Cluster enironment is that
all nodes can be actie, and dedicated nodes are not required or disaster recoery.
1hus instead o a minimum o 2 RAC clusters required in ull RAC-DG
architecture, 1 RAC cluster can be used. One note o comment: in a RAC-DG
architecture, the DR site can be used or other purposes including reporting and
decision support actiities.
In enironments with larger number o nodes, some adantage is still gained rom
haing all nodes able be an actie part o the same cluster
8
Assuming you are using last-Start lailoer in Oracle Database 10g Release 2 onwards
Real Application Clusters on Extended Distance Clusters - Page 19
Fast Recovery
Prior to Oracle 10g Release 2 the biggest adantage o RAC on Lxtended Distance
Clusters is that when a site ails, it is possible to recoer quickly with no manual
interention needed. \ith Data Guard, when the primary site ails, ailoer is
generally manually instantiated. In Oracle Database 10g Release 2, last-Start
lailoer was introduced as a Oracle Data Guard eature that automatically, quickly,
and reliably ails oer to a designated, synchronized standby database in the eent
o loss o the primary database, without requiring manual interention to execute
the ailoer. 1his also requires a third arbitrating site.
Now in the eent o serer ailures, both RAC and Data Guard with last-Start
lailoer can accomplish the ailoer in a ew seconds, requiring no manual
interention.
Costs
1he biggest attraction o Lxtended Distance Clusters is in its potential to reduce
costs. By being able to hae all nodes actie, it is possible to get scalability, ery
high aailability and DR with just 2 nodes.
\hile one could get away with just one mirror copy o the data at each site, this
would be a risky proposition when one site becomes unaailable. 1wo mirrors
should be kept at each site, totaling 4 copies o the data ,same as w, RAC - Data
Guard,.
Cost increments can be incurred by the higher bandwidth and specialized
communication needs o an extended distance cluster enironment. Dark liber
or example can easily cost bvvarea. of tbov.ava. of aottar.. Additional costs can
come rom reduced perormance, and the potential need to implement a third site
9
or the quorum disk.
Strength of local RAC + Data Guard at a remote site
No Performance Hit
A Data Guard enironment can be setup to be asynchronous, which allows data to
be transerred across a great distance and still hae rom none to a minimal impact
on the perormance o the primary enironment. O course in an asynchronous
coniguration you no longer hae the guarantee o zero data loss.
\ith RAC, the sites are much more tightly coupled, thus any latencies inoled
hae a greater aect because o the separation o the two sites. Details o this
were discussed in the latency section ,Page 6,. lurthermore, the latency aects the
data transer between caches. Data Guard only sends redo data, and thus is less
sensitie to network latency.
9
1his can be negligible or large corporations with multiple locations. lor example
with the lP quorum serer this can be any site with IP access, running a ery small
serer.
Real Application Clusters on Extended Distance Clusters - Page 20
1o show the dierence perormance and bandwidth impact o Data Guard ersus
ull mirroring, it is useul to look at internal analysis o Oracle's corporate e-mail
systems. lere it was demonstrated that times more data was transmitted oer
the network and 2 times more I,O operations were perormed using a remote
mirroring solution, compared to using Data Guard.
10
Keeping in mind the perormance impact caused by distance with RAC on an
Lxtended Distance Cluster or a well-known OL1P workload ,ligure 4,, it is
interesting to compare this to some other perormance impact tests or
synchronous Data Guard
11
with another well known OL1P workload. 1hese tests
show that the impact o distance on Data Guard is much less, een allowing
distance to be taken to thousands o km, something that would be impossible with
RAC.
Now why is there such a dierence 1he impacts o latency are actually quite
dierent and occur in dierent layers: with Data Guard, the impact is on the
synchronous I,O rom lgwr and network I,O or redo transmission, whereas with
RAC on Lxtended Cluster, the impact is on the synchronous I,O rom dbwr, lgwr
and network.
Greater Disaster Protection
An Lxtended Distance Cluster scenario does not proide ull disaster recoery
,DR, as distance between the sites is limited. In reality it is more o an extended
lA solution as it does get you some degree o separation between the sites.
In DR design it is important to aoid common utility ailure ,water, electricity,,
being on the same lood plain, or being part o a larger location that can all be
damaged by the same jumbo jet. In an earthquake zone the general
10
rom Oracle Data Guard and Remote Mirroring Solutions, Oracle 1echnology
Network ,O1N,
11
rom Oracle9i Data Guard Log 1ransport Serices and Perormance Characterization by
Rabah Mediouni and Rick Anderson
Performance Degradation for Data Guard Sync
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Local 25km 50km 100km 550km 1400km 2760km 3650km 5000km
Distance
%
o
f
L
o
c
a
l
P
e
r
f
o
r
m
a
n
c
e
Real Application Clusters on Extended Distance Clusters - Page 21
recommendation is 300 km at right angles to the main ault line. lurricanes and
wars can take out ery large areas. 1errorism brings more unpredictable eects.
So or example i the two sites are in a non-looding non-earthquake zone, not
under a light path and each has independent automatic standby generators and
sel-contained cooling then 1Km may be ample except perhaps in times o war,
terrorism, hurricanes, etc.
Data Guard is able to unction more eiciently and at a much greater distance, and
is in general a more complete DR solution. Many o its adantages come rom the
act that Data Guard does not depend upon remote-mirroring to synchronize the
replica at the remote site. Speciically:
Data Guard Redo 1ransport Serices proide the database perormance
and network utilization adantages described aboe.
Unlike remote-mirroring, Data Guard Apply Serices alidate redo data
beore it is applied to data iles at the remote site. 1his alidation isolates
the remote database rom hardware-induced data ile corruptions that can
occur at the primary location or during the transmission o data to the
remote site.
Data Guard can proide a delayed copy to protect against user errors
,important in Oracle9i but not in Oracle Database 10g when llashback
Database is used,.
Oracle Databae 10g Rolling Upgrades with Data Guard also proide the
ability to reduce downtime during planned outages.
Costs
A RAC approach with only Data Guard at the remote site requires less network
bandwidth and these networks do not need to be as redundant or with such
extreme low latencies they would need or a RAC enironment on Lxtended
Distance Clusters.
Other Limitations of RAC on Extended Distance Clusters
o All endors do not oer these conigurations and the amount o testing
they hae done aries.
o Quorum implementations in some platorms ,lP-UX, require that there
is an equal number o nodes at each site.
Real Application Clusters on Extended Distance Clusters - Page 22
CONCLUSION
RAC on Lxtended Distance Clusters is an attractie alternatie architecture that
allows scalability, rapid aailability, and een some ery limited disaster recoery
protection with all nodes ully actie.
1his architecture can proide great alue when used properly, but it is critical that
that the limitations are well understood. Distance can hae a huge eect on
perormance, so keeping the distance short and using costly dedicated airect
networks are critical.
\hile this is a greater lA solution compared to local RAC, it is not a ull Disaster
Recoery solution. Distance cannot be great enough to protect against major
disasters, nor does one get the extra protection against corruptions and lexibility
or planned outages that a RAC and Data Guard combination proides.
\hile this coniguration has been deployed by a small number o customers,
thorough planning and testing is recommended beore attempting to implement.
Real Application Clusters on Extended Distance Clusters - Page 23
APPENDIX A: DETAILED QUORUM EXAMPLES
Clusters are designed so that in the case o a ailure o communication between
any 2 subsets o nodes o the cluster, at most one sub-cluster will surie and thus
aoid corrupting the database.
1he At Most` in the last phrase is key. I the clusterware cannot guarantee ater
a ailure that only one sub-cluster will surie, then all sub-clusters go down. \ou
cannot assume that sub-clusters will be able to talk to each other ,a
communication ailure could be the cause o needing to reorm the cluster,.
low the clusterware handles quorum aects how one should layout an extended
cluster. Some cases require a balanced number o nodes at each o the 2 main
sites, while all cases require a third site to locate the tie-breaking deice or higher
aailability.
1he ollowing examples will help you to understand the details o these
restrictions, as well as get a better understanding o how quorum works.
HP Serviceguard / Sun Cluster example
Quorum is achieed here by giing each node a ote, and a quorum deice
,normally a disk or serer, acts a tiebreaker to make sure only one side gets the
majority.
Veritas Storage Foundation for RAC (fomerally DBE/AC) example:
\ith Veritas SlRAC, nodes don`t get otes but instead all nodes race or access to
3 coordinator disks. Because o the algorithm used, larger subsets o nodes will
get to the coordinator disks quicker thus are more likely to surie.
In a 2-site enironment, one would not want both sides to surie, as this would
quickly cause corruptions. 1hereore one side must be able to orm a quorum,
and the tie breaking ote must exist on one side or the other. 1his ends up
creating a primary and a secondary site. Should the primary site ail, the secondary
site will not hae a quorum and will shut down. In this case a manual
reconiguration is required and this should be practiced and well rehearsed.
In a 3-site implementation, quorum can be redistributed so that any 2 sites let can
hae a majority o otes or coordinator disks to ensure that the cluster suries.
Oracle Clusterware example:
1he ollowing example applies to when only Oracle Clusterware is used ,i.e. on
Linux and \indows in Oracle9i and on all platorms in Oracle10g when a third
party clusterware is not used in conjunction with Oracle Clusterware,.
By design, shared disk cluster nodes hae 2 ways to communicate with each other,
thru the interconnect network and shared disk sub system. Many endor s
clusterware monitor cluster aailability only based upon the network heartbeat.,
but depend upon SCSI timeouts or detecting disk ailures to one or all nodes,
Real Application Clusters on Extended Distance Clusters - Page 24
these timeouts can take up to 15 minutes..
Oracle Clusterware uses the concept o a oting disk and a heartbeat to monitor
the cluster through both the disk subsystem and the interconnect. 1his helps
Oracle Clusterware to resole asymmetric ailures ery eectiely without resorting
to SCSI timeout mechanisms.
1his method o using the oting disk actiely helps protect against heterogeneous
ailures ,where one node sees the cluster as being ine but others do not, but it
also means that the oting disk` must be accessible at all times, rom all nodes or
the cluster will ail, and the location o oting disk` will make that site primary.
1he oting disk` ile should be mirrored locally or high aailability purposes.
Multiple oting disks setup ia Oracle Clusterware are not mirrors, but members o
a group or which you need to achiee a quorum to continue. 1hus a local mirror
is good. 1hey should not be mirrored remotely as part o an extended cluster as
this could allow two sub clusters to continue working ater a ailure and potentially
lead to a split-brain or dierging database situation.
Real Application Clusters on Extended Distance Clusters - Page 25
APPENDIX B: CUSTOMERS USING RAC ON EXTENDED DISTANCE CLUSTERS
1he Roer Group completed the irst known implementation with a similar architecture in the mid 1990`s using Oracle
Parallel Serer. Since then other clients hae implemented it with Real Application Clusters including the ollowing
examples:
1he list below includes many o the known production customers running RAC on an extended cluster. Because Oracle9i
has been around or a longer period it has a larger set o the production customers. 1oday the majority o new customers
implementing are doing so using Oracle 10g, Oracle Clusterware and using ASM to mirror the data between the sites.
Names in italics hae been modiied to only show country or region and industry.
Name Release Nodes Platorm OS Clusterware Stretch
Distance
,KM,
tatiav ivavciat errice. firv 10g 20 IBM AIX lACMP 0.2
Groupe Diusion Plus 10g 2 IBM AIX Oracle 0.5
.v.triav 1 errice. Proriaer 10g 2 IBM AIX lACMP 1
vroeav tectrovic. firv 9i 2 IBM AIX lACMP 8
| Potice Deartvevt 9i 2 IBM AIX lACMP 3
vroeav Corervvevt 9i 2 IBM AIX lACMP 8
| roaaca.ter 9i 2 IBM AIX lACMP 0.2
.v.triav o.itat 9i 2 IBM AIX lACMP 0.6
raitiav Creait |viov ^etror/ 9i 3 IBM AIX lACMP 10
UzPromStroyBank 9i 2 IBM AIX lACMP 1.
Daiso Sangyo 10g 2 lP lP-UX Oracle 10
| ortvve 100 firv 9i 2 lP lP-UX lP Serice
Guard
2
raitiav o.itat 9i 2 lP lP-UX lP Serice
Guard
0.5
tatiav Mavvfactvrer 10g 4 lP Linux Oracle 0.8
reai.b .vtovotire Part. 10g 2 IBM Linux Oracle 2
.v.triav eattb Proriaer 10g 2 IBM Linux Oracle 0.3
1homson Legal 10g 8 Sun Linux Oracle 1
^ortb .vericav otter, 9i 4 lP OpenVMS 10
Cervav 1etecov 10g 4 Sun Solaris Sun Cluster 5
vroeav av/ 10g 2 Sun Solaris Oracle 5
vroeav Mobite Oerator 9i 3 Sun Solaris Veritas
Cluster
48
Comic Relie 9i 3 Sun Solaris Sun Cluster 8
Cervav av/ 9i 2 Sun Solaris 12
vroeav Mait 9i 2 Sun Solaris Veritas
Cluster
12
vroeav Corervvevt 9i 2 Sun Solaris Sun Cluster 0.4
|K |virer.it, 9i 2 Sun Solaris Sun Cluster 0.8
tatiav 1etco 9i 2 Sun Solaris Sun Cluster 2
Austrian Railways 9i 2 lP 1ru64 1ruCluster 1.5
Nordac, Draeger 9i 4 lP 1ru64 1ruCluster 0.3
Uniersity o Melbourne 9i 3 lP 1ru64 1ruCluster 0.8
vroeav tectrovic. Covovevt. firv 10g 2 IBM \indows Oracle 0.5
avi.b eattb firv 9i 6 Dell \indows Oracle 25
Real Application Clusters on Extended Distance Clusters - Page 26
REFERENCES
Roland Knapp, Daniel Dibbets, Amit Das, Using standard NlS to support a
third oting disk on a stretch cluster coniguration on Linux, September 2006
LMLA Joint Solutions Center Oracle/IBM, 10gRAC Release2 ligh
Aailability 1est Oer 2 distant sites on xSeries, July 2005
Paul Bramy (Oracle), Christine O'Sullivan (IBM), 1hierry Plumeau (IBM)
at the LMLA Joint Solutions Center Oracle/IBM, Oractei R.C Metrootitav
.rea ^etror/ ivtevevtatiov iv av M erie. evrirovvevt, July 2003
Veritas, 1R1. 1otvve Mavager for otari.: Perforvavce rief - Revote Mirrorivg
|.ivg 11M, December 2003
HP Oracle C1C, Lxtended Sericeguard cluster conigurations. Detailed
coniguration inormation or extended RAC on lP-UX clusters, Noemer 2003
Mai Cutler (HP), Sandy Gruver (HP), Stefan Pommerenk (Oracle) tivivate
tbe Cvrrevt Pb,.icat Re.trictiov. of a ivgte Oracte Ctv.ter, Oracle\orld San lrancisco
2003
Joseph Algieri & Xavier Dahan ,lP,, tevaea MC,erriceCvara ctv.ter
covfigvratiov. ;Metro ctv.ter.), Version 1.4, January 2002
Michael Hallas and Robert Smyth, Covic Retief Rea ^o.e Da, 200 ;R^D0),
v.tattivg a 1bree^oae R.C Ctv.ter iv a Dvatite Covfigvratiov v.ivg av Kv D!DM
iv/, Issue 1, April 2003
Lawrence 1o, Oracte Databa.e 10g Retea.e 2: Roaava to Maivvv .raitabitit,
.rcbitectvre ;M..), April 2006
Michael 1. Smith, Oracte Databa.e 10g Retea.e 2 e.t Practice.: Data Cvara Reao
1rav.ort c ^etror/ Covfigvratiov, August 2006
Oracle 1echnology Network, Oracte Data Cvara ava Revote Mirrorivg otvtiov.
Joseph Meeks, Michael 1. Smith, Ashish Ray, Sadhana Kyathappala, a.t
tart aitorer e.t Practice.: Oracte Data Cvara 10g Retea.e 2, Noember, 2005
1im Read, Architecting Aailability & Disaster Recoery Solutions, Sun
BluePrints OnLine, April 2006
Oracle Ral Application Clusters on Extended Distance Cluster
October 2006
Author: Erik Peterson
Reviewers: Daniel Dibbets, Bill Bridge, Joseph Meeks
Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.
Worldwide Inquiries:
Phone: +1.650.506.7000
Fax: +1.650.506.7200
oracle.com
Copyright 2006, Oracle. All rights reserved.
This document is provided for information purposes only and the
contents hereof are subject to change without notice.
This document is not warranted to be error-free, nor subject to any
other warranties or conditions, whether expressed orally or implied
in law, including implied warranties and conditions of merchantability
or fitness for a particular purpose. We specifically disclaim any
liability with respect to this document and no contractual obligations
are formed either directly or indirectly by this document. This document
may not be reproduced or transmitted in any form or by any means,
electronic or mechanical, for any purpose, without our prior written permission.
Oracle, JD Edwards, PeopleSoft, and Siebel are registered trademarks of Oracle
Corporation and/or its affiliates. Other names may be trademarks
of their respective owners.