Sunteți pe pagina 1din 19

Oracle RAC (Real Application Cluster)

When one computer is not powerfull enough for an Oracle database, the solution is to use
more computers for a database. In this case we will have one database, but more
computers will use the same database. As you probably know, the database is not more
than some files used for keeping data and managing that data. To access that data we need
same processes, memory used by the these processes for accomplishing this tasks. These
form an instance. So, an instance or more instances are accessing the database. So, all the
time the database is on the disks (temporarily, parts of that information are loaded in
memory for the management of the database, application data).
When more instances are used to access the same data (database) we need to put that
insnces in a cluster. The clusterware, assure that the management of data is done correctly
(for instance, the same data is not modified in the same time by 2 users, even if the users
access the database by 2 or more instances). The clusterware can be bought from another
vendor than the database vendor. Oracle offers a solution for the clusterware as well. In this
case we speak about the Oracle clusterware. When the Oracle database is installed on a
clusterware (Oracle or not) we speak about an Oracle RAC or Oracle Real Application Cluster.
The Oracle RAC and Oracle clusterware is not necessarily the same thing. The Oracle RAC
installation includes the Oracle clusterware installation.
Here is an image which show how the Oracle Real Application Cluster (Oracle RAC is
working):

Oracle RAC Architecture


1. Where are the clusterware files stored on a RAC environment?

The clusterware is installed on each node (on an Oracle Home) and on the
shared disks (the voting disks and the CSR file)

2. Where are the database software files stored on a RAC environment?

The base software is installed on each node of the cluster and the database
storage on the shared disks.

3. What kind of storage we can use for the shared clusterware files?

- OCFS (Release 1 or 2)
- raw devices
- third party cluster filesystem such as GPFS or Veritas

4. What kind of storage we can use for the RAC database storage?

OCFS (Release 1 or 2)
ASM
raw devices
third party cluster filesystem such as GPFS or Veritas

5. What is a CFS?

A cluster File System (CFS) is a file system that may be accessed (read and
write) by all members in a cluster at the same time. This implies that all
members of a cluster have the same view.

6. What is an OCFS2?

The OCFS2 is the Oracle (version 2) Cluster File System which can be used
for the Oracle Real Application Cluster.

7. Which files can be placed on an Oracle Cluster File System?

- Oracle Software installation (Windows only)


- Oracle files (controlfiles, datafiles, redologs, files described by the bfile
datatype)
- Shared configuration files (spfile)
- OCR and voting disk
- Files created by Oracle during runtime

Note: There are some platform specific limitations.

8. Do you know another Cluster Vendor?

HP Tru64 Unix, Veritas, Microsoft

9. How is possible to install a RAC if we dont have a CFS?

This is possible by using a raw device.

10. What is a raw device?

A raw device is a disk drive that does not yet have a file system set up. Raw
devices are used for Real Application Clusters since they enable the sharing
of disks.

11. What is a raw partition?

A raw partition is a portion of a physical disk that is accessed at the lowest


possible level. A raw partition is created when an extended partition is
created and logical partitions are assigned to it without any formatting. Once
formatting is complete, it is called cooked partition.

12. When to use CFS over raw?

A CFS offers:
- Simpler management
- Use of Oracle Managed Files with RAC
- Single Oracle Software installation
- Autoextend enabled on Oracle datafiles
- Uniform accessibility to archive logs in case of physical node failure
- With Oracle_Home on CFS, when you apply Oracle patches CFS guarantees
that the updated Oracle_Home is visible to all nodes in the cluster.

Note: This option is very dependent on the availability of a CFS on your


platform.

13. When to use raw over CFS?

- Always when CFS is not available or not supported by Oracle.


- The performance is very, very important: Raw devices offer best
performance without any intermediate layer between Oracle and the disk.
Note: Autoextend fails on raw devices if the space is exhausted. However the
space could be added online if needed.

14. What CRS is?

Oracle RAC 10g Release 1 introduced Oracle Cluster Ready Services (CRS), a
platform-independent set of system services for cluster environments. In
Release 2, Oracle has renamed this product to Oracle Clusterware.

1. What is VIP IP used for?


It returns a dead connection IMMIDIATELY, when its primary node fails.
Without using VIP IP, the clients have to wait around 10 minutes to receive
ORA-3113: end of file on communications channel. However, using
Transparent Application Failover (TAF) could avoid ORA-3113.

2. Why we need to have configured SSH or RSH on the RAC nodes?


SSH (Secure Shell,10g+) or RSH (Remote Shell, 9i+) allows oracle UNIX
account connecting to another RAC node and copy/ run commands as the
local oracle UNIX account.

3. Is the SSH, RSH needed for normal RAC operations?


No. SSH or RSH are needed only for RAC, patch set installation and clustered
database creation.

4. Do we have to have Oracle RDBMS on all nodes?


Each node of a cluster that is being used for a clustered database will
typically have the RDBMS and RAC software loaded on it, but not actual data
files (these need to be available via shared disk).

5. What are the restrictions on the SID with a RAC database? Is it limited to
5 characters?
The SID prefix in 10g Release 1 and prior versions was restricted to five
characters by install/ config tools so that an ORACLE_SID of up to max of
5+3=8 characters can be supported in a RAC environment. The SID prefix is
relaxed up to 8 characters in 10g Release 2, see bug 4024251 for more
information.

6. Does Real Application Clusters support heterogeneous platforms?


The Real Application Clusters do not support heterogeneous platforms in the
same cluster.

7. Are there any issues for the interconnect when sharing the same switch as
the public network by using VLAN to separate the network?
RAC and Clusterware deployment best practices suggests that the
interconnect (private connection) be deployed on a stand-alone, physically
separate, dedicated switch. On big network the connections could be
instables.

8. What is the Load Balancing Advisory?

To assist in the balancing of application workload across designated


resources, Oracle Database 10g Release 2 provides the Load Balancing
Advisory. This Advisory monitors the current workload activity across the
cluster and for each instance where a service is active; it provides a
percentage value of how much of the total workload should be sent to this
instance as well as service quality flag.

9. How many nodes are supported in a RAC Database?

With 10g Release 2, we support 100 nodes in a cluster using Oracle


Clusterware, and 100 instances in a RAC database. Currently DBCA has a bug
where it will not go beyond 63 instances. There is also a documentation bug
for the max-instances parameter. With 10g Release 1 the Maximum is 63.

10. What is the Cluster Verification Utiltiy (cluvfy)?

The Cluster Verification Utility (CVU) is a validation tool that you can use to
check all the important components that need to be verified at different
stages of deployment in a RAC environment.

11. What versions of the database can I use the cluster verification utility
(cluvfy) with?

The cluster verification utility is release with Oracle Database 10g Release 2
but can also be used with Oracle Database 10g Release 1.

12. If I am using Vendor Clusterware such as Veritas, IBM, Sun or HP, do I


still need Oracle Clusterware to run Oracle RAC 10g?

Yes. When ceritifed, you can use Vendor clusterware however you must still
install and use Oracle Clusterware for RAC. Best Practice is to leave Oracle
Clusterware to manage RAC. For details see Metalink Note 332257.1 and for
Veritas SFRAC see 397460.1.

13. Is RAC on VMWare supported?

No.

14. What is hangcheck timer used for ?

The hangcheck timer checks regularly the health of the system. If the system
hangs or stop the node will be restarted automatically.
There are 2 key parameters for this module:
-> hangcheck-tick: this parameter defines the period of time between checks
of system health. The default value is 60 seconds; Oracle recommends
setting it to 30seconds.
-> hangcheck-margin: this defines the maximum hung delay that should be
tolerated before hangcheck-timer resets the RAC node.

15. Is the hangcheck timer still needed with Oracle RAC 10g?

Yes.

16. What files can I put on Linux OCFS2?

For optimal performance, you should only put the following files on Linux
OCFS2:
- Datafiles
- Control Files
- Redo Logs
- Archive Logs
- Shared Configuration File (OCR)
- Voting File
- SPFILE

17. Is it possible to use ASM for the OCR and voting disk?

No, the OCR and voting disk must be on raw or CFS (cluster file system).

18. Can I change the name of my cluster after I have created it when I am
using Oracle Clusterware?

No, you must properly uninstall Oracle Clusterware and then re-install.

19. What the O2CB is?


The O2CB is the OCFS2 cluster stack. OCFS2 includes some services. These
services must be started before using OCFS2 (mount/ format the file
systems).

20. What the OCR file is used for?


OCR is a file that manages the cluster and RAC configuration.

21. What the Voting Disk file is used for?


The voting disk is nothing but a file that contains and manages information of
all the node memberships.

Oracle RAC clusterware startup sequence (11gR2)


ohasd -> orarootagent -> ora.cssdmonitor : Monitors CSSD and node
health (along with the cssdagent). Try to restart the node if the node is
unhealthy.

Services Daemon

-> ora.ctssd : Cluster Time Synchronization

-> ora.crsd

-> oraagent -> ora.LISTENER.lsnr

->

ora.LISTENER_SCAN.lsnr

-> ora.ons
-> ora.eons
-> ora.asm
-> ora.DB.db

->orarootagent ->
ora.nodename.vip
-> ora.net1.network
-> ora.gns.vip
-> ora.gnsd
-> ora.SCANn.vip

-> cssdagent -> ora.cssd : Cluster Synchronization Services

-> oraagent -> ora.mdnsd : Used for DNS lookup


-> ora.evmd
-> ora.asmd
-> ora.gpnpd : Grid Plug and Play = adding a node to the
cluster is easier (we need less configuration for the new node)

If a resource is written using blue & bold font => resource owned by root.
The other resources are owner by oracle. (all this on UNIX environment)
When a resource is managed by root, we need to run the
command crsctl as root or oracle.
Clusterware Resource Status Check

$ crsctl status resource -t

-------------------------------------------------------------------------------NAME
TARGET STATE
SERVER
STATE_DETAILS
-------------------------------------------------------------------------------Local Resources
-------------------------------------------------------------------------------ora.LISTENER.lsnr

ora.asm

ONLINE ONLINE
ONLINE ONLINE

tzdev1rac
tzdev2rac

OFFLINE OFFLINE
OFFLINE OFFLINE

tzdev1rac
tzdev2rac

ONLINE ONLINE
ONLINE ONLINE

tzdev1rac
tzdev2rac

ora.eons

ora.gsd

OFFLINE OFFLINE tzdev1rac


OFFLINE OFFLINE tzdev2rac
ora.net1.network
ONLINE ONLINE tzdev1rac
ONLINE ONLINE tzdev2rac
ora.ons
ONLINE ONLINE tzdev1rac
ONLINE ONLINE tzdev2rac
-------------------------------------------------------------------------------Cluster Resources
-------------------------------------------------------------------------------ora.LISTENER_SCAN1.lsnr
1
ONLINE ONLINE tzdev1rac
ora.LISTENER_SCAN2.lsnr
1
ONLINE ONLINE tzdev2rac
ora.LISTENER_SCAN3.lsnr
1
ONLINE ONLINE tzdev2rac
ora.oc4j
1
OFFLINE OFFLINE
ora.scan1.vip
1
ONLINE ONLINE tzdev1rac
ora.scan2.vip
1
ONLINE ONLINE tzdev2rac
ora.scan3.vip
1
ONLINE ONLINE tzdev2rac
ora.trezor.db
1
ONLINE ONLINE tzdev1rac
Open
2
ONLINE ONLINE tzdev2rac
ora.tzdev1rac.vip

1
ONLINE ONLINE
ora.tzdev2rac.vip
1
ONLINE ONLINE

tzdev1rac
tzdev2rac

crsctl start has -> start all the clusterware services/ resources (including
the database server and the listener);
crsctl stop has -> stop all the clusterware services/ resources (including
the database server and the listener);

crsctl check has -> to check if ohasd is running/ stopped

crsctl check has


CRS-4638: Oracle High Availability Services is online
>> the ohasd daemon is running => the clusterware is (must be) up and
running (if no error occur).

crsctl check has


CRS-4639: Could not contact Oracle High Availability Services
>> the ohasd daemon is NOT running => the clusterware is DOWN
(stopped).

crsctl enable has -> enable Oracle High Availability Services autostart
crsctl disable has -> disable Oracle High Availability Services autostart

crsctl config has -> check if Oracle High Availability Services autostart is
enabled/ disabled.

crsctl config has


CRS-4622: Oracle High Availability Services autostart is enabled.

Useful information you can get from Metalink ID 1053147.1: 11gR2


Clusterware and Grid Home - What You Need to Know

Oracle RAC Clusterware Components


1) Cluster Ready Services (CRS)
Functionality: RAC resource monitoring/ management
changes are written in the OCR

==> all

- start, stop of the resouces


- failover of the application resources
- node recovery
- automatically restarts t RAC
resources when a failure occurs.
RAC resources: a database, an instance, a service, a listener, a virtual
IP (VIP) address, an application process

Daemon process (on AIX, UNIX, Linux): crsd

Run as (on AIX, UNIX, Linux): root

ps -ef | grep crsd


root 221524 1 0 May 26 - 3:33 /oracle/crs/10.2/bin/crsd.bin reboot

Failure of the process: the crsd restarts automatically, without


restarting the node.

CRSd can run in 2 modes:

reboot mode -> when crsd starts all the resources are restarted.

restart mode -> when crsd starts the resources are started as these were
before the shutdown.
When CRS is installed on the cluster where a 3rd-party clusterware is
integrated (there are 2 clusterware on the cluster)

-> CRSd manages:


- Oracle RAC services and resources

When CRS is the ONLY ONE clusterware on the cluster

-> CRSd manages:


- Oracle RAC services and resources
- the node membership functionality (by CSSd, but
CSS in managed by CRSd)

COMMENT:
In order to start the crsd we need:
- the public interface, the private interface and the virtual IP
(VIP) should be up and running !
- these IPs must be pingable to each other.

2) Cluster Synchronization Services (CSS)

Functionality: enables basic cluster services


node information is written to OCR

==> new/ lost

- the node membership functionality


- basic locking

Daemon process (on AIX, UNIX, Linux): ocssd

Run as (on AIX, UNIX, Linux): oracle

ps -ef | grep ocssd


oracle 229642 241940 1 May 26 - 3:36 /oracle/crs/10.2/bin/ocssd.bin

Failure of the process: Node restart.

3) Event Management (EVM)

Functionality: - a background process that publishes events that crs


creates.

Daemon process (on AIX, UNIX, Linux): evmd

Run as (on AIX, UNIX, Linux): oracle

ps -ef | grep evmd


oracle 229633 241356 1 May 26 - 3:36 /oracle/crs/10.2/bin/evmd.bin

Failure of the process: the evmd restarts automatically, without


restarting the node.

4) Oracle Notification Service (ONS)

Functionality: - a publish and subscribe service for communicating


Fast Application Notification (FAN) events to clients.

Daemon process (on AIX, UNIX, Linux): ons

Run as (on AIX, UNIX, Linux): oracle

ps -ef | grep ons


oracle 233968 1 0 May 28 - 0:00 /oracle/crs/10.2/opmn/bin/ons -d

Failure of the process: -

5) RACG

Functionality: - Runs server callout scripts when FAN events occur.

Daemon process (on AIX, Linux): racgimon (AIX), racgmain


(Linux)

Run as (on AIX, UNIX, Linux): oracle

ps -ef | grep racgimon


oracle 385292 1 0 11:40:08 - 0:03 /oracle/db/10.2/bin/racgimon
startd MYDB

Failure of the process: -

6) Process Monitor Daemon (OPROCD)

(cssdagent from 11gR2)

Functionality: - is the I/O fencing solution which monitors the


Cluster Node (I/O fencing )

the Nx node resources

- when the Nx node fails, no other node can modify

Daemon process (on AIX, Linux): racgimon (AIX), racgmain


(Linux)

Run as (on AIX, UNIX, Linux): root

ps -ef | grep oprocd


root 184346 201058 0 May 27 - 0:06 /oracle/crs/10.2/bin/oprocd run

-t 1000 -m 400

COMMENT: In this case oprocd wakes up every minute to get the


current time. If it is within 400ms range with the last result it will go back to
sleep again otherwise it will reboot the node.

Failure of the process: Node restart (=node reboot).

S-ar putea să vă placă și