Sunteți pe pagina 1din 83

Oracle Cluster ware has two key components Cluster Registry OCR

and Voting Disk.


The cluster registry holds all information about nodes, instances, services
and ASM storage if used, it also contains state information ie they are
available and up or similar.
The voting disk is used to determine if a node has failed, i.e. become
separated from the majority. If a node is deemed to no longer belong to the
majority then it is forcibly rebooted and will after the reboot add itself again
the the surviving cluster nodes.

What is a virtual IP address or VIP?


A virtual IP address or VIP is an alternate IP address that the client
connections use instead of the standard public IP address. To
configure VIP address, we need to reserve a spare IP address for
each node, and the IP addresses must use the same subnet as the
public network.
What is the use of VIP?
If a node fails, then the nodes VIP address fails over to another
node on which the VIP address can accept TCP connections but it
cannot accept Oracle connections.
Give situations under which VIP address failover happens:VIP addresses failover happens when the node on which the VIP
address runs fails, all interfaces for the VIP address fails, all
interfaces for the VIP address are disconnected from the network.
Using virtual IP we can save our TCP/IP timeout problem because
Oracle notification service maintains communication between each
nodes and listeners.
What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to
connect to the VIP address receive a rapid connection refused
error .They dont have to wait for TCP connection timeout
messages.
What is voting disk?

Voting Disk is a file that sits in the shared storage area and must be
accessible by all nodes in the cluster. All nodes in the cluster
registers their heart-beat information in the voting disk, so as to
confirm that they are all operational. If heart-beat information of
any node in the voting disk is not available that node will be evicted
from the cluster. The CSS (Cluster Synchronization Service) daemon
in the clusterware maintains the heart beat of all nodes to the
voting disk. When any node is not able to send heartbeat to voting
disk, then it will reboot itself, thus help avoiding the split-brain
syndrome.
For high availability, Oracle recommends that you have a minimum
of three or odd number (3 or greater) of votingdisks.
Voting Disk is file that resides on shared storage and Manages
cluster members. Voting disk reassigns cluster ownership between
the nodes in case of failure.
The Voting Disk Files are used by Oracle Clusterware to determine
which nodes are currently members of the cluster. The voting disk
files are also used in concert with other Cluster components such as
CRS to maintain the clusters integrity.
Oracle Database 11g Release 2 provides the ability to store the
voting disks in ASM along with the OCR. Oracle Clusterware can
access the OCR and the voting disks present in ASM even if the ASM
instance is down. As a result CSS can continue to maintain the
Oracle cluster even if the ASM instance has failed.
How many voting disks are you maintaining ?
By default Oracle will create 3 voting disk files in ASM.
Oracle expects that you will configure at least 3 voting disks for
redundancy purposes. You should always configure an odd number
of voting disks >= 3. This is because loss of more than half your
voting disks will cause the entire cluster to fail.
You should plan on allocating 280MB for each voting disk file. For
example, if you are using ASM and external redundancy then you
will need to allocate 280MB of disk for the voting disk. If you are
using ASM and normal redundancy you will need 560MB.
Why we need to keep odd number of voting disks ?
Oracle expects that you will configure at least 3 voting disks for
redundancy purposes. You should always configure an odd number

of voting disks >= 3. This is because loss of more than half your
voting disks will cause the entire cluster to fail.
What are Oracle RAC software components?
Oracle RAC is composed of two or more database instances. They
are composed of Memory structures and background processes
same as the single instance database.Oracle RAC instances use two
processes GES(Global Enqueue Service), GCS(Global Cache Service)
that enable cache fusion.Oracle RAC instances are composed of
following background processes:
ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)
RSMNRemote Slave Monitor
What are Oracle Clusterware processes for 10g ?
Cluster Synchronization Services (ocssd) Manages cluster node
membership and runs as the oracle user; failure of this process
results in cluster restart.
Cluster Ready Services (crsd) The crs process manages cluster
resources (which could be a database, an instance, a service, a
Listener, a virtual IP (VIP) address, an application process, and so
on) based on the resources configuration information that is stored
in the OCR. This includes start, stop, monitor and failover
operations. This process runs as the root user
Event manager daemon (evmd) A background process that
publishes events that crs creates.
Process Monitor Daemon (OPROCD) This process monitor the
cluster and provide I/O fencing. OPROCD performs its check, stops
running, and if the wake up is beyond the expected time, then
OPROCD resets the processor and reboots the node. An OPROCD
failure results in Oracle Clusterware restarting the node. OPROCD
uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) Extends clusterware to support
Oracle-specific requirements and complex resources. Runs server
callout scripts when FAN events occur.
What are Oracle database background processes specific to RAC?
LMSGlobal Cache Service Process
LMDGlobal Enqueue Service Daemon
LMONGlobal Enqueue Service Monitor
LCK0Instance Enqueue Process

Oracle RAC instances use two processes, the Global Cache Service
(GCS) and the Global Enqueue Service (GES). The GCS and GES
maintain records of the statuses of each data file and each cached
block using a Global Resource Directory (GRD). The GRD contents
are distributed across all of the active instances.
What is Cache Fusion?
Transfor of data across instances through private interconnect is
called cachefusion.Oracle RAC is composed of two or more
instances. When a block of data is read from datafile by an instance
within the cluster and another instance is in need of the same
block,it is easy to get the block image from the insatnce which has
the block in its SGA rather than reading from the disk. To enable
inter instance communication Oracle RAC makes use of
interconnects. The Global Enqueue Service(GES) monitors and
Instance enqueue process manages the cahce fusion
What is SCAN? (11gR2 feature)
Single Client Access Name (SCAN) is s a new Oracle Real Application
Clusters (RAC) 11g Release 2 feature that provides a single name
for clients to access an Oracle Database running in a cluster. The
benefit is clients using SCAN do not need to change if you add or
remove nodes in the cluster.
SCAN provides a single domain name via (DNS), allowing and-users
to address a RAC cluster as-if it were a single IP address. SCAN
works by replacing a hostname or IP list with virtual IP addresses
(VIP).
Single client access name (SCAN) is meant to facilitate single name
for all Oracle clients to connect to the cluster database, irrespective
of number of nodes and node location. Until now, we have to keep
adding multiple address records in all clients tnsnames.ora, when a
new node gets added to or deleted from the cluster.
Single Client Access Name (SCAN) eliminates the need to change
TNSNAMES entry when nodes are added to or removed from the
Cluster. RAC instances register to SCAN listeners as remote
listeners. Oracle recommends assigning 3 addresses to SCAN, which
will create 3 SCAN listeners, though the cluster has got dozens of
nodes.. SCAN is a domain name registered to at least one and up to
three IP addresses, either in DNS (Domain Name Service) or GNS
(Grid Naming Service). The SCAN must resolve to at least one
address on the public network. For high availability and scalability,
Oracle recommends configuring the SCAN to resolve to three
addresses.

What are SCAN components in a cluster?


1.SCAN Name
2.SCAN IPs (3)
3.SCAN Listeners (3)
What is FAN?
Fast application Notification as it abbreviates to FAN relates to the
events related to instances,services and nodes.This is a notification
mechanism that Oracle RAc uses to notify other processes about the
configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can
respond to FAN events and take immediate action.
What is TAF?
TAF (Transparent Application Failover) is a configuration that allows
session fail-over between different nodes of a RAC database cluster.
Transparent Application Failover (TAF). If a communication link
failure occurs after a connection is established, the connection fails
over to another active node. Any disrupted transactions are rolled
back, and session properties and server-side program variables are
lost. In some cases, if the statement executing at the time of the
failover is a Select statement, that statement may be automatically
re-executed on the new connection with the cursor positioned on
the row on which it was positioned prior to the failover.
After an Oracle RAC node crashesusually from a hardware failure
all new application transactions are automatically rerouted to a
specified backup node. The challenge in rerouting is to not lose
transactions that were in flight at the exact moment of the crash.
One of the requirements of continuous availability is the ability to
restart in-flight application transactions, allowing a failed node to
resume processing on another server without interruption. Oracles
answer to application failover is a new Oracle Net mechanism
dubbed Transparent Application Failover. TAF allows the DBA to
configure the type and method of failover for each Oracle Net client.
TAF architecture offers the ability to restart transactions at either
the transaction (SELECT) or session level.
What are the requirements for Oracle Clusterware?
1. External Shared Disk to store Oracle Cluster ware file (Voting Disk
and Oracle Cluster Registry OCR)
2. Two netwrok cards on each cluster ware node (and three set of IP
address)
Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter
node communication between rac nodes used by clusterware and
rac database)
IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for
client connection and for connection failover)
3. Storage Option for OCR and Voting Disk RAW, OCFS2 (Oracle
Cluster File System), NFS, ..
Which enable the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application
connections across all of the instances in an Oracle RAC database.
How to find location of OCR file when CRS is down?
If you need to find the location of OCR (Oracle Cluster Registry) but
your CRS is down.
When the CRS is down:
Look into ocr.loc file, location of this file changes depending on
the OS:
On Linux: /etc/oracle/ocr.loc
On Solaris: /var/opt/oracle/ocr.loc
When CRS is UP:
Set ASM environment or CRS environment then run the below
command:
ocrcheck
In 2 node RAC, how many NICs are r using ?
2 network cards on each clusterware node
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node
communication between rac nodes used by clusterware and rac database)
In 2 node RAC, how many IPs are r using ?
6 3 set of IP address
## eth1-Public: 2
## eth0-Private: 2
## VIP: 2
How to find IPs information in RAC ?
Edit the /etc/hosts file as shown below:
# Do not remove the following line, or various programs
# that requires network functionality will fail.
127.0.0.1 localhost.localdomain localhost
## Public Node names
192.168.10.11 node1-pub.hingu.net node1-pub
192.168.10.22 node2-pub.hingu.net node2-pub
## Private Network (Interconnect)
192.168.0.11 node1-prv node1-prv

192.168.0.22 node2-prv node2-prv


## Private Network (Network Area storage)
192.168.1.11 node1-nas node1-nas
192.168.1.22 node2-nas node2-nas
192.168.1.33 nas-server nas-server
## Virtual IPs
192.168.10.111 node1-vip.hingu.net node1-vip
192.168.10.222 node2-vip.hingu.net node2-vip
What is difference between RAC ip addresses ?
Public IP adress is the normal IP address typically used by DBA and SA to
manage storage, system and database. Public IP addresses are reserved for
the Internet.
Private IP address is used only for internal clustering processing (Cache
Fusion) (aka as interconnect). Private IP addresses are reserved for private
networks.
VIP is used by database applications to enable fail over when one cluster
node fails. The purpose for having VIP is so client connection can be failover
to surviving nodes in case there is failure
Can application developer access the private ip ?
No. private IP address is used only for internal clustering processing (Cache
Fusion) (aka as interconnect)
What is GRD?
GRD stands for Global Resource Directory. Its play key role in cache fusion.
The GES and GCS maintains will maintain the information in GRD. Its consist
of

Data block address.


Most recent data block version information
Role of the data block ( Local/Global)

Mode of the Data block (Shared/Exclusive)


What is Cache Fusion?
Cache fusion is the mechanism to transfer the data block from memory to
memory of one node to the other.
If two nodes require the same block for query or update, the block must be
transferred from the cache of one node to the other.

RAC system must equipped with low-latency and high speed inter-connect to
make it happen.
What are the different network components are in 10g RAC?
Public Ip, Private Ip, and vip components in 10g
Scan and Scan Lsnr in 11g along with 10g components.
Mention the Oracle RAC background process :
ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)
RSMNRemote Slave Monitor
Daig:
ACMS:
Atomic Controlfile Memory Service is an agent that ensures SGA memory
update.(ie) SGA updates are globally committed if success or globally
aborted if failed.
GTX0-j :
The process provides transparent support for XA global transactions in a RAC
environment.The database auto tunes the number of these processes based
on the workload of XA global transactions.
LMON:
Global Enqueue Service Monitor .This process monitors global enques and
resources across the cluster and performs global enqueue recovery
operations.
LMD:
Global enqueue service daemon. This process manages incoming remote

resource requests within instance. Detects deadlock.


LMS:
Global Cache service process.T

This process records cached block information in (GRD).


This process also controls the flow of messages to remote instances
Transmits block images between the buffer caches of different
instances.

This processing is a part of cache fusion feature.


LCK0:
Instance enqueue process.This process manages non-cache fusion resource
requests such as library and row cache requests.
Daig:
Captures diagnostic information about instances.
RMSn:
This process is called as Oracle RAC management process. These processes
perform managability tasks for Oracle RAC creation of resources when
new instances are added.

RSMN:
This process is called as Remote Slave Monitor. This process manages
background slave process creation and communication on remote instances.
Give Details on Cache Fusion:
Oracle RAC is composed of two or more instances. When a block of data is
read from datafile by an instance within the cluster and another instance is
in need of the same block,it is easy to get the block image from the insatnce
which has the block in its SGA rather than reading from the disk. To enable
inter instance communication Oracle RAC makes use of interconnects. The
Global Enqueue Service(GES) monitors and Instance enqueue process
manages the cahce fusion.

What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the
cluster and hence the processing differs. The most common wait events
related to this are gc cr request and gc buffer busy
GC CR request: the time it takes to retrieve the data from the remote cache
GC BUFFER BUSY: It is the time the remote instance locally spends
accessing the requested data block.
What components in RAC must reside in shared storage?
All datafiles, controlfiles, SPFIles, redo logs
Give few examples for solutions that support cluster storage:
ASM , raw disk devices, network file system(NFS), OCFS2 and OCFS
What is an interconnect?
An interconnect is a private network that connects all of the servers in a
cluster. The interconnect network uses a switch/multiple switches that only
the nodes in the cluster can access.
Cluster interconnect is used by the Cache fusion for inter instance
communication.
RAC uses interconnect for cache fusion (UDP) and inter-process
communication (TCP).
How can we configure the cluster interconnect?
Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster
interconnect.
unix and linux UDP and RDS(Reliable data socket)Windows TCP protocol.

What is the use of a service in Oracle RAC environment?


Services enable us to define rules to control how users and applications
connect to database instances.
What enables the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application connections

across all of the instances in an Oracle RAC database.

How do we verify that RAC instances are running?


SQL>select * from V$ACTIVE_INSTANCES;
What is FAN?
Fast application Notification: This is a notification mechanism that Oracle RAC
uses to notify other processes about the configuration and service level
information that includes service status changes such as,UP or DOWN
events. Applications can respond to FAN events and take immediate action.
Where can we apply FAN UP and DOWN events?
FAN UP and FAN DOWN events can be applied to instances,services and
nodes.
State the use of FAN events in case of a cluster configuration
change?
During times of cluster configuration changes, Oracle RAC high availability
framework publishes a FAN event immediately when a state change occurs in
the cluster. So applications can receive FAN events and react
immediately.This prevents applications from polling database and detecting a
problem after such a state change.
Why should we have seperate homes for ASM/ DB instance?
It is a good practice to have ASM home seperate from the database
hom(ORACLE_HOME).This helps in upgrading and patching ASM and the
Oracle database software independent of each other.Also,we can deinstall
the Oracle database software independent of the ASM instance.
What is rolling upgrade?
It is a new ASM feature from Database 11g.ASM instances in Oracle database
11g release(from 11.1) can be upgraded or patched using rolling upgrade
feature. This enables us to patch or upgrade ASM nodes in a clustered
environment without affecting database availability. During a rolling upgrade
we can maintain a functional cluster while one or more of the nodes in the
cluster are running in different software versions.
Can rolling upgrade be used to upgrade from 10g to 11g database?

No,it can be used only for Oracle database 11g releases(from 11.1).
State the initialization parameters that must have same value for
every instance in an Oracle RAC database:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_PASSWORD_FILE
UNDO_MANAGEMENT
What is ORA-00603: ORACLE server session terminated by fatal error or ORA29702: error occurred in Cluster Group Service operation?
RAC node name was listed in the loopback address...
Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all
instances?
These parameters can be identical on all instances only if these parameter
values are set to zero.
What two parameters must be set at the time of starting up an ASM instance
in a RAC environment?The parameters CLUSTER_DATABASE and
INSTANCE_TYPE must be set.
Name some Oracle clusterware tools and their uses?
OIFCFG - allocating and deallocating network interfaces
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry
OCRDUMP - Identify the interconnect being used
CVU - Cluster verification utility to get status of CRS resources

What are the modes of deleting instances from ORacle Real


Application cluster Databases?
We can delete instances using silent mode or interactive mode using
DBCA(Database Configuration Assistant).
How do we remove ASM from a Oracle RAC environment?
We need to stop and delete the instance in the node first in interactive or
silent mode.After that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name
How do we verify that an instance has been removed from OCR after
deleting an instance?
Issue the following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat
How do we verify an existing current backup of OCR?
ocrconfig -showbackup
What are the performance views in an Oracle RAC environment?
We have v$ views that are instance specific. In addition we have GV$ views
called as global views that has an INST_ID column of numeric data type.GV$
views obtain information from individual V$ views.
What are the types of connection load-balancing?
There are two types of connection load-balancing:server-side load balancing
and client-side load balancing.
What is the difference between server-side and client-side
connection load balancing?
Client-side balancing happens at client side where load balancing is done
using listener.In case of server-side load balancing listener uses a loadbalancing advisory to redirect connections to the instance providing best
service.

1. What is the SCAN for RAC?


Single Client Access Name
single name for clients to access an Oracle Database running in a cluster
SCAN is an automatic load balancing tool that uses least-recently-loaded
algorithm. Its intelligent RAC load balancing.
The benefit is clients using SCAN do not need to change if you add or remove
nodes in the cluster.
Oracle recommends assigning 3 addresses to SCAN, which will create 3 SCAN
listeners, though the cluster has got dozens of nodes..
Single Client Access Name (SCAN) - eliminates the need to change
TNSNAMES entry when nodes are added to or removed from the Cluster. With
SCAN, , only 1 entry per cluster is used, regardless of the number of nodes.
RACDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)(PORT = 1521))
(CONNECT_DATA =
. . . ))
RACDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = SCANname)(PORT = 1521))
(CONNECT_DATA =
. . . ))

RAC instances register to SCAN listeners as remote listeners. SCAN is a


domain name registered to at least one and up to three IP addresses, either
in DNS (Domain Name Service) or GNS (Grid Naming Service
2. What is SCAN listener?
A scan listener is additional to node listener which listens the incoming
connection requests through the scan IP, it routes the db connection
requests to particular node listener.

2. What are the prerequisites for RAC setup?


3. What are Oracle Clusterware Daemons processes and what they
do?
ocssd, crsd, evmd, oprocd, racgmain, racgimon
Cluster Ready Services (crsd) The crs process manages cluster
resources (a database, an instance, a service, a Listener, VIP,..) based on the
OCR information. This includes start, stop, monitor and failover operations.
This process runs as the root user.
Cluster Synchronization Services (ocssd) Manages cluster
configurations, node membership info and runs as the oracle user; failure of
this process results in cluster restart.
Event manager daemon (evmd) A background process that publishes
events that crs creates.
Process Monitor Daemon (OPROCD) This process monitor the cluster and
provide I/O fencing. OPROCD performs its check, stops running, and if the
wake up is beyond the expected time, then OPROCD resets the processor
and reboots the node. An OPROCD failure results in Oracle Clusterware
restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) Extends clusterware to support Oracle-specific
requirements and complex resources. Runs server callout scripts when FAN
events occur.
9. What are the Clusterware components?
Voting Disk - Voting Disk is a file will be in the shared storage area and
must be accessible by all nodes in the cluster. All nodes in the cluster
register their heart-beat information in the voting disk. It will have node
membership information.
If heart-beat information of any node in the voting disk is not available that
node will be evicted from the cluster. The OCSSD (Cluster Synchronization
Service) daemon maintains the heart beat of all nodes to the voting disk

Oracle Database 11g Release 2 provides the ability to store the voting disks
in ASM along with the OCR
Oracle Cluster Registry (OCR) It contains vital information cluster and
configuration information

Databases, Instances( ASM,DB) ,status


Services.
Node apps
Lsnr

The OCR must reside on shared disk that is accessible by all of the nodes in
your cluster. The daemon OCSSd manages.
Virtual IP (VIP) - A virtual IP is an alternate IP address used for a node for
the client connections instead of the standard public IP address. To configure
VIP address, we need to reserve a spare IP address for each node.
If a node fails, then the node's VIP fails over to another surviving node and
session will be establishing in another surviving node automatically.

Past Image:
Earlier version of the block while updating.
12. How to take backup of OCR file?
#ocrconfig -manualbackup
#ocrconfig -export file_name.dmp
#ocrdump -backupfile my_file
$cp -p -R /u01/app/crs/cdata /u02/crs_backup/ocrbackup/RAC1
13. How to recover OCR file?
Ans:
#ocrconfig -restore backup_file.ocr
#ocrconfig -import file_name.dmp
14. What is local OCR?

Ans:
/etc/oracle/local.ocr
/var/opt/oracle/local.ocr
15. How to check backup of OCR files?
Ans:
#ocrconfig showbackup
16. How to take backup of voting file?
Ans:
dd if=/u02/ocfs2/vote/VDFile_0 of=$ORACLE_BASE/bkp/vd/VDFile_0
crsctl backup css votedisk
-- from 11g R2
17. How do I identify the voting disk location?
Ans:
# crsctl query css votedisk
18. How do I identify the OCR file location?
check /var/opt/oracle/ocr.loc or /etc/ocr.loc
# ocrcheck
19. If voting disk/OCR file got corrupted and dont have backups,
how to get them?
We have to install Clusterware.
21. What is Oracle RAC Node Eviction?
Oracle Clusterware is designed to perform a node eviction by removing one
or more nodes from the cluster if some critical problem is detected.
A critical problem could be a node not responding via a network heartbeat, a
node not responding via a disk heartbeat, a hung or severely degraded
machine, or a hung ocssd.bin process.
During failures, to avoid data corruption, the failing instance evicts itself from
the cluster group
22. What is the major difference between 10g and 11g RAC?

What is TAF?
TAF (Transparent Application Failover) is a configuration that allows session
fail-over between nodes.
TAF offers the ability to restart transactions at SELECT or session level.
Failover_mode = Type= select /session
= Method= Basic/ Preconnect

What is cluvfy ?
It checks the cluster configuration.
Components installed, node connectivity, shared storage access, user
equivalence

What is cache fusion?

In a RAC environment, it is the combining of data blocks, which are shipped


across the interconnect from remote database caches (SGA) to the local
node, in order to fulfill the requirements for a transaction (DML, Query of
Data Dictionary).

What is split brain?

When database nodes in a cluster are unable to communicate with each


other, they may continue to process and modify the data blocks
independently. If the
same block is modified by more than one instance, synchronization/locking
of the data blocks does not take place and blocks may be overwritten by
others in the cluster. This state is called split brain.

What is the difference between Crash recovery and Instance recovery?

When an instance crashes in a single node database on startup a crash


recovery takes place. In a RAC enviornment the same recovery for an
instance is performed by the surviving nodes called Instance recovery.

What is the interconnect used for?

It is a private network which is used to ship data blocks from one instance to
another for cache fusion. The physical data blocks as well as data dictionary
blocks are shared across this interconnect.

How do you determine what protocol is being used for Interconnect traffic?

One of the ways is to look at the database alert log for the time period when
the database was started up.

What methods are available to keep the time synchronized on all nodes in
the cluster?

Either the Network Time Protocol(NTP) can be configured or in 11gr2, Cluster


Time Synchronization Service (CTSS) can be used.

What files components in RAC must reside on shared storage?

Spfiles, ControlFiles, Datafiles and Redolog files should be created on shared


storage.

Where does the Clusterware write when there is a network or Storage missed
heartbeat?

The network ping failure is written in $CRS_HOME/log

How do you find out what OCR backups are available?

The ocrconfig -showbackup can be run to find out the automatic and
manually run backups.

If your OCR is corrupted what options do have to resolve this?

You can use either the logical or the physical OCR backup copy to restore the
Repository.

How do you find out what object has its blocks being shipped across the
instance the most?

You can use the dba_hist_seg_stats.

What is a VIP in RAC use for?

The VIP is an alternate Virtual IP address assigned to each node in a cluster.


During a node failure the VIP of the failed node moves to the surviving node
and relays to the application that the node has gone down. Without VIP, the
application will wait for TCP timeout and then find out that the session is no
longer live due to the failure.

How do we know which database instances are part of a RAC cluster?

You can query the V$ACTIVE_INSTANCES view to determine the member


instances of the RAC cluster.

What is OCLUMON used for in a cluster environment?

The Cluster Health Monitor (CHM) stores operating system metrics in the
CHM repository for all nodes in a RAC cluster. It stores information on CPU,
memory, process, network and other OS data, This information can later be
retrieved and used to troubleshoot and identify any cluster related issues. It
is a default component of the 11gr2 grid install. The data is stored in the
master repository and replicated to a standby repository on a different node.

What would be the possible performance impact in a cluster if a less


powerful node (e.g. slower CPUs) is added to the cluster?

All processing will show down to the CPU speed of the slowest server.

What is the purpose of OLR?

Oracle Local repository contains information that allows the cluster processes
to be started up with the OCR being in the ASM storage ssytem. Since the
ASM file system is unavailable until the Grid processes are started up a local
copy of the contents of the OCR is required which is stored in the OLR.

What is the default memory allocation for ASM?

In 10g the default SGA size is 1G in 11g it is set to 256M and in 12c ASM it is
set back to 1G.

How do you backup ASM Metadata?

You can use md_backup to restore the ASM diskgroup configuration in-case of
ASM diskgroup storage loss.

What files can be stored in the ASM diskgroup?

In 11g the following files can be stored in ASM diskgroups.

Datafiles
Redo logfiles
Spfiles
In 12c the files below can also new be stored in the ASM Diskgroup

Password file
What it the ASM POWER_LIMIT?

This is the parameter which controls the number of Allocation units the ASM
instance will try to rebalance at any given time. In ASM versions less than
11.2.0.3 the default value is 11 however it has been changed to unlimited in
later versions.

What is a rolling upgrade?

A patch is considered a rolling if it is can be applied to the cluster binaries


without having to shutting down the database in a RAC environment. All
nodes in the cluster are patched in a rolling manner, one by one, with only
the node which is being patched unavailable while all other instance open.

What are some of the RAC specific parameters?

Some of the RAC parameters are:

CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
INSTANCE_TYPE (RDBMS or ASM)
ACTIVE_INSTANCE_COUNT
UNDO_MANAGEMENT
What is the future of the Oracle Grid?
The Grid software is becoming more and more capable of not just supporting
HA for Oracle Databases but also other applications including Oracles
applications. With 12c there are more features and functionality built-in and
it is easier to deploy these pre-built solutions, available for common Oracle
applications.

What components of the Grid should I back up?


The backups should include OLR, OCR and ASM Metadata.

Is there an easy way to verify the inventory for all remote nodes

You can run the opatch lsinventory -all_nodes command from a single node
to look at the inventory details for all nodes in the cluster.

RAC/ASM/VOTING DISK Interview Questions & Answer

Q What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters
(RAC) 11g Release 2 feature that provides a single name for clients to access
an Oracle Database running in a cluster. The benefit is clients using SCAN do
not need to change if you add or remove nodes in the cluster.

Q what is dynamic remastering ? When will the dynamic remastering


happens?
dynamic remastering is ability to move the ownership of resource from one
instance to another instance in RAC. dynamic resource remastering is used
to implement for resource affinity for increased performance. resource
affinity optimized the system in situation where update transactions are
being executed in one instance. when activity shift to another instance the
resource affinity correspondingly move to another instance. If activity is not
localized then resource ownership is hashed to the instance.

In 10g dynamic remastering happens in file+object level.the process of


remastering is very stringent. For one instance should touch more than 50
times than the other instance in particular period(say 10 mints). this touch
ratio and time can be tuned by gc_affinity_limit and _gc_affinity_time
parameter.

Q why we required to maintain odd number of voting disks?

Odd number of disk are to avoid split brain, When Nodes in cluster can't talk
to each other they run to lock the Voting disk and whoever lock the more
disk will survive, if disk number are even there are chances that node might
lock 50% of disk (2 out of 4) then how to decide which node to evict.
whereas when number is odd, one will be higher than other and each for
cluster to evict the node with less number

Q How you check the health of Your RAC Database?


'crsctl' command from root or oracle user can be used to check the
clusterware health But for starting or stopping we have to use root user or
any privilege user.

[oracle@TEST_NODE1 ~]$ crsctl check crs


CSS appears healthy
CRS appears healthy
EVM appears healthy

Q How you check the services in RAC Node?


We can check the service or start the services with 'srvctl' command.load
balanced/TAF service named RAC online.

[oracle@TEST_NODE1 ~]$ srvctl start service -d orcl -s RAC


[oracle@TEST_NODE1 ~]$ crsstat

Q If there is some issue with virtual IP how will you troubleshoot it?
How will you change virtual ip?
To change the VIP (virtual IP) on a RAC node, use the command

[oracle@testnode oracle]$ srvctl modify nodeapps -A new_address

Q How you will backup your RAC Database?


Backup strategy of RAC Database:
An RAC Database consists of
1)OCR
2)Voting disk &
3)Database files, controlfiles, redolog files & Archive log files

Q Do you have any idea of load balancing in application?How load balancing


is done?
http://practicalappsdba.wordpress.com/category/for-master-apps-dbas/

Q What is RAC?
RAC stands for Real Application cluster. It is a clustering solution from Oracle
Corporation that ensures high availability of databases by providing instance
failover, media failover features.

Q What is RAC and how is it different from non RAC databases?


RAC stands for Real Application Cluster, you have n number of instances
running in their own separate nodes and based on the shared storage.
Cluster is the key component and is a collection of servers operations as one
unit. RAC is the best solution for high performance and high availably. Non
RAC databases has single point of failure in case of hardware failure or server
crash.

Q Give the usage of srvctl ?

srvctl start instance -d db_name -i "inst_name_list" [-o start_options]


srvctl stop instance -d name -i "inst_name_list" [-o stop_options]
srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate
srvctl start database -d name [-o start_options]
srvctl stop database -d name [-o stop_options]
srvctl start database -d orcl -o mount

Q Mention the Oracle RAC software components ?


Oracle RAC is composed of two or more database instances. They are
composed of Memory structures and background processes same as the
single instance database.Oracle RAC instances use two processes GES(Global
Enqueue Service), GCS(Global Cache Service) that enable cache
fusion.Oracle RAC instances are composed of following background
processes:
ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)
RSMNRemote Slave Monitor

Q What is GRD?
GRD stands for Global Resource Directory. The GES and GCS maintains
records of the statuses of each datafile and each cahed block using global

resource directory.This process is referred to as cache fusion and helps in


data integrity.

Q What are the different network components are in 10g RAC?


public, private, and vip components
Private interfaces is for intra node communication. VIP is all about availability
of application. When a node fails then the VIP component fail over to some
other node, this is the reason that all applications should based on vip
components means tns entries should have vip entry in the host list

Q Give Details on ACMS:


ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC
environment ACMS is an agent that ensures a distributed SGA memory
update(ie)SGA updates are globally committed on success or globally
aborted in event of a failure.

Q What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the
cluster and hence the processing differs.The most common wait events
related to this are gc cr request and gc buffer busy

GC CR request :the time it takes to retrieve the data from the


remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly


tuned queries will increase the amount of data blocks requested by an Oracle
session. The more blocks requested typically means the more often a block
will need to be read from a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends


accessing the requested data block.

Q Give details on GTX0-j


The process provides transparent support for XA global transactions in a RAC
environment.The database autotunes the number of these processes based
on the workload of XA global transactions.

Q Give details on LMON


This process monitors global enques and resources across the cluster and
performs global enqueue recovery operations.This is called as Global
Enqueue Service Monitor.

Q Give details on LMD


This process is called as global enqueue service daemon. This process
manages incoming remote resource requests within each instance.

Q Give details on LMS


This process is called as Global Cache service process.This process maintains
statuses of datafiles and each cahed block by recording information in a
Global Resource Dectory(GRD).This process also controls the flow of
messages to remote instances and manages global data block access and
transmits block images between the buffer caches of different instances.This
processing is a part of cache fusion feature.

Q Give details on LCK0


This process is called as Instance enqueue process.This process manages
non-cache fusion resource requests such as libry and row cache requests.

Q Give details on RMSn


This process is called as Oracle RAC management process.These pocesses
perform managability tasks for Oracle RAC.Tasks include creation of
resources related Oracle RAC when new instances are added to the cluster.

Q How to export and import crs resources while migrating Oracle RAC to new
server.
Below script generate svrctl add script for database, instance, service and
11G listeners from OCR from current RAC.
Save the result of the script and run it at new RAC.

for DBNAME in $(srvctl config database)


do

# Generate DB resource

srvctl config database -d $DBNAME -a | awk -v dbname="$DBNAME" \


'BEGIN { FS=":" }
$1~/Oracle home/ || $1~/ORACLE_HOME/ {dbhome = "-o" $2}
$1~/Spfile/ || $1~/SPFILE/ {spfile = "-p" $2}
$1~/Disk Groups/ {dg = "-a" $2}

END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s\n", "srvctl add
database -d ", dbname, dbhome, spfile, dg }'

# Generate Instance resource

srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \


'$4~/running/ { printf "%s %s %s %s %s %s\n", "srvctl add instance -d
",dbname, " -i ", $2 ," -n ", $7 }
$5~/running/ { printf "%s %s %s %s %s %s \n", "srvctl add instance -d
",dbname, " -i ", $2 ," -n ", $8 }'

# Modify instance for 10G - ASM dependency

if [ $(echo $ORACLE_HOME | grep "1020" | wc -l ) -eq 1 ]


then
srvctl status database -d $DBNAME | awk -v dbname="$DBNAME" \
'$2~/1$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname,
" -i ", $2 ," -s +ASM1" }
$2~/2$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, "
-i ", $2 ," -s +ASM2" }
$2~/3$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, "
-i ", $2 ," -s +ASM3" }
$2~/4$/ { printf "%s %s %s %s %s \n", "srvctl modify instance -d ",dbname, "
-i ", $2 ," -s +ASM4" }'
fi

echo "srvctl start database -d $DBNAME"

# Generate Service resource

snamelist=$(srvctl status service -d $DBNAME | awk '{print $2}')

for sname in $snamelist


do
srvctl config service -d $DBNAME -s $sname| awk -v dbname="$DBNAME" -v
sname=$sname \
'BEGIN { FS=":"}
$1~/Preferred instances/ {pref = "-r" $2}
$1~/PREF/ {pref = "-r" $2; sub(/AVAIL/, "", pref) }
$1~/Available instances/ {avail = "-a" $2}
$2~/AVAIL/ {avail = "-a" $3}
$1~/Failover type/ {ft = "-e" $2}
$1~/Failover method/ {fm = "-m" $2}
$1~/Runtime Load Balancing Goal/ {g = "-B" $2}
END { if (avail == "-a ") {avail = ""}; printf "%s %s %s %s %s %s %s %s %s
%s\n", "srvctl add service -d ",dbname, "-s ", sname, pref, avail ,ft, fm,g, "-P
BASIC"}'
echo "srvctl start service -d $DBNAME -s $sname"
done
done

# Listener at 11G Home. 10G listener can't ba added with srvctl.

srvctl config listener | awk \


'BEGIN { FS=":"; state = 0; }
$1~/Name/ {lname = "-l" $2; state=1};
$1~/Home/ && state == 1 {ohome = "-o" $2; state=2;}
$1~/End points/ && state == 2 {lport = "-p " $3; state=3;}
state == 3 {if (ohome != "-o ") {printf "%s %s %s %s\n", "srvctl add listener
", lname, ohome, lport;} state=0;}'

Q Give details on RSMN


This process is called as Remote Slave Monitor.This process manages
background slave process creation andd communication on remote
instances. This is a background slave process.This process performs tasks on
behalf of a co-ordinating process running in another instance.

Q What components in RAC must reside in shared storage?


All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware
shred storage.

Q What is the significance of using cluster-aware shared storage in an Oracle


RAC environment?
All instances of an Oracle RAC can access all the datafiles,control files,
SPFILE's, redolog files when these files are hosted out of cluster-aware
shared storage which are group of shared disks.

Q Give few examples for solutions that support cluster storage

ASM(automatic storage management),raw disk devices,network file


system(NFS), OCFS2 and OCFS(Oracle Cluster Fie systems).

Q What is an interconnect network?


An interconnect network is a private network that connects all of the servers
in a cluster. The interconnect network uses a switch/multiple switches that
only the nodes in the cluster can access.

Q How can we configure the cluster interconnect?


Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster
interconnect.On unix and linux systems we use UDP and RDS(Reliable data
socket) protocols to be used by Oracle Clusterware.Windows clusters use the
TCP protocol.

Q Can we use crossover cables with Oracle Clusterware interconnects?


No, crossover cables are not supported with Oracle Clusterware intercnects.

Q What is the use of cluster interconnect?


Cluster interconnect is used by the Cache fusion for inter instance
communication.

Q How do users connect to database in an Oracle RAC environment?

Users can access a RAC database using a client/server configuration or


through one or more middle tiers ,with or without connection pooling.Users
can use oracle services feature to connect to database.

Q What is the use of a service in Oracle RAC environment?


Applications should use the services feature to connect to the Oracle
database.Services enable us to define rules and characteristics to control
how users and applications connect to database instances.

Q What are the characteristics controlled by Oracle services feature?


The charateristics include a unique name, workload balancing and failover
options,and high availability characteristics.

Q What enables the load balancing of applications in RAC?


Oracle Net Services enable the load balancing of application connections
across all of the instances in an Oracle RAC database.

Q What is a virtual IP address or VIP?


A virtl IP address or VIP is an alternate IP address that the client connectins
use instead of the standard public IP address. To configureVIP address, we
need to reserve a spare IP address for each node, and the IP addresses must
use the same subnet as the public network.

Q What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on
which the VIP address can accept TCP connections but it cannot accept
Oracle connections.

Q Give situations under which VIP address failover happens


VIP addresses failover happens when the node on which the VIP address runs
fails, all interfaces for the VIP address fails, all interfaces for the VIP address
are disconnected from the network.

Q What is the significance of VIP address failover?


When a VIP address failover happens, Clients that attempt to connect to the
VIP address receive a rapid connection refused error .They don't have to wait
for TCP connection timeout messages.

Q What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using
OEM(Enterprise
Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBC
A,NETCA

Q How do we verify that RAC instances are running?


Issue the following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;

The query gives the instance number under INST_NUMBER


column,host_:instancename under INST_NAME column.

Q What is FAN?
Fast application Notification as it abbreviates to FAN relates to the events
related to instances,services and nodes.This is a notification mechanism that
Oracle RAc uses to notify other processes about the configuration and
service level information that includes service status changes such as,UP or
DOWN events.Applications can respond to FAN events and take immediate
action.

Q Where can we apply FAN UP and DOWN events?


FAN UP and FAN DOWN events can be applied to instances,services and
nodes.
State the use of FAN events in case of a cluster configuration change?
During times of cluster configuration changes,Oracle RAC high availability
framework publishes a FAN event immediately when a state change occurs in
the cluster.So applications can receive FAN events and react
immediately.This prevents applications from polling database and detecting a
problem after such a state change.

Q Why should we have seperate homes for ASm instance?


It is a good practice to have ASM home seperate from the database
hom(ORACLE_HOME).This helps in upgrading and patching ASM and the
Oracle database software independent of each other.Also,we can deinstall
the Oracle database software independent of the ASM instance.

Q What is the advantage of using ASM?


Having ASM is the Oracle recommended storage option for RAC databases as
the ASM maximizes performance by managing the storage configuration
across the disks.ASM does this by distributing the database file across all of
the available storage within our cluster database environment.

Q What is rolling upgrade?


It is a new ASM feature from Database 11g.ASM instances in Oracle database
11g release(from 11.1) can be upgraded or patched using rolling upgrade
feature. This enables us to patch or upgrade ASM nodes in a clustered
environment without affecting database availability.During a rolling upgrade
we can maintain a functional cluster while one or more of the nodes in the
cluster are running in different software versions.

Q Can rolling upgrade be used to upgrade from 10g to 11g database?


No,it can be used only for Oracle database 11g releases(from 11.1).

Q State the initialization parameters that must have same value for every
instance in an Oracle RAC database
Some initialization parameters are critical at the database creation time and
must have same values.Their value must be specified in SPFILE or PFILE for
every instance.The list of parameters that must be identical on every
instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET

COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT

Q What is ORA-00603: ORACLE server session terminated by fatal error or


ORA-29702: error occurred in Cluster Group Service operation?
RAC node name was listed in the loopback address...

Q Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all


instances?
These parameters can be identical on all instances only if these parameter
values are set to zero.

What two parameters must be set at the time of starting up an ASM instance
in a RAC environment?The parameters CLUSTER_DATABASE and
INSTANCE_TYPE must be set.

Q Mention the components of Oracle clusterware


Oracle clusterware is made up of components like voting disk and Oracle
Cluster Registry(OCR).

Q What is a CRS resource?


Oracle clusterware is used to manage high-availability operations in a
cluster.Anything that Oracle Clusterware manages is known as a CRS
resource.Some examples of CRS resources are database,an instance,a
service,a listener,a VIP address,an application process etc.

Q What is the use of OCR?


Oracle clusterware manages CRS resources based on the configuration
information of CRS resources stored in OCR(Oracle Cluster Registry).

Q How does a Oracle Clusterware manage CRS resources?


Oracle clusterware manages CRS resources based on the configuration
information of CRS resources stored in OCR(Oracle Cluster Registry).

Q Name some Oracle clusterware tools and their uses?


OIFCFG - allocating and deallocating network interfaces

OCRCONFIG - Command-line tool for managing Oracle Cluster Registry


OCRDUMP - Identify the interconnect being used
CVU - Cluster verification utility to get status of CRS resources

Q What are the modes of deleting instances from ORacle Real Application
cluster Databases?
We can delete instances using silent mode or interactive mode using
DBCA(Database Configuration Assistant).

Q How do we remove ASM from a Oracle RAC environment?


We need to stop and delete the instance in the node first in interactive or
silent mode.After that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

Q How do we verify that an instance has been removed from OCR after
deleting an instance?
Issue the following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat

Q How do we verify an existing current backup of OCR?


We can verify the current backup of OCR using the following command :
ocrconfig -showbackup
What are the performance views in an Oracle RAC environment?
We have v$ views that are instance specific. In addition we have GV$ views
called as global views that has an INST_ID column of numeric data type.GV$
views obtain information from individual V$ views.
What are the types of connection load-balancing?
There are two types of connection load-balancing:server-side load balancing
and client-side load balancing.

Q What is the difference between server-side and client-side connection load


balancing?
Client-side balancing happens at client side where load balancing is done
using listener.In case of server-side load balancing listener uses a loadbalancing advisory to redirect connections to the instance providing best
service.

Q What are the three greatest benefits that RAC provides??


The three main benefits are availability, scalability, and the ability to use low
cost commodity hardware. RAC allows an application to scale vertically, by
adding CPU, disk and memory resources to an individual server. But RAC also
provides horizontal scalability, which is achieved by adding new nodes into
the cluster. RAC also allows an organization to bring these resources online
as they are needed. This can save a small or midsize organization a lot of
money in the early stages of a project.

In a RAC environment, if a node in the cluster fails, the application continues


to run on the surviving nodes contained in the cluster. If your application is
configured correctly, most users won't even know that the node they were
running on became unavailable.

Q What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the
cluster and hence the processing differs.The most common wait events
related to this are gc cr request and gc buffer busy

GC CR request: the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly


tuned queries will increase the amount of data blocks
requested by an Oracle session. The more blocks requested typically means
the more often a block will need to be read from a remote instance via the
interconnect.)
GC BUFFER BUSY: It is the time the remote instance locally spends accessing
the requested data block.

Q What are the different network components in Oracle 10g RAC?

We have public, private, and VIP components. Private interfaces is for intra
node communication. VIP is all about availability of application. When a node
fails then the VIP component will fail over to some other node, this is the
reason that all applications should be based on VIP components. This means
that tns entries should have VIP entry in the host list.

Q Tune the following RAC DATABASE (DBNAME=PROD) which is 3 node RAC.

PROD1

PROD2

PROD3

CPU 8

CPU 15

CPU 8

32 GB RAM

12 GB RAM

16 GB RAM

What are you looking for here? What tuning information do you expect?
It is a 3 node cluster with different hardware configuration running RAC.
I would put 20% of the memory for Oracle in each node. So that would mean
that the SGA is different in each of the nodes.
Also since the CPU's are different PROD2 can have more number of max
number of processes as compared to the rest of them.

But as I said this is just configuration, this is not tuning. Question is not clear.

Q Write a sample script for RMAN for the recovery if all the instance are
down.(First explain the procedure how you will restore)
Bring all nodes down.
Start one Node
Restore all datafiles and archive logs.
Recover 1 Node.
Open the database.
bring other nodes up.
Confirm that all nodes are operational.

Q. Clients are performing some operation and suddenly one of the datafile is
experiencing problem what do you do? The cluster is a two node one.
A. Bring the datafile offline recover the datafile.

Q. How can you connect to a specific node in a RAC environment?


A. tnsnames.ora ensure that you have INSTANCE_NAME specified in it.

Q How to move OCR and Voting disk to new storage device?


Moving OCR
==========
You must be logged in as the root user, because root owns the OCR files. Also
an ocrmirror must be in place before trying to replace the OCR device.

Make sure there is a recent backup of the OCR file before making any
changes:

ocrconfig showbackup

If there is not a recent backup copy of the OCR file, an export can be taken
for the current OCR file. Use the following command to generate an export of
the online OCR file:

In 10.2

# ocrconfig export -s online

In 11g

# ocrconfig -manualbackup

The new OCR disk must be owned by root, must be in the oinstall group, and
must have permissions set to 640. Provide at least 100 MB disk space for the
OCR.

On one node as root run:

# ocrconfig -replace ocr


# ocrconfig -replace ocrmirror

Now run ocrcheck to verify if the OCR is pointing to the new file

Moving Voting Disk


==================

Note: crsctl votedisk commands must be run as root

Shutdown the Oracle Clusterware (crsctl stop crs as root) on all nodes before
making any modification to the voting disk. Determine the current voting
disk location using:

crsctl query css votedisk

Take a backup of all voting disk:

dd if=voting_disk_name of=backup_file_name

To move a Voting Disk, provide the full path including file name:

crsctl delete css votedisk force


crsctl add css votedisk force

After modifying the voting disk, start the Oracle Clusterware stack on all
nodes

# crsctl start crs

Verify the voting disk location using

crsctl query css votedisk

Q What is runfixup.sh script in Oracle Clusterware 11g release 2 installation


With Oracle Clusterware 11g release 2, Oracle Universal Installer (OUI)
detects when the minimum requirements for an installation are not met, and
creates shell scripts, called fixup scripts, to finish incomplete system
configuration steps. If OUI detects an incomplete task, then it generates
fixup scripts (runfixup.sh). You can run the fixup script after you click the Fix
and Check Again Button.
The Fixup script does the following:
If necessary sets kernel parameters to values required for successful
installation,

including:
Shared memory parameters.
Open file descriptor and UDP send/receive parameters.
Sets permissions on the Oracle Inventory (central inventory) directory.
Reconfigures primary and secondary group memberships for the
installation
owner, if necessary, for the Oracle Inventory directory and the operating
system
privileges groups.
Sets shell limits if necessary to required values.

Q When exactly during the installation process are clusterware components


created?

After fulfilling the pre-installation requirements, the basic installation steps to


follow are:

1. Invoke the Oracle Universal Installer (OUI)

2. Enter the different information for some components like:


- name of the cluster
- public and private node names
- location for OCR and Voting Disks
- network interfaces used for RAC instances
-etc.

3. After the Summary screen, OUI will start copying under the $CRS_HOME
(this is the $ORACLE_HOME for Oracle Clusterware) in the local node the
libraries and executables.
- here we will have the daemons and scripts init.* created and configured
properly.

Oracle Clusterware is formed of several daemons, each one of which have a


special function inside the stack. Daemons are executed via the init.* scripts
(init.cssd, init.crsd and init.evmd).

- note that for CRS only some client libraries are recreated, but not all the
executables (as for the RDBMS).

4. Later the software is propagated to the rest of the nodes in the cluster and
the oraInventory is updated.

5. The installer will ask to execute root.sh on each node. Until this step the
software for Oracle Clusterware is inside the $CRS_HOME.

Running root.sh will create several components outside the $CRS_HOME:

- OCR and VD will be formated.

- control files (or SCLS_SRC files ) will be created with the correct contents to
start Oracle Clusterware.

These files are used to control some aspects of Oracle Clusterware like:
- enable/disable processes from the CSSD family (Eg. oprocd, oslsvmon)

- stop the daemons (ocssd.bin, crsd.bin, etc).


- prevent Oracle Clusterware from being started when the machine boots.
- etc.

- /etc/inittab will be updated and the init process is notified.

In order to start the Oracle Clusterware daemons, the init.* scripts first need
to be run. These scripts are executed by the daemon init. To accomplish this
some entries must be created in the file /etc/inittab.

- the different processes init.* (init.cssd, init.crsd, etc) will start the daemons
(ocssd.bin, crsd.bin, etc). When all the daemons are running then we can say
that the installation was successful

- On 10.2 and later, running root.sh on the last node in the cluster also will
create the nodeapps (VIP, GSD and ONS). On 10.1, VIPCA is executed as part
of the RAC installation.

6. After running root.sh on each node, we need to continue with the OUI
session. After pressing the 'OK' button OUI will include the information for the
public and cluster_interconnect interfaces. Also CVU (Cluster Verification
Utility) will be executed.

Q What are Oracle Clusterware processes for 10g on Unix and Linux

Cluster Synchronization Services (ocssd) Manages cluster node


membership and runs as the oracle user; failure of this process results in
cluster restart.

Cluster Ready Services (crsd) The crs process manages cluster resources
(which could be a database, an instance, a service, a Listener, a virtual IP
(VIP) address, an application process, and so on) based on the resource's
configuration information that is stored in the OCR. This includes start, stop,
monitor and failover operations. This process runs as the root user

Event manager daemon (evmd) A background process that publishes


events that crs creates.

Process Monitor Daemon (OPROCD) This process monitor the cluster and
provide I/O fencing. OPROCD performs its check, stops running, and if the
wake up is beyond the expected time, then OPROCD resets the processor
and reboots the node. An OPROCD failure results in Oracle Clusterware
restarting the node. OPROCD uses the hangcheck timer on Linux platforms.

RACG (racgmain, racgimon) Extends clusterware to support Oracle-specific


requirements and complex resources. Runs server callout scripts when FAN
events occur.

Q What are Oracle database background processes specific to RAC

LMSGlobal Cache Service Process

LMDGlobal Enqueue Service Daemon

LMONGlobal Enqueue Service Monitor

LCK0Instance Enqueue Process

To ensure that each Oracle RAC database instance obtains the block that it
needs to satisfy a query or transaction, Oracle RAC instances use two
processes, the Global Cache Service (GCS) and the Global Enqueue Service
(GES). The GCS and GES maintain records of the statuses of each data file
and each cached block using a Global Resource Directory (GRD). The GRD
contents are distributed across all of the active instances.

Q What are Oracle Clusterware Components

Voting Disk Oracle RAC uses the voting disk to manage cluster
membership by way of a health check and arbitrates cluster ownership
among the instances in case of network failures. The voting disk must reside
on shared disk.

Oracle Cluster Registry (OCR) Maintains cluster configuration information


as well as configuration information about any cluster database within the
cluster. The OCR must reside on shared disk that is accessible by all of the
nodes in your cluster

Q How do you troubleshoot node reboot

Please check metalink ...

Note 265769.1 Troubleshooting CRS Reboots


Note.559365.1 Using Diagwait as a diagnostic to get more information for
diagnosing Oracle Clusterware Node evictions.

Q How do you backup the OCR

There is an automatic backup mechanism for OCR. The default location is :


$ORA_CRS_HOME\cdata\"clustername"\

To display backups :
#ocrconfig -showbackup
To restore a backup :
#ocrconfig -restore

With Oracle RAC 10g Release 2 or later, you can also use the export
command:
#ocrconfig -export -s online, and use -import option to restore the contents
back.
With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR
with the command:
# ocrconfig -manualbackup

Q How do you backup voting disk

#dd if=voting_disk_name of=backup_file_name

Q How do I identify the voting disk location

#crsctl query css votedisk

Q How do I identify the OCR file location

check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)


or
#ocrcheck

Q Is ssh required for normal Oracle RAC operation ?

"ssh" are not required for normal Oracle RAC operation. However "ssh"
should be enabled for Oracle RAC and patchset installation.

Q What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters
(RAC) 11g Release 2 feature that provides a single name for clients to access
an Oracle Database running in a cluster. The benefit is clients using SCAN do
not need to change if you add or remove nodes in the cluster.

Q What is the purpose of Private Interconnect ?

Clusterware uses the private interconnect for cluster synchronization


(network heartbeat) and daemon communication between the the clustered
nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process
communication (TCP). Cache Fusion is the remote memory mapping of
Oracle buffers, shared between the caches of participating nodes in the
cluster.

Q Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often
wait for a TCP timeout period (which can be up to 10 min) before getting an
error. As a result, you don't really have a good HA solution without using
VIPs.
When a node fails, the VIP associated with it is automatically failed over to
some other node and new node re-arps the world indicating a new MAC
address for the IP. Subsequent packets sent to the VIP go to the new node,
which will send error RST packets back to the clients. This results in the
clients getting errors immediately

Q What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in


AWR Report?

This is most likely due to a fault in interconnect network.


Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work with
your system administrator find the fault with network.

Q How many nodes are supported in a RAC Database?

10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and
100 instances in a RAC database.

Q Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215,
however sqlplus can start it on both nodes? How do you identify the
problem?

Set the environmental variable SRVM_TRACE to true.. And start the instance
with srvctl. Now you will get detailed error stack.

Q what is the purpose of the ONS daemon?

The Oracle Notification Service (ONS) daemon is an daemon started by the


CRS clusterware as part of the nodeapps. There is one ons daemon started
per clustered node.
The Oracle Notification Service daemon receive a subset of published
clusterware events via the local evmd and racgimon clusterware daemons
and forward those events to application subscribers and to the local
listeners.

This in order to facilitate:

a. the FAN or Fast Application Notification feature or allowing applications to


respond to database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing
accross different rac nodes dependent of the load on the different nodes. The
rdbms MMON is creating an advisory for distribution of work every 30seconds
and forward it via racgimon and ONS to listeners and applications.

Q How do users connect to database in an Oracle RAC environment?

Users can access a RAC database using a client/server configuration or


through one or more middle tiers, with or without connection pooling. Users
can use oracle services feature to connect to database.

Q What is the use of a service in Oracle RAC environment?


Applications should use the services feature to connect to the Oracle
database. Services enable us to define rules and characteristics to control
how users and applications connect to database instances.

Q What are the characteristics controlled by Oracle services feature?


The characteristics include a unique name, workload balancing and failover
options, and high availability characteristics.

Q What is a voting disk?


A voting disk is a file that manages information about node membership.

Q What are the administrative tasks involved with voting disk?


Following administrative tasks are performed with the voting disk :
1) Backing up voting disks
2) Recovering Voting disks
3) Adding voting disks
4) Deleting voting disks
5) Moving voting disks

Q How do we backup voting disks?

1) Oracle recommends that you back up your voting disk after the initial
cluster creation and after we complete any node addition or deletion
procedures.
2) First, as root user, stop Oracle Clusterware (with the crsctl stop crs
command) on all nodes. Then, determine the current voting disk by issuing
the following command:
crsctl query votedisk css
3) Then, issue the dd or ocopy command to back up a voting disk, as
appropriate.
Give the syntax of backing up voting disks:On Linux or UNIX systems:
dd if=voting_disk_name of=backup_file_name
where,
voting_disk_name is the name of the active voting disk
backup_file_name is the name of the file to which we want to back up the
voting disk contents
On Windows systems, use the ocopy command:
ocopy voting_disk_name backup_file_name

Q What is the Oracle Recommendation for backing up voting disk?


Oracle recommends us to use the dd command to backup the voting disk
with a minimum block size of 4KB.

Q How do you restore a voting disk?


To restore the backup of your voting disk, issue the dd or ocopy command for
Linux and UNIX systems or ocopy for Windows systems respectively.
On Linux or UNIX systems:
dd if=backup_file_name of=voting_disk_name

On Windows systems, use the ocopy command:


ocopy backup_file_name voting_disk_name
where,
backup_file_name is the name of the voting disk backup file
voting_disk_name is the name of the active voting disk

Q How can we add and remove multiple voting disks?


If we have multiple voting disks, then we can remove the voting disks and
add them back into our environment using the following commands, where
path is the complete path of the location where the voting disk resides:
crsctl delete css votedisk path
crsctl add css votedisk path

Q How do we stop Oracle Clusterware?When do we stop it?


Before making any modification to the voting disk, as root user, stop Oracle
Clusterware using the crsctl stop crs command on all nodes.

Q How do we add voting disk?


To add a voting disk, issue the following command as the root user, replacing
the path variable with the fully qualified path name for the voting disk we
want to add:
crsctl add css votedisk path -force

Q How do we move voting disks?

To move a voting disk, issue the following commands as the root user,
replacing the path variable with the fully qualified path name for the voting
disk we want to move:
crsctl delete css votedisk path -force
crsctl add css votedisk path -force

Q How do we remove voting disks?


To remove a voting disk, issue the following command as the root user,
replacing the path variable with the fully qualified path name for the voting
disk we want to remove:
crsctl delete css votedisk path -force

Q What should we do after modifying voting disks?


After modifying the voting disk, restart Oracle Clusterware using the crsctl
start crs command on all nodes, and verify the voting disk location using the
following command:
crsctl query css votedisk

Q When can we use -force option?


If our cluster is down, then we can include the -force option to modify the
voting disk configuration, without interacting with active Oracle Clusterware
daemons. However, using the -force option while any cluster node is active
may corrupt our configuration.

What is RAC? What is the benefit of RAC over single instance database?

In Real Application Clusters environments, all nodes concurrently execute


transactions against the same database. Real Application Clusters

coordinates each nodes access to the shared data to provide consistency


and integrity.

Benefits:
Improve response time
Improve throughput
High availability
Transparency

What is Oracle RAC One Node?

Oracle RAC one Node is a single instance running on one node of the cluster
while the 2nd node is in cold standby mode. If the instance fails for some
reason then RAC one node detect it and restart the instance on the same
node or the instance is relocate to the 2nd node incase there is failure or
fault in 1st node.

The benefit of this feature is that it provides a cold failover solution and it
automates the instance relocation without any downtime and does not need
a manual intervention. Oracle introduced this feature with the release of
11gR2 (available with Enterprise Edition).

Advantages of RAC (Real Application Clusters)

Reliability if one node fails, the database wont fail


Availability nodes can be added or replaced without having to shutdown the
database
Scalability more nodes can be added to the cluster as the workload
increases

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client


connections use instead of the standard public IP address. To configure VIP
address, we need to reserve a spare IP address for each node, and the IP
addresses must use the same

Where are the Clusterware files stored on a RAC environment?

The Clusterware is installed on each node (on an Oracle Home) and on the
shared disks (the voting disks and the CSR file)

Where are the database software files stored on a RAC environment?

The base software is installed on each node of the cluster and the database
storage on the shared disks.

What kind of storage we can use for the shared Clusterware files?

OCFS (Release 1 or 2)
Raw devices
Third party cluster file system such as GPFS or Veritas

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the
VIP address receive a rapid connection refused error .They dont have to wait
for TCP connection timeout messages.

What is voting disk?

Voting Disk is a file that sits in the shared storage area and must be
accessible by all nodes in the cluster. All nodes in the cluster registers their
heart-beat information in the voting disk, so as to confirm that they are all
operational. If heart-beat information of any node in the voting disk is not
available that node will be evicted from the cluster.

The CSS (Cluster Synchronization Service) daemon in the clusterware


maintains the heart beat of all nodes to the voting disk. When any node is
not able to send heartbeat to voting disk, then it will reboot itself, thus help
avoiding the split-brain syndrome.

For high availability, Oracle recommends that you have a minimum of three
or odd number (3 or greater) of votingdisks.

Voting Disk is file that resides on shared storage and Manages cluster
members. Voting disk reassigns cluster ownership between the nodes in case
of failure.
The Voting Disk Files are used by Oracle Clusterware to determine which
nodes are currently members of the cluster. The voting disk files are also
used in concert with other Cluster components such as CRS to maintain the
clusters integrity.
Oracle Database 11g Release 2 provides the ability to store the voting disks
in ASM along with the OCR. Oracle Clusterware can access the OCR and the
voting disks present in ASM even if the ASM instance is down. As a result CSS
can continue to maintain the Oracle cluster even if the ASM instance has
failed.

What kind of storage we can use for the RAC database storage?

OCFS (Release 1 or 2)
ASM
raw devices
third party cluster file system such as GPFS or Veritas

What is a CFS?

A cluster File System (CFS) is a file system that may be accessed (read and
write) by all members in a cluster at the same time. This implies that all
members of a cluster have the same view.

What is an OCFS2?

The OCFS2 is the Oracle (version 2) Cluster File System which can be used
for the Oracle Real Application Cluster.

Which files can be placed on an Oracle Cluster File System?

Oracle Software installation (Windows only)


Oracle files (controlfiles, datafiles, redologs, files described by the bfile
datatype)
Shared configuration files (spfile)
OCR and voting disk

Files created by Oracle during runtime

Do you know another Cluster Vendor?

HP Tru64 Unix, Veritas, Microsoft

How is possible to install a RAC if we dont have a CFS?

This is possible by using a raw device.

What is a raw device?

A raw device is a disk drive that does not yet have a file system set up. Raw
devices are used for Real Application Clusters since they enable the sharing
of disks.

Why we need to keep odd number of voting disks ?

Oracle expects that you will configure at least 3 voting disks for redundancy
purposes. You should always configure an odd number of voting disks >= 3.
This is because loss of more than half your voting disks will cause the entire
cluster to fail.

What is a raw partition?

A raw partition is a portion of a physical disk that is accessed at the lowest


possible level. A raw partition is created when an extended partition is

created and logical partitions are assigned to it without any formatting. Once
formatting is complete, it is called cooked partition.

When to use CFS over raw?

A CFS offers:
Simpler management
Use of Oracle Managed Files with RAC
Single Oracle Software installation
Autoextend enabled on Oracle datafiles
Uniform accessibility to archive logs in case of physical node failure
With Oracle_Home on CFS, when you apply Oracle patches CFS guarantees
that the updated Oracle_Home is visible to all nodes in the cluster.

What CRS is?

Oracle RAC 10g Release 1 introduced Oracle Cluster Ready Services (CRS), a
platform-independent set of system services for cluster environments. In
Release 2, Oracle has renamed this product to Oracle Clusterware.

What is VIP IP used for?

It returns a dead connection IMMEDIATELY, when its primary node fails.


Without using VIP IP, the clients have to wait around 10 minutes to receive
ORA-3113: end of file on communications channel. However, using
Transparent Application Failover (TAF) could avoid ORA-3113.

Why we need to have configured SSH or RSH on the RAC nodes?

SSH (Secure Shell,10g+) or RSH (Remote Shell, 9i+) allows oracle UNIX
account connecting to another RAC node and copy/ run commands as the
local oracle UNIX account.

Is the SSH, RSH needed for normal RAC operations?

No. SSH or RSH are needed only for RAC, patch set installation and clustered
database creation.

Do we have to have Oracle RDBMS on all nodes?

Each node of a cluster that is being used for a clustered database will
typically have the RDBMS and RAC software loaded on it, but not actual data
files (these need to be available via shared disk).

What are the restrictions on the SID with a RAC database? Is it limited to 5
characters?

The SID prefix in 10g Release 1 and prior versions was restricted to five
characters by install/ config tools so that an ORACLE_SID of up to max of
5+3=8 characters can be supported in a RAC environment. The SID prefix is
relaxed up to 8 characters in 10g Release 2, see bug 4024251 for more
information.

Does Real Application Clusters support heterogeneous platforms?

The Real Application Clusters do not support heterogeneous platforms in the


same cluster.

Are there any issues for the interconnect when sharing the same switch as
the public network by using VLAN to separate the network?

RAC and Clusterware deployment best practices suggests that the


interconnect (private connection) be deployed on a stand-alone, physically
separate, dedicated switch. On big network the connections could be
unstable.

What is the Load Balancing Advisory?

To assist in the balancing of application workload across designated


resources, Oracle Database 10g Release 2 provides the Load Balancing
Advisory. This Advisory monitors the current workload activity across the
cluster and for each instance where a service is active; it provides a
percentage value of how much of the total workload should be sent to this
instance as well as service quality flag.

What is the Cluster Verification Utiltiy (cluvfy)?

The Cluster Verification Utility (CVU) is a validation tool that you can use to
check all the important components that need to be verified at different
stages of deployment in a RAC environment.

Is it possible to use ASM for the OCR and voting disk?

No, the OCR and voting disk must be on raw or CFS (cluster file system).

What the OCR file is used for?

OCR is a file that manages the cluster and RAC configuration.

What the Voting Disk file is used for?

The voting disk is nothing but a file that contains and manages information
of all the node memberships.

What is the recommended method to make backups of a RAC environment?

RMAN to make backups of the database, dd to backup your voting disk and
hard copies of the OCR file.

What command would you use to check the availability of the RAC system?

crs_stat -t -v (-t -v are optional)

What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters
(RAC) 11g Release 2 feature that provides a single name for clients to access
an Oracle Database running in a cluster. The benefit is clients using SCAN do
not need to change if you add or remove nodes in the cluster.

What is cache fusion?

In a RAC environment, it is the combining of data blocks, which are shipped


across the interconnect from remote database caches (SGA) to the local
node, in order to fulfill the requirements for a transaction (DML, Query of
Data Dictionary).

What is split brain?

When database nodes in a cluster are unable to communicate with each


other, they may continue to process and modify the data blocks
independently. If the same block is modified by more than one instance,
synchronization/locking of the data blocks does not take place and blocks
may be overwritten by others in the cluster. This state is called split brain.

What is the difference between Crash recovery and Instance recovery?

When an instance crashes in a single node database on start-up a crash


recovery takes place. In a RAC environment the same recovery for an
instance is performed by the surviving nodes called Instance recovery.

What is the interconnect used for?

It is a private network which is used to ship data blocks from one instance to
another for cache fusion. The physical data blocks as well as data dictionary
blocks are shared across this interconnect.

How do you determine what protocol is being used for Interconnect traffic?

One of the ways is to look at the database alert log for the time period when
the database was started up.

What methods are available to keep the time synchronized on all nodes in
the cluster?

Either the Network Time Protocol(NTP) can be configured or in 11gr2, Cluster


Time Synchronization Service (CTSS) can be used.

What files components in RAC must reside on shared storage?

Spfiles, ControlFiles, Datafiles and Redolog files should be created on shared


storage.

Where does the clusterware write when there is a network or Storage missed
heartbeat?

The network ping failure is written in $CRS_HOME/log

How do you find out what OCR backups are available?

The ocrconfig -showbackup can be run to find out the automatic and
manually run backups.

If your OCR is corrupted what options do have to resolve this?

You can use either the logical or the physical OCR backup copy to restore the
Repository.

How do you find out what object has its blocks being shipped across the
instance the most?

You can use the dba_hist_seg_stats.

What is a VIP in RAC use for?

The VIP is an alternate Virtual IP address assigned to each node in a cluster.


During a node failure the VIP of the failed node moves to the surviving node
and relays to the application that the node has gone down. Without VIP, the
application will wait for TCP timeout and then find out that the session is no
longer live due to the failure.

How do we know which database instances are part of a RAC cluster?

You can query the V$ACTIVE_INSTANCES view to determine the member


instances of the RAC cluster.

What is OCLUMON used for in a cluster environment?

The Cluster Health Monitor (CHM) stores operating system metrics in the
CHM repository for all nodes in a RAC cluster. It stores information on CPU,
memory, process, network and other OS data, This information can later be
retrieved and used to troubleshoot and identify any cluster related issues.

It is a default component of the 11gr2 grid install. The data is stored in the
master repository and replicated to a standby repository on a different node.

What would be the possible performance impact in a cluster if a less


powerful node (e.g. slower CPUs) is added to the cluster?

All processing will show down to the CPU speed of the slowest server.

What are some of the RAC specific parameters?

Some of the RAC parameters are:

CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
INSTANCE_TYPE (RDBMS or ASM)
ACTIVE_INSTANCE_COUNT
UNDO_MANAGEMENT

What is the future of the Oracle Grid?

The Grid software is becoming more and more capable of not just supporting
HA for Oracle Databases but also other applications including Oracles
applications. With 12c there are more features and functionality built-in and
it is easier to deploy these pre-built solutions, available for common Oracle
applications.

What components of the Grid should I back up?

The backups should include OLR, OCR and ASM Metadata.

Is there an easy way to verify the inventory for all remote nodes

You can run the opatch lsinventory -all_nodes command from a single node
to look at the inventory details for all nodes in the cluster.

What are Oracle RAC software components?

Oracle RAC is composed of two or more database instances. They are


composed of Memory structures and background processes same as the
single instance database.Oracle RAC instances use two processes GES(Global
Enqueue Service), GCS(Global Cache Service) that enable cache fusion.

Oracle RAC instances are composed of following background processes:


ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)
RSMNRemote Slave Monitor

What are Oracle database background processes specific to RAC?

LMSGlobal Cache Service Process


LMDGlobal Enqueue Service Daemon
LMONGlobal Enqueue Service Monitor
LCK0Instance Enqueue Process

Oracle RAC instances use two processes, the Global Cache Service (GCS) and
the Global Enqueue Service (GES). The GCS and GES maintain records of the
statuses of each data file and each cached block using a Global Resource
Directory (GRD). The GRD contents are distributed across all of the active
instances.

What is Cache Fusion?

Transfer of data across instances through private interconnect is called cache


fusion.Oracle RAC is composed of two or more instances. When a block of
data is read from datafile by an instance within the cluster and another
instance is in need of the same block,it is easy to get the block image from
the instance which has the block in its SGA rather than reading from the disk.
To enable inter instance communication Oracle RAC makes use of
interconnects. The Global en-queue Service(GES) monitors and Instance enqueue process manages the cache fusion

What is SCAN? (11gR2 feature)

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters
(RAC) 11g Release 2 feature that provides a single name for clients to access
an Oracle Database running in a cluster. The benefit is clients using SCAN do
not need to change if you add or remove nodes in the cluster.

What are SCAN components in a cluster?

SCAN Name
SCAN IPs (3)
SCAN Listeners (3)

What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events


related to instances,services and nodes.This is a notification mechanism that
Oracle RAC uses to notify other processes about the configuration and
service level information that includes service status changes such as,UP or
DOWN events.Applications can respond to FAN events and take immediate
action.

How to find location of OCR file when CRS is down?

If you need to find the location of OCR (Oracle Cluster Registry) but your CRS
is down.

When the CRS is down:


Look into ocr.loc file, location of this file changes depending on the OS:
On Linux: /etc/oracle/ocr.loc
On Solaris: /var/opt/oracle/ocr.loc

When CRS is UP:


Set ASM environment or CRS environment then run the below command:

ocrcheck

In 2 node RAC, how many NICs are using ?

2 network cards on each clusterware node


Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node
communication between rac nodes used by clusterware and rac database)

What is difference between RAC ip addresses ?

Public IP adress is the normal IP address typically used by DBA and SA to


manage storage, system and database. Public IP addresses are reserved for
the Internet.
Private IP address is used only for internal clustering processing (Cache
Fusion) (aka as interconnect). Private IP addresses are reserved for private
networks.
VIP is used by database applications to enable fail over when one cluster
node fails. The purpose for having VIP is so client connection can be failover
to surviving nodes in case there is failure

Can application developer access the private ip ?


No. private IP address is used only for internal clustering processing (Cache
Fusion)

What are Oracle Clusterware Components?

Voting Disk > Oracle RAC uses the voting disk to manage cluster
membership by way of a health check and arbitrates cluster ownership
among the instances in case of network
failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) > Maintains cluster configuration information


as well as configuration information about any cluster database within the
cluster. The OCR must
reside on shared disk that is accessible by all of the nodes in your cluster

What is the purpose of Private Interconnect ?

Clusterware uses the private interconnect for cluster synchronization


(network heartbeat) and daemon communication between the the clustered
nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process
communication (TCP). Cache Fusion is the remote memory mapping of
Oracle buffers, shared between the caches of participating nodes in the
cluster.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often
wait for a TCP timeout period (which can be up to 10 min) before getting an
error. As a result, you
dont really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to
some other node and new node re-arps the world indicating a new MAC
address for the IP. Subsequent

packets sent to the VIP go to the new node, which will send error RST
packets back to the clients. This results in the clients getting errors
immediately.

What is dynamic remastering ? When will the dynamic remastering happens?

dynamic remastering is ability to move the ownership of resource from one


instance to another instance in RAC.
dynamic resource remastering is used to implement for resource affinity for
increased performance.
resource affinity optimized the system in situation where update transactions
are being executed in one instance.
when activity shift to another instance the resource affinity correspondingly
move to another instance.
If activity is not localized then resource ownership is hashed to the instance.

What is RAC and how is it different from non RAC databases?

RAC stands for Real Application Cluster,


you have n number of instances running in their own separate nodes and
based on the shared storage.
Cluster is the key component and is a collection of servers operations as one
unit.
RAC is the best solution for high performance and high availably.
Non RAC databases has single point of failure in case of hardware failure or
server crash.

What is GRD?

GRD stands for Global Resource Directory.


The GES and GCS maintains records of the statuses of each datafile and each
cached block using global resource directory.This process is referred to as
cache fusion and helps in data integrity.

What are the major RAC wait events?

In a RAC environment the buffer cache is global across all instances in the
cluster and hence the processing differs.The most common wait events
related to this are gc cr request and gc buffer busy

GC CR request :the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly


tuned queries will increase the amount of data blocks requested by an Oracle
session.
The more blocks requested typically means the more often a block will need
to be read from a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing
the requested data block.

What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance


communication.

What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle


database.Services enable us to define rules and characteristics to control
how users and applications connect to database instances.

How do we verify that RAC instances are running?

Issue the following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER
column,host_:instancename under INST_NAME column.

How does a Oracle Clusterware manage CRS resources?

Oracle clusterware manages CRS resources based on the configuration


information of CRS resources stored in OCR(Oracle Cluster Registry).

How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or
silent mode.After that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

What are the types of connection load-balancing?

There are two types of connection load-balancing:server-side load balancing


and client-side load balancing.

What is the difference between server-side and client-side connection load


balancing?

Client-side balancing happens at client side where load balancing is done


using listener.In case of server-side load balancing listener uses a loadbalancing advisory to redirect connections to the instance providing best
service.

What is the Oracle Recommendation for backing up voting disk?

Oracle recommends us to use the dd command to backup the voting disk


with a minimum block size of 4KB.

How do you restore a voting disk?

To restore the backup of your voting disk, issue the dd or ocopy command for
Linux and UNIX systems or ocopy for Windows systems respectively.
On Linux or UNIX systems:
dd if=backup_file_name of=voting_disk_name
On Windows systems, use the ocopy command:
ocopy backup_file_name voting_disk_name

where,
backup_file_name is the name of the voting disk backup file
voting_disk_name is the name of the active voting disk

S-ar putea să vă placă și