Documente Academic
Documente Profesional
Documente Cultură
Version A.11.20
Legal Notices
Copyright 2011 Hewlett-Packard Development Company, L.P.
Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial
Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under
vendors standard commercial license.
The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express
warranty statements a-ccompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP
shall not be liable for technical or editorial errors or omissions contained herein.
Oracle is a registered trademark of Oracle Corporation.
UNIX is a registered trademark in the United States and other countries, licensed exclusively through The Open Group.
VERITAS is a registered trademark of VERITAS Software Corporation.
VERITAS File System is a trademark of VERITAS Software Corporation.
Contents
Advantages of using SGeRAC.........................................................................8
User Guide Overview....................................................................................9
Where to find Documentation on the Web......................................................11
1 Introduction to Serviceguard Extension for RAC............................................12
What is a Serviceguard Extension for RAC Cluster? ...................................................................12
Group Membership............................................................................................................13
Using Packages in a Cluster ...............................................................................................13
Serviceguard Extension for RAC Architecture..............................................................................14
Group Membership Daemon...............................................................................................14
Overview of SGeRAC and Cluster File System (CFS)/Cluster Volume Manager (CVM).....................14
Package Dependencies.......................................................................................................15
Storage Configuration Options............................................................................................15
About Veritas CFS and CVM from Symantec..........................................................................15
Overview of SGeRAC and Oracle 10g, 11gR1, and 11gR2 RAC...................................................16
Overview of SGeRAC Cluster Interconnect Subnet Monitoring ......................................................17
How Cluster Interconnect Subnet Works................................................................................17
Configuring Packages for Oracle RAC Instances.........................................................................18
Configuring Packages for Oracle Listeners..................................................................................18
Node Failure.........................................................................................................................19
Larger Clusters ......................................................................................................................20
Up to Four Nodes with SCSI Storage....................................................................................20
Point-to-Point Connections to Storage Devices ........................................................................21
Extended Distance Cluster Using Serviceguard Extension for RAC.................................................22
GMS Authorization.................................................................................................................22
Overview of Serviceguard Manager.........................................................................................23
Starting Serviceguard Manager...........................................................................................23
Monitoring Clusters with Serviceguard Manager....................................................................23
Administering Clusters with Serviceguard Manager................................................................23
Configuring Clusters with Serviceguard Manager...................................................................24
Network Monitoring...........................................................................................................28
SGeRAC Heartbeat Network..........................................................................................28
CSS Heartbeat Network.................................................................................................28
RAC Cluster Interconnect................................................................................................28
Public Client Access.......................................................................................................28
RAC Instances........................................................................................................................28
Automated Startup and Shutdown........................................................................................28
Manual Startup and Shutdown............................................................................................29
Shared Storage.................................................................................................................29
Network Planning for Cluster Communication.............................................................................29
Planning Storage for Oracle Cluster Software.............................................................................30
Planning Storage for Oracle 10g/11gR1/11gR2 RAC..................................................................30
Volume Planning with SLVM................................................................................................31
Storage Planning with CFS..................................................................................................31
Volume Planning with CVM.................................................................................................31
Installing Serviceguard Extension for RAC .................................................................................33
Veritas Cluster Volume Manager (CVM) and Cluster File System (CFS)...........................................33
Veritas Storage Management Products.......................................................................................33
About Multipathing............................................................................................................33
About Device Special Files.......................................................................................................33
About Cluster-wide Device Special Files (cDSFs).....................................................................34
Configuration File Parameters...................................................................................................35
Cluster Communication Network Monitoring..............................................................................35
Single Network for Cluster Communications..........................................................................36
Alternate ConfigurationFast Reconfiguration with Low Node Member Timeout.........................37
Alternate ConfigurationMultiple RAC Databases.................................................................38
Guidelines for Changing Cluster Parameters..........................................................................39
When Cluster Interconnect Subnet Monitoring is used........................................................39
When Cluster Interconnect Subnet Monitoring is not Used..................................................39
Limitations of Cluster Communication Network Monitor...........................................................40
Cluster Interconnect Monitoring Restrictions.......................................................................40
Creating a Storage Infrastructure with LVM.................................................................................40
Building Volume Groups for RAC on Mirrored Disks...............................................................41
Creating Volume Groups and Logical Volumes .................................................................41
Selecting Disks for the Volume Group..........................................................................41
Creating Physical Volumes.........................................................................................41
Creating a Volume Group with PVG-Strict Mirroring......................................................42
Building Mirrored Logical Volumes for RAC with LVM Commands.............................................42
Creating Mirrored Logical Volumes for RAC Redo Logs and Control Files..............................42
Creating Mirrored Logical Volumes for RAC Data Files.......................................................43
Creating RAC Volume Groups on Disk Arrays .......................................................................44
Creating Logical Volumes for RAC on Disk Arrays..................................................................45
Oracle Demo Database Files ..............................................................................................45
Displaying the Logical Volume Infrastructure ..............................................................................46
Exporting the Logical Volume Infrastructure ...........................................................................46
Exporting with LVM Commands ......................................................................................46
Installing Oracle Real Application Clusters.................................................................................47
Creating a Storage Infrastructure with CFS.................................................................................47
Creating an SGeRAC Cluster with CFS for Oracle 11gR1 or 11gR2...........................................48
Initializing the Veritas Volume Manager................................................................................48
Deleting CFS from the Cluster..............................................................................................51
Creating a Storage Infrastructure with CVM................................................................................52
Initializing the Veritas Volume Manager................................................................................52
Using CVM 5.x or later......................................................................................................53
Preparing the Cluster and the System Multi-node Package for use with CVM 5.x or later.........53
4
Contents
5 Maintenance.........................................................................................113
Reviewing Cluster and Package States with the cmviewcl Command............................................113
Types of Cluster and Package States...................................................................................113
Examples of Cluster and Package States.........................................................................113
Types of Cluster and Package States..............................................................................115
Cluster Status .............................................................................................................116
Node Status and State ................................................................................................116
Package Status and State ............................................................................................116
Package Switching Attributes........................................................................................117
Status of Group Membership........................................................................................117
Service Status ............................................................................................................117
Network Status...........................................................................................................118
Failover and Failback Policies.......................................................................................118
Examples of Cluster and Package States .............................................................................118
Normal Running Status................................................................................................118
Quorum Server Status..................................................................................................119
CVM Package Status...................................................................................................119
Status After Moving the Package to Another Node...........................................................120
Status After Package Switching is Enabled......................................................................121
Status After Halting a Node.........................................................................................121
Viewing Data on Unowned Packages.............................................................................121
Checking the Cluster Configuration and Components................................................................122
Checking Cluster Components...........................................................................................123
Setting up Periodic Cluster Verification................................................................................125
Example....................................................................................................................125
Limitations.......................................................................................................................126
Online Reconfiguration..........................................................................................................126
Online Node Addition and Deletion...................................................................................126
Managing the Shared Storage...............................................................................................127
Making LVM Volume Groups Shareable..............................................................................127
Making a Volume Group Unshareable ..........................................................................128
Activating an LVM Volume Group in Shared Mode...............................................................128
Deactivating a Shared Volume Group ...........................................................................128
Making Offline Changes to Shared Volume Groups.............................................................128
Adding Additional Shared LVM Volume Groups ..................................................................130
Changing the CVM Storage Configuration .........................................................................130
6
Contents
6 Troubleshooting......................................................................................138
A Software Upgrades ...............................................................................139
Rolling Software Upgrades....................................................................................................139
Upgrading Serviceguard to SGeRAC cluster........................................................................140
Upgrading from an existing SGeRAC A.11.19 cluster to HP-UX 11i v3 1109
HA-OE/DC-OE...........................................................................................................140
Upgrading from an existing Serviceguard A.11.19 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE
along with SGeRAC....................................................................................................140
Upgrading from an existing Serviceguard A.11.19 cluster to HP-UX 11i v3 1109
HA-OE/DC-OE along with SGeRAC (Alternative approach).........................................141
Upgrading from Serviceguard A.11.18 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE along
with SGeRAC.............................................................................................................141
Upgrading from Serviceguard A.11.20 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE along
with SGeRAC.............................................................................................................142
Steps for Rolling Upgrades ...............................................................................................142
Keeping Kernels Consistent...........................................................................................143
Example of Rolling Upgrade .............................................................................................143
Step 1. ......................................................................................................................144
Step 2. .....................................................................................................................144
Step 3. .....................................................................................................................145
Step 4. .....................................................................................................................146
Step 5. .....................................................................................................................146
Limitations of Rolling Upgrades .........................................................................................147
Non-Rolling Software Upgrades.............................................................................................148
Limitations of Non-Rolling Upgrades ..................................................................................148
Migrating an SGeRAC Cluster with Cold Install....................................................................148
Upgrade Using DRD.............................................................................................................149
Rolling Upgrade Using DRD..............................................................................................149
Non-Rolling Upgrade Using DRD.......................................................................................149
Restrictions for DRD Upgrades...........................................................................................149
Index.......................................................................................................153
Contents
Chapter 5 Maintenance
Describes tools and techniques necessary for ongoing cluster operation. This chapter should
be used as a supplement to Chapters 78 of the Managing Serviceguard users guide.
Chapter 6 Troubleshooting
Lists where to find troubleshooting information.
VERITAS Storage Foundation for Oracle RAC. HP Serviceguard Storage Management Suite
Configuration Guide Extracts.
VERITAS Storage Foundation for Oracle RAC. HP Serviceguard Storage Management Suite
Administration Guide Extracts.
If you will be using Veritas Cluster Volume Manager (CVM) and Veritas Cluster File System (CFS)
from Symantec with Serviceguard refer to the HP Serviceguard Storage Management Suite Version
A.03.01 for HP-UX 11i v3 Release Notes.
These release notes describe suite bundles for the integration of HP Serviceguard A.11.20 on
HP-UX 11i v3 with Symantecs Veritas Storage Foundation.
Problem Reporting
If you have any problems with the software or documentation, please contact your local
Hewlett-Packard Sales Office or Customer Service Center.
Typographical Conventions
10
audit(5)
An HP-UX manpage. audit is the name and 5 is the section in the HP-UX
Reference. On the web and on the Instant Information CD, it may be a hot link
to the manpage itself. From the HP-UX command line, you can enter man
audit or man 5 audit to view the manpage. See man(1).
ComputerOut
UserInput
Command
Variable
[ ]
The contents are optional in formats and command descriptions. If the contents
are a list separated by |, you must choose one of the items.
{ }
The contents are required in formats and command descriptions. If the contents
are a list separated by |, you must choose one of the items.
...
SGeRAC Documentation
Go to www.hp.com/go/hpux-serviceguard-docs, and then click HP Serviceguard
Extension for RAC.
Related Documentation
Go to www.hp.com/go/hpux-serviceguard-docs, www.hp.com/go/
hpux-core-docs, and www.hp.com/go/hpux-ha-monitoring-docs.
The following documents contain additional useful information:
HP Serviceguard Storage Management Suite Version A.02.01 for HP-UX 11i v3 Release
Notes
HP-UX System Administrator's Guide: Logical Volume Management HP-UX 11i Version
3
11
Overview of SGeRAC and Cluster File System (CFS)/Cluster Volume Manager (CVM) (page
14)
Overview of SGeRAC and Oracle 10g, 11gR1, and 11gR2 RAC (page 16)
Extended Distance Cluster Using Serviceguard Extension for RAC (page 22)
In the figure, two loosely coupled systems (each one known as a node) are running separate
instances of Oracle software that read data from and write data to a shared set of disks. Clients
connect to one node or the other via LAN.
With RAC on HP-UX, you can maintain a single database image that is accessed by the HP servers
in parallel and gain added processing power without the need to administer separate databases.
12
When properly configured, Serviceguard Extension for RAC provides a highly available database
that continues to operate even if one hardware component fails.
Group Membership
Group membership allows multiple instances of RAC to run on each node. Related processes are
configured into groups. Groups allow processes in different instances to choose which other
processes to interact with. This allows the support of multiple databases within one RAC cluster.
A Group Membership Service (GMS) component provides a process monitoring facility to monitor
group membership status. GMS is provided by the cmgmsd daemon, which is an HP component
installed with Serviceguard Extension for RAC.
Figure 2 shows how group membership works. Nodes 1 through 4 of the cluster share the Sales
database, but only Nodes 3 and 4 share the HR database. There is one instance of RAC each on
Node 1 and Node 2, and two instances of RAC each on Node 3 and Node 4. The RAC processes
accessing the Sales database constitute one group, and the RAC processes accessing the HR
database constitute another group.
Figure 2 Group Membership Services
13
are supported are those specified by Hewlett-Packard, and you can create your own multi-node
packages. For example, the packages HP supplies for use with the Veritas Cluster Volume
Manager (CVM) and the Veritas Cluster File (CFS) System (on HP-UX releases that support
Veritas CFS and CVM. Also, see About Veritas CFS and CVM from Symantec (page 15)).
Multi-node package. A system multi-node package must run on all nodes that are active in
the cluster. If it fails on one active node, that node halts. A multi-node package can be
configured to run on one or more cluster nodes. It is considered UP as long as it is running
on any of its configured nodes.
NOTE: In RAC clusters, you create packages to start and stop Oracle Clusterware and RAC itself
as well as to run applications that access the database instances. For details on the use of packages
with Oracle Clusterware and RAC, refer to RAC Instances (page 28)
Oracle Components
Serviceguard Components
Package Manager
Cluster Manager
Network Manager
Operating System
HP-UX Kernel
This HP daemon provides group membership services for Oracle Real Application Cluster 10 g or
later. Group membership allows multiple Oracle instances to run on the same cluster node. GMS
is illustrated in Figure 2 (page 13).
14
Package Dependencies
When CFS is used as shared storage, the application and software using the CFS storage should
be configured to start and stop using Serviceguard packages. These application packages should
be configured with a package dependency on the underlying multi-node packages, which manages
the CFS and CVM storage reserves.
Configuring the application to be start/stop through Serviceguard package is to ensure the
synchronization of storage activation/deactivation and application startup/shutdown.
With CVM configurations using multi-node packages, CVM shared storage should be configured
in Serviceguard packages with package dependencies.
Refer to the latest edition of the Managing Serviceguard users guide for detailed information on
multi-node packages.
Overview of SGeRAC and Cluster File System (CFS)/Cluster Volume Manager (CVM)
15
NOTE: Beginning with HP-UX 11i v3 1109 HA-OE/DC-OE, SGeRAC is included as a licensed
bundle at no additional cost. To install SGeRAC A.11.20 on your system during 1109
HA-OE/DC-OE installation, you must select T1907BA (SGeRAC) in the Software tab.
If you are using any one of the bundles namely, T2771DB, T2773DB, T2774DB, T2775DB,
T8684DB, T8694DB, T8685DB, T8695DB, T2774EB, T8684EB, or T8694EB) and if you select
SGeRAC during upgrade from HP Serviceguard Storage Management Suite (SMS) cluster, then
this configuration is supported only if Oracle RAC is deployed over SLVM. This happens because
these Serviceguard SMS bundles either contain only VxVM which cannot be used for Oracle RAC
or they are not Oracle specific bundles.
If SGeRAC is not installed as a part of HP-UX 11i v3 1109 HA-OE/DC-OE, it is automatically
installed during the installation of Serviceguard SMS bundles (T2777DB, T8687DB, T8697DB,
T2777EB, T8687EB, T8697EB).
If SGeRAC is installed with HP-UX 11i v3 1109 HA-OE/DC-OE, then depending upon the available
version of Serviceguard and SGeRAC in the SMS bundle, the installation of Serviceguard SMS
may succeed or fail.
The table below discusses the different installation scenarios for Serviceguard SMS bundles:
SMS bundle
Install SMS bundle 7 which has the same version of Serviceguard Continues the installation of Serviceguard SMS by
and SGeRAC version available on the 1109 HA-OE/DC-OE.
skipping the installation of Serviceguard and
SGeRAC.
Install Serviceguard SMS bundle 7 which has a higher version
of Serviceguard and SGeRAC version available on the 1109
HA-OE/DC-OE.
Install SMS bundle 7 which has a lower version of Serviceguard Installation of Serviceguard SMS fails indicating that
and SGeRAC version available on the 1109 HA-OE/DC-OE.
higher Serviceguard version is already available on
the system.
16
Oracle 10g/11gR1/11gR2 RAC uses the following two subnets for cluster communication purposes:
CSS Heartbeat Network (CSS-HB)Oracle Clusterware running on the various nodes of the
cluster communicate among themselves using this network.
NOTE: In this document, the generic terms CRS and Oracle Clusterware will subsequently be
referred to as Oracle Cluster Software. The use of the term CRS will still be used when referring
to a sub-component of Oracle Cluster Software.
For more detailed information on Oracle 10g/11gR1/11gR2 RAC, refer to Chapter 2:
Serviceguard Configuration for Oracle 10g, 11gR1, or 11gR2 RAC.
A subnet used only for the communications among instances of an application configured as
a multi-node package.
A subnet whose health does not matter if there is only one instance of an application (package)
running in the cluster. The instance is able to provide services to its clients regardless of whether
the subnet is up or down on the node where the only instance of the package is running.
A failure of the cluster interconnect subnet on all nodes of the cluster, where the multi-node package
is running, is handled by bringing down all but one instance of the multi-node package.
In certain 10g/11gR1/11gR2 RAC configurations, this parameter can be used to monitor Oracle
Cluster Synchronization Services (CSS-HB) and/or RAC cluster interconnect subnet when Oracle
Clusterware and RAC Database instances are configured as Serviceguard multi-node packages.
Cluster Interconnect Subnet Monitoring provides the following benefits:
Better availability by detecting and resolving RAC-IC subnet failures quickly in certain
configurations. (For example, in configurations where Oracle CSS-HB subnet and the RAC-IC
subnet are not the same. This configuration helps avoid one RAC database (RAC-IC) traffic
from interfering with another.)
Assists in providing services (on one node) when Oracle CSS-HB/RAC-IC subnet fails on all
nodes.
Facilitates fast and reliable reconfiguration with use of multiple SG-HB networks.
Allows separation of SG-HB and Oracle RAC-IC traffic (recommended when RAC-IC traffic
may interfere with SG-HB traffic).
17
For example, when a multi-node package (pkgA) is configured to run on all nodes of the cluster,
and configured to monitor a subnet (SubnetA) using the CLUSTER_INTERCONNECT_SUBNET
parameter:
If more than one instance of pkgA is running in the cluster and SubnetA fails on one of the
nodes where the instance of pkgA is running, the failure is handled by halting the instance of
pkgA on the node where the subnet has failed.
If pkgA is running on only one node of the cluster and SubnetA fails on that node, pkgA will
continue to run on that node after the failure.
If pkgA runs on all nodes of the cluster and SubnetA fails on all nodes of the cluster, the
failure is handled by halting all but one instance of pkgA. Where the instance of pkgA will
be left running is randomly determined.
The following describes the behavior of cluster interconnect subnet monitoring feature under the
following scenarios:
For more information on the Cluster Interconnect Subnet Monitoring feature, refer to chapter 2,
section Cluster Communication Network Monitoring (page 35). This section describes various
network configurations for cluster communications in SGeRAC/10g or 11gR1/11gR2 RAC cluster,
and how the package configuration parameter CLUSTER_INTERCONNECT_SUBNET can be used
to recover from Oracle Cluster Communications network failures.
configured to automatically fail over from the original node to an adoptive node. When the original
node is restored, the listener package automatically fails back to the original node.
In the listener package ASCII configuration file, the FAILBACK_POLICY is set to AUTOMATIC.
The SUBNET is a set of monitored subnets. The package can be set to automatically startup with
the AUTO_RUN setting.
Each RAC instance can be configured to be registered with listeners that are assigned to handle
client connections. The listener package script is configured to add the package IP address and
start the listener on the node.
For example, on a two-node cluster with one database, each node can have one RAC instance
and one listener package. Oracle clients can be configured to connect to either package IP address
(or corresponding hostname) using Oracle Net Services. When a node failure occurs, existing
client connection to the package IP address will be reset after the listener package fails over and
adds the package IP address. For subsequent connections for clients configured with basic failover,
clients would connect to the next available listener package's IP address and listener.
Node Failure
RAC cluster configuration is designed so that in the event of a node failure, another node with a
separate instance of Oracle can continue processing transactions. Figure 3 shows a typical cluster
with instances running on both nodes.
Figure 3 Before Node Failure
Figure 4 shows the condition where node 1 has failed and Package 1 has been transferred to
node 2. Oracle instance 1 is no longer operating, but it does not fail over to node 2. The IP address
for package 1 was transferred to node 2 along with the package. Package 1 continues to be
available and is now running on node 2. Also, node 2 can now access both the Package 1 disk
and Package 2 disk. Oracle instance 2 now handles all database access, since instance 1 has
gone down.
Node Failure
19
In the above figure, pkg1 and pkg2 are not instance packages. They are shown to illustrate the
movement of the packages.
Larger Clusters
Serviceguard Extension for RAC supports clusters of up to 16 nodes. The actual cluster size is
limited by the type of storage and the type of volume manager used.
20
In this type of configuration, each node runs a separate instance of RAC and may run one or more
high availability packages as well.
The figure shows a dual Ethernet configuration with all four nodes connected to a disk array (the
details of the connections depend on the type of disk array). In addition, each node has a mirrored
root disk (R and R). Nodes may have multiple connections to the same array using alternate links
(PV links) to take advantage of the array's use of RAID levels for data protection.
Larger Clusters
21
FibreChannel switched configurations also are supported using either an arbitrated loop or fabric
login topology. For additional information about supported cluster configurations, refer to the HP
9000 Servers Configuration Guide, available through your HP representative.
GMS Authorization
SGeRAC includes the Group Membership Service (GMS) authorization feature, which allows only
the listed users to access the GMS. By default, this feature is disabled. To enable this feature,
uncomment the variable GMS_USER[0] and add as many as users as you need.
Use the following steps to enable the GMS authorization (If Oracle RAC is already installed):
1. If Oracle RAC database instance and Oracle Clusterware are running, shut them down on
all nodes.
2. Halt the Serviceguard cluster.
3. Edit /etc/opt/nmapi/nmutils.conf to add all Oracle users on all nodes.
GMS_USER[0]=<oracle1>
GMS_USER[1]=<oracle2>
...
GMS_USER[n-1]=<oraclen>
22
4.
5.
1.
2.
3.
4.
5.
If Oracle RAC database instance and Oracle Clusterware (for Oracle 10g, 11gR1, and
11gR2) are running, shut them down on all nodes.
Halt the Serviceguard cluster.
Edit /etc/opt/nmapi/nmutils.conf and comment the GMS_USER[] settings on all
nodes.
Restart the Serviceguard cluster.
Restart Oracle Clusterware (for Oracle 10g, 11gR1, and 11gR2) and Oracle RAC database
instance on all nodes.
You can see properties, status, and alerts of clusters, nodes, and packages.
You can do administrative tasks such as run or halt clusters, cluster nodes, and packages.
Package: halt, run, move from one node to another, reset node- and package-switching flags
Overview of Serviceguard Manager
23
24
Interface Areas
This section documents interface areas where there is expected interaction between SGeRAC,
Oracle 10g/11gR1/11gR2 Cluster Software, and RAC.
SGeRAC Detection
When Oracle 10g/11gR1/11gR2 Cluster Software is installed on an SGeRAC cluster, Oracle
Cluster Software detects the existence of SGeRAC, and CSS uses SGeRAC group membership.
Cluster Timeouts
SGeRAC periodically check heartbeat member timeouts to determine when any SGeRAC cluster
member has failed or when any cluster member is unable to communicate with the other cluster
members. CSS uses a similar mechanism for CSS memberships. Each RAC instance group
membership also has a member timeout mechanism, which triggers Instance Membership Recovery
(IMR).
NOTE:
Interface Areas
25
CSS Timeout
When SGeRAC is on the same cluster as Oracle Cluster Software, the CSS timeout is set to a
default value of 600 seconds (10 minutes) at Oracle software installation.
This timeout is configurable with Oracle tools and should not be changed without ensuring that
the CSS timeout allows enough time for Serviceguard Extension for RAC (SGeRAC) reconfiguration
and to allow multipath (if configured) reconfiguration to complete.
On a single point of failure, for example a node failure, Serviceguard reconfigures first and SGeRAC
delivers the new group membership to CSS via NMAPI2. If there is a change in group membership,
SGeRAC updates the members of the new membership. After receiving the new group membership,
CSS initiates its own recovery action as needed, and propagates the new group membership to
the RAC instances.
IMR timeout as a configurable parameter has been deprecated in Oracle 11gR1 and
Monitoring
Oracle Cluster Software daemon monitoring is performed through programs initiated by the HP-UX
init process. SGeRAC monitors Oracle Cluster Software to the extent that CSS is a NMAPI2 group
membership client and group member. SGeRAC provides group membership notification to the
remaining group members when CSS enters and leaves the group membership.
Shared Storage
SGeRAC supports shared storage using HP Shared Logical Volume Manager (SLVM), Cluster File
System (CFS), Cluster Volume Manager (CVM), and ASM (ASM/SLVM in 11i v2/v3 and ASM
over raw disks in 11i v3). CFS and CVM are not supported on all versions of HP-UX (on HP-UX
releases that support them). See About Veritas CFS and CVM from Symantec (page 15).
26
The file /var/opt/oracle/oravg.conf must not be present so Oracle Cluster Software will
not activate or deactivate any shared storage.
Multipathing
Multipathing is automatically configured in HP-UX 11i v3 (this is often called native multipathing).
Multipathing is supported through either SLVM pvlinks or CVM Dynamic Multipath (DMP). In some
configurations, SLVM or CVM does not need to be configured for multipath as the multipath is
provided by the storage array. Since Oracle Cluster Software checks availability of the shared
device for the vote disk through periodic monitoring, the multipath detection and failover time must
be less than CRS's timeout specified by the Cluster Synchronization Service (CSS) MISSCOUNT.
On SGeRAC configurations, the CSS MISSCOUNT value is set to 600 seconds. Multipath failover
time is typically between 30 to 120 seconds. For information on Multipathing and HP-UX 11i v3,
see About Multipathing (page 33).
Listener
Automated Startup and Shutdown
CRS can be configured to automatically start, monitor, restart, and halt listeners.
If CRS is not configured to start the listener automatically at Oracle Cluster Software startup, the
listener startup can be automated with supported commands, such as srvctl and lsnrctl,
through scripts or SGeRAC packages. If the SGeRAC package is configured to start the listener,
the SGeRAC package would contain the virtual IP address required by the listener.
Interface Areas
27
Network Monitoring
SGeRAC cluster provides network monitoring. For networks that are redundant and monitored by
Serviceguard cluster, Serviceguard cluster provides local failover capability between local network
interfaces (LAN) that is transparent to applications utilizing User Datagram Protocol (UDP) and
Transport Control Protocol (TCP).
Virtual IP addresses (floating or package IP address) in Serviceguard provide remote failover
capability of network connection endpoints between cluster nodes, and transparent local failover
capability of network connection endpoints between redundant local network interfaces.
NOTE: Serviceguard cannot be responsible for networks or connection endpoints that it is not
configured to monitor.
RAC Instances
Automated Startup and Shutdown
CRS can be configured to automatically start, monitor, restart, and halt RAC instances. If CRS is
not configured to automatically start the RAC instance at Oracle Cluster Software startup, the RAC
instance startup can be automated through scripts using supported commandssuch as srvctl
or sqlplus, in an SGeRAC package to start and halt RAC instances.
NOTE:
28
Shared Storage
It is expected the shared storage is available when the RAC instance is started. Since the RAC
instance expects the shared storage to be available, ensure the shared storage is activated. For
SLVM, the shared volume groups must be activated and for CVM, the disk group must be activated.
For CFS, the cluster file system must be mounted.
Oracle Cluster Software requires shared storage for the Oracle Cluster Registry (OCR) and a vote
device. Automatic Storage Management (ASM) can not be used for the OCR and vote device in
prior Oracle 11gR2 versions since these files must be accessible before Oracle Cluster Software
starts.
For Oracle 10g, the minimum required size for each copy of the OCR is 100 MB, and for each
vote disk it is 20 MB. For Oracle 11gR2, the minimum required size for each copy of the OCR is
300 MB, and for each vote disk it is 300 MB.
The Oracle OCR and vote device can be created on supported shared storage, including SLVM
logical volumes, CVM raw volumes, and CFS file systems. Oracle 11gR2 supports OCR and vote
device on ASM over SLVM, ASM over raw device files, and Cluster File System (CFS).
RAC Interconnect (RAC-IC)RAC instance peer-to-peer traffic and communications for Global
Cache Service (GCS) and Global Enqueue Service (GES), formally Cache Fusion (CF) and
Distributed Lock Manager (DLM). Network HA is provided by the HP-UX platform (Serviceguard
or Auto Port Aggregation (APA)).
Automatic Storage Management Interconnect (ASM-IC) (only when using ASM, Automatic
Storage Management)ASM instance peer-to-peer traffic. When it exists, ASM-IC should be
on the same network as CSS-HB. Network HA is required either through Serviceguard failover
or HP APA .
Global Atomic Broadcast/Link Level Traffic (GAB/LLT) (only when using CFS/CVM)Symantec
cluster heartbeat and communications traffic. GAB/LLT communicates over link level protocol
29
(DLPI) and supported over Serviceguard heartbeat subnet networks, including primary and
standby links.
Highly available virtual IP (HAIP) (only when using Oracle Grid Infrastructure 11.2.0.2) IP
addresses, which Oracle Database and Oracle ASM instances use to ensure highly available
and load balanced across the provided set of cluster interconnect interfaces.
The most common network configuration is to have all interconnect traffic for cluster
communications to go on a single heartbeat network that is redundant so that Serviceguard
monitors the network and resolves interconnect failures by cluster reconfiguration.
The following are situations when it is not possible to place all interconnect traffic on a single
network:
RAC GCS (cache fusion) traffic may be very high, so an additional dedicated heartbeat
network for Serviceguard needs to be configured.
Some networks, such as Infiniband, are not supported by CFS/CVM, so the CSS-HB/RAC-IC
traffic may need to be on a separate network that is different from SG-HB network.
Certain configurations for fast re-configurations requires a dual Serviceguard heartbeat network,
and CSS-HB/RAC-IC does not support multiple networks for HA purposes.
In a multiple database configuration, RAC-IC traffic of one database may interfere with RAC-IC
traffic of another database; therefore, the RAC-IC traffic of databases may need to be
separated.
In the above cases, you will see a longer time to recover some network failures beyond those
protected by primary and standby, unless Serviceguard is configured to monitor the network.
A failure of CSS-HB/RAC-IC network in such configuration does not force Serviceguard to reform
the cluster. If Serviceguard is not configured to monitor the network, Oracle will take at least CSS
misscount time interval to resolve the network failure. The default value of CSS misscount in SGeRAC
configurations is 600 seconds.
To avoid longer recovery times, manage Oracle Clusterware and RAC-DB instances using
Serviceguard multi-node packages. In addition, configure the CLUSTER_INTERCONNECT_SUBNET
package configuration parameter (as done with a standard SUBNET package configuration
parameter) in the respective multi-node packages to monitor the CSS-HB/RAC-IC networks.
30
SGeRAC
CFS
CFS and CVM are not supported on all versions of HP-UX (on HP-UX releases that support them.
See About Veritas CFS and CVM from Symantec (page 15)).
CAUTION: Once you create the disk group and mount point packages, you must administer the
cluster with CFS commands, including cfsdgadm, cfsmntadm, cfsmount, and cfsumount.
You must not use the HP-UX mount or umount command to provide or remove access to a shared
file system in a CFS environment. Using these HP-UX commands under these circumstances is not
supported. Use cfsmount and cfsumount instead.
If you use the HP-UX mount and umount commands, serious problems could occur, such as writing
to the local file system instead of the cluster file system. Non-CFS commands could cause conflicts
with subsequent CFS command operations on the file system or the Serviceguard packages, and
will not create an appropriate multi-node package, which means cluster packages will not be
aware of file system changes.
NOTE: For specific CFS Serviceguard Storage Management Suite product information refer to
your version of the HP Serviceguard Storage Management Suite Release Notes.
31
Fill out the Veritas Volume worksheet to provide volume names for volumes that you will create
using the Veritas utilities. The Oracle DBA and the HP-UX system administrator should prepare this
worksheet together. Create entries for shared volumes only. For each volume, enter the full pathname
of the raw volume device file. Be sure to include the desired size in MB. Following are sample
worksheets filled out. Refer to Appendix B: Blank Planning Worksheets, for samples of blank
worksheets. Make as many copies as you need. Fill out the worksheet and keep it for future
reference.
ORACLE LOGICAL VOLUME WORKSHEET FOR LVM Page ___ of ____
===============================================================================
RAW LOGICAL VOLUME NAME SIZE (MB)
Oracle Cluster Registry: _____/dev/vg_rac/rora_ocr_____100___ (once per cluster)
Oracle Cluster Vote Disk: ____/dev/vg_rac/rora_vote_____20___ (once per cluster)
Oracle Control File: _____/dev/vg_rac/ropsctl1.ctl______110______
Oracle Control File 2: ___/dev/vg_rac/ropsctl2.ctl______110______
Oracle Control File 3: ___/dev/vg_rac/ropsctl3.ctl______110______
Instance 1 Redo Log 1: ___/dev/vg_rac/rops1log1.log_____120______
Instance 1 Redo Log 2: ___/dev/vg_rac/rops1log2.log_____120_______
Instance 1 Redo Log 3: ___/dev/vg_rac/rops1log3.log_____120_______
Instance 1 Redo Log: __________________________________________________
Instance 1 Redo Log: __________________________________________________
Instance 2 Redo Log 1: ___/dev/vg_rac/rops2log1.log____120________
Instance 2 Redo Log 2: ___/dev/vg_rac/rops2log2.log____120________
Instance 2 Redo Log 3: ___/dev/vg_rac/rops2log3.log____120________
Instance 2 Redo Log: _________________________________________________
Instance 2 Redo Log: __________________________________________________
Data: System ___/dev/vg_rac/ropssystem.dbf___500__________
Data: Sysaux ___/dev/vg_rac/ropssysaux.dbf___800__________
Data: Temp ___/dev/vg_rac/ropstemp.dbf______250_______
Data: Users ___/dev/vg_rac/ropsusers.dbf_____120_________
Data: User data ___/dev/vg_rac/ropsdata1.dbf_200__________
Data: User data ___/dev/vg_rac/ropsdata2.dbf__200__________
Data: User data ___/dev/vg_rac/ropsdata3.dbf__200__________
Parameter: spfile1 ___/dev/vg_rac/ropsspfile1.ora __5_____
Password: ______/dev/vg_rac/rpwdfile.ora__5_______
Instance 1 undotbs1: /dev/vg_rac/ropsundotbs1.dbf___500___
Instance 2 undotbs2: /dev/vg_rac/ropsundotbs2.dbf___500___
Data: example1__/dev/vg_rac/ropsexample1.dbf__________160____
32
To install Serviceguard Extension for RAC, use the following steps for each node:
NOTE: All nodes in the cluster must be either SGeRAC nodes or Serviceguard nodes. For the
up-to-date version compatibility for Serviceguard and HP-UX, see the SGeRAC release notes for
your version.
1.
2.
3.
4.
Mount the distribution media in the tape drive, CD, or DVD reader.
Run Software Distributor, using the swinstall command.
Specify the correct input device.
Choose the following bundle from the displayed list:
Serviceguard Extension for RAC
5.
Veritas Cluster Volume Manager (CVM) and Cluster File System (CFS)
CVM (and CFS Cluster File System) are supported on some, but not all current releases of HP-UX.
See the latest release notes for your version of Serviceguard at
www.hp.com/go/hpux-serviceguard-docs.
NOTE: The HP-UX 11i v3 I/O subsystem provides multipathing and load balancing by default.
This is often referred to as native multipathing. When CVM is installed on HP-UX 11i v3, DMP is
the only supported multipathing solution, and is enabled by default.
33
nomenclature. You are not required to migrate to agile addressing when you upgrade to 11i v3,
though you should seriously consider its advantages. It is possible, though not a best practice, to
have legacy DSFs on some nodes and agile addressing on othersthis allows you to migrate the
names on different nodes at different times, if necessary.
NOTE:
cDSFs can be created for any group of nodes that you specify, provided that Serviceguard
A.11.20 is installed on each node.
Normally, the group should comprise the entire cluster.
cDSFs apply only to shared storage; they will not be generated for local storage, such as root,
boot, and swap devices.
Once you have created cDSFs for the cluster, HP-UX automatically creates new cDSFs when
you add shared storage.
HP recommends that you do not mix cDSFs with persistent (or legacy DSFs) in a volume group,
and you cannot use cmpreparestg (1m) on a volume group in which they are mixed.
For more information about cmpreparestg, see About Easy Deployment in the Managing
Serviceguard, Nineteenth Edition manual at www.hp.com/go/hpux-serviceguard-docs
> HP Serviceguard .
34
Limitations of cDSFs
cDSFs are supported only within a single cluster; you cannot define a cDSF group that crosses
cluster boundaries.
cDSFs are not supported by CVM, CFS, or any other application that assumes DSFs reside
only in /dev/disk and /dev/rdisk.
For more information about Cluster-wide Device Special Files (cDSFs), see the Managing
Serviceguard, Eighteenth Edition manual at www.hp.com/go/hpux-serviceguard-docs >
HP Serviceguard .
This parameter is used for CVM disk groups. Enter the names of all the
CVM disk groups the package will use.
In the ASCII package configuration file, this parameter is called
STORAGE_GROUP.
Unlike LVM volume groups, CVM disk groups are not entered in the cluster configuration file, they
are entered in the package configuration file.
NOTE: CVM 5.x or later with CFS does not use the STORAGE_GROUP parameter because the
disk group activation is performed by the multi-node package. CVM 5.x and later without CFS
uses the STORAGE_GROUP parameter in the ASCII package configuration file in order to activate
the disk group (on HP-UX releases that support Veritas CFS and CVM). See About Veritas CFS
and CVM from Symantec (page 15).
Do not enter the names of LVM volume groups in the package ASCII configuration file.
35
any network configured for Oracle cluster interconnect must also be configured as SGeRAC
A.11.20 heartbeat network.
NOTE:
Do not configure Serviceguard heartbeat and Oracle cluster interconnect in mutually
exclusive networks.
2.
Serviceguard standby interfaces must not be configured for the networks used for Serviceguard
heartbeat and Oracle cluster interconnect.
For Oracle Grid Infrastructure 11.2.0.2 HAIP feature to work properly in an SGeRAC cluster and
to have a resilient network configured for Serviceguard heartbeat and Oracle cluster interconnect,
use one of the following network configurations:
1. Configure one or more networks, each using one HP Auto Port Aggregation (APA) interface
for Oracle cluster interconnect and Serviceguard heartbeat. The APA interfaces must contain
two or more physical network interfaces in hot standby mode or in LAN monitor mode.
NOTE: APA interfaces on all the cluster nodes in the same network must use the same APA
mode (LAN monitor or hot standby). Since these monitoring and failover functions work
differently and do not communicate with each other, unexpected failover can occur.
2.
Configure two or more networks on different interfaces without any standby interface for both
Oracle cluster interconnect and Serviceguard heartbeat.
36
Each primary and standby pair protects against a single failure. With the SG-HB on more than
one subnet, a single subnet failure will not trigger a Serviceguard reconfiguration. If the subnet
with CSS-HB fails, unless subnet monitoring is used, CSS will resolve the interconnect subnet failure
with a CSS cluster reconfiguration. It will wait for the CSS misscount time interval before handling
the CSS-HB subnet failure (by bringing down the node on which the CSS-HB subnet has failed).
The default value of CSS misscount in SGeRAC configurations is 600 seconds.
As shown in Figure 8, CLUSTER_INTERCONNECT_SUBNET can be used in conjunction with the
NODE_FAIL_FAST_ENABLED package configuration parameter to monitor the CSS-HB network.
A failure of CSS-HB subnet on a node should be handled by bringing down that node. Therefore,
set NODE_FAIL_FAST_ENABLED to YES for the package monitoring the CSS-HB subnet. If the
monitored subnet fails, the failure of the CSS-HB subnet on a node will bring down the instance of
the multi-node package and the node where the subnet has failed (When Oracle Clusterware is
configured as a multi-node package and CLUSTER_INTERCONNECT_SUBNET is used to monitor
the CSS-HB subnet).
A failure of CSS-HB subnet on all nodes will result in the multi-node package failing on the nodes
one by one (resulting in that node going down), and one instance of the multi-node package and
node will remain providing services to the clients.
Use a separate package to monitor only the CSS-HB subnet and have Oracle Clusterware multi-node
package depend on the package monitoring the CSS-HB subnet. The NODE_FAIL_FAST_ENABLED
parameter is set to NO for the Oracle Clusterware package, and is set to YES for the package
Cluster Communication Network Monitoring
37
monitoring CSS-HB subnet (Oracle Cluster Interconnect Subnet Package as shown in the package
configuration parameters examples below).
NOTE: Do not configure CLUSTER_INTERCONNECT_SUBNET in the RAC Instance package due
to the RAC-IC network being the same as CSS-HB network.
The following is an example of the relevant package configuration parameters:
Oracle Clusterware Package:
PACKAGE_NAME
PACKAGE_TYPE
LOCAL_LAN_FAILOVER_ALLOWED
NODE_FAIL_FAST_ENABLED
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
CRS_PACKAGE
MULTI_NODE
YES
NO
CI-PACKAGE
CI-PACKAGE=UP
SAME_NODE
Oracle Cluster Interconnect Subnet Package: Package to monitor the CSS-HB subnet
PACKAGE_NAME
CI-PACKAGE
PACKAGE_TYPE
MULTI_NODE
LOCAL_LAN_FAILOVER_ALLOWED
YES
NODE_FAIL_FAST_ENABLED
YES
CLUSTER_INTERCONNECT_SUBNET
192.168.1.0
NOTE: For information on guidelines to change certain Oracle Clusterware and Serviceguard
cluster configuration parameters, see Guidelines for Changing Cluster Parameters (page 39).
As shown in Figure 9, each primary and standby pair protects against a single failure. If the subnet
with SG-HB (lan1/lan2) fails, Serviceguard will resolve the subnet failure with a Serviceguard
cluster reconfiguration. If the 192.168.2.0 subnet (lan3 and lan4) fails, Oracle instance membership
recovery (IMR) will resolve the interconnect failure subnet, unless Serviceguard subnet monitoring
is used. Oracle will wait for IMR time interval prior to resolving the subnet failure. In SGeRAC
configurations, default value of IMR time interval may be as high as seventeen minutes.
CLUSTER_INTERCONNECT_SUBNET can be configured for RAC instance MNP to monitor the
RAC-IC subnet that is different from CSS-HB subnet. The parameter file (SPFILE or PFILE) for
RAC instances must have cluster_interconnects parameter defined, to hold IP address in
38
the appropriate subnet if the RAC Instances use a RAC-IC network different from CSS-HB network.
No special subnet monitoring is needed for CSS-HB Subnet because Serviceguard monitors the
subnet (heartbeat) and will handle failures of the subnet.
The database instances that use 192.168.2.0 must have cluster_interconnects defined in
their SPFILE or PFILE as follows:
orcl1.cluster_interconnects=192.168.2.1
orcl2.cluster_interconnects=192.168.2.2
RAC_PACKAGE
MULTI_NODE
YES
NO
192.168.2.0
NOTE: For information on guidelines to change certain Oracle Clusterware and Serviceguard
cluster configuration parameters, see Guidelines for Changing Cluster Parameters (page 39)
When both SLVM and CVM/CFS are used, then take the max of the above two calculations.
When both SLVM and CVM/CFS are used, then take the max of the above two calculations.
39
NOTE:
1. The F represents the Serviceguard failover time as given by the
max_reformation_duration field of cmviewcl v f line output.
2. SLVM timeout is documented in the whitepaper, LVM link and Node Failure Recovery Time.
A double switch failure resulting in the simultaneous failure of CSS-HB subnet and SG-HB
subnet on all nodes of a two-node cluster. (Assuming the CSS-HB subnet is different from SG-HB
subnet). Serviceguard may choose to retain one node while the failure handling of interconnect
subnets might choose to retain the other node to handle CSS-HB network failure. As a result,
both nodes will go down.
NOTE: To reduce the risk of failure of multiple subnets simultaneously, each subnet must
have its own networking infrastructure (including networking switches).
A double switch failure resulting in the simultaneous failure of CSS-HB subnet and RAC-IC
network on all nodes may result in loss of services (Assuming the CSS-HB subnet is different
from RAC-IC network). The failure handling of interconnect subnets might choose to retain one
node for CSS-HB subnet failures and to retain RAC instance on some other node for RAC-IC
subnet failures. Eventually, the database instance will not run on any node as the database
instance is dependent on clusterware to run on that node.
40
The Event Monitoring Service HA Disk Monitor provides the capability to monitor the health of
LVM disks. If you intend to use this monitor for your mirrored disks, you should configure them in
physical volume groups. For more information, refer to the manual Using HA Monitors.
NOTE: When using LVM version 2.x, the volume groups are supported with Serviceguard. The
steps shown in the following section are for configuring the volume groups in Serviceguard clusters
LVM version 1.0.
For more information on using and configuring LVM version 2.x, see the HP-UX 11i Version 3:
HP-UX System Administrator's Guide: Logical Volume Management located at www.hp.com/go/
hpux-core-docs > HP-UX 11i v3.
For LVM version 2.x compatibility requirements see the Serviceguard/SGeRAC/SMS/Serviceguard
Mgr Plug-in Compatibility and Feature Matrix at www.hp.com/go/hpux-serviceguard-docs
> HP Serviceguard Extension for RAC.
NOTE: For more information, see the Serviceguard Version A.11.20 Release Notes at
www.hp.com/go/hpux-serviceguard-docs > HP Serviceguard Extension for
RAC.
NOTE: The Oracle 11gR2 OUI allows only ASM over SLVM, ASM over raw device files, Cluster
File System for Clusterware files, and Database files.
41
where hh must be unique to the volume group you are creating. Use the next hexadecimal
number that is available on your system, after the volume groups that are already configured.
Use the following command to display a list of existing volume groups:
# ls -l /dev/*/group
3.
Create the volume group and add physical volumes to it with the following commands:
# vgcreate -g bus0 /dev/vg_rac /dev/dsk/c1t2d0
# vgextend -g bus1 /dev/vg_rac /dev/dsk/c0t2d0
The first command creates the volume group and adds a physical volume to it in a physical
volume group called bus0. The second command adds the second drive to the volume group,
locating it in a different physical volume group named bus1. The use of physical volume
groups allows the use of PVG-strict mirroring of disks and PV links.
4.
Creating Mirrored Logical Volumes for RAC Redo Logs and Control Files
Create logical volumes for use as redo log and control files by selecting mirror consistency recovery.
Use the same options as in the following example:
# lvcreate -m 1 -M n -c y -s g -n redo1.log -L 408 /dev/vg_rac
-L 28allocates 28 megabytes.
NOTE: Use the -c y options for both redo logs and control files. These options allow the redo
log files to be resynchronized by SLVM following a system crash before Oracle recovery proceeds.
If these options are not set correctly, you may not be able to continue with database recovery.
If the command is successful, the system will display messages like the following:
42
NOTE: With LVM 2.1 and above, mirror write cache (MWC) recovery can be set to ON for
RAC Redo Logs and Control Files volumes. Example:
# lvcreate -m 1 -M y -s g -n redo1.log -L 408 /dev/vg_rac
NOTE: The character device file name (also called the raw logical volume name) is used by the
Oracle DBA in building the RAC database.
If the command is successful, the system will display messages like the following:
Creating a Storage Infrastructure with LVM
43
NOTE: The character device file name (also called the raw logical volume name) is used by the
Oracle DBA in building the OPS database.
/dev/dsk/c0t15d0
/dev/dsk/c0t15d1
/dev/dsk/c0t15d2
/dev/dsk/c0t15d3
/dev/dsk/c0t15d4
/dev/dsk/c0t15d5
/*
/*
/*
/*
/*
/*
I/O
I/O
I/O
I/O
I/O
I/O
Channel
Channel
Channel
Channel
Channel
Channel
0
0
0
0
0
0
(8/0)
(8/0)
(8/0)
(8/0)
(8/0)
(8/0)
10/0.3.0
10/0.3.1
10/0.3.2
10/0.3.3
10/0.3.4
10/0.3.5
/dev/dsk/c1t3d0
/dev/dsk/c1t3d1
/dev/dsk/c1t3d2
/dev/dsk/c1t3d3
/dev/dsk/c1t3d4
/dev/dsk/c1t3d5
/*
/*
/*
/*
/*
/*
I/O
I/O
I/O
I/O
I/O
I/O
Channel
Channel
Channel
Channel
Channel
Channel
1
1
1
1
1
1
(10/0)
(10/0)
(10/0)
(10/0)
(10/0)
(10/0)
SCSI
SCSI
SCSI
SCSI
SCSI
SCSI
SCSI
SCSI
SCSI
SCSI
SCSI
SCSI
address
address
address
address
address
address
address
address
address
address
address
address
15
15
15
15
15
15
3
3
3
3
3
3
LUN
LUN
LUN
LUN
LUN
LUN
0
1
2
3
4
5
*/
*/
*/
*/
*/
*/
LUN
LUN
LUN
LUN
LUN
LUN
0
1
2
3
4
5
*/
*/
*/
*/
*/
*/
Assume that the disk array has been configured, and that both the following device files appear
for the same LUN (logical disk) when you run the ioscan command:
/dev/dsk/c0t15d0
/dev/dsk/c1t3d0
Use the following procedure to configure a volume group for this logical disk:
1. Set up the group directory for vg_rac:
# mkdir /dev/vg_rac
2.
The major number is always 64, and the hexadecimal minor number has the form
0xhh0000
where hh must be unique to the volume group you are creating. Use the next hexadecimal
number that is available on your system, after the volume groups that are already configured.
Use the following command to display a list of existing volume groups:
# ls -l /dev/*/group
3.
Use the pvcreate command on one of the device files associated with the LUN to define the
LUN to LVM as a physical volume.
# pvcreate -f /dev/rdsk/c0t15d0
44
It is only necessary to do this with one of the device file names for the LUN. The -f option is
only necessary if the physical volume was previously used in some other volume group.
4.
Use the following to create the volume group with the two links:
# vgcreate /dev/vg_rac /dev/dsk/c0t15d0 /dev/dsk/c1t3d0
LVM will now recognize the I/O channel represented by/dev/dsk/c0t15d0 as the primary link
to the disk. If the primary link fails, LVM will automatically switch to the alternate I/O channel
represented by /dev/dsk/c1t3d0. Use the vgextend command to add additional disks to the
volume group, specifying the appropriate physical volume name for each PV link.
Repeat the entire procedure for each distinct volume group you wish to create. For ease of system
administration, you may wish to use different volume groups to separate logs from data and control
files.
NOTE: The default maximum number of volume groups in HP-UX version 2.0 is 10. If you intend
to create enough new volume groups that the total exceeds 10, you must increase the maxvgs
system parameter and then reboot the system. Use the kctune utility to change kernel parameter
area, then choose Configurable Parameters, maxvgs appears on the list.
lvcreate
lvcreate
lvcreate
lvcreate
-n
-n
-n
-n
ops1log1.log -L 4 /dev/vg_rac
opsctl1.ctl -L 4 /dev/vg_rac
system.dbf -L 28 /dev/vg_rac
opsdata1.dbf -L 1000 /dev/vg_rac
LV Size
(MB)
Oracle File
Size (MB)*
opsctl1.ctl
118
/dev/vg_rac/ropsctl1.ctl
110
opsctl2.ctl
118
/dev/vg_rac/ropsctl2.ctl
110
opsctl3.ctl
118
/dev/vg_rac/ropsctl3.ctl
110
ops1log1.log
128
/dev/vg_rac/rops1log1.log
120
ops1log2.log
128
/dev/vg_rac/rops1log2.log
120
ops1log3.log
128
/dev/vg_rac/rops1log3.log
120
ops2log1.log
128
/dev/vg_rac/rops2log1.log
120
ops2log2.log
128
/dev/vg_rac/rops2log2.log
120
ops2log3.log
128
/dev/vg_rac/rops2log3.log
120
opssystem.dbf
408
/dev/vg_rac/ropssystem.dbf
400
opssysaux.dbf
808
/dev/vg_rac/ropssysaux.dbf
800
opstemp.dbf
258
/dev/vg_rac/ropstemp.dbf
250
opsusers.dbf
128
/dev/vg_rac/ropsusers.dbf
120
opsdata1.dbf
208
/dev/vg_rac/ropsdata1.dbf
200
45
LV Size
(MB)
Oracle File
Size (MB)*
opsdata2.dbf
208
/dev/vg_rac/ropsdata2.dbf
200
opsdata3.dbf
208
/dev/vg_rac/ropsdata3.dbf
200
opsspfile1.ora
/dev/vg_rac/ropsspfile1.ora
pwdfile.ora
/dev/vg_rac/rpwdfile.ora
opsundotbs1.dbf
508
/dev/vg_rac/ropsundotbs1.log
500
opsundotbs2.dbf
508
/dev/vg_rac/ropsundotbs2.log
500
example1.dbf
168
/dev/vg_rac/ropsexample1.dbf
160
The size of the logical volume is larger than the Oracle file size because Oracle needs extra space
to allocate a header in addition to the file's actual data capacity.
Create these files if you wish to build the demo database. The three logical volumes at the bottom
of the table are included as additional data files, that you can create as needed, supplying the
appropriate sizes. If your naming conventions require, you can include the Oracle SID and/or the
database name to distinguish files for different instances and different databases. If you are using
the ORACLE_BASE directory structure, create symbolic links to the ORACLE_BASE files from the
appropriate directory. Example:
# ln -s /dev/vg_rac/ropsctl1.ctl/u01/ORACLE/db001/ctrl01_1.ctl
After creating these files, set the owner to oracle and the group to dba with a file mode of 660.
The logical volumes are now available on the primary node, and the raw logical volume names
can now be used by the Oracle DBA.
1.
2.
/dev/vg_rac
Still on ftsys9, copy the map file to ftsys10 (and to additional nodes as necessary.)
# rcp /tmp/vg_rac.map ftsys10:/tmp/vg_rac.map
3.
On ftsys10 (and other nodes, as necessary), create the volume group directory and the
control file named group.
# mkdir /dev/vg_rac
# mknod /dev/vg_rac/group c 64 0xhh0000
For the group file, the major number is always 64, and the hexadecimal minor number has
the form
0xhh0000
where hh must be unique to the volume group you are creating. If possible, use the same
number as on ftsys9. Use the following command to display a list of existing volume groups:
# ls -l /dev/*/group
4.
Import the volume group data using the map file from node ftsys9. On node ftsys10 (and
other nodes, as necessary), enter:
# vgimport -s -m /tmp/vg_rac.map /dev/vg_rac
47
For more information, refer to your version of the Serviceguard Extension for RAC Release Notes
and HP Serviceguard Storage Management Suite Release Notes located at
www.hp.com/go/hpux-serviceguard-docs.
CAUTION: Once you create the disk group and mount point packages, you must administer the
cluster with CFS commands, including cfsdgadm, cfsmntadm, cfsmount, and cfsumount.
You must not use the HP-UX mount or umount command to provide or remove access to a shared
file system in a CFS environment. Using these HP-UX commands under these circumstances is not
supported. Use cfsmount and cfsumount instead.
If you use the HP-UX mount and umount commands, serious problems could occur, such as writing
to the local file system instead of the cluster file system. Non-CFS commands could cause conflicts
with subsequent CFS command operations on the file system or the Serviceguard packages, and
will not create an appropriate multi-node packagecluster packages will not be aware of file
system changes.
3.
4.
48
STATUS
up
NODE
ever3a
ever3b
5.
STATUS
up
up
STATE
running
running
6.
7.
8.
9.
Create the disk group multi-node package. Use the following command to add the disk group
to the cluster:
# cfsdgadm add cfsdg1 all=sw
The following output will be displayed:
Package name SG-CFS-DG-1 was generated to control the resource
shared disk group cfsdg1 is associated with the cluster.
49
50
STATUS
up
STATUS
up
up
STATE
running
running
MULTI_NODE_PACKAGES
PACKAGE
SG-CFS-pkg
SG-CFS-DG-1
SG-CFS-MP-1
SG-CFS-MP-2
SG-CFS-MP-3
STATUS
up
up
up
up
up
STATE
running
running
running
running
running
AUTO_RUN
enabled
enabled
enabled
enabled
enabled
SYSTEM
yes
no
no
no
no
CAUTION: Once you create the disk group and mount point packages, you must administer the
cluster with CFS commands, including cfsdgadm, cfsmntadm, cfsmount, and cfsumount.
You must not use the HP-UX mount or umount command to provide or remove access to a shared
file system in a CFS environment. Using these HP-UX commands under these circumstances is not
supported. Use cfsmount and cfsumount instead.
If you use the HP-UX mount and umount commands, serious problems could occur, such as writing
to the local file system instead of the cluster file system. Non-CFS commands could cause conflicts
with subsequent CFS command operations on the file system or the Serviceguard packages, and
will not create an appropriate multi-node packagecluster packages will not be aware of file
system changes.
51
4.
De-configure CVM.
# cfscluster stop
The following output will be generated:
Stopping CVM...CVM is stopped
# cfscluster unconfig
The following output will be displayed:
CVM is now unconfigured
52
Preparing the Cluster and the System Multi-node Package for use with CVM 5.x or later
The following steps describe how to prepare the cluster and the system multi-node package with
CVM 5.x or later only.
1. Create the cluster file.
# cd /etc/cmcluster
# cmquerycl -C clm.asc -n ever3a -n ever3b
Edit the cluster file.
2.
STATUS
ever3_cluster
up
NODE
ever3a
ever3b
3.
STATUS
up
up
STATE
running
running
53
that uses the volume group must be halted. This procedure is described in the Managing
Serviceguard Eighteenth Edition user guide Appendix G.
4.
5.
STATUS
up
STATUS
up
up
STATE
running
running
MULTI_NODE_PACKAGES
PACKAGE
SG-CFS-pkg
54
STATUS
up
STATE
running
AUTO_RUN
enabled
SYSTEM
yes
IMPORTANT: After creating these files, use the vxedit command to change the ownership of
the raw volume files to oracle and the group membership to dba, and to change the permissions
to 660. Example:
# cd /dev/vx/rdsk/ops_dg
# vxedit -g ops_dg set user=oracle *
# vxedit -g ops_dg set group=dba *
# vxedit -g ops_dg set mode=660 *
The logical volumes are now available on the primary node, and the raw logical volume names
can now be used by the Oracle DBA.
CAUTION: Once you create the disk group and mount point packages, you must administer the
cluster with CFS commands, including cfsdgadm, cfsmntadm, cfsmount, and cfsumount.
You must not use the HP-UX mount or umount command to provide or remove access to a shared
file system in a CFS environment. Using these HP-UX commands under these circumstances is not
supported. Use cfsmount and cfsumount instead.
If you use the HP-UX mount and umount commands, serious problems could occur, such as writing
to the local file system instead of the cluster file system. Non-CFS commands could cause conflicts
with subsequent CFS command operations on the file system or the Serviceguard packages, and
will not create an appropriate multi-node packagecluster packages will not be aware of file
system changes.
Mirror Detachment Policies with CVM
The required CVM disk mirror detachment policy is "global"as soon as one node cannot see a
specific mirror copy (plex), all nodes cannot see it as well. The alternate policy is "local"if one
node cannot see a specific mirror copy, then CVM will deactivate access to the volume for that
node only.
This policy can be reset on a disk group basis by using the vxedit command, as follows:
# vxedit set diskdetpolicy=global <DiskGroupName>
NOTE: The specific commands for creating mirrored and multipath storage using CVM are
described in the HP-UX documentation for the Veritas Volume Manager.
To prepare the cluster for CVM disk group configuration, you need to ensure that only one heartbeat
subnet is configured. Then, use the following command, which creates the special package that
communicates cluster information to CVM:
# cmapplyconf -P /etc/cmcluster/cvm/VxVM-CVM-pkg.conf
WARNING!
55
After the above command completes, start the cluster and create disk groups for shared use as
described in the following sections.
Starting the Cluster and Identifying the Master Node
Run the cluster to activate the special CVM package:
# cmruncl
After the cluster is started, it will run with a special system multi-node package named
VxVM-CVM-pkg that is on all nodes. This package is shown in the following output of the cmviewcl
-v command:
CLUSTER
bowls
STATUS
up
NODE
spare
split
strike
STATUS
up
up
up
STATE
running
running
running
SYSTEM_MULTI_NODE_PACKAGES:
PACKAGE
STATUS
VxVM-CVM-pkg up
STATE
running
When CVM starts up, it selects a master node. From this node, you must issue the disk group
configuration commands. To determine the master node, issue the following command from each
node in the cluster:
# vxdctl -c mode
One node will identify itself as the master. Create disk groups from this node.
Converting Disks from LVM to CVM
You can use the vxvmconvert utility to convert LVM volume groups into CVM disk groups. Before
you can do this, the volume group must be deactivatedany package that uses the volume group
must be halted. This procedure is described in the latest edition of the Managing Serviceguard
user guide, Appendix G.
Initializing Disks for CVM
You need to initialize the physical disks that will be employed in CVM disk groups. If a physical
disk has been previously used with LVM, you should use the pvremove command to delete the
LVM header data from all the disks in the volume group (this is not necessary if you have not
previously used the disk with LVM).
To initialize a disk for CVM, log on to the master node, then use the vxdiskadm program to
initialize multiple disks, or use the vxdisksetup command to initialize one disk at a time, as in
the following example:
# /usr/lib/vxvm/bin/vxdisksetup -i /dev/dsk/c0t3d2
Creating Disk Groups for RAC
Use the vxdg command to create disk groups. Use the -s option to specify shared mode, as in the
following example:
# vxdg -s init ops_dg c0t3d2
Verify the configuration with the following command:
# vxdg list
NAME
rootdg
ops_dg
56
STATE
enabled
enabled,shared
ID
971995699.1025.node1
972078742.1084.node2
Creating Volumes
Use the vxassist command to create logical volumes. The following is an example:
# vxassist -g ops_dg make log_files 1024m
This command creates a 1024MB volume named log_files in a disk group named ops_dg.
The volume can be referenced with the block device file /dev/vx/dsk/ops_dg/log_files
or the raw (character) device file /dev/vx/rdsk/ops_dg/log_files.
Verify the configuration with the following command:
# vxdg list
IMPORTANT: After creating these files, use the vxedit command to change the ownership of
the raw volume files to oracle and the group membership to dba, and to change the permissions
to 660. Example:
# cd /dev/vx/rdsk/ops_dg
# vxedit -g ops_dg set user=oracle *
# vxedit -g ops_dg set group=dba *
# vxedit -g ops_dg set mode=660 *
The logical volumes are now available on the primary node, and the raw logical volume names
can now be used by the Oracle DBA.
opsctl1.ctl
118
/dev/vx/rdsk/ops_dg/opsctl1.ctl
110
opsctl2.ctl
118
/dev/vx/rdsk/ops_dg/opsctl2.ctl
110
opsctl3.ctl
118
/dev/vx/rdsk/ops_dg/opsctl3.ctl
110
ops1log1.log
128
/dev/vx/rdsk/ops_dg/ops1log1.log
120
ops1log2.log
128
/dev/vx/rdsk/ops_dg/ops1log2.log
120
ops1log3.log
128
/dev/vx/rdsk/ops_dg/ops1log3.log
120
ops2log1.log
128
/dev/vx/rdsk/ops_dg/ops2log1.log
120
ops2log2.log
128
/dev/vx/rdsk/ops_dg/ops2log2.log
120
Creating Volumes
57
ops2log3.log
128
/dev/vx/rdsk/ops_dg/ops2log3.log
120
opssystem.dbf
508
/dev/vx/rdsk/ops_dg/opssystem.dbf
500
opssysaux.dbf
808
/dev/vx/rdsk/ops_dg/opssysaux.dbf
800
opstemp.dbf
258
/dev/vx/rdsk/ops_dg/opstemp.dbf
250
opsusers.dbf
128
/dev/vx/rdsk/ops_dg/opsusers.dbf
120
opsdata1.dbf
208
/dev/vx/rdsk/ops_dg/opsdata1.dbf
200
opsdata2.dbf
208
/dev/vx/rdsk/ops_dg/opsdata2.dbf
200
opsdata3.dbf
208
/dev/vx/rdsk/ops_dg/opsdata3.dbf
200
opsspfile1.ora
508
/dev/vx/rdsk/ops_dg/opsspfile1.ora
500
opspwdfile.ora
508
/dev/vx/rdsk/ops_dg/opspwdfile.ora
500
opsundotbs1.dbf
508
/dev/vx/rdsk/ops_dg/opsundotbs1.dbf
500
opsundotbs2.dbf
508
/dev/vx/rdsk/ops_dg/opsundotbs2.dbf
500
opsexmple1.dbf
168
/dev/vx/rdsk/ops_dg/opsexample1.dbf
160
Create these files if you wish to build the demo database. The three logical volumes at the bottom
of the table are included as additional data files that you can create as needed, supplying the
appropriate sizes. If your naming conventions require, you can include the Oracle SID and/or the
database name to distinguish files for different instances and different databases. If you are using
the ORACLE_BASE directory structure, create symbolic links to the ORACLE_BASE files from the
appropriate directory.
Example:
# ln -s /dev/vx/rdsk/ops_dg/opsctl1.ctl \
/u01/ORACLE/db001/ctrl01_1.ctl
Example:
1. Create an ASCII file, and define the path for each database object.
control1=/u01/ORACLE/db001/ctrl01_1.ctl
2.
Set the following environment variable where filename is the name of the ASCII file created.
# export DBCA_RAW_CONFIG=<full path>/filename
58
3.
4.
5.
6.
7.
59
Create Oracle base directory (for RAC binaries on local file system).
If installing RAC binaries on local file system, create the oracle base directory on each node.
# mkdir -p /mnt/app/oracle
# chown -R oracle:oinstall /mnt/app/oracle
# chmod -R 775 /mnt/app/oracle
# usermod -d /mnt/app/oracle oracle
9.
Create Oracle base directory (for RAC binaries on cluster file system).
If installing RAC binaries on Cluster File System, create the oracle base directory once, because
this is a CFS directory visible by all nodes. The CFS file system used is /cfs/mnt1.
# mkdir -p /cfs/mnt1/oracle
# chown -R oracle:oinstall /cfs/mnt1/oracle
# chmod -R 775 /cfs/mnt1/oracle
# chmod 775 /cfs
# chmod 775 /cfs/mnt1
Modify oracle user to use new home directory on each node.
# usermod -d /cfs/mnt1/oracle oracle
b.
Change permission and ownership of Oracle cluster software vote device and database
files.
# chown oracle:oinstall /dev/vg_rac/r*
# chmod 660 /dev/vg_rac/r*
c.
d.
Create raw device mapping file for Oracle Database Configuration Assistant.
In this example, the database name is ver10
# ORACLE_BASE=/mnt/app/oracle; export ORACLE_BASE
# mkdir -p $ORACLE_BASE/oradata/ver10
# chown -R oracle:oinstall $ORACLE_BASE/oradata
# chmod -R 755 $ORACLE_BASE/oradata
60
In this sample, create the DBCA mapping file and place at:
/mnt/app/oracle/oradata/ver10/ver10_raw.conf.
11. Prepare shared storage on CFS.
This section assumes the OCR, Vote device, and database files are created on CFS directories.
The OCR and vote device reside on /cfs/mnt3 and the demo database files reside on
/cfs/mnt2.
a. Create OCR and vote device directories on CFS.
Create OCR and vote device directories on Cluster File System. Run commands only on
one node.
# chmod 775 /cfs
# chmod 755 /cfs/mnt3
# cd /cfs/mnt3
# mkdir OCR
# chmod 755 OCR
# mkdir VOTE
# chmod 755 VOTE
# chown -R oracle:oinstall /cfs/mnt3
b.
61
NOTE: The volume groups are supported with Serviceguard. The steps shown in the
following section are for configuring the volume groups in Serviceguard clusters LVM
version 1.0.
For more information on using and configuring LVM version 2.x, see the HP-UX System
Administrator's Guide: Logical Volume Management located at www.hp.com/go/
hpux-core-docs > HP-UX 11i v3.
If you are using SLVM for the vote device, specify the Vote Disk Location as /dev/vg_rac/
rora_vote.
If you are using CFS for the vote device, specify the Vote Disk Location as /cfs/mnt3/VOTE/
vote_file.
NOTE: During Oracle 10g/11gR1/11gR2 cluster configuration, Oracle gives the default
cluster name crs. This default name can be changed, using the combination of the following
characters: a-z, A-Z, 0-9, _, $and #.
4.
5.
NOTE: The following procedure is only applicable up to Oracle 11gR1. If you prefer to use
SLVM storage for Oracle 11gR2, see Oracle Database 11g Release 2 Real Application Clusters
with SLVM/RAW on HP-UX Installation Cookbook http://h18006.www1.hp.com/storage/pdfs/
4AA2-7668ENW.pdf.
1.
Set up listeners with Oracle network configuration assistant (If the listeners are not configured.).
Use the Oracle network configuration assistant to configure the listeners with the following
command:
$ netca
2.
63
1.
Set up listeners with Oracle network configuration assistant (if the listeners are not configured).
Use the Oracle network configuration assistant to configure the listeners with the following
command:
$ netca
2.
2.
3.
64
4.
Start the cluster and Oracle database (if not already started).
Check that the Oracle instance is using the Oracle Disk Manager function:
# cat /dev/odm/stats
abort:
cancel:
commit:
create:
delete:
identify:
io:
reidentify:
resize:
unidentify:
mname:
vxctl:
vxvers:
0
0
18
18
0
349
12350590
78
0
203
0
0
10
Configuring Oracle to Use Oracle Disk Manager Library
65
io req:
io calls:
comp req:
comp calls:
io mor cmp:
io zro cmp:
cl receive:
cl ident:
cl reserve:
cl delete:
cl resize:
cl same op:
cl opt idn:
cl opt rsv:
**********:
3.
9102431
6911030
73480659
5439560
461063
2330
66145
18
8
1
0
0
0
332
17
4.
In the alert log, verify the Oracle instance is running. The log should contain output similar to
the following:
For CFS 4.1:
Oracle instance running with ODM: VERITAS 4.1 ODM Library, Version
1.1
For CFS 5.0.1:
Oracle instance running with ODM: VERITAS 5.1 ODM Library, Version
1.0
For CFS 5.1 SP1:
Oracle instance running with ODM: Veritas 5.1 ODM Library, Version
2.0
4.
66
$ rm libodm11.so
$ ln -s ${ORACLE_HOME}/lib/libodmd11.so ${ORACLE_HOME}/lib/libodm11.so
5.
On each node of the cluster, disable the automatic startup of the Oracle Clusterware at boot
time.
Login as root and enter:
: $ORA_CRS_HOME/bin/crsctl disable crs
(Check CRS logs or check for Oracle processes, ps -ef | grep ocssd.bin)
On one node of the cluster, disable the Oracle RAC database and instances from being started
automatically by the Oracle Clusterware.
Login as the Oracle administrator and run the following command to set the database
management policy to manual.
For Oracle 10g:
: $ORACLE_HOME/bin/srvctl modify database -d <dbname> -y manual
For Oracle 11g:
: $ORACLE_HOME/bin/srvctl modify database -d <dbname> -y MANUAL
67
Modify the package control script to set the CVM disk group to activate for shared write
and to specify the disk group.
CVM_DG[0]=ops_dg
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
mp1
SG-CFS-MP-1=UP
SAME_NODE
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
mp2
SG-CFS-MP-2=UP
SAME_NODE
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
mp3
SG-CFS-MP-3=UP
SAME_NODE
68
Oracle single instance and RAC databases running in a pure Oracle clusterware environment
Not Supported
Introduction
69
for specific types of disk arrays. Other advantages of the "ASM-over-SLVM" configuration are as
follows:
ASM-over-SLVM ensures that the HP-UX devices used for disk group members will have the
same names (the names of logical volumes in SLVM volume groups) on all nodes, easing ASM
configuration.
ASM-over-SLVM protects ASM data against inadvertent overwrites from nodes inside/outside
the cluster. If the ASM disk group members are raw disks, there is no protection currently
preventing these disks from being incorporated into LVM or VxVM volume/disk groups.
Additional configuration and management tasks are imposed by the extra layer of volume
management (administration of volume groups, logical volumes, physical volumes).
There is a small performance impact from the extra layer of volume management.
SLVM has some restrictions in the area of online reconfiguration, the impact of which will be
examined later in this chapter.
Contiguous
The idea is that ASM provides the mirroring, striping, slicing, and dicing functionality as needed
and SLVM supplies the multipathing functionality not provided by ASM. Figure 13 indicates this
1-1 mapping between SLVM PVs and LVs used as ASM disk group members.
Further, the default retry behavior of SLVM could result in an I/O operation on an SLVM LV taking
an indefinitely long period of time. This behavior could impede ASM retry and rebalance
capabilities; hence, a finite timeout must be configured for each SLVM LV. For example, the timeout
could be configured to the value (total number of physical paths to the PV * PV timeout), providing
enough time for SLVM to try all available paths, if needed.4
The PVs used in an ASM disk group can be organized into SLVM volume groups as desired by
the customer. In the example shown in Figure 13, for each ASM disk group, the PVs corresponding
to its members are organized into a separate SLVM volume group.
70
Figure 10 1-1 mapping between SLVM logical and physical volumes for ASM configuration
If the LVM patch PHKL_36745 (or equivalent) is installed in the cluster, a timeout equal to (2* PV
timeout) will suffice to try all paths.
The SLVM volume groups are marked as shared volume groups and exported across the SGeRAC
cluster using standard SGeRAC procedures. As noted above, multiple physical paths to each
physical volume should be configured using the LVM PV Links feature or a separate multipathing
product such as HP StorageWorks Secure Path.
Please note that, for the case in which the SLVM PVs being used by ASM are disk array LUs, the
requirements in this section do not place any constraints on the configuration of the LUs. The LUs
may be configured with striping, mirroring and other characteristics at the array level, following
guidelines provided by Oracle and the array provider for use by ASM.
pvcreate -f /dev/dsk/c9t0d1
pvcreate -f /dev/dsk/c9t0d2
mkdir /dev/vgora_asm
mknod /dev/vgora_asm/group c 64 0xhh0000
vgcreate /dev/vgora_asm /dev/dsk/c9t0d1
vgextend /dev/vgora_asm /dev/dsk/c9t0d2
SG/SGeRAC Support for ASM on HP-UX 11i v2
71
2.
Extend each LV to the maximum size possible on that PV (the number of extents available
in a PV can be determined via vgdisplay -v <vgname>).
Null out the initial part of each LV to ensure ASM accepts the LV as an ASM disk group
member.5 Note that we are zeroing out the LV data area, not its metadata. It is the ASM
metadata that is being cleared.
See Oracle Metalink Doc ID: Note:268481.1 RE-CREATING ASM INSTANCES AND
DISKGROUPS at https://metalink.oracle.com/ (Oracle MetaLink account required)
#
#
#
#
#
#
#
#
#
#
#
#
#
3.
Export the volume group across the SGeRAC cluster and mark it as shared, as specified by
SGeRAC documentation. Assign the right set of ownerships and access rights to the raw logical
volumes on each node as required by Oracle (oracle:dba and 0660, respectively).
We can now use the raw logical volume device names as disk group members when configuring
ASM disk groups using the Oracle database management utilities. There are a number of ways
of doing this described in Oracle ASM documentation, including the dbca database creation
wizard as well as sqlplus.
The same command sequence, with some modifications, can be used for adding new disks to an
already existing volume group that is being used by ASM to store one or more RAC databases.
If the database(s) should be up and running during the operation, we use the Single Node Online
volume Reconfiguration (SNOR) feature of SLVM.
Step 1 of the above sequence is modified as follows:
72
First, deactivate the volume group vgora_asm on all nodes but one, say node A. This requires
prior shutdown of the database(s) using ASM-managed storage and ASM itself, on all nodes
but node A. See the section ASM Halt is needed to ensure disconnect of ASM from SLVM
Volume Groups to understand why it is not adequate to shut down only the database(s) using
the volume group to be reconfigured, and why we must shut down ASM itself and therefore
all database(s) using ASM-managed storage, on all nodes but node A.
Next, on node A, switch the volume group to exclusive mode, using SNOR.
Initialize the disks to be added with pvcreate, and then extend the volume group with
vgextend.
Step 2 remains the same. Logical volumes are prepared for the new disks in the same way.
In step 3, switch the volume group back to shared mode, using SNOR, and export the VG across
the cluster, ensuring that the right ownership and access rights are assigned to the raw logical
volumes. Activate the volume group, and restart ASM and the database(s) using ASM-managed
storage on all nodes (they are already active on node A).
Oracle single instance and RAC databases running in a pure Oracle clusterware environment.
The following requirements/restrictions apply to SG/SGeRAC (A.11.17.01 or later) support
of ASM (summarized in Table 4):
Not Supported
73
ASM-over-SLVM ensures that the HP-UX devices used for disk group members will have the
same names (the names of logical volumes in SLVM volume groups) on all nodes, easing ASM
configuration.
ASM-over-SLVM protects ASM data against inadvertent overwrites from nodes inside/outside
the cluster. If the ASM disk group members are raw disks, there is no protection currently
preventing these disks from being incorporated into VxVM volume/disk groups.
Additional configuration and management tasks are imposed by the extra layer of volume
management (administration of volume groups, logical volumes, physical volumes).
There is a small performance impact from the extra layer of volume management.
SLVM has some restrictions in the area of online reconfiguration, the impact of which will be
examined later in this chapter.
Contiguous
The idea is that ASM provides the mirroring, striping, slicing, and dicing functionality as needed
and SLVM supplies the multipathing functionality not provided by ASM. Figure 14 indicates this
1-1 mapping between SLVM PVs and LVs used as ASM disk group members.
Further, the default retry behavior of SLVM could result in an I/O operation on an SLVM LV taking
an indefinitely long period of time. This behavior could impede ASM retry and rebalance
capabilities; hence a finite timeout must be configured for each SLVM LV. For example, the timeout
could be configured to the value (total number of physical paths to the PV * PV timeout), providing
enough time for SLVM to try all available paths, if needed.
The PVs used in an ASM disk group can be organized into SLVM volume groups as desired by
the customer. In the example shown in Figure 14, for each ASM disk group, the PVs corresponding
to its members are organized into a separate SLVM volume group.
74
Figure 11 1-1 mapping between SLVM logical and physical volumes for ASM configuration
The SLVM volume groups are marked as shared volume groups and exported across the SGeRAC
cluster using standard SGeRAC procedures.
Please note that, for the case in which the SLVM PVs being used by ASM are disk array LUs, the
requirements in this section do not place any constraints on the configuration of the LUs. The LUs
may be configured with striping, mirroring, and other characteristics at the array level, following
guidelines provided by Oracle and the array provider for use by ASM.
2.
pvcreate -f /dev/rdisk/disk1
pvcreate -f /dev/rdisk/disk2
mkdir /dev/vgora_asm
mknod /dev/vgora_asm/group c 64 0xhh0000
vgcreate /dev/vgora_asm /dev/disk/disk1
vgextend /dev/vgora_asm /dev/disk/disk2
75
Extend each LV to the maximum size possible on that PV (the number of extents available
in a PV can be determined via vgdisplay -v <vgname>)
Null out the initial part of each LV to ensure ASM accepts the LV as an ASM disk group
member. Note that we are zeroing out the LV data area, not its metadata. It is the ASM
metadata that is being cleared.
#
#
#
#
#
#
#
#
#
#
#
#
#
3.
Export the volume group across the SGeRAC cluster and mark it as shared, as specified by
SGeRAC documentation. Assign the right set of ownerships and access rights to the raw logical
volumes on each node as required by Oracle (oracle:dba and 0660, respectively).
We can now use the raw logical volume device names as disk group members when configuring
ASM disk groups using the Oracle database management utilities. There are a number of ways
of doing this described in Oracle ASM documentation, including the dbca database creation
wizard as well as sqlplus.
The same command sequence, with some modifications, can be used for adding new disks to an
already existing volume group that is being used by ASM to store one or more RAC databases.
If the database(s) should be up and running during the operation, we use the Single Node Online
volume Reconfiguration (SNOR) feature of SLVM.
Step 1 of the above sequence is modified as follows:
First, deactivate the volume group vgora_asm on all nodes but one, say node A. This requires
prior shutdown of the database(s) using ASM-managed storage and ASM itself, on all nodes
but node A. See the section to understand why it is not adequate to shut down only the
database(s) using the volume group to be reconfigured, and why we must shut down ASM
itself and therefore all database(s) using ASM-managed storage, on all nodes but node A.
Next, on node A, switch the volume group to exclusive mode, using SNOR.
Initialize the disks to be added with pvcreate, and then extend the volume group with
vgextend.
Step 2 remains the same. Logical volumes are prepared for the new disks in the same way.
In step 3, switch the volume group back to shared mode, using SNOR, and export the VG across
the cluster, ensuring that the right ownership and access rights are assigned to the raw logical
volumes. Activate the volume group, and restart ASM and the database(s) using ASM-managed
storage on all nodes (they are already active on node A).
or later) to support ASM on raw disks/disk array LUs. In HP-UX 11i v3, new DSF is introduced.
SGeRAC will support the DSF format that ASM support with the restriction that native multipathing
feature is enabled.
The advantages for ASM-over-raw are as follows:
There is a small performance improvement from one less layer of volume management.
Online disk management (adding disks, deleting disks) is supported with ASM-over-raw.
Might not see the HP-UX devices (raw disks/disk array LUs) used for disk group members as
the same names on all nodes.
There is no protection to prevent the raw disks from being incorporated into VxVM volume/disk
groups.
Configure Raw Disks/Disk Array Logical Units for ASM Disk Group
Oracle provides instructions on how to configure disks for ASM where the member disks are raw
logical volume. The instructions to configure raw disks/disk LUs are the following:
For Oracle 10g R2, please refer to Oracle Database Installation Guide 10g Release
2 for hpux Itanium , Chapter 2, Preinstallation Tasks, section Preparing Disk Group
for an Automatic Storage Management Installation.
For 11g R1, please refers to Oracle Clusterware Installation Guide 11g Release 1
(11.) for HP-UX, Chapter 5, Configuring Oracle Real Application Clusters Storage,
section Configuring Disks for Automatic Storage Management.
Then, these raw devices can be used as disk group members to configure ASM disk group members
using Oracle database management utilities.
ASM Halt is needed to ensure disconnect of ASM from SLVM Volume Groups
This section is specific to ASM-over-SLVM only.
When an ASM disk group is dismounted on a node in the SGeRAC cluster, there is no guarantee
that processes in the ASM instance on that node and client processes of the ASM instance will
close their open file descriptors for the raw volumes underlying the members of that ASM disk
group.
Consider a configuration in which there are multiple RAC databases using ASM to manage their
storage in an SGeRAC cluster. Assume each database stores its data in its own exclusive set of
ASM disk groups.
If we shut down the database instance for a specific RAC database on a node, and then dismount
its ASM disk groups on that node, some Oracle processes may still hold open file descriptors to
the underlying raw logical volumes. Hence, an attempt at this point to deactivate the corresponding
SLVM volume group(s) on the node may fail. The only way to ensure success of the deactivation
Additional Hints on ASM Integration with SGeRAC
77
of the volume groups is to first shut down the ASM instance and its clients (including all databases
that use ASM based storage) on that node.
The major implications of this behavior include the following:
Many SGeRAC customers use SGeRAC packages to start and shut down Oracle RAC instances.
In the startup and shutdown sequences, the package scripts activate and deactivate the SLVM
volume groups used by the instance.
For the ASM environment, it is not appropriate to include SLVM volume group activation and
deactivation in the database instance package control script, since the deactivation may fail.
In the SGeRAC Toolkit that is MNP/Simple Dependency-based, the SLVM volume groups
underlying ASM disk groups are managed instead from the package control script for Oracle
Clusterware.
When there are multiple RAC databases using ASM to manage their storage in the cluster,
online reconfiguration of SLVM volume groups for one database will impact the others, even
if each database is configured with its own set of ASM disk groups and underlying SLVM
volume groups.
An SLVM volume group is reconfigured online, using the SLVM SNOR feature. This procedure
requires the volume group to be deactivated on all nodes but one. This in turn requires shutting
down the ASM instance and all client database instances on all nodes but one.
However, note that many storage reconfiguration operations can be confined to the ASM
layer. For example, if a new file has to be created in a disk group, there will be no SLVM
operation required, if the disk group has adequate free space. Adding disks to, and deleting
disks from, ASM disk groups may require SLVM reconfiguration (online disk addition is
discussed in the section).
ASM, that the corresponding logical volume be created at the same time, to avoid the potential
impact of a future online SLVM reconfiguration operation to create the logical volume. Note that,
when adding one or more physical volumes in an SLVM configuration, one can avoid an online
SLVM reconfiguration operation by creating a new volume group for the physical volume(s).
78
Oracle Clusterware Installation Guide 11g Release 1 (11.1) for HP-UX at www.oracle.com/
pls/db111/portal.portal_db?selected=11&frame= HP-UX Installation Guides Clusterware
Installation Guide for HP-UX
www.oracle.com/technology/products/database/asm/index.html
79
The coordination issues, pertaining to the combined stack, that the toolkit addresses.
The discussion of SGeRAC begins with the reasons why the Toolkit uses the SG/SGeRAC multi-node
package and simple package dependency features. Next, there is an outline of the flow of control
during startup and shutdown of the combined stack using SGeRAC. This is followed by a description
of how the Toolkit interacts both with Oracle RAC and with the storage management subsystems.
Then, the SGeRAC internal file structure is discussed. Lastly, the SGeRAC benefits are listed.
HP recommends to use SGeRAC toolkit for the following reasons:
The SGeRAC Toolkit ensures the following:
RAC database will not run unless the Oracle CRS is running.
CRS package will not try to come up before it's required dependency package like SG CFS
mount point and disk group packages comes up. That is how SGeRAC toolkit enforces the
proper dependencies.
The storage needed for the operation of Oracle Clusterware is activated before Oracle
Clusterware processes are started.
The storage needed for the operation of a RAC database instance is activated before the RAC
database instance is started.
Oracle Clusterware and the RAC database instance are halted before deactivating the storage
needed by these two entities, while halting these packages or SGeRAC nodes.
Background
Coordinating the Oracle RAC/Serviceguard Extension for RAC stack
The Oracle 10g and later database server offers a built-in feature called Oracle Clusterware which
builds highly available RAC and single instance databases in clustered configurations. Since the
release of Oracle 10g, HP has recommended a combined SGeRAC/Oracle Clusterware
configuration for RAC deployments on HP-UX 11i.3 In this combined environment, the responsibilities
of SGeRAC include the following:
80
Provide cluster membership information to the Oracle Clusterware CSS (Cluster Synchronization
Service) daemon.
Provide clustered storage to meet the needs of Oracle Clusterware and RAC database instances.
The Oracle Clusterware quorum voting and registry devices can be configured as shared raw
logical volumes managed by SGeRAC using Shared Logical Volume Manager (SLVM) or
Cluster Volume Manager (CVM) or, beginning with SGeRAC A.11.17, as shared files managed
by SGeRAC using the Cluster File System (CFS). Beginning with Oracle 11gR2 release, Oracle
Clusterware voting and registry devices can also be configured using oracle ASM (Automatic
Storage Management) disk groups. The members of disk groups are configured as raw devices
(on HP-UX 11i v3). Oracle 11gR2 is supported only on HP-UX 11i v3 (11.31) with SGeRAC
A.11.19 or later.
The RAC database files can be configured as shared raw logical volumes managed by SGeRAC
using SLVM or CVM. Beginning with SGeRAC A.11.17, the RAC database files may be
configured as shared files managed by SGeRAC using CFS. Also, beginning with Oracle 10g
R2 and SGeRAC A.11.17, the RAC database files may also be configured as files in Oracle
ASM (Automatic Storage Management) Disk Groups. The members of the ASM Disk Groups
are configured as raw devices.
The responsibilities of Oracle Clusterware in this combined environment include the following:
Management of the database and associated resources (database instances, services, virtual
IP addresses (VIPs), listeners, etc.).
All pieces of the combined stack must start up and shut down in the proper sequence and we need
to be able to automate the startup and shutdown sequences, if desired. In particular, the storage
needed for the operation of Oracle Clusterware must be activated up before the Oracle Clusterware
processes are started, and the storage needed for the operation of a RAC database instance must
be activated before the instance is started. On shutdown, the sequence is reversedOracle
Clusterware and the RAC database instance must be halted before deactivating the storage needed
by these two entities.
Traditionally, in the SG and SGeRAC environment, these ordering requirements have been met
using a package to encapsulate the startup and shutdown of an application as well as the startup
and shutdown of storage needed by that application. In SG and SGeRAC, a different model is
introduced for the case where the storage needs of an application are met by using a CFS. Here
the CFS is started up and shutdown in a separate package from the one that starts up and shuts
down the application. Beginning with patches PHSS_40885/PHSS_40886 SGeRAC has introduced
new MNP in to existing SGeRAC toolkit namely ASMDG MNP. In SGeRAC, a different model is
recommended for the case where the storage needs of an application are met by using Oracle
ASM. Here ASM disk groups are mounted, dismounted, and monitored using a separate MNP
package. It also activates and deactivates the shared volume groups needed by the ASM disk
groups if ASM is configured over SLVM. The ordering requirement is met by using the SGeRAC
feature of simple package dependencies, discussed later in this chapter.
Can we manage the storage needs of Oracle Clusterware and RAC database instances in Oracle
RAC, using SGeRAC packages in the ways just discussed? Starting in Oracle 10.1.0.4, Oracle
made the following improvements in coordination between Oracle Clusterware and platform
clusterware, enabling such use of SGeRAC packages.
Support for on-demand startup and shutdown of Oracle Clusterware and RAC database instances
In addition to starting up and shutting down Oracle Clusterware automatically as HP-UX 11i
is taken up to init level 3 and taken down to a lower level respectively, Oracle Clusterware
can be start up and shut down on demand.
To disable the automatic startup of Oracle Clusterware on entering init level 3, we use the
crsctl disable crs command. Oracle Clusterware may thereafter be started up and
shut down on demand using the commands crsctl start crs and crsctl stop crs
respectively.
In addition to starting up and shutting down the RAC database instance automatically as
Oracle Clusterware itself is started up and shut down, we can start up and shut down the RAC
database instance on demand.
To disable the automatic startup of the RAC database instance with the startup of Oracle
Clusterware, we follow the procedures described by Oracle to remove auto-start for the
instance. The RAC database instance may thereafter be started up and shut down on demand
Background
81
by, for example, using the command srvctl start instance... and srvctl stop
instance... respectively.
NOTE: The above mentioned steps are the mandatory prerequisite steps to be performed before
you configure SGeRAC toolkit for CRS, ASMDG (if storage is ASM/SLVM), and RAC MNPs.
Support for invocation of Oracle Clusterware commands from customer-developed scripts
This includes invocation of such commands from SGeRAC package control scripts or module scripts;
therefore, SGeRAC packages can invoke commands to start up and shutdown Oracle Clusterware
and/or RAC database instances.
With these improvements, it became possible, using SGeRAC packages, to meet the sequencing
requirements mentioned above for the startup and shutdown of Oracle Clusterware and RAC
database instances with respect to the SGeRAC-managed storage used by these entities.
Each instance of an MNP behaves as a normal package but does not failover.
Failures of package components such as services, EMS resources or subnets, will cause the
MNP to be halted only on the node on which the failure occurred. Unlike failover packages,
when an instance of an MNP fails, only node switching is set to disabled, rather than the
global AUTO_RUN.
cmviewcl has been enhanced to show the overall package status as a summary of the
status of package instances. If not all the configured instances of an MNP are running,
a qualifier is added to the STATUS field of the form (<running>/<configured>) where
<running> indicates the number of instances running and <configured> indicates
the total number of configured instances. The status of each instance can also be displayed.
Simple package dependencies are used to describe the dependency relationship between packages.
To configure package A with a simple package dependency on package B, we set the following
three attributes of package A:
82
DEPENDENCY_NAME: Each dependency must have a unique name within the package which
is used to identify it.
DEPENDENCY_LOCATION: Describes where the condition must be satisfied. For SGeRAC, the
only possible value for this attribute is SAME_NODE.
cmrunpkg will fail if the user attempts to start a package that has a dependency on another
package that is not running. The package manager will not attempt to start a package if its
dependencies are not met. If multiple packages are specified to cmrunpkg, they will be
started in dependency order. If the AUTO_RUN attribute is set to YES, the package manager
will start the packages automatically in dependency order.
cmhaltpkg will fail if the user attempts to halt a package that has another package depending
on it that is still running. If multiple packages are specified to cmhaltpkg, they will be halted
in dependency order. During cmhaltcl or cmhaltnode, the package manager will halt
packages in dependency order.
The output of cmviewcl shows the current state of each dependency on each node where
the package is configured.
Why use multi-node packages/simple package dependencies for Oracle RAC integration
83
An example of a bottleneck created if we only have a package for Oracle Clusterware is this: if
we concentrate all storage management in the Oracle Clusterware package, then any time there
is a change in the storage configuration for one database (for example, an SLVM volume group
is added), we would have to modify the Oracle Clusterware package.
These are the main arguments in favor of having separate packages for Oracle Clusterware and
each RAC database. But then the question arises: how do we ensure that these packages start and
halt in the proper order? Prior to version A.11.17, SG/SGeRAC did not provide a mechanism for
package ordering.
Two new features are introduced in SG/SGeRAC that help us solve this problem: MNPs and simple
package dependencies. The combination of MNPs and simple package dependencies is a very
good fit for our problem of coordinating Oracle RAC and SGeRAC. We configure Oracle
Clusterware as one MNP and each RAC database as another MNP and we set up the database
MNPs to depend on the Oracle Clusterware MNP. This is the core concept of SGeRAC.
Both Oracle Clusterware and the RAC database are multi-instance applications well suited to being
configured as MNPs. Further, the use of MNPs reduces the total package count and simplifies
SGeRAC package configuration and administration. Simple package dependencies enable us to
enforce the correct start/stop order between the Oracle Clusterware MNP and RAC database
MNPs.
84
Figure 12 Resources managed by SGeRAC and Oracle Clusterware and their dependencies
85
Next, SGeRAC package manager shuts down Oracle Clusterware via the Oracle Clusterware
MNP, followed by the storage needed by Oracle Clusterware (this requires subsequent shutdown
of mount point and disk group MNPs in the case of the storage needed by Oracle Clusterware
being managed by CFS). It can do this since the dependent RAC database instance MNP is already
down. Before shutting itself down, Oracle Clusterware shuts down the ASM instance if configured,
and then the node applications. Lastly, SGeRAC itself shuts down.
Note that the stack can be brought up or down manually, package by package, by using
cmrunpkg/cmhaltpkg in the proper dependency order. To disable (partially or wholly) automatic
startup of the stack when a node joins the cluster, the AUTO_RUN attribute should be set to NO on
the packages that should not automatically be started.
How Serviceguard Extension for RAC starts, stops and checks Oracle Clusterware
Having discussed how the toolkit manages the overall control flow of the combined stack during
startup and shutdown, we will now discuss how the toolkit interacts with Oracle Clusterware and
RAC database instances. We begin with the toolkit interaction with Oracle Clusterware.
The MNP for Oracle Clusterware provides start and stop functions for Oracle Clusterware and has
a service for checking the status of Oracle Clusterware.
The start function starts Oracle Clusterware using crsctl start crs. To ensure successful
startup of Oracle Clusterware, the function, every 10 seconds, runs crsctl check until the
command output indicates that the CSS, CRS, and EVM daemons are healthy. If Oracle Clusterware
does not start up successfully, the start function will execute the loop until the package start timer
expires, causing SGeRAC to fail the instance of the Oracle Clusterware MNP on that node.
The stop function stops Oracle Clusterware using crsctl stop crs. Then, every 10 seconds,
it runs ps until the command output indicates that the processes called evmd.bin, crsd.bin,
and ocssd.bin no longer exist.
The check function runs ps to determine process id of the process called ocssd.bin. Then, in a
continuous loop driven by a configurable timer, it uses kill -s 0 to check if this process exists.
The other daemons are restarted by Oracle Clusterware, so they are not checked.
When Oracle Clusterware MNP is in maintenance mode, the check function pauses the Oracle
Clusterware health checking. Otherwise, if the check function finds that the process has died, it
means that Oracle Clusterware has either failed or been inappropriately shut downwithout using
cmhaltpkg. The service that invokes the function fails at this point and the SGeRAC package
manager fails the corresponding Oracle Clusterware MNP instance.
How Serviceguard Extension for RAC Mounts, dismounts and checks ASM disk groups
We discuss the toolkit interaction with the ASM disk groups.
The MNP for the ASM diskgroups that are needed by RAC database provides mount and dismount
functions for the ASM diskgroups and has a service for checking the status of those ASM diskgroups
whether they are mounted or not.
The start function executes su to the Oracle software owner user id. It then determines the ASM
instance id on the current node for the specified diskgroup using crsctl status resource
ora.asm. It is stored in variable and used for future references. Then it mounts the ASM disk
groups mentioned in that ASMDG MNP by connecting to ASM instance using sqlplus.
The stop function executes su to the Oracle software owner user id. It unmounts the ASM diskgroups
which are specified in that ASMDG MNP by connecting to ASM instance via sqlplus.
The check function determines the status of the ASM disk groups that are mentioned in ASMDG
MNP. When ASMDG MNP is in maintenance mode, the ASM diskgroup status checking is paused.
Otherwise, in a continuous loop driven by a configurable timer, the check function monitors the
status of the ASM diskgroups mentioned in that ASMDG MNP. If one or more ASM diskgroup is
in a dismounted state, the check function will report failurethe ASM diskgroup is dismounted
without using cmhaltpkg. The service that invokes the function fails at this point and the SGeRAC
86
package manager fails the corresponding ASMDG MNP and the RAC MNP that is dependent on
ASMDG MNP.
How Serviceguard Extension for RAC Toolkit starts, stops, and checks the RAC
database instance
Next, the toolkit interaction with the RAC database is discussed.
The MNP for the RAC database instance provides start and stop functions for the RAC database
instance and has a service for checking the status of the RAC database instance.
The start function executes su to the Oracle software owner user id. It then determines the Oracle
instance id8 on the current node for the specified database using srvctl status database.
Then it starts the corresponding RAC database instance using srvctl start instance. If an Oracle
Clusterware placement error occurs, indicating that CRS is not ready to start the instance, the
function sleeps for 2 minutes and then retries. At most 3 attempts are made to start the instance.
The stop function executes su to the Oracle software owner user id. It then determines the Oracle
instance id on the current node for the specified database using srvctl status database.
Then it stops the corresponding Oracle RAC instance using srvctl stop instance. If the user
configurable parameter STOP_MODE is abort and Oracle RAC Instance is not halted by srvctl
command within ORA_SHUTDOWN_TIMEOUT seconds, the Oracle RAC instance is terminated via
killing its background processes.
The check function executes ps and crs_stat commands to determine the health of RAC instance.
When the Oracle database instance MNP is in maintenance mode, the RAC instance health
checking is paused. Otherwise, in a continuous loop driven by a configurable timer, the check
function runs ps to check the number of the monitored RAC instance background processes. If one
or more RAC background processes are gone and crs_stat command shows Oracle Clusterware
has not restarted the Oracle RAC instance, the function will report the RAC instance as down. This
means that the RAC instance failed or has been inappropriately shut down without using
cmhaltpkg. The service that invokes the function fails at this point and the SGeRAC package
manager fails the corresponding RAC database MNP instance.
How Serviceguard Extension for RAC Toolkit interacts with storage management
subsystems
The core concept of the Toolkit, namely, configuring an MNP for Oracle Clusterware and for each
RAC database and configuring a dependency of each RAC database MNP on the Oracle
Clusterware MNP holds true across the following storage management options supported by
SGeRAC: SLVM, CVM, ASM over raw device (on HP-UX 11i v3) and CFS. The above dependency
may not hold well if ASM over SLVM is used as a storage option for RAC databases. Beginning
with the SGeRAC A.11.19 patches, PHSS_40885 (11i v2) and PHSS_40886 (11i v3), SGeRAC
toolkit introduces a new ASMDG MNP package to decouple ASM disk group management from
OC MNP. In previous toolkit versions, RAC database shared volume groups used for ASM disk
groups were defined in the OC MNP. However, the Storage management option deployed will
have some impact on the configuration of the toolkit.
In this case, Oracle Clusterware quorum and voting disk and RAC database files are stored in raw
logical volumes managed by SLVM or CVM. The management of SLVM or CVM storage for Oracle
Clusterware and database is specified in the package configuration of the respective MNPs.
Serviceguard Extension for RAC Toolkit operation
87
In this case, Oracle Clusterware quorum and registry device data is stored in files in a CFS. Oracle
database files are also stored in a CFS.
For each CFS used by Oracle Clusterware for its quorum and registry device data, there will be
a dependency configured from the Oracle Clusterware MNP to the mount point MNP corresponding
to that CFS. The mount point MNP has a dependency on the CFS system MNP (SMNP).
Similarly, for each CFS used by the RAC database for database files, there will be a dependency
configured from the RAC database MNP to the mount point MNP corresponding to that CFS. Only
the Oracle Clusterware and RAC DB Instance MNPs, and their dependencies, in this use case are
to be configured and managed as described in this chapter. The rest of the multi-node packages
shown for this use case are created via the CFS subsystem. Configuration and administration
procedures for these MNPs are specified in the SGeRAC user guide.
The above diagram can be considered as one use case. Here we have one Oracle Clusterware
MNP, three ASMDG MNP, and four RAC database MNP. All the ASMDG MNPs should be made
88
dependent on Oracle Clusterware MNP. Disk groups that are exclusively used by a RAC DB should
be managed in separate ASM DG MNP. If different RAC Database uses different ASM Disk groups
then those, ASM DGs should not be configured in a single ASMDG MNP. As RAC DB Instance
MNP 3 and RAC DB Instance MNP 4 use completely different ASM diskgroups, they are made
dependent on their respective ASMDG MNP(ASMDG MNP 2, ASMDG MNP 3). However, If two
RAC DB use same set of ASM Disk groups, then those ASM DG can be configured in a single
ASMDG MNP. Then, both the RAC MNP is made dependent on the ASMDG MNP. RAC DB
Instance MNP 1, RAC DB Instance MNP 2 makes use of same set of ASM diskgroups so both
MNPs are made dependent on ASMDG MNP 1.
89
3.
4.
The user can maintain the Oracle ASM disk groups on that node while Oracle ASMDG MNP
package is still running.
After the maintenance work is completed, the user can remove the created asm_dg.debug
in step 2 to bring the Oracle ASMDG MNP package out of maintenance mode to resume
normal monitoring by Serviceguard. The maintenance mode message will appear in the Oracle
database instance package log files, e.g. Starting ASM DG MNP checking again after
maintenance.
2.
3.
4.
Make sure the MAINTENANCE_FLAG parameter for Oracle database instance MNP is set to
yes when this packages is created. If not, shutdown this first, set the MAINTENANCE_FLAG to
yes, and then restart MNP.
On the maintenance node, create a debug file called rac.debug file in the Oracle database
instance MNP working directory. The Oracle database instance MNP on this node will go
into maintenance mode. The maintenance mode message will appear in Oracle database
instance package log files, e.g. RAC MNP pausing RAC instance checking and entering
maintenance mode.
The user can maintain the Oracle database instance on that node while Oracle database
instance package is still running.
After the maintenance work is completed, the user can remove the created rac.debug in
step 2 to bring the Oracle database instance package out of maintenance mode to resume
normal monitoring by Serviceguard. The maintenance mode message will appear in the Oracle
database instance package log files, e.g. Starting RAC MNP checking again after
maintenance.
90
91
PREFACE:
This README file describes the SGeRAC Toolkit which enables integration of
Oracle Real Application Clusters (RAC) with HP Serviceguard Extension for
Real Application Clusters (SGeRAC). This document covers the Toolkit file
structure, files, configuration parameters and procedures, administration,
supported configurations, known problem and workaround, and the support
restrictions of this version of Toolkit. A link to a whitepaper on the same
topic is also included.
This document assumes that its readers are already familiar with Serviceguard
(SG), SGeRAC, and Oracle RAC, including installation and configuration
procedures.
This version of SGeRAC Toolkit supports Oracle 10g Release 2, 11g Release 1
and 11g Release 2 versions/revision of RAC only.
A. Overview
Oracle 10g and later database server software offers a built-in feature called Oracle
Clusterware for building highly available RAC and single instance databases
in clustered configurations. Since the release of Oracle 10g, HP has
recommended a combined SGeRAC-Oracle Clusterware configuration for RAC
deployments on HP-UX. In the combined stack, SGeRAC provides cluster
92
93
----------------------------|
|
|
|
|
|
V
V
----------------------------|
|
|
|
|
|
|
|
| CFS-DG1-MNP |
| CFS-DG2-MNP |
|
|
|
|
|
|
|
|
----------------------------|
|
|
|
|
|
V
V
--------------------------------------|
|
|
|
|
SG-CFS-pkg
|
|
|
|
|
---------------------------------------
3. Dependency structure in the case of ASM over SLVM and ASM over HP-UX raw disks.
--------------|
|
|
|
| RAC-MNP
|
|
|
|
|
--------------|
|
|
V
--------------|
|
|
|
| ASMDG-MNP |
|
|
|
|
--------------|
|
|
V
--------------|
|
|
|
|
OC-MNP
|
|
|
|
|
--------------In case of ASM over SLVM
The SLVM Volume groups used for Oracle Clusterware storage are configured in the
OC-MNP package.
The SLVM Volume groups used for RAC database storage are configured in the ASMDG MNP
package.
In case of ASM over HP-UX raw disks
Do not specify any HP-UX raw disks information either in OC-MNP package or in ASMDG
MNP package.
NOTE: Oracle patches p7225720 and p7330611 are not availabe on Oracle RAC 10.2.0.4
for IA. Upgrade to Oracle RAC 10.2.0.5 to use ASMDG MNP feature on 10.2.0.4 on IA.
C. SGeRAC Toolkit File Structure
From SG/SGeRAC version A.11.19 and later, the SGeRAC Toolkit uses Modular
Packages to implement OC MNP, ASMDG MNP and RAC MNP.
94
After installation of SGeRAC, the SGeRAC Toolkit module Attribute Definition Files (ADF)
reside under the /etc/cmcluster/modules/sgerac directory and the module
scripts reside under the /etc/cmcluster/scripts/sgerac directory.
The SGeRAC Toolkit files reside under /opt/cmcluster/SGeRAC/toolkit. This
directory contains three subdirectories crsp, asmp and racp. Subdirectory
crsp contains the Toolkit scripts for OC MNP. Subdirectory racp contains
the Toolkit scripts for RAC MNP. Subdirectory asmp contains the Toolkit
scripts for ASMDG MNP.
The following files are installed during the SGeRAC Toolkit installation:
/opt/cmcluster/SGeRAC/toolkit/README
/opt/cmcluster/SGeRAC/toolkit/crsp/toolkit_oc.sh
/opt/cmcluster/SGeRAC/toolkit/crsp/oc.conf
/opt/cmcluster/SGeRAC/toolkit/crsp/oc.sh
/opt/cmcluster/SGeRAC/toolkit/crsp/oc.check
/opt/cmcluster/SGeRAC/toolkit/racp/toolkit_dbi.sh
/opt/cmcluster/SGeRAC/toolkit/racp/rac_dbi.conf
/opt/cmcluster/SGeRAC/toolkit/racp/rac_dbi.sh
/opt/cmcluster/SGeRAC/toolkit/racp/rac_dbi.check
/opt/cmcluster/SGeRAC/toolkit/asmp/toolkit_asmdg.sh
/opt/cmcluster/SGeRAC/toolkit/asmp/asm_dg.conf
/opt/cmcluster/SGeRAC/toolkit/asmp/asm_dg.sh
/opt/cmcluster/SGeRAC/toolkit/asmp/asm_dg.check
/etc/cmcluster/modules/sgerac/erac_tk_oc
/etc/cmcluster/modules/sgerac/erac_tk_oc.1
/etc/cmcluster/modules/sgerac/erac_tk_rac
/etc/cmcluster/modules/sgerac/erac_tk_rac.1
/etc/cmcluster/modules/sgerac/erac_tk_asmdg
/etc/cmcluster/modules/sgerac/erac_tk_asmdg.1
/etc/cmcluster/scripts/sgerac/erac_tk_oc.sh
/etc/cmcluster/scripts/sgerac/erac_tk_rac.sh
/etc/cmcluster/scripts/sgerac/erac_tk_asmdg.sh
/etc/cmcluster/scripts/sgerac/oc_gen.sh
/etc/cmcluster/scripts/sgerac/rac_gen.sh
/etc/cmcluster/scripts/sgerac/asmdg_gen.sh
this file
subdirectory for OC MNP
subdirectory for RAC MNP
subdirectory for ASMDG MNP
95
oc.sh
96
run_script_timeout, halt_script_timeout
Default value is 600 seconds for a 4 node cluster. This value is suggested
as an initial value. It may need to be tuned for your environment.
script_log_file
Set by default to "$SGRUN/log/$SG_PACKAGE.log"
TKIT_DIR
Set to the OC MNP working directory. After the cmapplyconf command, the OC
MNP configuration file oc.conf will be created in this directory. If the
oc.conf file already exists in the directory then all the configuration
parameters will be overwritten and the original oc.conf file will be backed
up in oc.conf.old.
TKIT_SCRIPT_DIR
Set to the OC MNP script files directory. The default value is the OC
MNP script files installation directory:
/opt/cmcluster/SGeRAC/toolkit/crsp.
ORA_CRS_HOME, CHECK_INTERVAL, MAINTENANCE_FLAG
These parameters can be set in the cmmakepkg command if the Toolkit
configuration file is given with the -t option. Otherwise set them based on
the description in E-2 to fit your Oracle environment.
vgchange_cmd, vg
When SLVM or ASM over SLVM is used for Oracle Clusterwaer storage, specify
the corresponding SLVM Volume Group names and set activation to shared mode.
- set vgchange_cmd to "vgchange -a s"
- specify the name(s) of the Shared Volume Groups in vg[0], vg[1]....
Note:
cvm_activation_cmd, cvm_dg
If CVM is used for Oracle Clusterware storage management and the CVM disk
group activation and deactivation are to be handled in the package control
file:
- set cvm_activation_cmd to
"vxdg -g \$Disk group set activation=sharedwrite"
- specify the name(s) of the CVM Disk Groups in cvm_dg[0], cvm_dg[1]...
cluster_interconnect_subnet
Refer to the SGeRAC manual for the steps to configure monitoring of the
Oracle Clusterware heartbeat subnet.
service_name
Set by default to crsp_monitor
service_cmd
Set by default to "$SGCONF/scripts/sgerac/erac_tk_oc.sh oc_check"
service_restart
Set by default to none
service_fail_fast_enabled
Set by default to no
service_halt_timeout
Set by default to 300
dependency_name, dependency_condition, dependency_location
If CVM or CFS is used for managing the storage of the Oracle Clusterware,
and Serviceguard Disk Group (DG) MNP and Mount Point (MP) MNP are used to
handle the disk group and file system mount point, configure a dependency
for the corresponding DG MNP (for CVM) or MP MNP (for CFS).
For example, for the package using CVM:
DEPENDENCY_NAME
DG-MNP-name
DEPENDENCY_CONDITION
DG-MNP-PKG=UP
DEPENDENCY_LOCATION
SAME_NODE
For the package using CFS:
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
MP-MNP-name
MP-MNP-PKG=UP
SAME_NODE
Note: For modular style CFS DG-MP package, as a dependency modular style CFS
DG-MP MNP must be mentioned in OC MNP configuration file.
For the ASMDG MNP:
-----------Note: If ASM over SLVM is being used for the RAC database, it is recommended
to use the new ASMDG package to manage the ASM disk group.
package_name
Set to any name desired for the ASMDG MNP.
package_type
Set by default to multi_node.
97
package_description
Set by default to "SGeRAC Toolkit Oracle ASMDG package"
node_name
Specify the names for the nodes that the ASMDG MNP will run on.
auto_run
Set to yes or no depending on whether the ASMDG MNP is to be started on
cluster join or on demand.
local_lan_failover_allowed
Set by default to yes to allow the cluster to switch LANs locally in the
event of a failure.
node_fail_fast_enabled
Set by default to no.
script_log_file
Set by default to "$SGRUN/log/$SG_PACKAGE.log"
TKIT_DIR
Set to the ASMDG MNP working directory. After
ASMDG MNP configuration file asm_dg.conf will
If The asm_dg.conf file already exists in the
configuration parameters will be overwritten,
file will be backed up in asm_dg.conf.old.
TKIT_SCRIPT_DIR
Set to the ASMDG MNP script files directory. The default value is the ASMDG
MNP script files installation directory:
/opt/cmcluster/SGeRAC/toolkit/asmp
ORACLE_HOME, CHECK_INTERVAL...MAINTENANCE_FLAG
These parameters can be set in the cmmakepkg command if the Toolkit
configuration file is given with -t option. Otherwise set them based on the
description in E-2 to fit your Oracle environment.
ORA_CRS_HOME, OC_TKIT_DIR
It is not required to set these values, the cmapplyconf command will
automatically set them at package configuration time based on the setting
in the OC MNP. Set by default to "<set by cmapplyconf>". Refer to E-2 for
the descriptions.
vgchange_cmd, vg
- set vgchange_cmd to "vgchange -a s". When using ASM over HP-UX raw disks, ignore this step.
- specify the name(s) of the Shared Volume Groups in vg[0], vg[1]...
When using ASM over HP-UX raw disks, ignore this step.
run_script_timeout, halt_script_timeout
Default value is 600 seconds for a 4 node cluster. This value is suggested
as an initial value. It may need to be tuned for your environment.
service_name
Set by default to asmdg_monitor, if multiple ASMDG MNPs are configured, you
need to set a different service_name for each ASMDG MNP.
service_cmd
Set by default to "$SGCONF/scripts/sgerac/erac_tk_asmdg.sh asmdg_check"
service_restart
Set by default to none
service_fail_fast_enabled
Set by default to no
service_halt_timeout
Default value is 300
dependency_name, dependency_condition, dependency_location
Configure a dependency on the OC MNP.
For example,
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
OC-MNP-name
OC-MNP-PKG=UP
SAME_NODE
98
DG-MNP-name
DG-MNP-PKG=UP
SAME_NODE
OC-MNP-name
99
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
OC-MNP-PKG=UP
SAME_NODE
DEPENDENCY_NAME
MP-MNP-name
DEPENDENCY_CONDITION
MP-MNP-PKG=UP
DEPENDENCY_LOCATION
SAME_NODE
Note: For modular style CFS DG-MP package, as a dependency OC MNP and
modular style CFS DG-MP MNP must be mentioned in RAC MNP
configuration file. When ASMDG package is configured:
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
ASMDG-MNP-name
ASMDG-MNP-PKG=UP
SAME_NODE
Note: When ASMDG MNP is configured, make sure you configure the dependency
on the ASMDG MNP which is managing the disk group of the current RAC MNP.
Since ASMDG MNP is already configured with a dependency on OC MNP, there
is no need to configure a dependency on OC MNP for this RAC MNP.
E-1-2. Legacy Package Configuration File Parameters:
For legacy packages, the package configuration file template can be created
by running the Serviceguard command "cmmakepkg -p".
For the OC MNP:
----------PACKAGE_NAME
Set to any name desired for the OC MNP.
PACKAGE_TYPE
Set to MULTI_NODE.
FAILOVER_POLICY, FAILBACK_POLICY
Comment out.
NODE_NAME
Specify the names for the nodes that the OC MNP will run on.
AUTO_RUN
Set to YES or NO depending on whether the OC MNP is to be started on
cluster join or on demand.
LOCAL_LAN_FAILOVER_ALLOWED
Set by default to YES to allow cluster to switch LANs locally in the event
of a failure.
NODE_FAIL_FAST_ENABLED
Set by default to NO.
RUN_SCRIPT, HALT_SCRIPT
Set to the package control script.
RUN_SCRIPT_TIMEOUT, HALT_SCRIPT_TIMEOUT
Default value is 600 seconds for a 4 node cluster. This value is suggested
as an initial value. It may need to be tuned for your environment.
STORAGE_GROUP
If the Oracle Clusterware registry and vote devices are stored in a CVM
disk group, specify it using this parameter.
DEPENDENCY_NAME, DEPENDENCY_CONDITION, DEPENDENCY_LOCATION
If CVM or CFS is used for managing the storage of the Oracle Clusterware,
and Serviceguard Disk Group (DG) MNP and Mount Point (MP) MNP are used to
handle the disk group and file system mount point, configure a dependency
for the corresponding DG MNP (for CVM) or MP MNP (for CFS).
For example, for the package using CVM:
DEPENDENCY_NAME
DG-MNP-name
DEPENDENCY_CONDITION
DG-MNP-PKG=UP
DEPENDENCY_LOCATION
SAME_NODE
For the package using CFS:
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
MP-MNP-name
MP-MNP-PKG=UP
SAME_NODE
SERVICE_NAME
Specify a single SERVICE_NAME, corresponding to the service definition in
the control script. This service invokes Toolkit script "toolkit_oc.sh
check".
SERVICE_HALT_TIMEOUT
Default value is 300 seconds for a 4 node cluster. This value is suggested
as an initial value. It may need to be tuned for your environment.
PACKAGE_TYPE
Set to MULTI_NODE.
FAILOVER_POLICY, FAILBACK_POLICY
Comment out.
NODE_NAME
Specify the names for the nodes that the ASMDG MNP will run on.
AUTO_RUN
Set to YES or NO depending on whether the ASMDG MNP is to be started on
cluster join or on demand.
LOCAL_LAN_FAILOVER_ALLOWED
Set by default to YES to allow cluster to switch LANs locally in the event
of a failure.
NODE_FAIL_FAST_ENABLED
Set by default to NO.
RUN_SCRIPT, HALT_SCRIPT
Set to the package control script.
RUN_SCRIPT_TIMEOUT, HALT_SCRIPT_TIMEOUT
Default value is 600 seconds for a 4 node cluster. This value is suggested
as an initial value. It may need to be tuned for your environment.
DEPENDENCY_NAME, DEPENDENCY_CONDITION, DEPENDENCY_LOCATION
Configure a dependency on the OC MNP.
For example, for the package using CVM:
DEPENDENCY_NAME
OC-MNP-name
DEPENDENCY_CONDITION
OC-MNP-PKG=UP
DEPENDENCY_LOCATION
SAME_NODE
SERVICE_NAME
Specify a single SERVICE_NAME, corresponding to the service definition in
the control script. This service invokes Toolkit script "toolkit_asmdg.sh
check".
SERVICE_HALT_TIMEOUT
Default value is 300 seconds for a 4 node cluster. This value is suggested
as an initial value. It may need to be tuned for your environment.
101
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
DG-MNP-name
DG-MNP-PKG=UP
SAME_NODE
OC-MNP-name
OC-MNP-PKG=UP
SAME_NODE
DEPENDENCY_NAME
DEPENDENCY_CONDITION
DEPENDENCY_LOCATION
MP-MNP-name
MP-MNP-PKG=UP
SAME_NODE
ASMDG-MNP-name
ASMDG-MNP-PKG=UP
SAME_NODE
Note: When ASMDG MNP is configured, make sure you configure the dependency
on the ASMDG MNP which is managing the disk group of the current RAC MNP.
Since ASMDG MNP is already configured with a dependency on OC MNP, there
is no need to configure a dependency on OC MNP for this RAC MNP.
SERVICE_NAME
Specify a single SERVICE_NAME, corresponding to the service definition in
the control script. This service invokes Toolkit script "toolkit_dbi.sh
check".
SERVICE_HALT_TIMEOUT
Default value is 300 seconds for a 4 node cluster. This value is suggested
as an initial value. It may need to be tuned for your environment.
If CVM is used for Oracle Clusterware storage management and the CVM disk
group activation and deactivation are to be handled in the package control
file:
- set CVM_ACTIVATION_CMD to
"vxdg -g \$Disk group set activation=sharedwrite"
- specify the name(s) of the CVM Disk Groups in CVM_DG[0], CVM_DG[1]...
Configure one package service:
- set SERVICE_NAME[0] to the name of service specified in the ASCII
configuration file
- set SERVICE_CMD[0] to "<OC MNP working directory>/toolkit_oc.sh
check"
- set SERVICE_RESTART[0] to ""
In the function customer_defined_run_cmds:
- start Oracle Clusterware using the command:
<OC MNP working directory>/toolkit_oc.sh start
In the function customer_defined_halt_cmds:
- stop Oracle Clusterware using the command:
<OC MNP working directory>/toolkit_oc.sh stop
- set SERVICE_CMD[0]
to "<RAC MNP working directory>/toolkit_dbi.sh check"
- set SERVICE_RESTART[0] to ""
In the function customer_defined_run_cmds:
- start the RAC instance using the command:
<RAC MNP working directory>/toolkit_dbi.sh start
In the function customer_defined_halt_cmds:
- stop the RAC instance using the command:
<RAC MNP working directory>/toolkit_dbi.sh stop
OC_TKIT_DIR
Set to the OC MNP working directory. When MAINTENANCE_FLAG is yes, the RAC
MNP uses this parameter to check the OC MNP maintenance status: If the OC
MNP MAINTENANCE_FLAG is set to yes and oc.debug is in the OC_TKIT_DIR
directory, the RAC MNP knows the OC MNP on the same node is in maintenance
mode. In this case, because of the dependency on the OC MNP, the RAC MNP
will go into maintenance mode as well regardless of the presence of its
debug file.
If the MAINTENANCE_FLAG is set to no, OC_TKIT_DIR is not required, and the
RAC MNP will not check the OC MNP's maintenance status.
ASM_TKIT_DIR
Note: this parameter should be set only if the new ASM DG MNP is being used.
Set this to the ASM MNP working directory. When the MAINTENANCE_FLAG is
yes, it is used to check the ASMDG MNP maintenance status.
F-4. OC MNP startup procedures [For both Modular and Legacy Packages]:
1. On each node of the cluster, halt the Oracle Clusterware if it is running.
: $ORA_CRS_HOME/bin/crsctl stop crs
2. On one node of the cluster, start the OC MNP via cmrunpkg.
: cmrunpkg <OCMNP-package-name>
Use cmviewcl to check the package status.
OC MNP configured in the cluster.
3. After the package is up and running, verify that the Oracle Clusterware is
running on each node of the cluster.
On each node, enter:
: $ORA_CRS_HOME/bin/crsctl check crs
For
CSS
CRS
EVM
For Oracle 11g R1, messages like the following should be seen:
Cluster Synchronization Services appears healthy
Cluster Ready Services appears healthy
Event Manager appears healthy
For Oracle 11g R2, messages like the following should be seen:
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
The RAC instances should not be running.
Note: Steps F-5 to F-7 are required only if you are using ASM over SLVM for RAC
database and if you are planning to use the ASMDG package to manage your ASM
disk group.
3. Generate the package configuration file for the ASMDG MNP and edit the
file based on the description in E-1. Then configure ASMDG MNP.
If asm_dg.conf is configured and tested in step 2, use the following
command to create the package configuration file:
: cmmakepkg -m sg/multi_node_all -m sgerac/erac_tk_asmdg -t asm_dg.conf
pkgConfigFile
Otherwise, create the package configuration file and set the Oracle
Clusterware parameters in this file directly:
: cmmakepkg -m sg/multi_node_all -m sgerac/erac_tk_asmdg pkgConfigFile
Edit the package template files based on the description in E-1.
4. Now apply the package configuration file:
: cmapplyconf -P pkgConfigFile
F-7. Oracle ASMDG MNP startup procedures: [For both Modular and Legacy packages]
1. On one node of the cluster, start the ASMDG MNP via cmrunpkg.
: cmrunpkg Your-ASMDGMNP-Name
Use cmviewcl to check the package status.
2. After the package is up and running, verify that the ASM disk group is
mounted.
On one node of the cluster, login as the Oracle administrator and enter:
:$ORACLE_HOME/bin/asmcmd lsdg
Messages like the following should be seen:
MOUNTED -
NORMAL - <DG_NAME>
NOTE: To configure another ASMDG MNP package to manage the ASM disk group
used by a different RAC Database, repeat the steps in F-6 and F-7.
F-10 Oracle RAC MNP startup procedures: [For both Modular and Legacy packages]
1. On one node of the cluster, start the RAC MNP via cmrunpkg.
: cmrunpkg Your-RACMNP-Name
Use cmviewcl to check the package status.
2. After the package is up and running, verify that the RAC instance is
running.
On one node of the cluster, login as the Oracle administrator and enter:
:$ORACLE_HOME/bin/srvctl status instance -d $databaseName -i $instanceName
Messages like the following should be seen:
Instance <InstanceName> is running on node <NodeName>
3. If more than one RAC database is configured in the cluster and the RAC
instances are to be managed in the RAC MNP, repeat the steps in F-9 and
steps 1 and 2 in F-10 for each RAC database.
L. Migration of Legacy CFS Disk group and Mount point Packages to Modular
CFS Disk group and Mount point Packages(CFS DG-MP).
Beginning with the SG A.11.20 patch PHSS_41628 and SG CFS A.11.20 patch PHSS_41674,
new modular CFS Disk group and Mount point feature has been introduced. It will
allow to consolidate all disk group and mount point packages for an application
into a single modular package. To migration from legacy CFS Disk group and Mount
point MNPs to modular CFS Disk group and Mount point MNPs (CFS DG-MP) can be done
with the following steps:
1. Create modular CFS DG-MP MNP for Oracle Clusterware storage
: cmmakepkg -m sg/cfs_all /etc/cmcluster/OC-DGMP/OC-DGMP.ascii
2. Edit the OC-DGMP.ascii file with package name,Diskgroup and mount point
and other required package information
For example
cvm_disk_group
< DiskGroup used for Oracle Clusterware storage >
cvm_activation_mode
"< node1 > =sw < node2 > =sw"
cfs_mount_point
< Mount Point location for CRS>
cfs_volume
< CFS volume >
cfs_mount_options
"< node1 > =cluster < node2 > =cluster"
cfs_primary_policy
""
3. Create modular CFS DG-MP MNP for RAC database storage
: cmmakepkg -m sg/cfs_all /etc/cmcluster/RAC-DGMP/RAC-DGMP.ascii
4. Edit the RAC-DGMP.ascii file with package name,Diskgroup and mount point
and other required package information
For example
cvm_disk_group
< DiskGroup used for RAC database storage >
cvm_activation_mode
"< node1 > =sw < node2 > =sw"
cfs_mount_point
< Mount Point location RAC>
cfs_volume
< CFS volume >
cfs_mount_options
"< node1 > =cluster < node2 > =cluster"
cfs_primary_policy
""
5. Take a backup of OC MNP configuration file.
: cmgetconf -p < OC MNP > < backup OC MNP package configuration file >
6. Edit the backup of OC MNP ascii file, remove the existing dependency on
legacy MP and DG package and add modular CFS DG-MP package for Oracle
Clusterware storage as a dependency.
For example
110
DG-MNP-name
DG-MNP-PKG=UP
SAME_NODE
DG-MNP-name
DG-MNP-PKG=UP
SAME_NODE
RAC MNP ascii file with modular style CFS DG-MP package :
DEPENDENCY_NAME
RAC-DGMP-name
DEPENDENCY_CONDITION
RAC-DGMP-PKG=UP
DEPENDENCY_LOCATION
SAME_NODE
9. Shutdown all the RAC MNP, if RAC MNPs is running.
: cmhaltpkg < RAC MNP 1 > < RAC MNP 2 > ...
10. Shutdown the OC MNP, if the OC MNP is running.
: cmhaltpkg < OC MNP >
11. Shutdown the legacy Disk group (SG-CFS-DG-id#) and Mount point MNPs (SG-MP-id#)
: cmhaltpkg < SG-MP-id# > < SG-CFS-DG-id# >
12. Delete the OC MNP and all the RAC MNPs from cluster
: cmdeleteconf -p < RAC MNP >
: cmdeleteconf -p < OC MNP >
13. Delete all legacy style Disk group MNPs and Mount Point MNPs from cluster
: cmdeleteconf -p < legacy MP MNP >
: cmdeleteconf -p < legacy DG MNP >
14. Apply and run both modular CFS DG-MP packages for Oracle Clusterware and RAC
database storage created in step number [1] and [3]
: cmapplyconf -P < OC-DGMP-MNP configuration file >
: cmapplyconf -P < RAC-DGMP-MNP configuration file >
: cmrunpkg < OC-DGMP-MNP > < RAC-DGMP-MNP >
15. Apply the updated OC MNP configuration file which was modified in step number [6]
: cmapplyconf -P < backup OC MNP configuration file >
16. Apply the updated RAC MNP configuration file which was modified in step number [8]
: cmapplyconf -P < backup RAC MNP configuration file>
17. You may now start the OC MNP and RAC MNP in the cluster using the
: cmrunpkg < OC MNP > < RAC MNP >
111
3.
4.
5.
6.
Conclusion
Using SGeRAC Toolkit with multi-node packages and simple package dependencies provides a
uniform, intuitive, and easy-to-manage method to perform the following:
Manage all the storage options supported by SGeRAC -CFS, SLVM, CVM , ASM over SLVM
and ASM over raw device (on HP-UX 11i v3)
Although the concepts of resource dependency and resource aggregation delivered by SGeRAC
with the multi-node package and simple package dependency features are present in some form
or other in other clusterware productsincluding Oracle Clusterwarethe framework provided
by SGeRAC is unique due to the high level of multi-vendor (Oracle, Symantec, HP) and multi-storage
platform (CFS, SLVM, CVM, ASM over SLVM, ASM over raw device) integration it offers.
112
5 Maintenance
This chapter includes information about carrying out routine maintenance on a Real Application
Cluster configuration. Starting with version SGeRAC A.11.17, all log messages from cmgmsd log
to /var/adm/syslog/syslog.log by default. As presented here, these tasks differ in some
details from the similar tasks described in the Managing Serviceguard documentation.
Tasks include:
Reviewing Cluster and Package States with the cmviewcl Command (page 113)
113
CLUSTER
cluster_mo
NODE
minie
STATUS
up
STATUS
up
STATE
running
Quorum_Server_Status:
NAME
STATUS
white
up
STATE
running
Network_Parameters:
INTERFACE
STATUS
PRIMARY
up
PRIMARY
up
STANDBY
up
NODE
mo
PATH
0/0/0/0
0/8/0/0/4/0
0/8/0/0/6/0
STATUS
up
NAME
lan0
lan1
lan3
STATE
running
Quorum_Server_Status:
NAME
STATUS
white
up
STATE
running
Network_Parameters:
INTERFACE
STATUS
PRIMARY
up
PRIMARY
up
STANDBY
up
PATH
0/0/0/0
0/8/0/0/4/0
0/8/0/0/6/0
NAME
lan0
lan1
lan3
MULTI_NODE_PACKAGES
PACKAGE
SG-CFS-pkg
NODE_NAME
minie
STATUS
up
STATUS
up
Script_Parameters:
ITEM
STATUS
Service
up
Service
up
Service
up
Service
up
Service
up
NODE_NAME
mo
NODE_NAME
minie
MAX_RESTARTS
0
5
5
0
0
Maintenance
SYSTEM
yes
RESTARTS
0
0
0
0
0
NAME
SG-CFS-vxconfigd
SG-CFS-sgcvmd
SG-CFS-vxfsckd
SG-CFS-cmvxd
SG-CFS-cmvxpingd
SWITCHING
enabled
MAX_RESTARTS
0
5
5
0
0
STATUS
up
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-pkg
114
AUTO_RUN
enabled
SWITCHING
enabled
STATUS
up
Script_Parameters:
ITEM
STATUS
Service
up
Service
up
Service
up
Service
up
Service
up
PACKAGE
SG-CFS-DG-1
STATE
running
STATE
running
STATE
running
SATISFIED
yes
RESTARTS
0
0
0
0
0
NAME
SG-CFS-vxconfigd
SG-CFS-sgcvmd
SG-CFS-vxfsckd
SG-CFS-cmvxd
SG-CFS-cmvxpingd
AUTO_RUN
enabled
SWITCHING
enabled
SYSTEM
no
NODE_NAME
mo
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-pkg
PACKAGE
SG-CFS-MP-1
NODE_NAME
minie
STATUS
up
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-DG-1
NODE_NAME
mo
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-DG-1
PACKAGE
SG-CFS-MP-2
NODE_NAME
minie
STATUS
up
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-DG-1
NODE_NAME
mo
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-DG-1
PACKAGE
SG-CFS-MP-3
NODE_NAME
minie
STATUS
up
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-DG-1
NODE_NAME
mo
STATUS
up
Dependency_Parameters:
DEPENDENCY_NAME
SG-CFS-DG-1
STATE
running
SWITCHING
enabled
SATISFIED
yes
STATE
running
STATE
running
AUTO_RUN
enabled
SYSTEM
no
SWITCHING
enabled
SATISFIED
yes
STATE
running
SWITCHING
enabled
SATISFIED
yes
STATE
running
STATE
running
AUTO_RUN
enabled
SYSTEM
no
SWITCHING
enabled
SATISFIED
yes
STATE
running
SWITCHING
enabled
SATISFIED
yes
STATE
running
STATE
running
AUTO_RUN
enabled
SYSTEM
no
SWITCHING
enabled
SATISFIED
yes
STATE
running
SWITCHING
enabled
SATISFIED
yes
115
Cluster Status
The status of a cluster may be one of the following:
Up. At least one node has a running cluster daemon, and reconfiguration is not taking place.
Starting. The cluster is in the process of determining its active membership. At least one
cluster daemon is running.
Unknown. The node on which the cmviewcl command is issued cannot communicate with
other nodes in the cluster.
Failed. A node never sees itself in this state. Other active members of the cluster will see a
node in this state if that node was in an active cluster, but is no longer, and is not halted.
Reforming. A node is in this state when the cluster is re-forming. The node is currently running
the protocols which ensure that all nodes agree to the new membership of an active cluster.
If agreement is reached, the status database is updated to reflect the new cluster membership.
Running. A node in this state has completed all required activity for the last re-formation and
is operating normally.
Halted. A node never sees itself in this state. Other nodes will see it in this state after the
node has gracefully left the active cluster, for instance with a cmhaltnode command.
Unknown. A node never sees itself in this state. Other nodes assign a node this state if it has
never been an active cluster member.
Unknown.
A system multi-node package is up when it is running on all the active cluster nodes. A multi-node
package is up if it is running on any of its configured nodes.
The state of the package can be one of the following:
116
Starting. The start instructions in the control script are being run.
Halting. The halt instructions in the control script are being run.
Maintenance
Package Switching. Enabledthe package can switch to another node in the event of
failure.
Switching Enabled for a Node. Enabledthe package can switch to the referenced
node. Disabledthe package cannot switch to the specified node until the node is enabled
for the package using the cmmodpkg command.
Every package is marked Enabled or Disabled for each node that is either a primary or
adoptive node for the package.
For multi-node packages, node switching Disabled means the package cannot start on that
node.
Up. Services are active and being monitored. The membership appears in the output of
cmviewcl -l group.
Down. The cluster is halted and GMS services have been stopped. The membership does not
appear in the output of the cmviewcl -l group.
The following is an example of the group membership output shown in the cmviewcl command:
# cmviewcl -l group
GROUP
DGop
DBOP
DAALL_DB
IGOPALL
MEMBER
1
0
1
0
0
1
2
1
PID
10394
10499
10501
10396
10396
10501
10423
10528
MEMBER_NODE
comanche
chinook
comanche
chinook
comanche
chinook
comanche
chinook
Service Status
Services have only status, as follows:
UninitializedThe service is included in the package configuration, but it was not started
with a run command in the control script.
Unknown.
117
Network Status
The network interfaces have only status, as follows:
Up.
Down.
NOTE:
CONFIGURED_NODEthe package fails over to the next node in the node list in the package
configuration file.
MIN_PACKAGE_NODEthe package fails over to the node in the cluster with the fewest running
packages on it.
Packages can also be configured with one of two values for the FAILBACK_POLICY parameter:
AUTOMATICa package following a failover returns to its primary node when the primary
node becomes available again.
MANUALa package following a failover must be moved back to its original node by a system
administrator.
Failover and failback policies are displayed in the output of the cmviewcl -v command.
STATUS
up
STATUS
up
STATE
running
Network_Parameters:
INTERFACE
STATUS
PRIMARY
up
STANDBY
up
PATH
56/36.1
60/6
NAME
lan0
lan1
PACKAGE
ops_pkg1
STATE
running
AUTO_RUN
disabled
STATUS
up
NODE
ftsys9
Policy_Parameters:
POLICY_NAME
CONFIGURED_VALUE
Start
configured_node
Failback
manual
Node_Switching_Parameters:
NODE_TYPE
STATUS
SWITCHING
Primary
up
enabled
NODE
118
Maintenance
STATUS
STATE
NAME
ftsys9
(current)
ftsys10
up
running
Network_Parameters:
INTERFACE
STATUS
PRIMARY
up
STANDBY
up
PATH
28.1
32.1
NAME
lan0
lan1
PACKAGE
ops_pkg2
STATE
running
AUTO_RUN
disabled
STATUS
up
NODE
ftsys10
Policy_Parameters:
POLICY_NAME
CONFIGURED_VALUE
Start
configured_node
Failback
manual
Node_Switching_Parameters:
NODE_TYPE
STATUS
SWITCHING
Primary
up
enabled
Alternate
up
enabled
NAME
ftsys10
ftsys9
(current)
STATUS
up
STATUS
up
STATE
running
STATUS
up
STATE
running
STATE
running
STATE
running
STATUS
up
NODE
ftsys8
ftsys9
STATUS
down
up
STATE
halted
running
SYSTEM_MULTI_NODE_PACKAGES:
PACKAGE
STATUS
VxVM-CVM-pkg up
STATE
running
When you use the -v option, the display shows the system multi-node package associated with
each active node in the cluster, as in the following:
SYSTEM_MULTI_NODE_PACKAGES:
Reviewing Cluster and Package States with the cmviewcl Command
119
PACKAGE
STATUS
VxVM-CVM-pkg up
NODE
ftsys8
STATE
running
STATUS
down
NODE
STATUS
ftsys9
up
Script_Parameters:
ITEM
STATUS
Service
up
STATE
halted
STATE
running
MAX_RESTARTS
0
RESTARTS
0
NAME
VxVM-CVM-pkg.srv
STATUS
up
STATUS
up
STATE
running
Network_Parameters:
INTERFACE
STATUS
PRIMARY
up
STANDBY
up
PATH
56/36.1
60/6
NAME
lan0
lan1
PACKAGE
pkg1
STATE
running
AUTO_RUN
enabled
STATUS
up
NODE
ftsys9
Policy_Parameters:
POLICY_NAME
CONFIGURED_VALUE
Failover
min_package_node
Failback
manual
Script_Parameters:
ITEM
STATUS
Service
up
Subnet
up
Resource
up
MAX_RESTARTS
0
0
Node_Switching_Parameters:
NODE_TYPE
STATUS SWITCHING
Primary
up
enabled
Alternate
up
enabled
PACKAGE
pkg2
STATUS
up
STATE
running
RESTARTS
NAME
0
service1
0
15.13.168.0
/example/float
NAME
ftsys9
ftsys10
(current)
AUTO_RUN
disabled
NODE
ftsys9
Policy_Parameters:
POLICY_NAME
CONFIGURED_VALUE
Failover
min_package_node
Failback
manual
Script_Parameters:
ITEM
STATUS
NAME
MAX_RESTARTS
Service
up
service2.1
0
Subnet
up
15.13.168.0
0
Node_Switching_Parameters:
NODE_TYPE
STATUS
SWITCHING
Primary
up
enabled
120 Maintenance
NAME
ftsys10
RESTARTS
0
0
Alternate
NODE
ftsys10
up
STATUS
up
enabled
ftsys9
(current)
STATE
running
Network_Parameters:
INTERFACE
STATUS
PRIMARY
up
STANDBY
up
PATH
28.1
32.1
NAME
lan0
lan1
Now pkg2 is running on node ftsys9. Note that it is still disabled from switching.
STATUS
up
STATUS
up
PACKAGE
pkg1
pkg2
NODE
ftsys10
STATUS
up
up
STATUS
up
STATE
running
STATE
running
running
AUTO_RUN
enabled
enabled
NODE
ftsys9
ftsys9
STATE
running
Both packages are now running on ftsys9 and pkg2 is enabled for switching. Ftsys10 is
running the daemon and no packages are running on ftsys10.
ftsys10
STATUS
up
STATUS
up
PACKAGE
pkg1
pkg2
NODE
ftsys10
STATUS
up
up
STATUS
down
STATE
running
STATE
running
running
AUTO_RUN
enabled
enabled
NODE
ftsys9
ftsys9
STATE
halted
STATUS
down
STATE
halted
AUTO_RUN
enabled
NODE
unowned
121
Policy_Parameters:
POLICY_NAME
CONFIGURED_VALUE
Failover
min_package_node
Failback
automatic
Script_Parameters:
ITEM
STATUS
Resource
up
Subnet
up
Resource
up
Subnet
up
Resource
up
Subnet
up
Resource
up
Subnet
up
NODE_NAME
manx
manx
burmese
burmese
tabby
tabby
persian
persian
NAME
/resource/random
192.8.15.0
/resource/random
192.8.15.0
/resource/random
192.8.15.0
/resource/random
192.8.15.0
Node_Switching_Parameters:
NODE_TYPE
STATUS
SWITCHING
Primary
up
enabled
Alternate
up
enabled
Alternate
up
enabled
Alternate
up
enabled
NAME
manx
burmese
tabby
persian
122
Maintenance
NOTE:
All of the checks below are performed when you run cmcheckconf without any arguments
(or with only -v, with or without -k or -K). cmcheckconf validates the current cluster and
package configuration, including external scripts and pre-scripts for modular packages, and
runs cmcompare to check file consistency across nodes. (This new version of the command
also performs all of the checks that were done in previous releases.) See Checking Cluster
Components (page 123) for details.
You may want to set up a cron (1m) job to run cmcheckconf regularly. See Setting up
Periodic Cluster Verification (page 125).
These new checks are not done for legacy packages. For information about legacy and
modular packages.
Check that each volume group contains the same physical volumes on each node.
Check that each node has a working physical connection to the physical volumes.
Check that file systems have been built on the logical volumes identified by the fs_name
parameter in the cluster's packages.
File consistency:
Check that files including the following are consistent across all nodes.
/etc/nsswitch.conf
/etc/services
package control scripts for legacy packages (if you specify them)
/etc/cmcluster/cmclfiles2check
/etc/cmcluster/cmignoretypes.conf
/etc/cmcluster/cmknowncmds
/etc/cmcluster/cmnotdisk.conf
The table includes all the checks available as of A.11.20, not just the new ones.
Comments
123
Comments
Same physical volumes on each
node
Physical volumes connected on
each node
LANs (cluster)
124
Maintenance
Comments
exist and are executable. Service
commands whose paths are nested
within an unmounted shared
filesystem are not checked.
IP addresses (cluster)
Example
The short script that follows runs cluster verification and sends an email to admin@hp.com when
verification fails.
#!/bin/sh
cmcheckconf -v >/tmp/cmcheckconf.output
if (( $? != 0 ))
then
mailx -s "Cluster verification failed" admin@hp.com 2>&1 </tmp/cmcheckconf.output
fi
To run this script from cron, you would create the following entry in /var/spool/cron/
crontabs/root:
0 8,20 * * * verification.sh
125
Limitations
Serviceguard does not check the following conditions:
File systems configured to mount automatically on boot (that is, Serviceguard does not check
/etc/fstab)
Mount point overlaps (such that one file system is obscured when another is mounted)
Online Reconfiguration
The online reconfiguration feature provides a method to make configuration changes online to a
Serviceguard Extension for RAC (SGeRAC) cluster. Specifically, this provides the ability to add
and/or delete nodes from a running SGeRAC Cluster, and to reconfigure SLVM Volume Group
(VG) while it is being accessed by only one node.
Maintenance
Copy oc.conf from Oracle clusterware multi node package directory from other
nodes to the Oracle Clusterware multi node package directory of new node.
1)
Create a directory for RAC toolkit configuration file. This directory should be same
as the one which is created on the existing cluster nodes.
For example; mkdir /etc/cmcluster/<YourOwn-RACmulti-node package-Dir/
2)
Copy the rac_dbi.conf from RAC multi node package directory from other nodes
to the the RAC multi node package directory of new node.
Step b1 and b2 must be repeated for all the RAC MNP packages in the cluster.
c.
Copy asm_dg.conf from ASMDG multi node package directory from other nodes
to the the ASMDG multi node package directory of new node.
Step c1 and c2 must be repeated for all the ASMDG MNP packages in the cluster.
5.
Make the new node join the cluster (cmrunnode) and run the services.
Use
1.
2.
3.
the following steps for deleting a node using online node reconfiguration:
Halt the node in the cluster by running cmhaltnode.
Edit the cluster configuration file to delete a node(s).
Run cmapplyconf.
When using LVM, the volume groups are supported with Serviceguard.
For more information on using and configuring LVM version 2.x, see the HP-UX 11i Version 3:
HP-UX System Administrator's Guide: Logical Volume Management located at www.hp.com/go/
hpux-core-docs > HP-UX 11iv3.
For LVM version 2 compatibility requirements, see the Serviceguard/SGeRAC/SMS/Serviceguard
Mgr Plug-in Compatibility and Feature Matrix at www.hp.com/go/hpux-serviceguard-docs >
HP Serviceguard.
NOTE: For more information, see the Serviceguard Version A.11.20 Release Notes at
www.hp.com/go/hpux-serviceguard-docs > HP Serviceguard.
2.
On the configuration node, use the vgchange command to make the volume group shareable
by members of the cluster:
Managing the Shared Storage
127
# vgchange -S y -c y /dev/vg_rac
This command is issued from the configuration node only, and the cluster must be running on
all nodes for the command to succeed. Note that both the -S and the -c options are specified.
The -S y option makes the volume group shareable, and the -c y option causes the cluster
ID to be written out to all the disks in the volume group. This command specifies the cluster
that a node must be a part of to obtain shared access to the volume group.
The above example marks the volume group as non-shared, and not associated with a cluster.
When the same command is entered on the second node, the following message is displayed:
Activated volume group in shared mode.
This node is a Client.
NOTE: Do not share volume groups that are not part of the RAC configuration unless shared
access is controlled.
Remember that volume groups remain shareable even when nodes enter and leave the cluster.
NOTE: If you wish to change the capacity of a volume group at a later time, you must deactivate
and unshare the volume group first. If you add disks, you must specify the appropriate physical
volume group name and make sure the /etc/lvmpvg file is correctly updated on both nodes.
128
Maintenance
1.
2.
Ensure that the Oracle RAC database is not active on either node.
From node 2, use the vgchange command to deactivate the volume group:
# vgchange -a n /dev/vg_rac
3.
From node 2, use the vgexport command to export the volume group:
# vgexport -m /tmp/vg_rac.map.old /dev/vg_rac
From node 1, use the vgchange command to deactivate the volume group:
# vgchange -a n /dev/vg_rac
5.
6.
Prior to making configuration changes, activate the volume group in normal (non-shared)
mode:
# vgchange -a y /dev/vg_rac
7.
8.
Use normal LVM commands to make the needed changes. Be sure to set the raw logical volume
device file's owner to oracle and group to dba with a mode of 660.
Next, from node 1, deactivate the volume group:
# vgchange -a n /dev/vg_rac
9.
Use the vgexport command with the options shown in the example to create a new map
file:
# vgexport -p -m /tmp/vg_rac.map /dev/vg_rac
12. Create a control file named group in the directory /dev/vg_rac, as in the following:
# mknod /dev/vg_rac/group c 64 0xhh0000
The major number is always 64, and the hexadecimal minor number has the format:
0xhh0000
where hh must be unique to the volume group you are creating. Use the next hexadecimal
number that is available on your system after the volume groups that are already configured.
13. Use the vgimport command, specifying the map file you copied from the configuration node.
In the following example, the vgimport command is issued on the second node for the same
volume group that was modified on the first node:
# vgimport -v -m /tmp/vg_rac.map /dev/vg_rac /dev/dsk/c0t2d0/dev/dsk/c1t2d0
14. Activate the volume group in shared mode by issuing the following command on both nodes:
# vgchange -a s -p /dev/vg_rac
Skip this step if you use a package control script to activate and deactivate the shared volume
group as a part of RAC startup and shutdown.
129
Volume groups should include different PV links to each logical unit on the disk array.
Volume group names must be the same on all nodes in the cluster.
Logical volume names must be the same on all nodes in the cluster.
If you are adding or removing shared LVM volume groups, make sure that you modify the cluster
configuration file and any package control script that activates and deactivates the shared LVM
volume groups.
The cluster service should not be running on the node from which you will be deleting
Serviceguard Extension for RAC.
The node from which you are deleting Serviceguard Extension for RAC should not be in the
cluster configuration.
If you are removing Serviceguard Extension for RAC from more than one node, swremove
should be issued on one node at a time.
NOTE: After removing Serviceguard Extension for RAC, your cluster will still have Serviceguard
installed. For information about removing Serviceguard, refer to the Managing Serviceguard user
guide for your version of the product.
130 Maintenance
Monitoring Hardware
Good standard practice in handling a high-availability system includes careful fault monitoring so
as to prevent failures if possible, or at least to react to them swiftly when they occur.
The following should be monitored for errors or warnings of all kinds.
Disks
CPUs
Memory
LAN cards
Power sources
All cables
Some monitoring can be done through simple physical inspection, but for the most comprehensive
monitoring, you should examine the system log file (/var/adm/syslog/syslog.log) periodically
for reports on all configured HA devices. The presence of errors relating to a device will show the
need for maintenance.
131
Replacing Disks
The procedure for replacing a faulty disk mechanism depends on the type of disk configuration
you are using and on the type of Volume Manager software. For a description of replacement
procedures using CVM, refer to the chapter on Administering Hot-Relocation in the VERITAS
Volume Manager Administrators Guide. Additional information is found in the VERITAS Volume
Manager Troubleshooting Guide.
The following sections describe how to replace disks that are configured with LVM. Separate
descriptions are provided for replacing a disk in an array and replacing a disk in a high-availability
enclosure.
Remove the failed disk and insert a new one. The new disk will have the same HP-UX device
name as the old one.
On the node from where you issued the lvreduce command, issue the following command
to restore the volume group configuration data to the newly inserted disk:
# vgcfgrestore /dev/vg_sg01 /dev/dsk/c2t3d0
132
Maintenance
6.
Issue the following command to extend the logical volume to the newly inserted disk:
# lvextend -m 1 /dev/vg_sg01 /dev/dsk/c2t3d0
7.
Finally, use the lvsync command for each logical volume that has extents on the failed
physical volume. This synchronizes the extents of the new disk with the extents of the other
mirror.
# lvsync /dev/vg_sg01/lvolname
Detach the target PV by using one of the following commands on each node of the cluster:
# pvchange -a n [pv path]
Use the pvchange command -a n [pv path] to detach only one path or replace a disk
if the primary disk path is not performing well and you want to disable the path. The pvchange
-a n command detaches the single specified PV Link (device path). (If the path was the path
in use, LVM will switch to any alternate PV Link that is still available.)
OR
# pvchange -a N [pv path]
Alternatively, use the pvchange -a N [pv path] command to detach a disk (all paths to
the disk) and close it. Use this to allow diagnostics or replace a multi-ported disk.
NOTE: If the volume group is mirrored, applications can continue accessing data on mirror
copies after the commands above. If the volume is not mirrored, then any access attempts to
the device may hang indefinitely or time out. This depends upon the LV timeout value configured
for the logical volume.
2.
3.
Restore the LVM header to the new disk using the following command:
# vgcfgrestore -n [vg name] [pv raw path]
It is only necessary to perform the vgcfgrestore operation once from any node on the
cluster.
4.
Attach PV or Activate the VG from each node of the cluster using the following commands:
#pvchange -a y [pv path]
OR
# vgchange -a [y|e|s] [vg name]
The PV must be detached from all nodes and must be attached from each of the nodes to
make it usable. Alternatively, you can reactivate the VG from each of the nodes. (This command
cannot attach all the paths to the PV, therefore each PV link has to be attached as well.)
Replacing Disks
133
NOTE: After executing one of the commands above, any I/O queued for the device will
restart. If the device replaced in step #2 was a mirror copy, then it will begin the
resynchronization process that may take a significant amount of time to complete. The progress
of the resynchronization process can be observed using the vgdisplay(1M),
lvdisplay(1M) or pvdisplay(1M) commands.
Make a note of the physical volume name of the failed mechanism (for example, /dev/dsk/
c2t3d0).
Deactivate the volume group on all nodes of the cluster:
# vgchange -a n vg_rac
3.
4.
5.
Activate the volume group on one node in exclusive mode, then deactivate the volume group:
# vgchange -a e vg_rac
This will synchronize the stale logical volume mirrors. This step can be time-consuming,
depending on hardware characteristics and the amount of data.
6.
7.
Activate the volume group on all the nodes in shared mode using vgchange - a s:
# vgchange -a s vg_rac
Maintenance
the bus without harm.) When using inline terminators and Y cables, ensure that all orange-socketed
termination packs are removed from the controller cards.
NOTE: You cannot use inline terminators with internal FW/SCSI buses on D and K series systems,
and you cannot use the inline terminator with single-ended SCSI buses. You must not use an inline
terminator to connect a node to a Y cable.
Figure 19 shows a three-node cluster with two F/W SCSI buses. The solid line and the dotted line
represent different buses, both of which have inline terminators attached to nodes 1 and 3. Y
cables are also shown attached to node 2.
Figure 19 F/W SCSI Buses with Inline Terminators
The use of inline SCSI terminators allows you to do hardware maintenance on a given node by
temporarily moving its packages to another node and then halting the original node while its
hardware is serviced. Following the replacement, the packages can be moved back to the original
node.
Use the following procedure to disconnect a node that is attached to the bus with an inline SCSI
terminator or with a Y cable:
1. Move any packages on the node that requires maintenance to a different node.
2. Halt the node that requires maintenance. The cluster will reform, and activity will continue on
other nodes. Packages on the halted node will switch to other available nodes if they are
configured to switch.
3. Disconnect the power to the node.
4. Disconnect the node from the inline terminator cable or Y cable if necessary. The other nodes
accessing the bus will encounter no problems as long as the inline terminator or Y cable
remains connected to the bus.
5. Replace or upgrade hardware on the node, as needed.
6. Reconnect the node to the inline terminator cable or Y cable if necessary.
7. Reconnect power and reboot the node. If AUTOSTART_CMCLD is set to 1 in the /etc/
rc.config.d/cmcluster file, the node will rejoin the cluster.
8. If necessary, move packages back to the node from their alternate locations and restart them.
Replacing Disks
135
Offline Replacement
The following steps show how to replace a LAN card offline. These steps apply to HP-UX 11i:
1.
2.
3.
4.
5.
6.
Online Replacement
If your system hardware supports hotswap I/O cards, and if the system is running HP-UX 11i
(B.11.11 or later), you have the option of replacing the defective LAN card online. This will
significantly improve the overall availability of the system. To do this, follow the steps provided in
the section How to Online Replace (OLR) a PCI Card Using SAM in the document Configuring
HP-UX for Peripherals. The OLR procedure also requires that the new card must be exactly the same
card type as the card you removed to avoid improper operation of the network driver. Serviceguard
will automatically recover the LAN card once it has been replaced and reconnected to the network.
Maintenance
1.
Use the cmgetconf command to obtain a fresh ASCII configuration file, as follows:
# cmgetconf config.ascii
2.
Use the cmapplyconf command to apply the configuration and copy the new binary file to
all cluster nodes:
# cmapplyconf -C config.ascii
This procedure updates the binary file with the new MAC address and thus avoids data inconsistency
between the outputs of the cmviewconcl and lanscan commands.
137
6 Troubleshooting
Go to www.hp.com/go/hpux-serviceguard-docs, and then click HP Serviceguard . In the
User Guide section, click on the latest Managing Serviceguard manual and see the Troubleshooting
your Cluster chapter.
NOTE:
138
Troubleshooting
A Software Upgrades
Serviceguard Extension for RAC (SGeRAC) software upgrades can be done in the two following
ways:
rolling upgrade
non-rolling upgrade
Rolling upgrade is a feature of SGeRAC that allows you to perform a software upgrade on a given
node without bringing down the entire cluster. SGeRAC supports rolling upgrades on version
A.11.15 and later, and requires all nodes to be running on the same operating system revision
and architecture.
During rolling upgrade the nodes can run on mixed version of HP-UX. Rolling upgrades are not
intended as a means of using mixed release of HP-UX with in the same cluster. HP recommends to
upgrade all the cluster nodes to the new release level at the earliest.
Non-rolling upgrade allows you to perform a software upgrade from any previous revision to any
higher revision or between operating system versions but requires halting the entire cluster.
The rolling and non-rolling upgrade processes can also be used any time one system needs to be
taken offline for hardware maintenance or patch installations. Until the upgrade process is complete
on all nodes, you cannot change the cluster configuration files, and you will not be able to use
any of the new features of the Serviceguard/SGeRAC release.
There may be circumstances when, instead of doing an upgrade, you prefer to do a migration
with cold install. The cold install process erases the preexisting operating system and data, and
then installs the new operating system and software. After a cold install, you must restore the data.
The advantage of migrating with a cold install is that the software can be installed without regard
for the software currently on the system or concern for cleaning up old software.
A significant factor when deciding to either do an upgrade or cold install is overall system downtime.
A rolling upgrade will cause the least downtime. This is because only one node in the cluster is
down at any one time. A non-rolling upgrade may require more down time, because the entire
cluster has to be brought down during the upgrade process.
One advantage of both rolling and non-rolling upgrades versus cold install is that upgrades retain
the preexisting operating system, software, and data. Conversely, the cold install process erases
the preexisting systemyou must reinstall the operating system, software, and data. For these
reasons, a cold install may require more downtime.
The sections in this appendix are as follows:
The upgrade must be done on systems of the same architecture (HP 9000 or Integrity Servers).
All nodes in the cluster must be running on the same version of HP-UX.
Rolling Software Upgrades
139
Each node must be running a version of HP-UX that supports the new SGeRAC version.
Each node must be running a version of Serviceguard that supports the new SGeRAC version.
For more information on support, compatibility, and features for SGeRAC, refer to the Serviceguard
Compatibility and Feature Matrix, located at www.hp.com/go/hpux-serviceguard-docs
> HP Serviceguard Extension for RAC.
Upgrading from an existing SGeRAC A.11.19 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE
To upgrade from an existing Serviceguard and SGeRAC A.11.19 deployment to HP-UX 11i v3
1109 HA-OE/DC-OE, you must select the licensed SGeRAC A.11.20 bundle available in the
HP-UX 11i v3 1109 HA-OE/DC-OE media.
NOTE: You must select the product SGeRAC T1907BA when upgrading to HP-UX 11i v3 1109
HA-OE/DC-OE, otherwise Serviceguard A.11.20 installation will fail.
To perform rolling upgrade for an SGeRAC cluster with Oracle RAC configured, the HP-UX OS
version must be the same, that is, either 11i v2 or 11i v3. If you upgrade from 11i v2 to 11iv3,
you can perform an offline upgrade or rolling upgrade. For more information see, Non-Rolling
Software Upgrades (page 148) section and Steps for Rolling Upgrades (page 142).
NOTE: For all the scenarios discussed in the following sections, HP recommends that the method
used to upgrade from Serviceguard A.11.19 to Serviceguard A.11.20 and SGeRAC A.11.20
must be an offline upgrade. Using Dynamic Root Disk (DRD) utilities can significantly reduce the
planned maintenance time to perform the upgrade.
If you want to perform a rolling upgrade that includes SGeRAC, then you must follow the procedures
described in the scenario Upgrading from an existing Serviceguard A.11.19 cluster to HP-UX
11i v3 1109 HA-OE/DC-OE along with SGeRAC (page 140) onwards.
To perform an offline upgrade
1. Halt the cluster. Use the following command:
cmhaltcl -f
2.
3.
Upgrade all the nodes in the cluster to the new HP-UX release 1109 and select SGeRAC
A.11.20 during the upgrade.
Restart the cluster using the following command:
cmruncl
Upgrading from an existing Serviceguard A.11.19 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE
along with SGeRAC
To upgrade from Serviceguard A.11.9 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE along with
SGeRAC:
1. Install Serviceguard A.11.19 patch PHSS_42216 on all the cluster nodes before upgrading
from Serviceguard A.11.19 to HP-UX 11i v3 1109 HA-OE/DC-OE, Serviceguard and SGeRAC
A.11.20. Otherwise, upgraded Serviceguard and SGeRAC A.11.20 node will not be able
to join the existing 11.19 cluster.
2. Select the licensed SGeRAC A.11.20 bundle. To upgrade the 1109 HA-OE/DC-OE:
On the first node:
a. Halt the node using the following command:
140 Software Upgrades
cmhaltnode <node_name>
b.
3.
Select the SGeRAC bundle T1907BA while upgrading the node to HP-UX 11i v3 1109
HA-OE/DC-OE
After upgrading to HP-UX 11i v3 1109 HA-OE/DC-OE, Serviceguard, and SGeRAC A.11.20,
you must install Serviceguard A.11.20 patch PHSS_42137 on the upgraded node. This patch
allows an upgraded Serviceguard and SGeRAC A.11.20 node to join the existing Serviceguard
A.11.19 cluster.
NOTE: If Serviceguard A.11.20 patch PHSS_42137 is not installed on the upgraded
Serviceguard and SGeRAC A.11.20 node, it cannot join the existing A.11.19 cluster.
4.
Run the following command to join the upgraded Serviceguard and SGeRAC A.11.20 node
to the existing Serviceguard A.11.19:
cmrunnode <node_name>
NOTE: Please make sure that Serviceguard Extension for RAC is installed on all the nodes
in the cluster and all nodes are up and running before attempting to deploy Oracle RAC in
this cluster.
5.
Repeat steps 2 to 4 for all the nodes in the cluster to complete the upgrade.
Upgrading from an existing Serviceguard A.11.19 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE
along with SGeRAC (Alternative approach)
This is an alternative approach for upgrading from Serviceguard A.11.19 to Serviceguard and
SGeRAC A.11.20
1. Perform a rolling upgrade of Serviceguard A.11.19 to HP-UX 11i v3 1109 HA-OE/DC-OE
with Serviceguard A.11.20 without selecting the SGeRAC licensed bundle.
2. Install the Serviceguard A.11.20 patch PHSS_42137 on all cluster nodes.
NOTE: If Serviceguard A.11.20 patch PHSS_42137 is not installed on the upgraded
Serviceguard A.11.20 node, it cannot join the existing Serviceguard A.11.20 cluster when
you install SGeRAC on that node.
3.
Use the following steps to perform a rolling upgrade of Serviceguard A.11.20 to include
SGeRAC A.11.20 on each node:
On the first node:
a. Halt the node using the following command:
cmhaltnode <node_name>
4.
The upgraded Serviceguard and SGeRAC A.11.20 node will be able to join the existing
Serviceguard A.11.20 cluster.
NOTE: Please make sure that Serviceguard Extension for RAC is installed on all the nodes
in the cluster and all nodes are up and running before attempting to deploy Oracle RAC in
this cluster.
5.
Repeat steps 3 and 4 for all the nodes in the cluster to complete the upgrade.
Upgrading from Serviceguard A.11.18 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE along with
SGeRAC
To upgrade from Serviceguard A.11.18 to HP-UX 11i v3 1109 HA-OE/DC-OE:
141
1.
2.
Upgrading from Serviceguard A.11.20 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE along with
SGeRAC
To upgrade from Serviceguard A.11.20 to HP-UX 11i v3 1109 HA-OE/DC-OE:
1. Install Serviceguard A.11.20 patch PHSS_42137 on all the cluster nodes before you start
upgrading to HP-UX 11i v3 1109 HA-OE/DC-OE.
NOTE: If Serviceguard A.11.20 patch PHSS_42137 is not installed on the upgraded
Serviceguard A.11.20 node, when you install SGeRAC on a node, it will not be able to join
the existing Serviceguard A.11.20 cluster.
2.
Select the licensed SGeRAC A.11.20 bundle to upgrade each node to HP-UX 11i v3 1109
HA-OE/DC-OE. To upgrade to HP-UX 11i v3 1109 HA-OE/DC-OE:
On the first node:
a. Halt the node using the following command:
cmhaltnode <node_name>
b.
Select the SGeRAC bundle T1907BA while upgrading the node to HP-UX 11i v3 1109
HA-OE/DC-OE.
NOTE: In this state, if you try the commands like cmcheckconf and cmapplyconf on the
cluster configuration file, no Error or Warning message related to the missing SGeRAC software
are displayed.
3.
Run the following command to join the upgraded node to the cluster:
cmrunnode <node_name>
NOTE: Please make sure that Serviceguard Extension for RAC is installed on all nodes in
the cluster and all nodes are up and running before attempting to deploy Oracle RAC in this
cluster.
4.
Repeat steps 2 to 3 for all then nodes in the cluster to complete the upgrade.
4.
5.
Upgrade the HP-UX OS (if required), Serviceguard, and SGeRAC to the new release (SGeRAC
requires the compatible version of Serviceguard and OS). For more information on how to
upgrade HP-UX, see HP-UX Installation and Update Guide for the target version of HP-UX.
Edit the /etc/rc.config.d/cmcluster file, on the local node, to include the following
line:
AUTOSTART_CMCLD = 1
142
Software Upgrades
NOTE: It is optional to set this parameter to 1. If you want the node to join the cluster at
boot time, set this parameter to 1, otherwise set it to 0.
6.
7.
8.
Restart the cluster on the upgraded node (if desired). You can do this in Serviceguard Manager,
or from the command line, issue the Serviceguard cmrunnode command.
Start Oracle (Clusterware, RAC) software on the local node.
Repeat steps 1-7 on the other nodes, one node at a time until all nodes have been upgraded.
NOTE: Be sure to plan sufficient system capacity to allow moving the packages from node
to node during the upgrade process, to maintain optimum performance.
If a cluster fails before the rolling upgrade is complete (perhaps because of a catastrophic power
failure), the cluster could be restarted by entering the cmruncl command from a node that has
been upgraded to the latest revision of the software.
NOTE: HP recommends you to upgrade the Oracle RAC software either before SG/SGeRAC
rolling upgrade or after SG/SGeRAC rolling upgrade, if you are upgrading Oracle RAC software
along with SG/SGeRAC.
Halt the SGeRAC toolkit packages before upgrading Oracle RAC. For more information on Oracle
RAC software upgrade, see Oracle documentation.
143
Step 1.
1.
2.
This will cause the failover package to be halted cleanly and moved to node 2. The Serviceguard
daemon on node 1 is halted, and the result is shown in Figure 21.
Figure 21 Running Cluster with Packages Moved to Node 2
Step 2.
Upgrade node 1 and install the new version of Serviceguard and SGeRAC (A.11.16), as shown
in Figure 22.
144 Software Upgrades
NOTE: If you install Serviceguard and SGeRAC separately, Serviceguard must be installed before
installing SGeRAC.
Figure 22 Node 1 Upgraded to SG/SGeRAC 11.16
Step 3.
1.
If you prefer, restart the cluster on the upgraded node (node 1). You can do this in Serviceguard
Manager, or from the command line issue the following:
# cmrunnode node1
2.
3.
At this point, different versions of the Serviceguard daemon (cmcld) are running on the two
nodes, as shown in Figure 23.
Start Oracle (Clusterware, RAC) software on node 1.
145
Step 4.
1.
2.
5.
Step 5.
Move PKG2 back to its original node. Use the following commands:
# cmhaltpkg pkg2
# cmrunpkg -n node2 pkg2
# cmmodpkg -e pkg2
The cmmodpkg command re-enables switching of the package that is disabled by the cmhaltpkg
command. The final running cluster is shown in Figure 25.
146
Software Upgrades
During a rolling upgrade, you should issue Serviceguard/SGeRAC commands (other than
cmrunnode and cmhaltnode) only on a node containing the latest revision of the software.
Performing tasks on a node containing an earlier revision of the software will not work or will
cause inconsistent results.
You cannot modify the cluster or package configuration until the upgrade is complete. Also,
you cannot modify the hardware configurationincluding the clusters network
configurationduring a rolling upgrade. This means that you must upgrade all nodes to the
new release before you can modify the configuration file and copy it to all nodes.
The new features of the Serviceguard/SGeRAC release may not work until all nodes have
been upgraded.
You can perform a rolling upgrade only on a configuration that has not been modified since
the last time the cluster was started.
Rolling upgrades are not intended as a means of using mixed releases of HP-UX, Serviceguard,
or SGeRAC within the same cluster. HP recommends to upgrade all the cluster nodes to the
new release level at the earliest.
You can upgrade OS during rolling upgrade only if your cluster uses SLVM over Raw disks,
ASM over Raw disks, or ASM over SLVM configuration for the shared storage.
For more information see, http://www.hp.com/go/hpux-serviceguard-docs > HP
Serviceguard manual.
147
For more information on support, compatibility, and features for SGeRAC, refer to the
Serviceguard and Serviceguard Extension for RAC Compatibility and Feature Matrix, located
at www.hp.com/go/hpux-serviceguard-docs > HP Serviceguard Extension for RAC .
You cannot delete Serviceguard/SGeRAC software (via swremove) from a node while the
cluster is in the process of a rolling upgrade.
3.
4.
5.
If necessary, upgrade all the nodes in the cluster to the new HP-UX release.
Upgrade all the nodes in the cluster to the new Serviceguard/SGeRAC release.
Restart the cluster. Use the following command:
# cmruncl
6.
7.
If necessary, upgrade all the nodes in the cluster to the new Oracle (RAC, CRS, Clusterware)
software release.
Restart Oracle (RAC, Clusterware) software on all nodes in the cluster and configure the
Serviceguard/SGeRAC packages and Oracle as needed.
148
Software Upgrades
Before you proceed, read the sections Upgrading from an Earlier Serviceguard Release and
Rolling Upgrade in the latest version of the release notes for A.11.20.
Serviceguard A.11.20 is supported on HPUX 11i v3 only. For more information, see HP-UX
11i v3 Installation and Update Guide at www.hp.com/go/ hpux-core-docs > HP-UX 11i
v3.
You can perform a rolling upgrade from A.11.19 to a later release, or from an earlier release
to A.11.19, but you cannot do a rolling upgrade from a pre-A.11.19 release to a post-A.11.19
release.
This is because A.11.19 is the only version of Serviceguard that will allow both the older
version of the cluster manager and the new version (introduced in A.11.19) to coexist during
a rolling upgrade.
If you are upgrading from a pre-A.11.19 release: Start by reading Upgrading from an Earlier
Serviceguard Release and Rolling Upgrade in the release notes. Then, if you decide to upgrade
Upgrade Using DRD
149
to A.11.19 in preparation for a rolling upgrade to A.11.20, continue with the following
subsection that provides information on upgrading to A.11.19.
150
Software Upgrades
==========================================================================
Volume Group Name: ______________________________________________________
PV Link 1
PV Link2
Physical Volume Name:_____________________________________________________
Physical Volume Name:_____________________________________________________
Physical Volume Name:_____________________________________________________
Physical Volume Name: ____________________________________________________
Physical Volume Name: ____________________________________________________
Physical Volume Name: ____________________________________________________
Physical Volume Name: ____________________________________________________
Volume Group Name: _______________________________________________________
PV Link 1
PV Link2
SIZE
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
LVM Volume Group and Physical Volume Worksheet
151
152
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
Data: System
_____________________________________________________
Data: Rollback
_____________________________________________________
Data: Temp
_____________________________________________________
Data: Users
_____________________________________________________
Data: Tools
_____________________________________________________
Index
A
activation of volume groups
in shared mode, 128
administration
cluster and package states, 113
array
replacing a faulty mechanism, 132, 133, 134
B
building a cluster
CVM infrastructure, 52
building an RAC cluster
displaying the logical volume infrastructure, 46
logical volume infrastructure, 40
building logical volumes
for RAC, 45
C
CFS, 47, 51
cluster
state, 118
status options, 116
Cluster Communication Network Monitoring, 35
cluster volume group
creating physical volumes, 41
creating a storage infrastructure, 47
CVM
creating a storage infrastructure, 52
use of the CVM-pkg, 55
D
deactivation of volume groups, 128
deciding when and where to run packages, 18
deleting from the cluster, 51
deleting nodes while the cluster is running, 130
demo database
files, 45, 57
disk
choosing for volume groups, 41
disk arrays
creating logical volumes, 44
disk storage
creating the infrastructure with CVM, 52
disks
replacing, 132
Dynamic Root Disk (DRD), 140
E
eight-node cluster with disk array
figure, 22
EMS
for preventive monitoring, 131
enclosure for disks
replacing a faulty mechanism, 132
Event Monitoring Service
in troubleshooting, 131
exporting
shared volume group data, 46
exporting files
LVM commands, 46
F
figures
eight-node cluster with EMC disk array, 22
node 1 rejoining the cluster, 145
node 1 upgraded to HP-UX 111.00, 145
running cluster after upgrades, 147
running cluster before rolling upgrade, 144
running cluster with packages moved to node 1, 146
running cluster with packages moved to node 2, 144
H
HAIP, 35
hardware
adding disks, 131
monitoring, 131
heartbeat subnet address
parameter in cluster manager configuration, 35
high availability cluster
defined, 12
I
in-line terminator
permitting online hardware maintenance, 134
installing
Oracle RAC, 47
installing software
Serviceguard Extension for RAC, 33
IP address
switching, 19
L
lock disk
replacing a faulty mechanism, 134
logical volumes
blank planning worksheet, 151
creating, 45
creating for a cluster, 42, 56, 57
creating the infrastructure, 40
disk arrays, 44
filled in planning worksheet, 31
lssf
using to obtain a list of disks, 41
LVM
creating on disk arrays, 44
LVM commands
exporting files, 46
M
maintaining a RAC cluster, 113
maintenance
153
N
network
status, 118
node
halting status, 121
in an RAC cluster, 12
status and state, 116
non-rolling upgrade
DRD, 149
O
online hardware maintenance
by means of in-line SCSI terminators, 134
Online node addition and deletion, 126
Online reconfiguration, 126
opsctl.ctl
Oracle demo database files, 45, 57
opslog.log
Oracle demo database files, 45, 57
Oracle
demo database files, 45, 57
Oracle 10 RAC
installing binaries, 62
Oracle 10g/11gR2 RAC
introducing, 25
Oracle Disk Manager
configuring, 64
Oracle RAC
installing, 47
Oracle10g
installing, 62
P
package
basic concepts, 13
moving status, 120
state, 118
status and state, 116
switching status, 121
package configuration
service name parameter, 35
packages
deciding where and when to run, 18
physical volumes
creating for clusters, 41
filled in planning worksheet, 151
planning
worksheets for logical volume planning, 31
worksheets for physical volume planning, 151
planning worksheets
blanks, 151
point to point connections to storage devices, 21
PVG-strict mirroring
creating volume groups with, 42
154 Index
R
RAC
overview of configuration, 12
status, 117
RAC cluster
defined, 12
removing Serviceguard Extension for RAC from a system,,
130
replacing disks, 132
rollback.dbf
Oracle demo database files, 45, 46, 58
rolling software upgrades
example, 143
steps, 142
rolling upgrade
DRD, 149
limitations, 147, 148
S
service
status, 117
service name
parameter in package configuration, 35
SERVICE_NAME
parameter in package configuration, 35
Serviceguard Extension for RAC
installing, 33
introducing, 12
shared mode
activation of volume groups, 128
deactivation of volume groups, 128
shared volume groups
making volume groups shareable, 127
sharing volume groups, 46
SLVM
making volume groups shareable, 127
state
cluster, 118
node, 116
of cluster and package, 113
package, 116, 118
status
cluster, 116
halting node, 121
moving package, 120
network, 118
node, 116
normal running RAC, 119
of cluster and package, 113
package, 116
RAC, 117
service, 117
switching package, 121
Storage Management Suite (SMS) , 16
switching IP addresses, 19
system multi-node package
used with CVM, 55
system.dbf
Oracle demo database files, 45, 58
T
temp.dbf
Oracle demo database files, 45, 58
troubleshooting
monitoring hardware, 131
replacing disks, 132
U
upgrade
DRD, 149
upgrade restrictions
DRD, 149
V
volume group
creating for a cluster, 42
creating physical volumes for clusters, 41
volume groups
adding shared volume groups, 130
displaying for RAC, 46
exporting to other nodes, 46
making changes to shared volume groups, 128
making shareable, 127
making unshareable, 128
W
worksheet
logical volume planning, 31
worksheets
physical volume planning, 151
worksheets for planning
blanks, 151
155