Sunteți pe pagina 1din 34

Best Practices for 10g RAC

December 2005

Roy Rossebo
RAC Pack
Oracle Corporation

“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Session Objectives

Ÿ Best Practices based on latest field experiences.


Ÿ Avoid some common issues.
Ÿ Get ready for the Grid. Beyond 2-node clusters.

Ÿ Assume that basic RAC architecture is known


1. Define Service Level Objectives

Ÿ Cannot be Successful without Objectives


Ÿ Realistic and Measurable Objectives
– Driven by and linked to business objectives
– Existing service levels typically provide the baseline
– Max planned/unplanned downtime
– Performance requirements: #user, response times,
jobs/transactions per time period
– Consolidation, cost saving
Realistic Expectations

Ÿ If your application will scale transparently on


SMP, then it is realistic to expect it to scale well on
RAC, without having to make any changes to the
application code.

Ÿ RAC eliminates the database instance, and the


node itself, as a single point of failure, and ensures
database integrity in the case of such failures
2. Keep It Simple

Ÿ Use Proven Configurations


Ÿ Certified Components
– Metalink Certification matrix
– Vendor certifications
Ÿ Minimize number of Components
– Other Cluster Manager required anymore?
– Volume Managers needed anymore?
Ÿ Partner with Vendors
Ÿ Don’t be too ‘
Creative’
3. Get ready for the Grid

Ÿ 2 node clusters give no/few Grid benefits


– 30+ customer with 6+ nodes in cluster
– 4+ node clusters are the norm for new systems
Ÿ Large Clusters = Large HW Savings
Ÿ Provisioning, Capacity on Demand
Ÿ RAC performance scales better than SMP for
large number of CPUs
Ÿ Grid Control: RAC System Management very
similar to Single Instance
4. Use Oracle10g Release 2

Ÿ OCR and Voting Disk mirroring


Ÿ Simpler Install Process for Large Clusters
Ÿ Significant Performance Improvements
Ÿ Framework for Workload Management in
Large Clusters: Services, FAN and LBA
Ÿ Clusterware High Availability API
Ÿ Etc.
Before Oracle10g Release 2
Buffer Cache Buffer Cache

Gcs message to master

Read from disk

Object,e.g. Table
Data Blocks

Messages are sent to remote node when reading into cache


Object Affinity
Buffer Caches

Object,e.g. Table
Data Blocks

NO Messages are sent to remote node when reading into cache


5. Implement Workload Mgt.
Ÿ Workload Management for Clusters –more
then just load balancing
– Effective Utilization of Cluster Resources
– Minimize Synchronization Cost in Cluster
– Monitor and Optimize Response Times
– Reduce Network Delays During Failover
Ÿ Take advantage of Services, FAN and LBA
Connect to Services

Service: GL Preferred 1,2,3,4

Service: ERP Service: CRM,


Preferred: 1,2 Available: 3,4 Preferred: 3,4 Available: 1,2

Instance: Instance: Instance: Instance:


FINPROD1 FINPROD2 FINPROD3 FINPROD4

Database Service: FINPROD


Initial State
C1
C2
C3 Instance 1
C4
Instance 2
C5
C6

Service across RAC

Connection Pool
Instance Join
C1
C2
C3 Instance 1
C4
Instance 2
C5
C6 Instance 3

Service across RAC

Connection Pool
Instance Join –Undesired Cases
Option 1: Nothing Happens
C1
C2
C3 Instance 1
C4
Instance 2
C5
C6 Instance 3

Service across RAC

Connection Pool
Instance Join –Undesired Cases
Option 2: Add New Connections
Randomly
C1
C2
C3 Instance 1
C4
Instance 2
C5
C6 Instance 3
C7
C8
Service across RAC
C9
Connection Pool
Instance Join –Desired Result
C1
C2
C3 Instance 1
C4
Instance 2
C5
C6 Instance 3
C7
C8
Service across RAC
C9
Connection Pool
Node Leaves
C1
C2
C3 Instance 1
C4
Instance 2
C5
C6
C7
C8
Service across RAC
C9

Connection Pool
Runtime Connection Load Balancing
CRM Client connection requests

connection
cache ?
60% 30%
10%

“CRM is “CRM is “CRM is


bored” very busy”
busy”

Instance 1 Instance 2 Instance 3


6. Configure Interconnect Correctly
Ÿ Use UDP over Gigabit Ethernet for Interconnect!
Ÿ OS Bonding for Failover & Loadbalancing
– NIC bonding, multiple switches
Ÿ Set UDP buffers to max
Check:
$ /sbin/sysctl net.core.rmem_max net.core.wmem_max net.core
.rmem_default net.core.wmem_default
net.core.rmem_max = 262144
net.core.wmem_max = 262144
net.core.rmem_default = 262144
net.core.wmem_default = 262144
Ÿ Use switch(es) - crossover cable not supported
Ÿ Eliminate any Transmission Problems
Ÿ LMS CPU Resources
Correct network used?
select * from gv$cluster_interconnects;

INST_ID NAME IP_ADDRESS IS_PUBLIC SOURCE


------- ---- ------------- --------- -------------------------
1 eth2 138.2.238.74 NO Oracle Cluster Repository
2 eth2 138.2.238.75 NO Oracle Cluster Repository

SQL> oradebug setmypid


Statement processed.
SQL> oradebug ipc
Information written to trace file.
SQL> oradebug tracefile_name

SSKGXPT 0xcc1c78c flags SSKGXPT_READPENDING info for network 0


socket no 7 IP 138.2.238.74 UDP 33594
sflags SSKGXPT_UP
Correct network used?
$ oifcfg getif
eth0 138.2.236.114 global public
eth2 138.2.238.74 global cluster_interconnect

$ocrdump
$view OCRDUMPFILE
...
[SYSTEM.css.node_numbers.node2.privatename]
ORATEXT : stnsp014-rac
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION :
PROCR_READ, OTHER_PERMISSION : PROCR_READ, USER_NAME : root,
GROUP_NAME : root}
Interconnect
Ÿ Jumbo Frames (NIC and switch) –test properly first!
$ /sbin/ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:0E:0C:09:6C:81
inet6 addr: fe80::20e:cff:fe09:6c81/64 Scope:Link
UP BROADCAST RUNNING NOARP MULTICAST MTU:1500 Metric:1

$ ping -s 8972 -M do stnsp013-rac
From stnsp014-rac (138.2.238.74) icmp_seq=0 Frag needed and DF set (mtu =
1500)

Ÿ Full Duplex
$ /sbin/mii-tool -v eth0
eth0: 100 Mbit, full duplex, link ok
product info: vendor 00:50:43, model 2 rev 3
basic mode: 100 Mbit, full duplex
basic status: link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
7. Use Cluster Verification
Utility -pre dbcfg
Configures
RAC DB
-pre dbinst

Installs
RAC

-pre crsinst
Installs
CRS
-post crsinst
-pre cfs
Sets up OCFS
( OPT )
-post cfs

User sets up the


Hardware,
network & storage
-post hwos
List of Components
Ÿ $> ./cluvfy comp -list

Ÿ Valid components are:


Ÿ nodereach : checks reachability between nodes
Ÿ nodecon : checks node connectivity
Ÿ cfs : checks CFS integrity
Ÿ ssa : checks shared storage accessibility
Ÿ space : checks space availability
Ÿ sys : checks minimum system requirements
Ÿ clu : checks cluster integrity
Ÿ clumgr : checks cluster manager integrity
Ÿ ocr : checks OCR integrity
Ÿ crs : checks CRS integrity
Ÿ nodeapp : checks node applications existence
Ÿ admprv : checks administrative privileges
Ÿ peer : compares properties with peers
CVU locations
Ÿ Oracle DVD
– clusterware/cluvfy/runcluvfy.sh
– clusterware/rpm/cvuqdisk-1.0.1-1.rpm

Ÿ CRS Home
– $ORA_CRS_HOME/bin/cluvfy
– $ORA_CRS_HOME/cv/rpm/cvuqdisk-1.0.1-
1.rpm

Ÿ Oracle Home
– $ORACLE_HOME/bin/cluvfy
Deployment of cluvfy

l Install only on local node. Tool deploys itself on remote nodes during
execution, as required.
CVU
Ÿ User installs on
local node

Ÿ Issues verification
command for multiple
nodes

l Tool copies the


required bits to
the remote nodes

l Executes
verification tasks
on all nodes and
generates report
8. Test, Test, Test…
Ÿ Why?
– Verify that System Infrastructure meets SLO
– Verify Correct Install / Configuration
– Build Skills
Ÿ How?
– Test Plan
– Separate Test Cluster
– Realistic Configuration –matching production
– Realistic Workload
Ÿ Best Insurance, but can be difficult / expensive
– Functional, Performance and Destructive Testing
– Include Normal and Exception Operational Procedures
9. Monitor Performance

Ÿ Establish Performance Baseline


Ÿ AWR / Statspack
Ÿ ADDM
Ÿ Active Session History
ADDM ( excerpt )
FINDING 5: 7.1% impact (5377 seconds)
-------------------------------------
Read and write contention on database blocks was consuming significant
database time in the cluster.

RECOMMENDATION 1: Schema, 2.7% benefit (2074 seconds)


ACTION: Consider partitioning the INDEX "FHUS_NEW.PK_OBJECT_STORE" with
object id 101859 in a manner that will evenly distribute concurrent
DML across multiple partitions.
RELEVANT OBJECT: database object with id 101859
RATIONALE: The INSERT statement with SQL_ID "gf99f4tkwcvt9" was
significantly affected by "global cache buffer busy".
RELEVANT OBJECT: SQL statement with SQL_ID gf99f4tkwcvt9
insert into OBJECT_STORE (VERSION, TYPE_ID, CREATE_DATE, UPDATE_DATE,
STATE_ID, ENCODING_TYPE, COMPRESSION_TYPE, STORED_DATA_SIZE,
ORIGINAL_DATA_SIZE, FORWARD_ID, OBJECT_ID) values (:1, :2, :3, :4,
:5, :6, :7, :8, :9, :10, :11)
RATIONALE: The INSERT statement with SQL_ID "cfpgqrsxcfhg1" was
significantly affected by "global cache buffer busy".
RELEVANT OBJECT: SQL statement with SQL_ID cfpgqrsxcfhg1
insert into OBJECT_REFERENCE (OBJECT_ID, REFERENCE_ID) values (:1,
:2)

Identifies SQL and objects with serializing contention


@?/rdbms/admin/addmrpt.sql
Active Session History Report

@?/rdbms/admin/ashrpt.sql
10. Prepare Diagnostics
Procedures, just in case…
Ÿ Capture max possible diagnostics before restarting
system etc.
– Tradeoff with recovery time
Ÿ Oracle side
– Hanganalyze
– Systemstate dumps
– gv$views
– Statspack / AWR
Ÿ OS setup, e.g. Linux
– Netdump Utility (Metalink note 226057.1)
– Alt SysRq Keys Utility (Metalink node 228203.1)
– Serial Console (Metalink node 228204.1)
– Performance / load data
And Don’t Forget …

Ÿ Synchronize Time across cluster –NTP


Ÿ Use ASM for Database Storage
Ÿ Use Grid Control
Ÿ Cache Sequences
Ÿ Use ASSM for all Tablespaces
Ÿ Linux hugetlb for large memory
Ÿ Etc.
Summary
1. Define Service Level Objectives
2. Keep It Simple
3. Get ready for the Grid
4. Use Oracle10g Release 2
5. Implement Workload Management
6. Configure the Interconnect Correctly
7. Use Cluster Verification Utility
8. Test, Test, Test…
9. Monitor Performance
10. Prepare Diagnostics Procedures, just in
case…
Q&
QUESTIONS

A
ANSWERS

S-ar putea să vă placă și