Documente Academic
Documente Profesional
Documente Cultură
Legal information
The information in this presentation is provided by IBM on an "AS IS" basis without any warranty, guarantee or assurance of any kind. IBM also does not provide any warranty, guarantee or assurance that the information in this paper is free from errors or omissions. Information is believed to be accurate as of the date of publication. You should check with the appropriate vendor to obtain current product information. Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use. IBM, ^, , RS6000, System p, AIX, AIX 5L, GPFS, and Enterprise Storage Server (ESS) are trademarks or registered trademarks of the International Business Machines Corporation. Oracle, Oracle9i and Oracle10g are trademarks or registered trademarks of Oracle Corporation. All other products or company names are used for identification purposes only, and may be trademarks of their respective owners.
Agenda
AIX Configuration Best Practices for Oracle Memory CPU I/O Network Miscellaneous
Agenda
AIX Configuration Best Practices for Oracle Memory CPU I/O Network Miscellaneous
The 32-bit or 64-bit address translates into a 52-bit or 80-bit virtual address
32-bit system : 4-bit segment register that contains a 24-bit segment id, and 28-bit offset. 24-bit segment id + 28-bit offset = 52-bit VA 52-bit segment id + 28-bit offset = 80-bit VA 64-bit system: 32-bit segment register that contains a 52-bit segment id, and 28-bit offset.
The VMM maintains a list of free frames that can be used to retrieve pages that need to be brought into memory.
The VMM replenishes the free list by removing some of the current pages from real memory (i.e., steal memory). The process of moving data between memory and disk is called paging.
The VMM uses a Page Replacement Algorithm (implemented in the lrud kernel threads) to select pages that will be removed from memory.
JFS
maxperm strict_maxperm
maxclient strict_maxclient
NAME CUR DEF BOOT MIN MAX UNIT TYPE -------------------------------------------------------------------------------lru_file_repage 1 1 1 0 1 boolean D lru_poll_interval 0 0 0 0 60000 milliseconds D maxclient% 80 80 80 1 100 % memory D maxfree 1088 1088 1088 8 200K 4KB pages D maxperm% 80 80 80 1 100 % memory D minfree 960 960 960 8 200K 4KB pages D strict_maxclient 1 1 1 0 1 boolean D strict_maxperm 0 0 0 0 1 boolean D minperm% 20 20 20 1 100 % memory D 10 2012 IBM Corporation August 29, 2012
On AIX 5.3, number of the default vmo settings are not optimized for database workloads and should be modified for Oracle environments
11 2012 IBM Corporation August 29, 2012
Many tunable are classified as Restricted: Only change if AIX Support says so Parameters will not be displayed unless the -F option is used for commands like vmo, no, ioo, etc.
When migrating from AIX 5.3 to 6.1/7.1, parameter override settings in AIX 5.3 will be transferred to AIX 6.1/7.1 environment
12
14
Unused
File cache
Process
The current memory use can be verified via: cat /proc/sys/fs/jfs2/memory_usage metadata cache: 31186944 inode cache: 34209792 total: 65396736
*1 Note: Default values in AIX 7.1 are 200 (5%) , 200 (2%)
15 15 2012 IBM Corporation August 29, 2012
Start stealing pages when free memory below minfree Stop stealing pages when free memory above maxfree
When numperm% > maxperm%, steal only file system pages When minperm% < numperm% < maxperm%, steal file system or computation pages, depending on repage rate
comp% minfree
Free% minperm%
maxperm%
When numperm% < minperm%, steal both file system and computational pages
AIX 6.1 introduced the ability to maintain separate LRU lists for computational vs. filesystem pages.
Also backported to AIX 5.3
By default, memory for a process is allocated from memory associated with the processor that caused the page fault. Memory pool configuration is influenced by the VMO parameter memory_affinity
Memory_affinity=1 means configure memory pools based on physical hardware configuration (DEFAULT) Memory_affinity=0 means configure roughly uniform memory pools from any physical location
p590 / p595 MCM Architecture
Number can be seen with vmstat v |grep pools Size can only be seen using KDB LRUD operates per memory pool
28 2012 IBM Corporation August 29, 2012
64K, available with POWER5+ and later & AIX 5.3 TL4+
Can be paged to paging space Can be converted to 4K pages if not enough 4K pages are available Kernel page size used in AIX 5.3 TL4+ and above (can be configured) Can be utilized for application code, data and stack as well, but requires specific configuration
16M available with POWER4 hardware (or later) (also referred to as Large Pages)
Requires pinned memory and explicit configuration Can not be paged to paging space
30
31
M B used
2500 3000 3500 4000 4500 5000
1000
1500
2000
500
4kb free
4kb used
64kb used
64kb free
Time
14 :02 14 :02 14 :02 14 :03 14 :03 14 :03 14 :03 14 :04 14 :04 14 :04 14 :05 14 :05 14 :05 14 :06 14 :06 14 :06 14 :06 14 :07 14 :07 14 :07 14 :08 14 :08 14 :08 14 :09 14 :09 14 :09 14 :09 14 :10 14 :10 14 :10 14 :11 14 :11 14 :11 14 :12 14 :12 14 :12 14 :12 14 :13 14 :13 14 :13 14 :14 14 :14 14 :14 14 :15 14 :15 14 :15 14 :15 14 :16 14 :16 14 :16 14 :17 14 :17 14 :17 14 :18
4KB pages
64K page size is very promising, since they do not need to be configured/reserved in advance or pinned
export LDR_CNTRL=DATAPSIZE=64K@TEXTPSIZE=64K@STACKP SIZE=64K@SHMPSIZE=64K to use the 64K pagesize for stack, data & text
Will require Oracle to explicitly request the page size (10.2.0.4 & up plus Oracle patch# 7226548) If preferred size not available, the largest available smaller size will be used
Current Oracle versions will end up using 64KB pages even if SGA is not pinned
vmo p o v_pinshm = 1
Leave maxpin% at the default of 80% unless the SGA exceeds 77% of real memory
Oracle Parameters
LOCK_SGA = TRUE
Note: It is recommended not to pin SGA, as long as you had configured the VMM, SGA & PGA properly.
33
SGA regions -----------------------------Database Buffers Fixed Size Redo Buffers Variable Size
sum
18,172,864,960
34
Oracle dynamically allocates memory for the SGA only as needed up to the size specified by SGA_TARGET SGA_TARGET may be dynamically increased, up to SGA_MAX_SIZE 64K pages automatically used for SGA if supported in the environment. If needed, 4K (or 16M) pages are converted to 64K pages.
LOCK_SGA=true Discouraged
Oracle Automatic Memory Management (AMM) cannot be used (MEMORY_TARGET) Oracle pre-allocates all memory as specified by SGA_MAX_SIZE and pins it in memory, even if its not all used (i.e. SGA_TARGET < SGA_MAX_SIZE) If sufficient 16M pages are available those will be used. Otherwise, all the SGA memory will be allocated from 64K (if supported) or 4K pages (if 64K pages are not supported). If needed, 4K (or 16M pages will be converted to 64K pages, but 16M pages are never automatically created. If a value for sga_max_size is specified larger than the amount of available memory for computational pages, the system can become unresponsive due to system paging. If the specified SGA_MAX_SIZE is much larger than the currently available pages on the combined 64K and 16M page free lists, the database startup can fail with error: IBM AIX RISC System/6000 Error: 12: Not enough space. In this case re-try to start the database.
36 2012 IBM Corporation August 29, 2012
37
Source: Oracle Database Concepts 11g Release 1 (11.1) Part Number B28318-05 39
-v 1048576 1002006 812111 1 141103 80.0 3.0 90.0 3.2 32779 0.0 0 0.0 90.0 0 0 0 0 2484 0 0
memory pages lruable pages free pages memory pools pinned pages maxpin percentage minperm percentage maxperm percentage numperm percentage file pages compressed percentage compressed pages numclient percentage maxclient percentage client pages remote pageouts scheduled pending disk I/Os blocked with no pbuf paging space I/Os blocked with no psbuf filesystem I/Os blocked with no fsbuf client filesystem I/Os blocked with no fsbuf external pager filesystem I/Os blocked with no fsbuf
August 29, 2012
40
svmon -G size inuse 926225 5215 free 290287 pin 493246 virtual 262007
memory pg space
1179648 1572864
pers 0 4316
clnt 0 335656
other 74176
PageSize s m L
41
PoolSize 80
pgsp 5215 0 0
4 KB 64 KB 16 MB
Monitor paging activity: vmstat -s sar -r nmon Resolve paging issues: Reduce file system cache size (MAXPERM, MAXCLIENT) Reduce Oracle SGA or PGA (9i or later) size Add physical memory
43
Agenda
AIX Configuration Best Practices for Oracle Memory CPU I/O Network Miscellaneous
44
CPU Considerations
Oracle Parameters based on the # of CPUs
DB_WRITER_PROCESSES Degree of Parallelism user level table level query level MAX_PARALLEL_SERVERS or AUTOMATIC_PARALLEL_TUNING (CPU_COUNT * PARALLEL_THREADS_PER_CPU)
CPU_COUNT FAST_START_PARALLEL_ROLLBACK should be using UNDO instead CBO execution plan may be affected; check explain plan
45
Lparstat command
# lparstat -i
47
Node Name Partition Name Partition Number Type Mode Entitled Capacity Partition Group-ID Shared Pool ID Online Virtual CPUs Maximum Virtual CPUs Minimum Virtual CPUs Online Memory Maximum Memory Minimum Memory Variable Capacity Weight Minimum Capacity Maximum Capacity Capacity Increment Maximum Physical CPUs in system Active Physical CPUs in system Active CPUs in Pool Unallocated Capacity Physical CPU Percentage Unallocated Weight
: erpcc8 ::: Dedicated : Capped : 4.00 :::4 :4 :1 : 8192 MB : 9216 MB : 128 MB :: 1.00 : 4.00 : 1.00 :4 :4 ::: 100.00% :August 29, 2012
V V V V
2.1 Proc. Units
V
Pool 1
Virtual Physical
2 CPUs (dedicated)
1 CPU (dedicated)
Shared Pool 0
* All activated, non-dedicated CPUs are automatically placed into the shared processor pool. Only 2.1+0.8+1.2 = 4.1 processor units of desired capacity has been allocated from the pool of 13 CPUs
48 2012 IBM Corporation August 29, 2012
Oracle DB core license factors: Power5 and earlier: 0.75 Power6: 1.0 Power7: 1.0
On by default in AIX 5.3 ML3 and later vpm_xvcpus tunable vpm_fold_policy tunable
# schedo o vpm_xvcpus=[-1 | N]
Where N specifies the number of VPs to enable in addition to the number of VPs needed to consume physical CPU utilization A value of -1 disables CPU folding
50 2012 IBM Corporation August 29, 2012
DLPAR
MicroPartition
Agenda
AIX Configuration Best Practices for Oracle Memory CPU I/O Network Miscellaneous
54
IOs can be coalesced (good) or split up (bad) as they go thru the IO stack
55 2012 IBM Corporation August 29, 2012
Raw disks
Multi-path IO driver (optional) Queues exist for both adapters and disks Disk Device Drivers Adapter device drivers use DMA for IO Adapter Device Drivers Disk subsystems have read and write cache Disk subsystem (optional) Disks have memory to store commands/data Disk Write cache Read cache or memory area used for IO
Raw LVs
JFS JFS2
NFS
Other
Enhanced JFS (JFS2) Better for large files/filesystems Buffer caching (default) provides Sequential Read-Ahead, cached writes, etc. Direct I/O (DIO) mount/open option no caching on reads DIO, with write serialization Concurrent I/O (CIO) mount/open option disabled
Use for Oracle .dbf, control files and online redo logs only!!!
Non-cached, non-blocking I/Os (similar to JFS2 CIO) for all Oracle files
GPFS and JFS2 with CIO offer similar performance as Raw Devices
58 2012 IBM Corporation August 29, 2012
Bench throughput over run duration higher tps indicates better performance.
Set filesystemio_options=SETALL -orUse dio mount option Use cio mount option
Oracle Datafiles
Cached by Oracle mount -o cio (jfs2 + agblksize=512) Cached by Oracle mount -o rbrw Use JFS2 write-behind but are not kept in AIX Cache. mount -o rw Cached by AIX
Oracle Redolog
Oracle Archivelog
61
62
Use RAID-5 or RAID-10 to create striped LUNs (hdisks) Create AIX Volume Group(s) (VG) w/ LUNs from multiple arrays, striping on the front end as well for maximum distribution Physical Partition Spreading (mklv e x) orLarge Grained LVM striping (>= 1MB stripe size)
http://www-1.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100319
64
queue_depth = the maximum # of outstanding I/Os for an hdisk. Recommended/supported maximum is storage subsystem dependent.
max_xfer_size = the maximum allowable I/O transfer size (default is 0x40000 or 256k). Maximum supported value is storage subsystem dependent. Increasing value (to at least 0x200000) will also increase DMA size from 16 MB to 256 MB.
dyntrk = When set to yes (recommended), allows for immediate re-routing of I/O requests to an alternative path when a device ID (N_PORT_ID) change has been detected.
fc_err_recov = When set to fast_fail (recommended), if the driver receives an RSCN notification from the switch, the driver will check to see if the device is still on the fabric and will flush back outstanding I/Os if the device is no longer found.
70
IO : Asynchronous IO (AIO)
Allows multiple requests to be sent without to have to wait until the disk subsystem has completed the physical IO. Utilization of asynchronous IO is strongly advised whatever the type of file-system and mount option implemented (JFS, JFS2, CIO, DIO).
Application
2
aio Q
aioservers
Disk
Posix vs Legacy Since AIX5L V5.3, two types of AIO are now available : Legacy and Posix. For the moment, the Oracle code is using the Legacy AIO servers.
73 2012 IBM Corporation August 29, 2012
Application
2
Raw Devices / ASM :
AIX Kernel
Disk
check AIO configuration with : lsattr El aio0 enable asynchronous IO fast_path. : AIX 5L : chdev -a fastpath=enable -l aio0 (default since AIX 5.3) AIX 6.1/7.1 : ioo p o aio_fastpath=1 (default setting)
74
If warning messages found, increase maxreqs and/or maxservers Monitor from AIX:
pstat a | grep aios Use -A option for NMON iostat Aq (new in AIX 5.3)
75
Async I/O:
Oracle parameter filesystemio_options is ignored Set Oracle parameter disk_asynch_io=TRUE Prefetchthreads= exactly what the name says
Usually set prefetchthreads=64 (the default)
Other settings:
GPFS block size is configurable; most will use 512KB-1MB Pagepool GPFS fs buffer cache, not used for RAC but may be for binaries. Default=64M mmchconfig pagepool=100M
76 2012 IBM Corporation August 29, 2012
I/O Pacing
I/O Pacing parameters can be used to prevent large I/O streams from monopolizing CPUs
System backups (mksysb) DB backups (RMAN, Netbackup) Software patch updates
77
ASM configurations
AIX parameters
Async I/O needs to be enabled, but default values may be used
DB instance parameters
disk_asynch_io=TRUE filesystemio_options=ASYNCH Increase Processes by 16 Increase Large_Pool by 600k Increase Shared_Pool by [(1M per 100GB of usable space) + 2M]
78
Agenda
AIX Configuration Best Practices for Oracle Memory CPU I/O Network Miscellaneous
79
If isno=1, check to see if settings have been overridden at the network interface level:
$ no -a | grep use_isno=1 use_isno=1 $ lsattr -E -l en0 -H attribute value rfc1323 tcp_nodelay N/A tcp_sendspace N/A tcp_recvspace N/A tcp_mssdflt
description N/A
N/A
80
Agenda
AIX Configuration Best Practices for Oracle Memory I/O Network Miscellaneous
82
Miscellaneous parameters
User Limits (smit chuser)
Soft FILE size = -1 (Unlimited) Soft CPU time = -1 (Unlimited) Soft DATA segment = -1 (Unlimited) Soft STACK size -1 (Unlimited) /etc/security/limits
Environment variables:
AIXTHREAD_SCOPE=S
LDR_CNTRL=DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K
83
90
On Sunday 6 February 2011, Oracle Linux 6 was released Oracle Linux is Oracles Development Platform for all Oracle Database, Middleware, and Application Products As of January 12, 2012 (11 months later), zero Oracle products are certified on Oracle Linux 6
AIX is the leader
AIX 7 GA date was in Sept. 2010 Oracle Certified DB 11gR2 on AIX 7 in Oct. 2010 (30 days later) 1 year later, E-Business Suite, PeopleSoft, Oracle Oracle sellers will tell customers that it takes a long time for products to become Enterprise Manager, Fusion Middleware, Tuxedo, available on AIX, when in fact the exact opposite is the case. Products were delivered Hyperion EPM, OBIEE, Policy Automation, and others on the latest version of AIX (7.1) more than 11 months sooner than Oracle Linux 6 are certified on AIX 7 and waiting and waiting for their first product on OL 6 (6.x). Theyre still waiting
91 2012 IBM Corporation August 29, 2012
5% 22%
5%
13%
55%
Oracle HW People Facilities Other
Based on averages of customer supplied estimates August 29, 2012
93
94
95
96
Oracle Exadata
Customer has little choice in solution design. Forced to buy components not relevant to workloads. One size fits all requires total trust in Oracle as a single source provider. Exadata works best if data is read intensive, pre sorted, bulk loaded or can fit entirely into storage server cache. Smart Scan functionality not relevant for indexed tables or OLTP workloads Exadata has significant limitations in: 1) Consolidation of DB instances 2) Resource virtualization across instances 3) Running N-1 Oracle sw levels 4) Upgrading hw resources granularly Basic cheap disk no data management. No internal disk RAID Parity ASM mirroring only, No concurrent maintenance, Multiple SPOF Requires redesign of operational architecture strategy and deployment and new, complex DBA skills in many customer environments Significantly more systems, software images and RAC nodes to manage and update Initial acquisition costs similar Contractual terms can favour Oracle Upgrades cost more on Exadata Cost of integration more (existing Power) Price /performance costs increasing
2. Real Performance
7. Solution Cost
1. Graded on a normal college-level type of curve Source: Does Your OS Matter? Selecting a Strategic Operating System; Solitaire Interglobal Ltd (All rights reserved); October 2011.
98
Thank You
99 2012 IBM Corporation August 29, 2012
Trademarks
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: AS/400, DBE, e-business logo, ESCO, eServer, FICON, IBM, IBM Logo, iSeries, MVS, OS/390, pSeries, RS/6000, S/30, VM/ESA, VSE/ESA, Websphere, xSeries, z/OS, zSeries, z/VM The following are trademarks or registered trademarks of other companies Lotus, Notes, and Domino are trademarks or registered trademarks of Lotus Development Corporation Java and all Java-related trademarks and logos are trademarks of Sun Microsystems, Inc., in the United States and other countries LINUX is a registered trademark of Linux Torvalds UNIX is a registered trademark of The Open Group in the United States and other countries. Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation. SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC. Intel is a registered trademark of Intel Corporation * All other products may be trademarks or registered trademarks of their respective companies. NOTES: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. References in this document to IBM products or services do not imply that IBM intends to make them available in every country. Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use. The information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.
100