Sunteți pe pagina 1din 78

Oracle Database on IBM Power with AIX

Best Practices
Washington Systems Center –
Ralf Schmidt-Dannert
dannert@us.ibm.com
Washington Systems Center – Oracle
IBM

AIX VUG call on 09/27/2018


Please note

IBM’s statements regarding its plans, directions, and intent are subject to change
or withdrawal without notice and at IBM’s sole discretion.

Information regarding potential future products is intended to outline our general


product direction and it should not be relied on in making a purchasing decision.

The information mentioned regarding potential future products is not a commitment,


promise, or legal obligation to deliver any material, code or functionality. Information about
potential future products may not be incorporated into any contract.

The development, release, and timing of any future features or functionality described for
our products remains at our sole discretion.

Performance is based on measurements and projections using standard IBM benchmarks


in a controlled environment. The actual throughput or performance that any user will
experience will vary depending upon many factors, including considerations such as the
amount of multiprogramming in the user’s job stream, the I/O configuration, the storage
configuration, and the workload processed. Therefore, no assurance can be given that an
individual user will achieve results similar to those stated here.

© Copyright IBM Corporation 2018 1


Learning Objectives

Develop a better understanding how the Oracle database and AIX


interact, especially in the area of memory.

Be aware of typical “pitfalls” in the context of storage layout and


configuration for Oracle databases on AIX.

Know the AIX tuning parameters and their “best practice” values for
an Oracle database server.

© Copyright IBM Corporation 2018 2


April 2013
Agenda

AIX Configuration/Tuning for Oracle


– Memory
– CPU
– I/O
– Network
– Oracle Patches
– Miscellaneous

The suggestions presented here are considered to be basic


configuration “starting points” for general Oracle workloads
Customer workloads will vary
Ongoing performance monitoring and tuning is recommended to
ensure that the configuration is optimal for the particular workload
characteristics

© Copyright IBM Corporation 2018 3


POWER9, POWER8 Portfolio Supports Oracle Databases on AIX
Power
S924 AIX or Linux on Power E980
S922 Power
S914 • 4-24 Cores E950
• 4 TB memory
• 4 – 20 Cores • PowerVM
• 4 – 8 Cores
• 4 TB memory S824
•PowerVM • 32 - 192 Cores
• 1 TB memory • 16 - 48 Cores
•PowerVM S822 • 64TB memory
• 16TB memory • PowerVM
S814 • 6-24 Cores • PowerVM
S812 • 2 TB memory Power
• 6-20 Cores • PowerVM
• 1 TB memory Power E880C
• 4-8 Cores • PowerVM
•1 TB memory Power E870C
• 1-4 Cores • PowerVM
• 128GB memory E850C
• PowerVM
L922
• 8 - 64 Cores
• 16 - 48 Cores • 16TB memory • 8 - 192 Cores
LC921 • 8 - 24 Cores • 4TB memory • PowerVM • 32TB memory
LC922 • 4 TB memory • PowerVM • PowerVM
• PowerVM
S824L
• up to 40 Cores
• 2TB memory S822L
• up to 44 Cores
• 2TB memory • KVM S812L • 8 - 24 Cores
• KVM S821LC • 2 TB memory
• 8 - 24 Cores • KVM,
S822LC • 1TB memory
• 10, 12 Cores • KVM,
PowerVM Clients value
• 512GB memory
S812LC • 8 – 20 Cores • KVM,
PowerVM enterprise class features,
• 16, 20 Cores • 512GB memory
• 1TB memory • 0-1 Nvidia GPU
PowerVM robustness
• 0-4 NVidia GPUs • KVM, Bare Metal
• 8, 10 Cores • KVM, Bare metal
• 1 TB memory
• KVM, Bare metal Clients value
performance and
Clients value open price/performance
innovation and TCA
© Copyright IBM Corporation 2018 4
Agenda

– Memory
– CPU
– I/O
– Network
– Oracle Patches
– Miscellaneous

© Copyright IBM Corporation 2018 5


AIX Physical Memory – pools, pages, page lists

Physical
Automatic (*1)
Automatic Memory
Free lists 4KB 64KB 16MB 16GB
psmd proc. Manual
free free free free
16GB

Automatic free
DSO (*2)
Used lists 4KB 64KB 16MB 16MB 16GB
psmd proc.
used used used used
used
16GB

Memory Pool 0 used

Memory Pool 1
Paging
Space
On Disk This is a simplified view

(*1) Only when large amounts of memory are requested at once and not enough free pages on 4KB / 64KB free lists.
(*2) IBM AIX Dynamic System Optimizer (DSO) “MPSS” is a chargeable feature pre AIX 7.2. 16MB pages generated by DSO
are handled differently from pre-allocated / non-pageable 16MB pages!
© Copyright IBM Corporation 2018 6
April 2013
M B used

0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
1 4 :0 2

© Copyright IBM Corporation 2018


1 4 :0 2
1 4 :0 2
1 4 :0 3
1 4 :0 3
1 4 :0 3
1 4 :0 3
1 4 :0 4
1 4 :0 4
1 4 :0 4
1 4 :0 5
1 4 :0 5
1 4 :0 5
1 4 :0 6

4kb free
1 4 :0 6

4kb used
1 4 :0 6
1 4 :0 6
1 4 :0 7
1 4 :0 7
1 4 :0 7
4K - 64K - 16MB Page Dynamics

1 4 :0 8
1 4 :0 8
1 4 :0 8
1 4 :0 9
1 4 :0 9
1 4 :0 9
1 4 :0 9

Time
1 4 :1 0
1 4 :1 0
1 4 :1 0

April 2013
Note: DSO 16MB page conversion is currently only reported in svmon.
1 4 :1 1
1 4 :1 1
64kb used
64kb free

1 4 :1 1
1 4 :1 2
1 4 :1 2
1 4 :1 2
1 4 :1 2
1 4 :1 3
1 4 :1 3
1 4 :1 3
1 4 :1 4
1 4 :1 4
1 4 :1 4
1 4 :1 5
1 4 :1 5
1 4 :1 5
1 4 :1 5
1 4 :1 6
4KB_used MB 4KB_free MB 64KB_used MB 64KB_free MB 16MB_usedMB 16MB_freeMB

1 4 :1 6
1 4 :1 6
1 4 :1 7
1 4 :1 7
1 4 :1 7
1 4 :1 8
4KB pages
64KB pages
16MB pages

7
AIX Memory Management Concepts

Two primary categories of memory pages: Computational and File System


AIX tries to utilize all of the physical memory available
– What is not required to support computational page demand will tend to be used for file
system cache
Requests for new memory pages are satisfied from the free page list(s)
– Small reserve of free pages maintained by “stealing” Computational or File pages
– AIX uses “demand paging” algorithm – generally not written to paging space until “stolen”

System% Process% FScache%


100

90

80
Free Memory
70
% Physical memory used

60

50
File cache is always
40
4KB memory pages !
30

20

10

0
12:01
12:06
12:12
12:17
12:22
12:28
12:33
12:38
12:44
12:49
12:54
13:00
13:05
13:10
13:16
13:21
13:26
13:32
13:37
13:42
13:48
13:53
13:58
14:04
14:09
14:14
14:20
14:25
14:30
14:36
14:41
14:46
14:52
14:57
15:02
15:08
15:13
15:18
15:24
15:29
15:34
15:40
15:45
15:50
15:56
16:01
16:06
16:12
16:17
16:22
16:28
16:33
16:38
16:44
16:49
16:54
17:00
17:05
17:10
Time

© Copyright IBM Corporation 2018 9


April 2013
VMM Page Stealing Process (lrud)

Definitions:
• lrud = VMM page stealing process = LRU Daemon (1 per memory pool)
• numperm, numclient = # pages currently used for filesystem buffer cache
• maxperm, maxclient = target maximum # pages to use for filesystem buffer cache
• free pages = # pages immediately available to satisfy new memory requests
vmo Parameters:
• minperm% = target min % real memory for filesystem buffer cache
• maxperm%, maxclient% = target max % real memory for filesystem buffer cache
• minfree = target minimum number of free memory pages
• maxfree = number of free memory pages at which lrud stops stealing pages
When does lrud (for a given memory pool and page size) start?
• When free pages < minfree (4K and 64K pages)
• When (maxclient - numclient) < minfree (4K pages only)
When does lrud stop?
• When free pages > maxfree (4K and 64K pages)
• When (maxclient – numclient) > maxfree (4K pages only)

© Copyright IBM Corporation 2018 10


April 2013
VMM Page Stealing Thresholds (AIX 7.2, 7.1, 6.1)
To determine # of memory pools:
• minfree / maxfree values are per memory pool vmstat –v | grep pool
• Total system minfree = minfree * # of memory pools
• Total system maxfree = maxfree * # of memory pools

• AIX 7.2, 7.1 and 6.1 defaults are acceptable for most workloads
• Consider increasing if vmstat ‘fre’ column frequently approaches zero, or if “vmstat –s” shows
significantly increasing “free frame waits” over time

• Suggested starting points if tuning is required:


• minfree >= max(960,(120 x # logical CPUs )) / #mem pools
• maxfree = minfree + ((MAX(maxpgahead, j2_maxPageReadAhead) * # logical CPUs) / # mem pools)

Example:
10-way LPAR with SMT-4 enabled, with maxpgahead=8 and j2_maxPageReadAhead=128 and 2 memory pools:
minfree = 2400 = max(960,(120 x 10 x 4) / 2
maxfree = 4960 = 1200 + ((max(128,8) x 10 x 4) / 2)

vmo –p –o minfree=1200 –o maxfree=2480

© Copyright IBM Corporation 2018 11


April 2013
AIX System Paging Concepts & Requirements

By default, AIX uses a “demand paging” policy


– For Oracle DB, the goal is ZERO system paging space activity
– Filesystem pages written back to filesystem disk (if dirty); never to system paging space
– Unless otherwise specified, computational pages are not written to paging space
unless/until they are stolen by lrud. (*1)
Once written to paging space, pages are not removed from paging space until the process
associated with those pages terminates
– For long running processes (e.g. Oracle DB), even low levels of system paging can result in
significant growth in paging space usage over time
– Paging space should be considered a fail-safe mechanism for providing sufficient time to
identify and correct paging issues, not a license to allow ongoing system paging activity
Paging space allocation Rule-of-Thumb: Physical Paging
Memory Space
– ½ the physical memory + 4 GB, with the following cap: lower or Max
equal to
Resolve paging issues quickly:
128GB 60GB
– Reduce effective minimum file system cache size (minperm%)
– Reduce Oracle SGA or PGA size 256GB 100GB

– Add physical memory 512GB 150GB


1TB 200GB
© Copyright IBM Corporation 2018 12
April 2013
JFS2 inode / metadata caches

JFS2 utilizes two caches - one for inodes and one for metadata
Unused
Caches grow in size until maximum size is reached before
cache slots are reused
Default values are tuned for a file server! File cache

Each file entry in the cache requires about 1KB of physical


memory
1MB of memory can cache about 1000 file inodes
Process
Configured via ioo parameters: 100%
– j2_inodeCacheSize (Default: 200 = 5%) *1 physical
memory
– j2_metadataCacheSize (Default: 200 = 2%) *1 “System memory”
The current memory use can be verified via:
cat /proc/sys/fs/jfs2/memory_usage AIX “pinned”

metadata cache: 31186944 2% *1 metadata cache Can not be


inode cache: 34209792 paged !
total: 65396736
5% *1 Inode cache

Note: *1 Default values pre AIX 7.1 are 400 (10%) , 400 (4%)

© Copyright IBM Corporation 2018 13


Large Segment Aliasing (AKA Terabyte Segment)

Workloads with large memory footprints and low spatial locality may perform poorly due to Segment
Lookaside Buffer (SLB) faults
• May consume up to 20% of total execution time for some workloads

Architectural trend toward smaller SLB sizes can exacerbate SLB related performance issues:
• POWER6 has 64 SLB entries – 20 for kernel, 44 for user processes – allowing 11GB of accessible memory before
incurring SLB faults
• POWER7/POWER8/POWER9 have 32 SLB entries – 20 for kernel, 12 for user processes – allowing 3GB of accessible
memory before incurring SLB faults with default segment sizes

Oracle SGA sizes are typically in the 10s to 100s of Gigabytes


With Large Segment Aliasing, each SLB entry can address 1TB of memory
• Supports shared memory addressability for up to 12TB on POWER7,8,9 and up to 44TB on POWER6 without SLB faults
• Enabled by default on AIX 7.1 – 32 bit processes may need fix for IV11261 – and on AIX 7.2
• Disabled by default on AIX 6.1 –May be enabled by setting vmo esid_allocator=1 (Recommended)
• Oracle 11.2.0.3 needs Oracle fix for bug #147645450
• Unshared memory issue with 11.2.0.3 documented in APAR IV23859 & ML 1467807.1
– shm_1tb_unsh_enable=0 recommended for Oracle DB environments
– shm_1tb_unsh_enable = 0 is default with AIX6.1TL7SP5+/ AIX 7.1TL1 SP5+/AIX7.2
© Copyright IBM Corporation 2018 14
Dynamic System Optimizer (DSO)

Supported in AIX 6.1 TL8 and AIX 7.1 TL2 on IBM Power 7/7+; AIX 7.1 TL04 SP2 and AIX 7.2 TL01 SP2
on POWER8 and AIX 7.2 TL2 SP2 on POWER9

Configured via “asoo” and disabled by default.

Supports four optimization strategies


– Cache Affinity Optimization (multi-threaded applications only)
– Memory Affinity Optimization
– 16MB Mixed Page Size Segment (MPSS) (chargeable feature pre AIX 7.2– aso.dso)
– Data Stream Pre-fetch Optimization (chargeable feature pre AIX 7.2 – aso.dso)

MPSS transparently supports Oracle SGA page size conversion from 4K/64K pages to 16MB pages;
only shared memory supported

To be able to utilize MPSS you have to:


– Be on a supported AIX level running on IBM Power 7/7+, Power8 or Power9 server
– Active Memory Expansion (AME) is not used
– 16GB or more of physical memory in LPAR
– Pre AIX 7.2, purchase the feature and install the aso.dso fileset; AIX 7.2 includes the feature in the bos.aso fileset
– Verify aso subsystem is active (“lssrc -s aso”)and enable DSO via “asoo –p –o aso_active=1”. Also look at
“ASO_OPTIONS” environment variable for further control. “export ASO_OPTIONS=LARGE_PAGE=ON”
© Copyright IBM Corporation 2018 16
April 2013
Dynamic System Optimizer (DSO) - continued

svmon -P <pmon PID> -O mpss=on

Unit: page
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual
16711740 oracle 10199406 10144 325 10180604

Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual


881308 7000003a work default shmat/mmap m 4096 0 0 4096
f20572 7000007a work default shmat/mmap m 0 0 0 4096
L 16 0 0 0
f50675 70000066 work default shmat/mmap m 0 0 0 4096
L 16 0 0 0

© Copyright IBM Corporation 2018 17


Local, Near and Far Memory
• Power Systems use a “shared memory” model
all processors have access to all of the memory
• Each processor chip has its own memory controller and directly attached memory
• Each socket (chip module) typically has 2 (DCM) or 1 (SCM) processor chips; POWER9 is always SCM.
• Enterprise Power Systems (e.g. E870, E880, E980) use multiple building blocks (system nodes) to scale
capacity
Each system node has it’s own set of processor and memory chips
System nodes are interconnected via a switched communications fabric
• The closer the memory is to the processor accessing it, the faster the memory access
Local Memory: Directly attached to the chip’s memory controller
Near Memory: On an adjacent chip, accessed via intra-node communication paths
Far Memory: On a different CEC drawer, accessed via inter-node communication paths

Model Local Near Far


Power S814 Same Chip Other Chip, Same Socket N/A
Power S824 Same Chip Other Chip, Same Socket Other Socket
Power S924 Same Chip Other Chip N/A
Power E850 Same Chip Other Chip, Same Socket Other Socket
Note that Power9
Power E950 Same Chip Other Chip, Same Node N/A is always single
Power E880/E980 Same Chip Other Chip, Same Node Different Node chip per socket.
© Copyright IBM Corporation 2018 18
Displaying the LPAR CPU & Memory Configuration
The ‘lssrad -va’ command displays a summary of the way physical processors and memory is allocated for a
given LPAR:
# lssrad -va
REF1 SRAD MEM CPU
0
0 110785.25 0-31 Note the extremely poor distribution of
1 125665.00 32-63 memory in this example:
1
2 17430.00 64-95 • 3 of 6 SRADs have no local memory at
3 0.00 96-127 all. Every process running on a CPU on
these SRAD will encounter memory
2 latency induced slower performance.
4 0.00 128-159
5 0.00 160-191

– REF1: Hardware provided reference point identifying sets of resources that are near each other. e.g. socket in scale-out
servers or node in scale-up servers.
– SRAD: A Scheduler Resource Affinity Domain, i.e. an individual group of processors that all reside on the same chip
– MEM: The amount of local memory (in Megabytes) allocated to the SRAD
– CPU: The logical CPUs within the SRAD, e.g. with SMT4 enabled, 0-3 would be for the first physical CPU, 4-7 would be for
the second physical CPU, etc…

© Copyright IBM Corporation 2018 20


Help the Hypervisor do its job
Don’t over-allocate CPUs
– If a given workload (LPAR) requires less processors than a single CEC, don’t allocate more
processors than are on a single CEC.
– If all the LPARs in a given shared pool require (in aggregate) less processors than 2 CECs,
don’t allocate more processors than available in 2 CECs to the shared pool
– For Shared Processor LPARs, don’t over allocate vCPUs relative to Entitled Capacity (Rule
of thumb - no more than 2 – 3 times entitled capacity; best practice is < 1.5x (less time
slicing from hypervisor because of less processors to cycle through))

Don’t over-allocate memory (Both: desired and maximum)


– May cause processors/memory to be allocated on additional CECs, not local to the
processors assigned to the LPAR, because there wasn’t sufficient free memory available
on the optimal CEC

Help the Hypervisor do its job


– Stay current on Firmware to avoid any known CPU/memory allocation or virtual processor
dispatching issues
– Where appropriate, consider LPAR boot order to ensure high priority LPARs get optimal
CPU to memory allocation with improved affinity (validate via lssrad –av)

© Copyright IBM Corporation 2018 21


Parameter Tuning (AIX 7.2, 7.1, 6.1)

Most AIX 7.2, AIX 7.1, AIX 6.1 parameters configured by default to be ‘correct’
for most workloads

As of AIX 6.1, many tunables are now classified as ‘Restricted’


– Only change if AIX Support requests it
– Restricted parameters will not be displayed unless the ‘-F’ option is used for “vmo”
or other commands

When migrating from AIX 5.3 to AIX 6.1, AIX 7.1 or AIX 7.2, existing parameter
override settings in AIX 5.3 will be transferred to AIX 6.1 or later environment
– After migration, review/verify parameter values are properly set

© Copyright IBM Corporation 2018 22


April 2013
Oracle Server Architecture – Memory Structures

PGA
PGA PGA
RVWR PMON SMON
PGA

System Global Area (SGA) ARC0 Archive


Logs

DB Buffer In- Redo Log


Shared
Memory Buffer PGA
Flash- Pool Cache
back Log Area
LGWR
PGA
DBWn CKPT Redo
Logs
PGA
PGA User Control
Files

PGA D000
DB • SGA is Shared among processes
Files • PGA is Private to an individual server
or background process

Only a subset of Oracle


process types is shown.
© Copyright IBM Corporation 2018 24
Memory Usage in an Oracle Database Environment

Computational
Some used for AIX kernel processing
Some used by Oracle/client executable programs
Includes Oracle SGA and PGA memory

File System Cache


May be used for caching or pre-fetching of Oracle .dbf files
– Only for local file system based (non-RAC) environments where Direct I/O (or
Concurrent I/O) is not used
May be used for other Oracle related files
– Archive logs, export/import files, backups, binaries, etc.
May be used for non-Oracle related files
– Application files, system files, etc.

Virtual Memory Management Priorities


Always want to keep computational pages in memory -- System paging/swapping may
degrade Oracle/application performance
– Allocate enough physical memory to support computational footprint requirement +
small file cache
– When necessary, steal file system pages, not computational; note that computational
pages may be stolen (paged to paging space) if numperm% < minperm% and low on
free pages.

© Copyright IBM Corporation 2018 25


April 2013
Oracle Memory Structures Allocation
10g : Automatic Shared Memory Management (ASMM)
– sga_target (dynamic) – if set, the db_cache_size, shared_pool_size, large_pool_size and streams_pool_size are
dynamically sized; can grow to sga_max_size.
• Minimum values for these pools may optionally be specified
– If LOCK_SGA=true then physical memory according to sga_max_size is allocated at DB startup!
– To use ASMM, sga_target must be >0

11g : Automatic Memory Management (AMM)


– memory_target (dynamic parameter) – specifies the total memory size to be used by the instance SGA and PGA.
Exchanges between SGA and PGA are done according to workload requirements
– If sga_target and pga_aggregate_target are not set, the policy is to give 60% of memory_target to the SGA and
40% to the PGA
– memory_max_target (static parameter) – specifies the maximum memory size for the instance
– To use Automatic Memory Management, memory_target must be >0 and LOCK_SGA=false
– See Metalink notes 443746.1 and 452512.1 explaining AMM and these parameters

AMM dynamic resizing of the shared pool can cause a fair amount of “cursor: pin s” wait time. One
strategy to minimize this is to set minimum sizes for memory areas you particularly care about.
In addition, you can change the frequency how often AMM analyzes and adjusts the memory
distribution. See: Metalink note: 742599.1 ( _memory_broker_stat_interval)

© Copyright IBM Corporation 2018 26


Oracle Memory Structures Allocation
12c: pga_aggregate_limit
– Sessions with the highest amount of allocated PGA will be terminated until compliance is reached

12c: inmemory_size
– When the Oracle “In Memory” Option is used, specifies the size within the SGA to reserve for “In-Memory” objects
pinned in memory
– This parameter is not dynamic in 12.1, but can be dynamically increased in 12.2 or later
– And should not be confused with the keep cache.

Recommended:
1. Use SGA_TARGET and SGA_MAX SIZE rather than MEMORY_TARGET and MEMORY_MAX_TARGET
2. Most environments should use 64K pages rather than pinned 16M pages
3. If you do pin the SGA, make sure you also pin the kernel with vmm_klock_mode=2
Note:
MEMORY_TARGET/MEMORY_MAX_TARGET are not hard limits and Oracle can utilize significantly more memory for
PGA if needed. With Oracle 12c, Oracle also seems to take configured paging space into account as “memory” when
calculating the active limits.

263GB
210GB (> than physical memory !)
© Copyright IBM Corporation 2018 27
SGA_MAX_SIZE and LOCK_SGA implications (12c, 11g, 10.2.4.0+)

LOCK_SGA=false Preferred
• Oracle dynamically allocates memory for the SGA only as needed up to the
size specified by SGA_TARGET
• SGA_TARGET may be dynamically increased, up to SGA_MAX_SIZE
• 64K pages automatically used for SGA if supported in the environment.
– If needed, 4K (or 16M) pages are converted to 64K pages.
– Down-conversion of 16M pages to 64K pages is only triggered at DB
startup if needed.
– After startup, additional unused 16M pages are not converted, even if not
enough 4K or 64K pages are available potential for paging to paging
space.

Note: If you utilize environment variable ORACLE_SGA_PGSZ to set SGA memory page size manually,
then Oracle will allocate all memory specified via sga_max_size at startup! Memory is not pinned.

© Copyright IBM Corporation 2018 28


April 2013
LOCK_SGA=TRUE implications (12c, 11g, 10.2.4.0+)

LOCK_SGA=true Discouraged
• Oracle pre-allocates all memory as specified by SGA_MAX_SIZE and pins it in memory,
even if it’s not all usable (i.e. SGA_TARGET < SGA_MAX_SIZE)
• If sufficient 16M pages are available, those will be used. Otherwise, all the SGA
memory will be allocated from 64K (if supported) or 4K pages (if 64K pages are not
supported). If needed, 4K or 16M pages will be converted to 64K pages, but 16M
pages are never automatically created.
• If a value for SGA_MAX_SIZE is specified larger than the amount of available memory
for computational pages, the system can become unresponsive due to system paging.
• If the specified SGA_MAX_SIZE is much larger than the currently available pages on
the combined 64K and 16M page free lists, the database startup may fail with error:
“IBM AIX RISC System/6000 Error: 12: Not enough space”. In this case re-try to start
the database.

© Copyright IBM Corporation 2018 29


April 2013
AIX Multiple Page Size Support and Oracle Database - Summary
4K (Default)
– Always used for filesystem cache
– Can be paged to paging space
– Can be coalesced to create 64K pages if required
– Used system wide if Active Memory Sharing (AMS) or AME is used (AIX 7.2 TL1+ on POWER8 supports 64KB AME pages!)
– Typically used on older hardware which does not support 64K pages, or with older Oracle versions (< 10.2.0.4)

64K available with POWER5+ and later & AIX 5.3 TL4+
– Can be paged to paging space
– Can be converted to 4K pages if not enough 64K pages are available Preferred
– Can be utilized for application code, data and stack as well, if specified for Oracle
– Kernel page size used in AIX 6.1, AIX 7.1 and AIX 7.2 (can be configured) DB!
– In 11g and later Oracle will automatically use 64k for SGA if supported by system
– May also be used for program data, text and stack areas by setting:
export LDR_CNTRL=DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K oracle

16M (Large Pages) available with POWER4 hardware (or later)


– Must be explicitly preconfigured and reserved, even if not being used Tip: Read up on new vmo parameters pgz_* in AIX 7.2 TL2.
– Are pinned in memory
– Unused 16M pages can be converted to 4K or 64K pages if required, but AIX will never automatically create 16MB pages.
– Cannot be paged to paging space
– With AIX 7.2, AIX can dynamically aggregate 64KB pages in the SGA to “special” 16MB pages without having them configured or pre-allocated.
– If improperly configured, can contribute to severe system paging and kernel panics

16G (Huge Pages) available with POWER5+ and later & AIX 5.3 TL4+ and later AIX releases
– Must be explicitly preconfigured and reserved, even if not being used – Configured via HMC and requires physical server to be powered off.
– No automatic conversion and any change in assignment to a LPAR requires at minimum involved LPAR to be powered off
– Requires at minimum 3 additional 16GB pages above what is specified via sga_max_size
© Copyright IBM Corporation 2018 30
Agenda

– Memory
– CPU
– I/O
– Network
– Oracle Patches
– Miscellaneous

© Copyright IBM Corporation 2018 33


POWER9 SMT scalability proof point
Based on IBM internal OLTP workload testing,
actual SMT scalability

SMT1: Largest unit of execution work

SMT2: Smaller unit of work, but provides


greater amount of execution work per cycle
SMT4: Smaller unit of work, but provides
greater amount of execution work per cycle
SMT8: Smallest unit of work, but provides the
maximum amount of execution work per cycle
Can dynamical shift between modes as required:
SMT1 / SMT2 / SMT4 / SMT8

POWER6 supports SMT 1/2


POWER7/7+ supports SMT 1/2/4 (SMT 4 default)
POWER8 supports SMT 1/2/4/8 (SMT 4 default)
POWER9 supports SMT 1/2/4/8 (SMT 4/8 default)
© Copyright IBM Corporation 2018 Based on IBM internal OLTP workload performance testing. Your results may vary! 34
Virtual Shared Processor Pools – Licensing Benefits
Server with 24 processor cores

POWER6/7/8/9 Multiple shared n5 n6 n7 n8 n9 n10 n11


pools: Uncapped Uncapped Uncapped Uncapped Uncapped Uncapped Uncapped
AIX 7.1 AIX 7.2 AIX AIX AIX Ubuntu Linux SUSE Linux
Can reduce the number of
EBS EBS EBS
software licenses by putting a limit
Oracle DB Oracle DB App1 App2 QA MongoDB Redis
on the amount of processors an
uncapped partition can use
Ent. = 2.5 Ent. = 1.70 Ent. = 2.00 Ent. = 2.5 Ent. = 0.5 Ent. = 1.00 Ent. = 1.50
Up to 64 shared pools VP = 5 VP = 3 VP = 4 VP = 4 VP = 3 VP = 4 VP = 3

CUoD n1 n2 n3 n4 Shared processor pool #1 Shared processor pool #2 Shared proc. pool #3
VIOS VIOS IBM i Linux Max Cap: 5 processors Max Cap: 6 processors Max Cap: 4 processors
Physical Shared Pool (12 processor
cores)
7 1 1 2 1 1 2 3 4 5 6 7 8 9 10 11 12

Oracle DB cores to license:


5 from shared proc. pool 1
Activate and
=5 Oracle DB core – license factors: license only
POWER6: 1.0 1 incremental
EBS cores to license: POWER7/7+: 1.0
core at a time!
6 from shared proc. pool 2
=6 POWER8: 1.0
POWER9: 1.0

© Copyright IBM Corporation 2018 35


Raw Throughput versus Scaled Throughput
Only supported on Power7 and later with AIX 6.1 TL8, AIX 7.1 TL2, AIX 7.2 (vpm_throughput_mode)
Raw throughput (value of 1 or 0) Scaled throughput / SMT2 Scaled throughput / SMT4 Scaled throughput / SMT8
Default scheduling mode Tuned mode (2) Tuned mode (4) Tuned mode (8)
Workload spread across primary Workload spread across primary and Workload spread across primary, Workload spread across all SMT
SMT threads secondary SMT threads secondary and tertiary SMT threads threads

Core 1 P S T T CPU 0-3 P S T T CPU 0-3 P S T T CPU 0-3 P S T T T T T T CPU 0-7


Quickest unfolding

Slower unfolding

Slowest unfolding
Slower unfolding
Core 2 P S T T CPU 4-7 P S T T CPU 4-7 P S T T CPU 4-7 P S T T T T T T CPU 8-15

Core 3 P S T T CPU 8-11 P S T T CPU 8-11 P S T T CPU 8-11 P S T T T T T T CPU 16-23

P S T T CPU 12-15 P S T T CPU 12-15 P S T T CPU 12-15 P S T T T T T T CPU 24-31

P S T T CPU 16-19 P S T T CPU 16-19 P S T T CPU 16-19 P S T T T T T T CPU 32-39

P S T T CPU 20-23 P S T T CPU 20-23 P S T T CPU 20-23 P S T T T T T T CPU 40-47

P S T T CPU 24-27 P S T T CPU 24-27 P S T T CPU 24-27 P S T T T T T T CPU 48-55

P S T T CPU 28-31 P S T T CPU 28-31 P S T T CPU 28-31 P S T T T T T T CPU 56-63

P S T T CPU 32-35 P S T T CPU 32-35 P S T T CPU 32-35 P S T T T T T T CPU 64-71

P S T T CPU 36-39 P S T T CPU 36-39 P S T T CPU 36-39 P S T T T T T T CPU 72-79

P S T T CPU 40-43 P S T T CPU 40-43 P S T T CPU 40-43 P S T T T T T T CPU 80-87

P S T T CPU 44-47 P S T T CPU 44-47 P S T T CPU 44-47 P S T T T T T T CPU 88-95

Performance Performance Performance Performance


Best application response time Higher application response time Higher application response time Worst application response time
Best application throughput Lower physical consumed Lower physical consumed Lowest physical consumed
Highest physical consumption

P Primary SMT thread S Secondary SMT thread T Tertiary SMT thread busy idle

vpm_throughput_core_threshold: Specifies the number of cores that must be unfolded before vpm_throughput_mode parameter
is honored (Default: 1). If fewer processors are unfolded, the system behaves like vpm_throughput_mode parameter set as 1.
© Copyright IBM Corporation 2018 36
CPU Related Oracle Parameters

Oracle Parameters based on the # of logical CPUs


– Parameters
• CPU_COUNT = # of logical CPUs
• DB_WRITER_PROCESSES = 1 per every 8 logical CPUs
• GCS_SERVER_PROCESSES

– Degree of Parallelism
• Can be set at the user level, table level, or query level
• Restricted by PARALLEL_MAX_SERVERS
• Default setting = 1
• Default degree = (CPU_COUNT * PARALLEL_THREADS_PER_CPU)

– Cost Based Optimizer (CBO)


• execution plan may be affected; check explain plan

© Copyright IBM Corporation 2018 38


CPU Recommendations
For SMT8, test and evaluate SMT4/8 respectively prior to selecting SMT mode
– For SMT8, consider reducing DB_WRITER_PROCESSES
– When using parallel query, consider reducing or restricting default parallel degree
– RAC environments may also benefit from tuning GCS_SERVER_PROCESSES
– SMT-8 mode should be the starting point for POWER9 based systems

Set PARALLEL_THREADS_PER_CPU=1 (at least with SMT 4 or SMT 8, potentially SMT 2 as well)

Dedicated vs. Shared


– Dedicated CPUs may provide better performance, lower latency, for single or lightly threaded workloads
– Shared CPUs may provide better price/performance and/or greater ability to support peak workload
demand

© Copyright IBM Corporation 2018 39


CPU Recommendations

Micropartitioning Guidelines
– Virtual CPUs (vCPUs) should always be <= physical processors in shared CPU pool
– Use default processor folding behavior unless IBM AIX support recommends otherwise
CAPPED
– vCPUs should be the nearest integer >= capping limit
UNCAPPED
– vCPUs should be set to the max peak demand requirement
– Preferably, number of vCPUs should not be more than 1.5x to 2x entitlement
DLPAR considerations
Oracle 10g/11g/12c
– Oracle CPU_COUNT dynamically recognizes change in # cpus (physical and logical)
– Max CPU_COUNT limited to 3x CPU_COUNT at instance startup
This restriction does not exist in 12c

Note number of LP related bug in Oracle 12.1 –


Patch 18775971: ORA-04031: UNABLE TO ALLOCATE ("SHARED POOL","UNKNOWN OBJECT","PDB DYN …), see note: 2225248.1

© Copyright IBM Corporation 2018 40


Virtual Processors - Folding
Dynamically adjusts active Virtual Processors
– System consolidates loads onto a minimal number of VPs
• Scheduler computes utilization of VPs every second and calculates needed VP as: ceiling(phys_util + vpm_xvcpus)
– If VPs needed to host physical utilization is less than the current active, a VP is put to sleep
– If VPs needed are greater than the current active VPs, more are enabled

– Folding active by default in AIX 5.3 ML3 and later


• vpm_xvcpus tunable
• vpm_fold_policy tunable

Increases processor utilization and affinity


– Inactive VPs don’t get dispatched and waste physical CPU cycles
– Fewer VPs can be more accurately dispatched to physical resources by the Hypervisor

Monitoring
– mpstat -s
– nmon -> ‘c’ – this is an estimated value

RAC and Oracle Clusterware Best Practices and Starter Kit (AIX) 811293.1
– Older versions of this document recommended – incorrectly – disabling VP folding
– Document has been modified to correctly reflect that current TL levels should be used for support of processor folding

When to Adjust – Check with IBM support before changing!


– Burst/Batch workloads with short response time requirements may need sub-second dispatch latency - Disable processor folding or,
preferred, manually tune the # of VPs
• # schedo –o vpm_xvcpus=[-1 | N ] (For Oracle RAC this needs to be set to at minimum 2.)
• Where N= # of VPs to enable in addition to VPs for physical CPU utilization; a value of 2 or 3 seems to work well.
• -1 disables folding
© Copyright IBM Corporation 2018
POWER9 + shared CPU LPAR may need: schedo –p –o vpm_fold_threshold=29
41 41
Maximum Performance Mode(MPM) – POWER9

MPM enables increased performance over DPM P9 EnergyScale Modes


for nominal environmental conditions
– Takes advantage of nominal environmental conditions by
allowing increased CPU frequency and power draw
– Takes advantage of lighter workloads and lower active
core counts, just like DPM
– Socket Idle state remains at high frequency

Frequency
Determinism
– Under nominal environmental conditions, the same
workload running on the same system configuration will
result in the same performance

Note: The graph is for example only. Actual results will vary based on
system model, system configuration, supported processor core count, and
active processor core count.

Load Level

Static Nominal mode Dynamic Performance mode


Power Saver mode Maximum Performance mode
© Copyright IBM Corporation 2018 42
Agenda

– Memory
– CPU
– I/O
– Network
– Oracle Patches
– Miscellaneous

© Copyright IBM Corporation 2018 43


AIX IO - Queues

queue_depth

pbuf num_cmd_elems

pbuf

pbuf

pbuf

Note: RAW LV are not supported with Oracle 12c databases, except as “devices” used in ASM.

© Copyright IBM Corporation 2018 45


April 2013
I/O Stack Tuning Options (Device Level)
Disk
queue_depth - maximum # of concurrent active I/Os for an hdisk / hdiskpower; additional I/O
beyond that limit will be queued. Recommended/supported maximum is storage subsystem
dependent.
max_transfer - the maximum allowable I/O transfer size (default is 0x40000 or 256k).
Maximum supported value is storage subsystem dependent. All current technology supports
1MB I/O size set to 0x100000.

Fiber Channel Disk Adapter (fcsn)


num_cmd_elems - maximum number of outstanding I/Os for an adapter. Check APARs with NPIV: IV90915, IV91042, IV76258
set to 1024 or 2048 (within storage subsystem vendor guidelines)
max_xfer_size - Increasing value (to at least 0x200000) will also increase DMA size from 16
MB to 256 MB, but this should only be done after IBM support has directed you to do so, as it
can lead in specific configurations to system stability issues or AIX not being able to boot.
dyntrk - when set to yes (recommended), allows for immediate re-routing of I/O requests to an
alternative path when a device ID (N_PORT_ID) change has been detected; only applies to
multi-path configurations.
fc_err_recov - when set to “fast_fail” (recommended), if the driver receives an RSCN
notification from the switch, the driver will check to see if the device is still on the fabric and will
flush back outstanding I/Os if the device is no longer found.
To validate / change current parameter settings use: “lsattr”, “chdev”

© Copyright IBM Corporation 2018 46


Optimize IOPS and throughput with VSCSI specific settings

• Use fast (rather than delayed) fail over for multipath environments:
# chdev –l vscsi0 –a vscsi_err_recov=fast_fail
• Allow the client adapter to check the health of the VIO server vscsi path
# chdev –l vscsi0 –a vscsi_path_to=30

• queue_depth on the client LPAR vscsi disks and VIO server hdisks should match
• Calculate the max # of luns for a VSCSI adapter and configure this # or fewer. If more
LUNs are needed create additional VSCSI adapters.
Max = (# command elements - # cmd elems reserved for adapter – 3 cmd elems per LUN)
= (512-2)/(queue_depth + 3).
Example:
queue_depth = 32, (512-2)/(32+3) = maximum 14 LUNs per VSCSI adapter

© Copyright IBM Corporation 2018 47


Looking for Buffer Structure Shortages

# vmstat -v |tail -5 we only need to look at the last 5 lines


0 pending disk I/Os blocked with no pbuf if blocked on pbuf, increase
pv_min_pbuf (lvmo) and
varyoff/varyon VG

0 paging space I/Os blocked with no psbuf if blocked on psbuf, stop paging or
add more paging spaces

2484 filesystem I/Os blocked with no fsbuf if blocked on fsbuf (JFS), increase
numfsbufs (ioo restricted) to 1568

0 client filesystem I/Os blocked with no fsbuf if blocked on client fsbuf (NFS/Veritas)
increase nfso nfs_vX_pdts and
nfs_vX_vm_bufs values (“X” = 2,3, or 4)

0 external pager filesystem I/Os blocked with no fsbuf if blocked on JFS2 fsbuf,
1) increase j2_dynamicBufferPreallocation (ioo) to 128 (or higher)
2) If that is not sufficient, increase j2_nBufferPerPagerDevice (ioo restricted)
to 2048 and unmount/remount JFS2 filesystems

TIP: For more details on LV / FS buffer structures and tuning:


http://www-01.ibm.com/support/docview.wss?uid=isg3T1025198

© Copyright IBM Corporation 2018 48


April 2013
Data Striping to Avoid I/O Hotspots

Old Wisdom
Isolate files based on function and/or usage
– Manually intensive effort
– Leads to I/O hotspots over time that impact throughput capacity and
performance

New Wisdom
Stripe objects across as many physical disks as possible
– Minimal manual intervention
– Evenly balanced I/O across all available physical components
– Good average I/O response time and object throughput capacity with no
hotspots

Implementation Options:
– ASM and GPFS do this automatically within a given disk group or file system
– Can be implemented with conventional Volume Managers and file systems

© Copyright IBM Corporation 2018 49


Data Layout for Optimal I/O Performance

Example…
2. Stripe or spread individual objects across multiple LUNs (hdisks) for
maximum distribution
– Each object is spread across 4 LUNs, each from different array (16 drives)

AIX Storage
Volume (Disk) Group HW Striping

AIX LVM striping with JFS2 or


hdisk 1 LUN 1
ASM Disk Group or
SW Striping

IBM GPFS

hdisk 2 LUN 2

hdisk 3 LUN 3

hdisk 4 LUN 4
2

Note: ASM, AIX LVM with FS or GPFS can not share the same hdisks.

© Copyright IBM Corporation 2018 51


AIX Storage Logical Volume Management
Volume Group (VG)
my_vg
One hdisk belongs to 0 or 1 VG
hdisk Logical
Volume The size of an hdisk, SAN LUN for example,
Physical 1 (LV) can be increased as needed and the VG
Partitions 2 lv00 increases in size accordingly
(PP)
14 mirrored
An AIX system has 1 or more VG
strict
A VG contains LP from 0 or more LV
hdisk
A LV only contains PP from a single VG; it
Physical 1 Logical can not spread over multiple VG
Volumes 2 Volume
(LV) A LV can be increased in size up to the
25 orabin
number of PP in the VG; the size of a LV can
maximum
also be decreased, but only unused LP can
spread be freed up
hdisk LV can be created without software RAID,
Logical with RAID-0 and RAID 0+1
36 Volume
Logical
Partitions (LP) 1 (LV) LV can be used “raw” or a JFS, JFS2 file
from LV 2 test system is created on top of it
perspective 3 minimum
spread
“minimum spread” is AIX smit(ty) default !

© Copyright IBM Corporation 2018 52


Software Striping with AIX – LV striping & PP spreading - HOWTO

Stripe using Logical Volume (LV) Be aware of implications


– Create Logical Volume with the striping option : mklv –S <strip-size> ... for future growth of LV!
– Oracle recommends a stripe size of a multiple of
Striped LV need free
db_block_size * db_file_multiblock_read_count (Usually 1 MB)
space on all underlying
– Valid LV Strip sizes:
LUNs to be able to grow!
AIX 7.2, 7.1, 6.1, 5.3: 4k to 128M in powers of 2.
– Use Scalable Volume Groups (VGs) SAN based dynamic
grow of LUN works well!

PP striping (AKA “PP spreading”)


– Create a Volume Group with a 8M,16M or 32M PPsize, but keep number of PP in VG < 30,000.
(PPsize will be the “strip size”)
– Choose a Scalable Volume Group : # mkvg –S –s <PPsize> ...
– Create LV with “Maximum range of physical volume” option to spread PPs on different hdisk in
a Round Robin fashion : # mklv –e x ...
Note:
If you create a JFS2 filesystem on a striped (or PP spreaded) LV, you can utilize the INLINE logging option. This
will avoid « hot spot » by creating the jfs2 redo log inside the filesystem itself, which is striped, instead of using a
single PP stored on 1 hdisk. (# crfs –a logname=INLINE …)

© Copyright IBM Corporation 2018 53


Oracle Server Architecture – Files

PGA
PGA PGA
RVWR PMON SMON
PGA

System Global Area (SGA) ARC0 Archive


Logs

Flash-
Shared Pool DB Buffer In-Memory Redo Log PGA
back Log Cache Area Buffer

LGWRn
PGA

DBWn CKPT Redo


Logs

User PGA
PGA Control
Files

PGA D000
DB Oracle
Files Binaries

• SGA is Shared among processes


• PGA is Private to an individual server
or background process

© Copyright IBM Corporation 2018 54


April 2013
Oracle Options for Data Storage on AIX

ACFS ACFS
(JFS) / JFS2 RAW LV GPFS ASM
(11.2.0.2) (12.2)*

Database
Files

Redo Log
Files

Control Files

Archive Log
Files

Oracle
Binaries

Unsupported upgrade after 11gR2, or new installs with 11gR2 or later.

* Note that there are restrictions in combination with Oracle Restart

© Copyright IBM Corporation 2018 55


April 2013
Improving I/O performance

Reduce the Amount of Physical I/O


– Improve query execution plans
– Improve database cache hit ratio (increase db_cache_size) and/or increase PGA

– Use the Oracle In-Memory Option (chargeable feature)

Reduce the Cost of Physical I/O (Improve Service Time)


– Use Flash or SSD

– Improve storage subsystem layout

– Spread IO over more physical resources

– Increase the disk RPM

© Copyright IBM Corporation 2018 56


Reduce the Cost of IO - Improving redo writes
Flash for Redos

Redo logs on
HDD

Redo logs on
FlashSystems

4K redo log option in 11.2.0.3+ (can benefit Flash, with JFS2 or ASM)
Watch out for
redo wastage
Set database instance parameter
reported in AWR.
“_disk_sector_size_override”=TRUE
Then add 4K logfiles and delete old 512Byte log files: For Oracle 12.2 or later also see:
SQL> alter database add log file ‘+RECO’ size 5G blocksize 4096; https://docs.oracle.com/en/database/oracle/oracle-
database/12.2/ostmg/create-diskgroups.html#GUID-CACF13FD-1CEF-
4A2B-BF17-DB4CF0E1800C
© Copyright IBM Corporation 2018 57
Improve Database IO service time with Host Level Striping

ASM
– Stripes by default when multiple LUNs configured per ASM disk group.
– 10gR2 Strip size is 128k (Fine-grained) or Allocation Unit (AU) Size (Coarse-grained)
– 11g+ Strip size = Allocation Unit (AU) Size, default = 1 MB
– The AU size can be changed at the Disk Group level. An example would be 4MB or 8MB size for
data warehouse type of workload.

Oracle single-instance filesystems (JFS2)


– use AIX Physical Partition (PP) Spreading or coarse LV striping
» LV Strip sizes: 4k-128M, 1MB most common for Oracle database
» PP Strip sizes: PP striping works with PP sizes of up to 32MB (large DBs up to 64MB)
– Use Scalable Volume Groups (VGs)

GPFS (Elastic Storage)


– Stripes by default when multiple LUNs configured per filesystem.
– Strip size (based on block size) is configurable

© Copyright IBM Corporation 2018 58


Tuning Oracle DB Buffer Cache

Buffer Cache is the primary database I/O avoidance option!


Old Wisdom
If the buffer hit% is > 90% it’s good enough

New Wisdom:
Depending on workload, a higher hit% may provide significant improvements
– For a given workload with a buffer hit% of 98%, a 1% increase (to 99%) will reduce
physical I/O requests by 50%
– Reducing IOPS typically also improves response time for remaining I/Os
– In many cases, adding server memory may be cheaper than adding I/O subsystem
cache memory or short-stroking disks
Evaluate impact of increasing db_cache_size on physical I/O
Monitor for and address potential impact:
– Increased logical read rates and higher peak CPU demand due to reduced I/O wait
time (increase CPU capacity as appropriate to benefit from reduced IO wait time)
– System paging due to memory shortage (add physical memory as necessary)

© Copyright IBM Corporation 2018 59


Reduce the amount of Physical IO

9 TB customer DB with batch transactional query workload


• Same workload benchmarked with 6.5 GB and 101.5 GB DB buffer
cache configurations

Buffer Cache tuning resulted in: Total “In DB” Time


3,500 350,000
• 91% reduction in User I/O Wait
Other
• 89% reduction in Other (e.g. latch) wait 3,000 DB CPU
300,000

• 79% reduction in total “In DB” time

Total DB Time (Hours)


User I/O
2,500 250,000
IOPS
2,000 200,000

IOPS
1,500 150,000

1,000 100,000

500 50,000

0 0
6.5 101.5
Buffer Cache Size (GB)

© Copyright IBM Corporation 2018 60


JFS2 Filesystem Mount Options (Non-RAC)

Common Mount options for Oracle file systems:


– Buffer Caching (default): stage data in file system buffer cache
• May work well for heavy sequential read activity with few updates

– Concurrent I/O (CIO): non-cached reads + no write serialization (JFS2 only)


• Best for random reads and/or heavy update workloads

– Release Behind Write (RBW): memory pages released (available for stealing) after pages
written to disk

– Release Behind Read (RBR): memory pages release after pages read from cache

– Release Behind Read / Write (RBRW) – combination of RBR and RBW

– No Access Time (NOATIME): do not update last accessed time when file is accessed

© Copyright IBM Corporation 2018 61


Filesystems Mount Options (DIO, CIO)

Note that access to a single data file is illustrated. There is one independent “inode lock” per file
to control concurrent access.
April 2013

© Copyright IBM Corporation 2018 62


JFS2 settings - How to configure CIO
Oracle Version < 11.2.0.2
– If Oracle files do not need to be concurrently accessed by external utilities, set filesystemio_options=SETALL
– Otherwise set filesystemio_options=ASYNCH and use dio (JFS) or cio (JFS2) mount option

Oracle Version >= 11.2.0.2 (and >= AIX 6.1)


– Oracle utilizes the O_CIOR call instead of O_CIO. This enables shared read access to other utility programs.
– EITHER:
• Use filesystemio_options=SETALL and do NOT use cio mount options
• CIO mount options may still be used if filesystemio_options=ASYNCH
Create a separate filesystem for redo logs and control files with agblksize=512 unless the disk sector size is 4k.
If the disk sector size for redo is 4k and the database version is 11.2.0.3+, use the default JFS2 agblksize and
create redo logs with a block size of 4k (MOS # 1681266.1)
Note that “redo wastage” can increase significantly with 4k redo block size.

When using DIO/CIO, FS buffer cache isn’t used. Consider the following Oracle database changes:
Increase db_cache_size
Increase db_file_multiblock_read_count (With 11gR2, 12c use database default!)

AIX APAR IV76026: CIO/DIO ENABLED FILESYSTEMS CAN CRASH THE SYSTEM WITH ISI_PROC (affects AIX 6.1, 7.1 and 7.2 releases)

© Copyright IBM Corporation 2018 63


JFS/JFS2 settings
Data Base Files (DBF)
• I/O size ranges from db_block_size to db_block_size * db_file_multiblock_read_count
• Use CIO (or DIO for JFS) or filesystem cache, depending on I/O characteristics
• If database block size is >=4096, use a filesystem block size of 4096, else use 2048

Redo Log/Control Files


• I/O size is always a multiple of 512 bytes
• Use CIO (or DIO for JFS), dedicated FS(s) and set filesystem block size (agblksize) to 512 unless 4k sector size disks are used.

Archive Log and Backup Files


• ‘rbrw’ mount option can be advantageous
• ‘CIO’ used for Archive Logs when filesystemio_options=SETALL

Flashback Log Files


• Writes are sequential, sized as a multiple of db_block_size
• By default, dbca will configure a single location for – the flash recovery area - for flashback logs, archive logs, and backup logs
• Flashback log files should use CIO, DIO, or ‘rbrw’ mount

Oracle Binaries
• Don’t use CIO or DIO
• Use NOATIME to reduce ‘getcwd’ overhead

© Copyright IBM Corporation 2018 64


Asynchronous I/O for filesystem environments

AIX parameters (not relevant if aio_fsfastpath=1 active)


aio_minservers = 3 ; minimum # of AIO server processes per logical CPU
aio_maxservers = 30 ; maximum # of AIO server processes per logical CPU
aio_maxreqs = 65536 ; maximum # of concurrent AIO requests
“enable” at system restart (not required with AIX 6.1, AIX 7.1 or AIX 7.2)
aio_server_inactivity = 300 ; time before idle AIO processes will be terminated
(AIX 6.1 and later only)
AIX APAR IV70032: CREATING AIOSERVER THREADS DELAYS
WHILE HOLDING LOCK (for releases up to AIX 7.1 TL3)
Slow startup of AIO processes (workaround: use kernelized AIO,
or increase aio_minservers and set aio_server_inactivity to 86400

Oracle parameters
disk_asynch_io = TRUE
filesystemio_options = {ASYNCH | SETALL}
db_writer_processes (typically let default)
db_writer_io_slaves (do not set when using AIO)

© Copyright IBM Corporation 2018 65


AIX rendev command with ASM
rendev command is used for renaming devices which are listed in ODM

Syntax / Description
– rendev –l <original name> -n <new name>
– The device entry under /dev will be renamed corresponding to <new name>
– Certain devices such as /dev/console, /dev/mem, /dev/null, and others that are identified only with /dev special
files cannot be renamed
– Command will fail for any device that does not have both a Configure and an Unconfigure method
– Any name that is 15 characters or less and not already used in the system can be used

If used to rename hdisk devices for ASM use, it is recommended that you keep the "hdisk" prefix, as
this will allow the default ASM discovery string to match the renamed hdisks. Corresponding rhdisk is
renamed as well.
Example:
# rendev –l hdisk10 –n hdiskASM10
# ls /dev/*ASM*
/dev/hdiskASM10
/dev/rhdiskASM10

© Copyright IBM Corporation 2018 67


April 2013
AIX lkdev command with ASM
The lkdev command locks the specified device. Any attempt to modify device attributes by using
the chdev or chpath command is denied. In addition, an attempt to delete the specified device or
one of its paths from the ODM by using either the rmdev or rmpath command is denied.

Syntax:
lkdev [ -l <Name> -a | -d [ -c <Text> ] ]
<Name>Name of device to be changed (required)
-a Locks the specified device.
-d Unlocks the specified device.
-c <Text> Specifies a text of up to 64 printable characters with no embedded spaces.
Examples:
– To enable the lock for the hdiskASM10 disk device and create a text label, enter the following command:
# lkdev -l hdiskASM10 -a -c ASMdisk
– To remove the lock for the hdisk1 disk device and remove the text label, enter the following command:
# lkdev -l hdiskASM10 -d

Note:
The text label of a locked device can not be changed! Instead, the device needs to
be first unlocked and then locked again with the new text label specified.

© Copyright IBM Corporation 2018 68


April 2013
AIX lkdev command (continued)
Additional information:

– lkdev with no parameters will return a list of all locked devices, including any defined label
information
# lkdev
hdiskASM10 asmdisk

The lspv command has been extended to display the device status of a locked device.
$ lspv
hdisk0 00f623c450d9a96f rootvg active
hdisk1 00f623c469960b72 orabinvg active
hdiskASM10 none asmdisk locked

Note re Oracle database with ASM connected to IBM Storwize V7000


Oracle ASM disk groups may dismount with the following error
"Waited 15 secs for write IO to PST"

Recommendation
Increase the asm_hbeatiowait to 120 seconds to prevent this issue occurring.
Applies to Oracle Database - Enterprise Edition - Version 11.2.0.3 to 12.1.0.1 [Release 11.2 to 12.1] on any platform
© Copyright IBM Corporation 2018 69
April 2013
Oracle In-Memory – Impact on CPU / IO for data warehouse queries

Row format:
• 726GB fact table
• CPU bound
• Peak 5.5GB/s read
• Sustained > 2.5GB/s
• 12TB of data read from
disk!

IM format -
With In-Memory:
• 171GB compressed In-
Memory fact table
• CPU bound
• Peak 0.11MB/s read
• 8MB of data read from
disk!
© Copyright IBM Corporation 2018 70
Agenda

– Memory
– CPU
– I/O
– Network
– Oracle Patches
– Miscellaneous

© Copyright IBM Corporation 2018 71


Network parameters (no)
use_isno = 1 means any parameters set at the interface level override parameters set with ‘no’
– DEFAULT (restricted) in AIX 6.1

If use_isno = 0, any parameters set with ‘no’ override interface-specific parameters

If use_isno = 1, set parameters for each interface using ‘ifconfig’ or ‘chdev’

Refer to the following URL for a chart on appropriate interface-specific parameters:


https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.networkcomm/interfaces_options.htm

Generally appropriate parameters for 1 or 10 Gigabit Ethernet Oracle public network interfaces:
– tcp_sendspace = 262144
– tcp_recvspace = 262144
– rfc1323 = 1

New in AIX 7.1 tcp_fastlo parameter


no –p –o tcp_fastlo=1
(AIX 6.1 TL9 (SP3/SP4) - APAR IV67463, AIX 7.1 TL3 (SP3/SP4) - APAR IV66228 for memory leak issue )
– For improving performance of loopback connections

tcp_nodelay on the network adapter (do not set tcp_nodelayack via “no”)
– Useful for RAC interconnect and/or LAN connected application server and database
– chdev –l <enX> -a tcp_nodelay=1

© Copyright IBM Corporation 2018 72


Agenda

– Memory
– CPU
– I/O
– Network
– Oracle Patches
– Miscellaneous

© Copyright IBM Corporation 2018 73


Database Release Schedule

Source: Oracle
© Copyright IBM Corporation 2018 74
Agenda
AIX Configuration/Tuning for Oracle

– Memory&CPU
– I/O
– Network
– Oracle Patches
– Miscellaneous

© Copyright IBM Corporation 2018 75


AIX CRITICAL APAR

IBM HIPER APAR - ORA 600 ERRORS AND ORACLE CORE DUMPS AFTER AIX SP UPGRADE

PROBLEM SUMMARY:
The thread_cputime or thread_cputime_fast interfaces can cause invalid data in the
FP/VMX/VSX registers if the thread page faults in this function.

AFFECTED LEVELS and FIXES:

iFix / APAR
Affected AIX Levels Fixed In
ftp://aix.software.ibm.com/aix/ifixes/

6100-09-08 6100-09-09 IV93840


7100-03-08 7100-03-09 IV93884
7100-04-03 7100-04-04 IV93845
7200-00-03 7200-00-04 IV93883
7200-01-01 7200-01-02 IV93885

© Copyright IBM Corporation 2018 76


Miscellaneous parameters

/etc/security/limits
– Set to “-1” for everything except core for Oracle, grid and root users

sys0 maxuproc attribute


– Should be >= 16384
– For workloads with a large number of concurrent connections and/or parallel servers, should be at least 128 plus the sum
of PROCESSES and PARALLEL_MAXSERVERS for all instances in the LPAR

Environment variables:
– AIXTHREAD_SCOPE=S
– LDR_CNTRL settings – See Oracle 12.1.x and 11.2.0.4 Database Performance Considerations with AIX on POWER8
technical notes white paper (WP102608) for more details on how to set it

Time synchronization – For RAC environments, use the xntpd “-x” flag

Increase priority of Oracle/Grid users:

– chuser capabilities = CAP_NUMA_ATTACH,CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle


• CAP_NUMA_ATTACH gives authority for non-root processes to increase priority
• CAP_PROPAGATE allows parent->child capability propagation
• CAP_BYPASS_RAC_VMM required for oprocd,ocssd.bin to be pinned in memory.
© Copyright IBM Corporation 2018 77
Key Resources
Metalink Notes
Minimum Software Versions and Patches Required to Support Oracle Products on Power Systems – 282036.1

Things to Consider Before Upgrading to 11.2.0.3 to Avoid Poor Performance or Wrong Results – 1392633.1

Things to Consider Before Upgrading to 11.2.0.4 to Avoid Poor Performance or Wrong Results – 1645862.1

RAC and Oracle Clusterware Best Practices and Starter Kit (AIX) 811293.1

AIX: Top Things to DO NOW to Stabilize 11gR2 GI/RAC Cluster 1427855.1

Oracle Database on UNIX AIX, HP-UX, etc Unix Operating Systems Installation and Configuration Requirements Quick
Reference 169706.1

GPFS and Oracle RAC 1376369.1

Best Practices: Proactively Avoiding Database and Query Performance Issues – 1482811.1

Recommended Bundle Patches for AIX – 2067154.1

Recommended Bundle patch for AIX and 11.2.0.3 with critical fixes – 1528081.1

Recommended Bundle patch for AIX and 11.2.0.4 with critical fixes – 2022567.1

Recommended Bundle patch for AIX and 12.1.0.2 with critical fixes – 2022559.1
© Copyright IBM Corporation 2018 78
IBM Key Resources

IBM Techdocs - the Technical Sales Library http://www-03.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs

Oracle 12.1.x and 11.2.0.4 Database Performance Considerations with AIX on POWER8
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102608

IBM Power System, AIX and Oracle Database Performance Considerations (for Oracle versions up to 11.2.0.3)
https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102171

Oracle Database 11g and 12c on IBM Power Systems S924, S922 and S914 with POWER9 processors
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102750

Oracle on IBM Power Technology: Adoption Roadmap


http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS4711

© Copyright IBM Corporation 2018 79


IBM Performance Detective for Oracle database
Top 10 Findings/Recommendations
Overall Foreground Wait Activity
Foreground Wait Activity Detail
Buffer Pool Tuning Advisory
Key initialization parameters

IBM Performance Detective for Oracle database


is a no charge service offering provided by the
North America Power Technical Team – Oracle.
Contact your IBM seller or IBM Business partner
for additional details!

© Copyright IBM Corporation 2018 80


Q&A

© Copyright IBM Corporation 2018 81


Appendix

© Copyright IBM Corporation 2018 82


Displaying Memory Usage Statistics
The ‘vmstat’ command provides information on current memory usage:
# vmstat -v
1048576 memory pages
1002006 lruable pages
812111 free pages
1 memory pools
141103 pinned pages
80.0 maxpin percentage
3.0 minperm percentage
90.0 maxperm percentage
3.2 numperm percentage
32779 file pages
0.0 compressed percentage
0 compressed pages
0.0 numclient percentage
90.0 maxclient percentage
0 client pages
0 remote pageouts scheduled
0 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
2484 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
0 external pager filesystem I/Os blocked with no fsbuf

© Copyright IBM Corporation 2018 83


Displaying Memory Usage Statistics
The ‘svmon -G’ command provides information on current memory usage per page size: (general
numbers are reported in 4K pages)

# svmon -G
size inuse free pin virtual
memory 1179648 926225 290287 493246 262007
pg space 1572864 5215

work pers clnt other


pin 91390 0 0 74176
in use 258573 4316 335656

PageSize PoolSize inuse pgsp pin virtual


s 4 KB - 477713 5215 94606 141175
m 64 KB - 7552 0 4435 7552
L 16 MB 80 0 0 80 0

Unused, pre-allocated 16M pages = PoolSize – inuse


Memory reported in 4K pages

© Copyright IBM Corporation 2018 84


Notices and disclaimers
© 2018 International Business Machines Corporation. No part of this Performance data contained herein was generally obtained in a controlled,
document may be reproduced or transmitted in any form without isolated environments. Customer examples are presented as illustrations
written permission from IBM. of how those

U.S. Government Users Restricted Rights — use, duplication or customers have used IBM products and the results they may have
disclosure restricted by GSA ADP Schedule Contract with IBM. achieved. Actual performance, cost, savings or other results in other
operating environments may vary.
Information in these presentations (including information relating to
products that have not yet been announced by IBM) has been reviewed References in this document to IBM products, programs, or services does
for accuracy as of the date of initial publication and could include not imply that IBM intends to make such products, programs or services
unintentional technical or typographical errors. IBM shall have no available in all countries in which IBM operates or does business.
responsibility to update this information. This document is distributed
“as is” without any warranty, either express or implied. In no event, Workshops, sessions and associated materials may have been prepared by
shall IBM be liable for any damage arising from the use of this independent session speakers, and do not necessarily reflect the views of
information, including but not limited to, loss of data, business IBM. All materials and discussions are provided for informational purposes
interruption, loss of profit or loss of opportunity. IBM products and only, and are neither intended to, nor shall constitute legal or other
services are warranted per the terms and conditions of the agreements guidance or advice to any individual participant or their specific situation.
under which they are provided.
It is the customer’s responsibility to insure its own compliance with legal
IBM products are manufactured from new parts or new and used parts. requirements and to obtain advice of competent legal counsel as to
In some cases, a product may not be new and may have been the identification and interpretation of any relevant laws and regulatory
previously installed. Regardless, our warranty terms apply.” requirements that may affect the customer’s business and any actions the
customer may need to take to comply with such laws. IBM does not provide
Any statements regarding IBM's future direction, intent or product legal advice or represent or warrant that its services or products will ensure
plans are subject to change or withdrawal without notice. that the customer follows any law.

© Copyright IBM Corporation 2018 85


Notices and disclaimers
continued
Information concerning non-IBM products was obtained from the IBM, the IBM logo, ibm.com and [names of other referenced IBM
suppliers of those products, their published announcements or other products and services used in the presentation] are trademarks of
publicly available sources. IBM has not tested those products about this International Business Machines Corporation, registered in many
publication and cannot confirm the accuracy of performance, jurisdictions worldwide. Other product and service names might
compatibility or any other claims related to non-IBM be trademarks of IBM or other companies. A current list of IBM
products. Questions on the capabilities of non-IBM products should be trademarks is available on the Web at "Copyright and trademark
addressed to the suppliers of those products. IBM does not warrant the information" at: www.ibm.com/legal/copytrade.shtml.
quality of any third-party products, or the ability of any such third-party
products to interoperate with IBM’s products. IBM expressly disclaims .
all warranties, expressed or implied, including but not limited to, the
implied warranties of merchantability and fitness for a purpose.

The provision of the information contained herein is not intended to,


and does not, grant any right or license under any IBM patents,
copyrights, trademarks or other intellectual property right.

© Copyright IBM Corporation 2018 86


© Copyright IBM Corporation 2018 87

S-ar putea să vă placă și