Documente Academic
Documente Profesional
Documente Cultură
Tactics for
Monitoring and
Performance-Tuning
AIX 5L/6.1
All original content by Earl Jew
FTSS Storage and System P
earlj@us.ibm.com (310)251-2907
AIX Virtual Users Group
April 30th, 2009
Network Information
Host Name: sapdb1
IP Address: 10.253.999.99
Sub Netmask: 255.255.255.128
Gateway: 10.252.999.9
Name Server: 10.244.999.99
Domain Name: comp.xxx.com
cpu min maj mpc int cs ics rq mig lpa sysc us sy wa id pc %ec lcs
0 16 24 85 454 577 278 0 52 100.0 1609 88.7 2.8 1.1 7.4 0.27 2.3 334
1 0 0 85 215 1 1 0 0 100.0 0 99.8 0.2 0.0 0.0 0.73 6.1 0
2 0 0 85 217 0 0 0 0 100.0 681 99.4 0.6 0.0 0.0 0.81 6.8 0
3 24 22 85 242 758 365 0 77 100.0 1936 83.8 2.4 2.5 11.3 0.19 1.5 459
4 0 0 85 222 18 9 0 0 100.0 261 99.7 0.3 0.0 0.0 0.74 6.2 0
5 32 10 85 227 363 142 0 117 100.0 974 91.0 1.0 0.4 7.6 0.26 2.1 247
6 0 0 85 226 102 32 0 49 100.0 484 96.7 1.6 0.0 1.7 0.40 3.3 44
7 0 0 85 227 5 1 0 0 100.0 44 99.9 0.1 0.0 0.0 0.60 5.0 0
8 0 0 85 267 3 2 0 3 100.0 295 99.6 0.4 0.0 0.0 0.90 7.5 16
9 18 4 1955 132 214 90 0 52 100.0 1219 61.1 6.0 0.1 32.9 0.10 0.9 223
10 0 0 85 213 1 1 0 0 100.0 200 99.7 0.3 0.0 0.0 0.73 6.1 0
11 3 3 85 226 344 137 0 103 100.0 624 92.8 0.8 0.6 5.9 0.27 2.2 236
12 1 2 85 233 346 159 0 55 100.0 704 93.9 0.7 0.2 5.2 0.32 2.7 278
13 0 0 85 233 38 16 0 14 100.0 128 99.4 0.2 0.0 0.4 0.67 5.6 26
14 5 7 85 230 413 174 0 66 100.0 527 93.1 0.9 0.1 6.0 0.31 2.6 235
15 0 0 85 234 8 4 0 0 100.0 3 99.8 0.2 0.0 0.0 0.69 5.7 0
16 0 0 85 274 0 0 0 0 100.0 325 99.6 0.4 0.0 0.0 0.52 4.4 0
17 0 0 85 215 1 1 0 0 100.0 0 99.9 0.1 0.0 0.0 0.48 4.0 0
18 0 0 85 214 0 0 0 0 100.0 0 99.9 0.1 0.0 0.0 0.66 5.5 0
19 8 10 85 226 130 69 0 5 100.0 437 95.4 0.4 0.2 4.0 0.34 2.8 103
20 2 3 85 238 266 96 0 101 100.0 271 95.5 0.8 0.2 3.5 0.45 3.7 161
21 8 6 85 236 198 87 0 58 100.0 400 96.1 0.8 0.2 2.9 0.46 3.9 148
22 0 0 85 229 0 0 0 0 100.0 0 99.8 0.2 0.0 0.0 0.67 5.6 0
23 7 16 85 230 357 180 0 30 100.0 1447 93.9 1.1 1.5 3.5 0.33 2.7 273
U - - - - - - - - - - - - 0.1 0.8 0.10 0.9 -
ALL 124 107 3910 5660 4143 1844 0 782 100.0 12569 96.7 0.6 0.2 2.5 11.90 99.1 2783
$
cpu cs ics bound rq push S3pull S3grd S0rd S1rd S2rd S3rd S4rd S5rd ilcs vlcs
0 1387 947 0 0 0 0 0 86.9 0.2 0.0 12.9 0.0 0.0 0 1317
1 4 2 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 1359
2 1165 752 0 0 0 0 0 85.2 0.2 0.0 14.6 0.0 0.0 0 883
3 0 0 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 576
4 868 576 0 0 0 0 0 83.5 0.3 0.0 16.3 0.0 0.0 0 801
5 3 2 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 882
6 1259 813 0 0 0 0 0 84.9 0.1 0.0 14.9 0.0 0.0 0 1091
7 2 2 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 1113
8 1207 809 0 0 0 0 0 85.5 0.0 0.0 14.5 0.0 0.0 0 1082
9 0 0 0 0 0 0 0 - - - - - - 0 1099
10 990 654 0 0 0 0 0 87.0 0.0 0.0 13.0 0.0 0.0 0 804
11 0 0 0 0 0 0 0 - - - - - - 0 802
12 1388 907 0 0 0 0 0 89.4 0.0 0.0 10.6 0.0 0.0 0 986
13 0 0 0 0 0 0 0 - - - - - - 0 990
14 1104 730 0 0 0 0 0 86.7 0.0 0.0 13.3 0.0 0.0 0 949
15 0 0 0 0 0 0 0 - - - - - - 0 960
16 659 411 0 0 0 0 0 88.8 0.0 0.0 11.2 0.0 0.0 0 524
17 0 0 0 0 0 0 0 - - - - - - 0 589
18 863 452 0 0 0 0 0 96.1 0.0 0.0 3.9 0.0 0.0 0 215
19 0 0 0 0 0 0 0 - - - - - - 0 316
20 1175 766 0 0 0 0 0 87.1 0.0 0.0 12.9 0.0 0.0 0 943
21 0 0 0 0 0 0 0 - - - - - - 0 951
22 1224 705 0 0 0 0 0 91.8 0.0 0.0 8.2 0.0 0.0 0 635
23 0 0 0 0 0 0 0 - - - - - - 0 639
ALL 13298 8528 0 0 0 0 0 87.7 0.1 0.0 12.2 0.0 0.0 0 20506
------------------------------------------------------------------------------------------------------
cpu cs ics bound rq push S3pull S3grd S0rd S1rd S2rd S3rd S4rd S5rd ilcs vlcs
0 1969 1277 0 0 0 0 0 83.2 0.0 0.0 16.8 0.0 0.0 0 1940
1 0 0 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 1968
2 1679 1068 0 0 0 0 0 85.7 0.0 0.0 14.3 0.0 0.0 0 1216
3 0 0 0 0 0 0 0 - - - - - - 0 1236
…
tty: tin tout avg-cpu: % user % sys % idle % iowait physc % entc time
0.0 62.1 92.4 4.8 0.3 2.6 11.8 98.6 15:42:10
General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Adapter Data Rate: 2000
…
…
Server nfs:
calls badcalls public_v2 public_v3
715102602 27 0 0
Version 2: (256 calls)
null getattr setattr root lookup readlink read
256 100% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
wrcache write create remove rename link symlink
0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
mkdir rmdir readdir statfs
0 0% 0 0% 0 0% 0 0%
Version 3: (715102346 calls)
null getattr setattr lookup access readlink read
208 0% 56755889 7% 1397024 0% 430178070 60% 4733070 0% 0 0% 55037597 7%
write create mkdir symlink mknod remove rmdir
43093505 6% 2409219 0% 7 0% 0 0% 0 0% 2329235 0% 14 0%
rename link readdir readdir+ fsstat fsinfo pathconf
110306 0% 63 0% 17771684 2% 9038704 1% 65544670 9% 206 0% 95 0%
commit
26702780 3%
Client rpc:
Connection oriented
calls badcalls badxids timeouts newcreds badverfs timers
272059 0 0 0 0 0 0
nomem cantconn interrupts
0 0 0
Connectionless
calls badcalls retrans badxids timeouts newcreds badverfs
79 0 0 0 0 0 0
timers nomem cantsend
0 0 0
…
…
vgname = rootvg
pv_pbuf_count = 512
total_vg_pbufs = 1024
max_vg_pbuf_count = 16384
pervg_blocked_io_count = 0
pv_min_pbuf = 512
global_blocked_io_count = 176559
vgname = oravg
pv_pbuf_count = 512
total_vg_pbufs = 8192
max_vg_pbuf_count = 16384
pervg_blocked_io_count = 176559
pv_min_pbuf = 512
global_blocked_io_count = 176559
…
…
•Adjust to the ever-greater scale, range and power inherent with each generation of technology
•Devise tactics that exploit the features of new technology for competitive advantage
•Or rather, don’t tolerate traditions that confound the value and benefits of new technology
•Trends toward greater virtualization and consolidation often create/insert more layers
•Tuning should thus entail the alignment/optimization of an infrastructure’s interfacing layers
•Most AIX 5L performance-tuning issues are founded on accepting the default values of AIX 5L
•Resize AIX 5L values to the specific scale, character and intensities of monitored workloads
AIX vmo:lru_file_repage=0 # default=1; all Enterprise workload LPARs must have this disabled (=0)
AIX ioo:pv_min_pbuf=2048 # given today’s larger SAN LUNs, this ensures a sufficient allocation of pbufs per volume-group.
VMM:vmo nitty-gritty…
•Devise tactics that exploit the features of new technology for competitive advantage
•Distinguish Enterprise workloads from Infrastructure workloads, and “architect” accordingly
AIX vmo:memory_affinity=0
Default=1 generally assumes an Infrastructure workload of short-lived processes. An Enterprise workload is comprised of long-lived
enduring user-processes, i.e. Oracle rdbms; as such, their comprising threads face never-ending migrations. Disabling this (=0)
ensures all memory_pools will be close-to-equal in size, and serve all migrated/migrating threads equally. Also, (=0) activates the
ability to use vmo:cpu_scale_memp below.
AIX vmo:lru_file_repage=0
Default=1 ensures comp and noncomp memory are treated equally. Setting =0 assumes an Enterprise workload that greatly favors
computational memory over noncomp JFS/JFS2 filesystem buffer-cache. Setting =0 causes lrud to steal only non-computational
mempages – until numclient<=minperm% or numperm<=minperm% -- when both comp and noncomp mempages can be stolen.
AIX vmo:strict_maxperm=0
AIX vmo:strict_maxclient=0
Disabling both (=0) causes lrud to be active only when needed to maintain minfree mempages per mempool.
AIX vmo:maxperm%=40
AIX vmo:maxclient%=40
AIX vmo:minperm%=20
When both vmo:strict_maxperm=0 and AIX vmo:strict_maxclient=0, maxperm% and maxclient% become merely
symbolic watermarks. That is, lrud scanning&stealing is no longer triggered when numperm>=maxperm% or
numclient>=maxclient%. This limits lrud scanning&stealing to maintaining minfree mempages per mempool.
AIX ioo:pv_min_pbuf=2048
Given today’s larger SAN LUNs, this ensures a sufficient allocation of pbufs per volume-group.
Example: All JFS/JFS2 filesystem fi/fo’s, as well as, pagingspace pi/po’s are conveyed to/from main memory via pbuf’s-per-vg,
fsbuf’s-per-filesystem, psbuf’s-per-pagingspace. If the intensity of coincident and nearly-coincident I/O’s exhaust entitlements of
pbuf/fsbuf/psbuf’s, then I/O’s slow to rates of real-time de-staging, i.e. I/O’s blocked with no fsbuf. For a worthwhile trade
of lruable memory, greater allocations of “buf’s” -- matched to subsystem capacities -- can greatly temper I/Os blocked.
This example of JFS/JFS2 tuning values is scaled to 3yr-old enterprise SAN storage technology.
•AIX ioo:numfsbufs=2048 # default 196 vs 2048 JFS fsbuf’s
•AIX ioo:j2_nBufferPerPagerDevice=2048 # default 512 vs 2048 JFS2 fsbuf’s
•AIX ioo:j2_dynamicBufferPreallocation=256 # default 16 vs 256 dynamic fsbuf’s
In general, each successive sequential JFS/JFS2 filesystem read doubles its page read-ahead request until it reaches its
JFS:maxpgahead or JFS2:j2_maxPageReadAhead value. These pages are the familiar 4096byte memory pages.
Note: The column-units of vmstat:memory (avm fre) and vmstat:page (fi fo pi po fr sr) are in 4096byte memory pages.
•Calculating sysminfree/sysmaxfree using the resized example values -- plus arbitrarily formulating the use of
AIX ioo:j2_maxPageReadAhead=2048
as a multiplying factor:
AIX vmo:minfree=[5*2048]=10240 # sysminfree= four mempools*10240 = 40960 * (4k mempages) = 167.7mb
AIX vmo:maxfree=[6*2048]=12288 # sysmaxfree= four mempools*12288 = 48152 * (4k mempages) = 197.2mb
Essentially, this range of sysminfree-to-sysmaxfree means 20-24 concurrent streams of JFS2-sequential reads can be
accommodated at any given moment, i.e. AIX pbuf/fsbuf/psbuf’sÆavailable vmstat:memory:fre.
•The trends of technology tend toward every manner of consolidation, virtualization, mobility, autonomy
•IT over time grows bigger baskets holding more eggs -- managed by fewer smarter hens