Documente Academic
Documente Profesional
Documente Cultură
• Boot from direct attached disk drive, CD/DVD Rom, tape, ...
• Boot from disk from SAN environment with HBA (Host Bus Adapter).
Using multiple paths to the boot disk is also supported
• Boot from disk given by Virtual I/O Server through virtual SCSI
adapter (dual VIOSes can provide the same boot disk)
• Boot from disk from SAN environment with Virtual Fiber Adapter given
by Virtual I/O Server through N-Port ID Virtualization feature.
• bootlist can be examined by bootlist command. Also, device tree can be examined with snap
data ( devtree.out file in general directory)
in devtree.out,
boot-device
2f706369 40383030 30303030 32303030 [/pci@80000002000]
30303135 2f706369 40322c32 2f666962 [0015/pci@2,2/fib]
72652d63 68616e6e 656c4031 2f646973 [re-channel@1/dis]
6b403530 30353037 36333030 63316130 [k@5005076300c1a0]
39362c35 37303830 30303030 30303030 [96,5708000000000]
3030303a 32202f70 63694038 30303030 [000:2 /pci@80000]
30303230 30303030 31352f70 63694032 [0020000015/pci@2]
2c322f66 69627265 2d636861 6e6e656c [,2/fibre-channel]
40312f64 69736b40 35303035 30373633 [@1/disk@50050763]
30306364 61303936 2c353730 38303030 [00cda096,5708000]
30303030 30303030 303a3220 2f706369 [000000000:2 /pci]
40383030 30303030 32303030 30303135 [@800000020000015]
2f706369 40322c32 2f666962 72652d63 [/pci@2,2/fibre-c]
68616e6e 656c4031 2f646973 6b403530 [hannel@1/disk@50]
30353037 36333030 63396130 39362c35 [05076300c9a096,5]
37303830 30303030 30303030 3030303a [708000000000000:]
32202f70 63694038 30303030 30303230 [2 /pci@800000020]
30303030 31372f70 63694032 2c322f66 [000017/pci@2,2/f]
69627265 2d636861 6e6e656c 40312f64 [ibre-channel@1/d]
....
• Through ioinfo, we can check the boot devices and run several I/O tests on specific device
0 > ioinfo
1. SCSIINFO
2. IDEINFO
3. SATAINFO
4. SASINFO
5. USBINFO
6. FCINFO <====
7. VSCSIINFO
q - quit/exit
==> 6
q - Quit/Exit
==> 2
q - Quit/Exit
==> 2
Select a FC Device : 1
FC Device Menu
FC Target Address ==> 50060e801530f310 FC Lun Address ==> 0
FC Device String: /vdevice/vfc-client@30000006/disk@50060e801530f310,0:0
FC Device: 10240 MB Disk drive (bootable)
----------------------------------------------------------------------
q - Quit/Exit
==> 1
000002f4cd00: 00 00 03 32 cf 00 00 02 48 49 54 41 43 48 49 20 :...2....HITACHI :
000002f4cd10: 4f 50 45 4e 2d 56 20 20 20 20 20 20 20 20 20 20 :OPEN-V :
000002f4cd20: 36 30 30 34 35 30 20 31 33 30 46 33 33 30 33 33 :600450 130F33033:
000002f4cd30: 20 32 41 20 01 01 01 01 00 00 00 00 00 00 00 00 : 2A ............:
000002f4cd40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
000002f4cd50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
000002f4cd60: 05 01 05 70 30 30 ff 00 c0 50 76 00 1a b6 00 3a :...p00...Pv....::
000002f4cd70: c0 50 76 00 1a b6 00 3a 00 00 00 0f 00 00 00 00 :.Pv....:........:
000002f4cd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 03 00 00 :................:
000002f4cd90: 01 01 01 01 00 00 00 00 01 01 01 01 01 01 01 01 :................:
000002f4cda0: 01 01 01 01 01 01 01 01 55 55 55 55 55 55 55 55 :........UUUUUUUU:
000002f4cdb0: 55 55 55 55 00 00 00 00 ff ff ff ff 00 00 00 00 :UUUU............:
000002f4cdc0: 00 00 00 03 00 00 00 01 00 00 00 01 00 01 99 40 :...............@:
000002f4cdd0: 00 00 71 a3 00 00 00 00 00 00 00 00 00 00 00 00 :..q.............:
000002f4cde0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
000002f4cdf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :...............:
Hit a key to continue...
FC Device Menu
FC Target Address ==> 50060e801530f310 FC Lun Address ==> 0
FC Device String: /vdevice/vfc-client@30000006/disk@50060e801530f310,0:0
FC Device: 10240 MB Disk drive (bootable)
----------------------------------------------------------------------
q - Quit/Exit
==> 98
SAN Switch #1 SAN Switch #2 •The fc_err_recov attribute of the fscsi is recommended to be
changed from delayed_fail to fast_fail, and this is the general
guideline for multipath environment. If you have only single path to
the boot disk, then delayed_failover is the recommended value.
hdiskx With fast_failover, the path failover will be done faster (15 seconds
around).
VIOS1 VIOS2
vSCSI vSCSI
Server Adapter Server Adapter
hdiskx hdiskx
fscsi devices in client : SAN Switch #1 SAN Switch #2 hdisk devices in VIOS :
dyntrk = yes algorithm = round_robin
fc_err_recov=fast_fail reserve_policy = no_reserve
hcheck_mode = nonactive
hdiskx hcheck_interval = 60+
Disk
• If AIX is installed When the hdisk of VIOS has single_path attribute, then SCSI-2 reservation
is set. In this case, you have to break it up with relbootrsv or equivalent. But keep in mind that
always change the reserve policy before AIX install. But keep in mind that relbootrsv can only
run against the rootvg name. If “only specific disk” needs to be cleared, then you can “forced”
open against that specific disk with small aplications using openx().
• dynamic tracking is beneficial in case that the fiber adapter of a host is connected to the SAN
switch. Without this feature, reconfiguration of each LUNs is required once the scsi_id
(N_port ID) of disk HBA is changed.
• Use fast_fail for a fc_err_recov attribute, and this will minimize time for detecting state change
of the target device.
• The algorithm attribute of disk is recommended being changed to “round_robin” for each
VIOSes for spreading out the I/O traffics. (We have a defect on the round_robin feature.
Please apply IZ47220 (IZ52365) or appropriate on according to the AIX level before using this
attribute)
• If the hcheck_mode is nonactive (default value), health check command will be down to the
path which the I/O is not handled at specific time. By default, health check feature is disabled,
but once hcheck_interval is changed to non-zero value, it will be enabled. This value should
• The default value of reserve_policy for client hdisk is “no_reserve”. If the reserve_policy was
successfully changed to “no_reserve” on VIOSes, then there will be no SCSI-2 reservation
issue.
• For the VIO client, fail_over is the default algorithm and recommended. Also, for distribution
of I/O requests to 2 Virtual I/O server, path priority should be managed appropriately. There
may be several numbers of VIO client, by adjusting the path priority of each VIO client, you
can divide the I/O requests into 2 VIO servers. You can change the path priority like,
• If the health check is enabled by changing the hcheck_interval value from 0 to the other value
(20 seconds will be good start), then It will send a health check command to the devices
which donʼt handle the I/O at specific time. If health check is not turned on, then failed path
will not be available until it is manually enabled. Using health check feature, failed path can
be dynamically enabled when itʼs recovered. Also, inactive path (due to the low priority value)
can be checked, so unreasonable takeover (when the inactive one is not usable, and the all
the active paths are downed) can be avoidable.
hdisk0 MPIO
boot disk hdisk devices in client :
algorithm = failover
reserve_policy = no_reserve
hcheck_mode = nonactive
hcheck_interval = 60+
fscsi devices in client :
dyntrk = yes (default) fcs0 fcs1
fc_err_recov = fast_fail (default)
VIOS1 VIOS2
vfchost1 vfchost1
fcs0 fcs0
Disk
• With NPIV, Client has its own fscsi layer, as a result, we donʼt have to consider the attributes of
the fscsi device driver in VIOS side.
• For more information regarding the NPIV itself, please refer to the following document.
http://ausgsa.ibm.com/projects/o/oneteam/public/Itrans/
ItransProjectsCompleted.html (NPIV_Introduction and problem
determination hints.ppt written by Bertram Begau from IBM Germany
• Using NPIV, the boot process itself is very similar to that of AIX which has 2 physical fiber
adapters. SCSI-2 reservation must be considered during AIX installation.
• But, still client LPARs are using the physical fiber adapters residing on the VIOSes, error logs /
traces / dumps of VIOSes will be needed to debug problems.
• Using NPIV, the client AIX will have its own scsi_id through physical fiber adapters residing on
the VIOSes. In switchʼs perspective, this virtual fiber adapter is regarded as a separate fiber port.
• The considerations for SAN boot using NPIV is almost the same as using more than 2 physical
fiber adapters. During AIX installation, only one path to the boot disk should be used to handle
SCSI-2 reservation.
• NPIV is quite a new technology, care should be taken regarding the S/W and H/W prerequisites.
• Is this a fresh install of AIX or migration from other systems using mksysb or alt_disk_install?
• Is VIOS involved?
• Is there any error log entries in VIOS when the VIO client boot fails? (including NPIV)
• Does the bootlist have all paths appropriately? (using bootinfo -m normal -ov command)
• For any reason, if the rootvg of AIX canʼt be accessed during normal operation, then system may
be hung for very long time (over 10 minutes) or forever.
• Unfortunately, the dump procedure will be failed in many cases. (But always dump procedure
should be initiated after significant of time)
• If VIOS is used, the kernel traces for VIOS will be needed. In many cases, system dump for
VIOS is also required to verify the problem.
• If the LPAR is migrated to the other LPAR, the open firmware of new H/W doesnʼt know the
bootlist of that AIX image yet. So, In this case, SMS mode boot is required to put the valid
bootlist information to the NVRAM. (When LPM (Live Partition Mobility) is used, then you donʼt
have to do this)
• 2 VIOSes are used. 20+ client partitions are using NPIV for their storage access, and rootvgs of
the clients are serviced through this NPIV virtual fiber adapters. Hitachi disk is used for rootvgs.
• Each client has 2 virtual fiber adapters, and 2 paths are configured to the rootvg.
- Symptoms
• If the first VIOS is downed, the client partitions still work properly using the second VIOS. The
access for rootvg has no problem at this moment.
• Client partition can successfully boot only with the first VIOS
• But, client partition canʼt boot only with the second VIOS. Customer reported that they could see
the virtual fiber adapter given by the second VIOS on SMS menu, but they couldnʼt see any
disks behind that virtual fiber adapter.
boot-device
2f766465 76696365 2f766663 2d636c69 [/vdevice/vfc-cli]
656e7440 33303030 30303035 2f646973 [ent@30000005/dis]
6b403530 30363065 38303135 33306633 [k@50060e801530f3]
30343a32 202f7664 65766963 652f7666 [04:2 /vdevice/vf]
632d636c 69656e74 40333030 30303030 [c-client@3000000]
362f6469 736b4035 30303630 65383031 [6/disk@50060e801]
35333066 3331303a 3200 [530f310:2.......]
* When this lun is assigned to the other AIX partition, we can get the
output of lquerypv -h /dev/hdiskx. This is not a SCSI-2 Reservation
issue.
INQUIRY DATA FOR : TARGET ==> 50060e801530f310 LUN ==> 0 - 10240 MB Disk drive (bootable)
000002f4cd00: 00 00 03 32 cf 00 00 02 48 49 54 41 43 48 49 20 :...2....HITACHI :
000002f4cd10: 4f 50 45 4e 2d 56 20 20 20 20 20 20 20 20 20 20 :OPEN-V :
000002f4cd20: 36 30 30 34 35 30 20 31 33 30 46 33 33 30 33 33 :600450 130F33033:
000002f4cd30: 20 32 41 20 01 01 01 01 00 00 00 00 00 00 00 00 : 2A ............:
000002f4cd40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
000002f4cd50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
Detail Data
ADDITIONAL INFORMATION
module: npiv_port_sciolst rc: 000000000000004F location: 00002523
data: 1 9 29 0 CC080
• This system has 2 NPIV paths and 1 physical fcs path to the boot disk (rootvg) on Netapp
storage.
- Symptoms
• So, a path through fcs0 set a SCSI-2 reservation on this disk. The boot disk canʼt be accessed
through the other 2 paths. So, the client will successfully boot with only one path (fcs0).
• To use only one path, changing zoning or disk assigning configurations or pulling out cables of
fcs1 and fcs2 will be helpful. Remind that during boot, using only one path is the best way to get
rid of the SCSI-2 reservation issue.
• Customer got an mksysb image, and tries to boot with those image in LPAR of a different CEC.
- Symptoms
• If fiber adapter (not used for boot disk) is removed, then it boots successfully.
• If they install an AIX with fresh install, it boots successfully with the fiber adapters.
* verbose output
---------------------------------------------------------------------------
LABEL:
LGPG_FREED
IDENTIFIER:
C4C3339D
Date/Time: Mon Oct 12 17:36:48 2009
Sequence Number: 4168
Machine Id: 00C46DC24C00
Node Id: phls6840
Class: S
Type: INFO
Resource Name: SYSVMM
Description
ONE OR MORE LARGE PAGES HAS BEEN CONVERTED INTO PAGEABLE PAGES
Probable Causes
System at or near pinned memory limit.
Recommended Actions
Tune maxpin percentage or lgpg_regions.
Detail Data
Number of large pages attempted to free:
1
Number of large pages actually freed:
1
---------------------------------------------------------------------------*
* 1 large page (16MB page) was freed due to pinned memory shortage
* We are now suspecting that the memory size is different between two
LPARs and they are using 16MB pages (always pinned). Due to pinned memory
shortage at boot time, some devices including fcs0 is not configured and
boot hung. (more analysis is required)
• A system cannot boot when the open firmware can detect the boot device, the debug boot
procedure will be needed to get to know the steps in which the boot fails.
http://www-01.ibm.com/support/docview.wss?uid=isg3T1000251
1. Log in to the HMC using ssh. (After allowing the ssh login)
2. Prepare the screen logging with “script” command like,
script -f debugboot.log <-- This log will be stored in HMC
3. Make a vterm of specific LPAR which has problem.
mkvterm -m <managed_system> --id <lpar_id>
4. Boot to the open firmware prompt (in HMC menu) if ok prompt is given,
ok> boot -s trap
KDB(0)> mw enter_dbg
enter_dbg+000000: 00000000 = 42
enter_dbg+000004: 00000000 = . (symbol dot)
KDB(0)> g
4-1. In the newer version (AIX 5.3 ML03 or later, AIX 5.2 ML07 or later), you can do this
easily,
ok> boot -s verbose
5. After getting the screen logs containing the error symptom, You can quit the virtual
session just type the following sequence in the vterm
#~. (tilde and period)
• Web materials
• Technical Documents
- Multipathing on AIX Version 2.1 by James Lee
- Understanding AIX boot process by Uma Sankar Atluri
- NPIV introduction and problem determination hints by Bertram Begau