Sunteți pe pagina 1din 4

In AIX 4.3.3 latest MLs cfgmgr configures fibre devices in parallel.

This speeds up configuration


but does mean that if the same device can be seen down multiple paths, it sometimes only gets
picked up on one path, hence the need to run config manager multiple times.

In fact there is a slightly neater way to do this :


Say you have two fibre adapters : fcs0 and fcs1 and you can see disks via both.
Run
cfgmgr -l fcs0 (or -vl for verbose output)
cfgmgr -l fcs1
This will configure all devices down each fibre adapter in turn and won't waste time trying to
configure any other devices on the system. If your devices are disks and you are running SDD,
remember that you still have to configure vpaths. You can do this by running a full cfgmgr or,
normally faster through smit devices --> datapath devices.

Finally, if they are disks, make sure you check with lsvpcfg that the vpaths have picked up all the
hdisks so you have the right number of paths to your disks. Most of the time SDD gets this right
but sometimes it doesn't and deleting everything and starting again can be the best fix.

An easy way to configure the vpath(s) after doing the individual cfgmgr's of the relevant fcs's is:
"cfgmgr -l dpo". I always try and make sure that before I do this, the hdisk's have all got a pvid on
them.

There is an undocumented option on cfgmgr in AIX 4.3.3 (At least it’s not in the man pages). The
–S option will run cfgmgr serially! Although you would not want to twin-tail and use DPO, since
that is not supported at the time of this document’s creating, you should know that cfgmgr in AIX
4.3.3 runs in parallel down each SCSI adapter. Therefore, since the second SCSI adapter could
finish earlier than your first SCSI adapter, your disk drives may be in a weird order. This is usually
not a big deal, unless you are doing a rollout of lots of machines that should look exactly alike, or
UNLESS you are twin tailing and you want the hdisk numbers to be in the same order on each
machine. In order to force the disk drives to be configured in the order of the scsi adapters, you
can remove all of the disk drives and run “cfgmgr –S”.

This –S flag is run in order to make the cfgmgr run serial over the childern, it help keep scsi id
order on devices added. Normally used when you want to keep the same device id's assigned to
the same ODM devices.
For example, I want to configure all the fiber drives on one adapter first, cfgmgr -S fcs0
this will keep the same order of ODM devices as last time cfgmgr -S fcs0 was run if the phyisical
connections are still the same.

The ESS is an E20, LIC 1.5.2.114, so it's fairly current. SDD/DPO's are 1.3.2.11, ibm2105.rte at
32.6.100.13
I ran cfgmgr the first time and vpath24 appeared with only the one hdisk but vpath25 appeared
with 4 hdisks. Re-run cfgmgr and no change.Rerun cfgmgr a couple more times, still no change.

This looks like a problem I have seen with the order that cfgmgr configured things. It configures
the vpaths in parallel with the hdisks so if you do it all in one go, the vpath is configured before
the hdisks. If this is the case, lspv should show 4 hdisks with the same pvid but lscfg just picks up
one of them.

The fix is simple. Run cfgmgr -vl scsiX once for each scsi adapter. This configures the hdisks but
not the vpaths. When all hdisks are configured, configure the vpaths. You can either run cfgmgr
again or, if you want to speed things up, run /usr/lib/methods/cfallvpath.

It was several days later, when the AIX administrator attempted to configure new storage on the
AIX system, when the first sign of trouble appeared. He had asked his Storage guy to assign a
couple of new disks to his LPAR (via NPIV/VFC). As soon as the Storage admin had completed
the assignment, the AIX admin ran cfgmgr to detect and configure the new hdisks. Immediately,
cfgmgr reported the following error:
Method error (/usr/lib/methods/cfgscsidisk):
0514-023 The specified device does not exist in the customized device configuration database.

Initially, the AIX team suspected there was some fault with either the storage device or the zoning
of the disk. Both of these items were checked and doubled-checked and were found to be OK. Our
next step was to run cfgmgr again, but this time we wanted a greater level of detail captured. To do
this we used the following environment variable to force cfgmgr to be ‘more verbose’.
# export CFGLOG="cmd,meth,lib,verbosity:9"

We ran cfgmgr and went to the /var/adm/ras/cfglog file to view the results with the alog command.
However, we noticed that the cfglog file had a size of zero (0) and contained no data.
# cd /var/adm/ras
# ls –l cfglog
-rw-r----- 1 root system 0 May 16 13:22 cfglog

We decided to recreate the cfglog alog file and run mkdev again to reproduce the disk configura
tion error.
# rm cfglog
# echo "Create cfglog `date`"|alog -t cfg
# mkdev -l hdisk0
Method error (/usr/lib/methods/cfgscsidisk):
0514-023 The specified device does not exist in the customized device configuration databa e.
This time we found some useful data in the cfglog file.
# alog -t cfg -o
MS 31981804 28835876 /usr/lib/methods/cfgscsidisk -l hdisk39
M4 31981804 Parallel mode = 0
M4 31981804 Get CuDv for hdisk39
M4 31981804 Get device PdDv, uniquetype=disk/fcp/htcvspmpio
M4 31981804 Get parent CuDv, name=fscsi0
M4 31981804 ..is_mpio_capable()
M4 31981804 Device is MPIO
M4 31981804 ..get_paths()
M4 31981804 Getting CuPaths for name='hdisk39'
M4 31981804 Found 1 paths
M0 31981804 cfgcommon.c 225 mpio_init error, rc=23
MS 28835892 31981568 /usr/lib/methods/cfgscsidisk -l hdisk0
M4 28835892 Parallel mode = 0
M4 28835892 Get CuDv for hdisk0
M4 28835892 Get device PdDv, uniquetype=disk/fcp/htcvspmpio
M4 28835892 Get parent CuDv, name=fscsi0
M4 28835892 ..is_mpio_capable()
M4 28835892 Device is MPIO
M4 28835892 ..get_paths()
M4 28835892 Getting CuPaths for name='hdisk0'
M4 28835892 Found 2 paths
M0 28835892 cfgcommon.c 225 mpio_init error, rc=23
MS 25690326 27328608 /usr/lib/methods/cfgscsidisk -l hdisk0
M4 25690326 Parallel mode = 0
M4 25690326 Get CuDv for hdisk0
M4 25690326 Get device PdDv, uniquetype=disk/fcp/htcvspmpio
M4 25690326 Get parent CuDv, name=fscsi0
M4 25690326 ..is_mpio_capable()
M4 25690326 Device is MPIO
M4 25690326 ..get_paths()
M4 25690326 Getting CuPaths for name='hdisk0'
M4 25690326 Found 2 paths
M0 25690326 cfgcommon.c 225 mpio_init error, rc=23
The configuration method was attempting to configure a disk device type of htcvspmpio (which
was correct) but it was unable to configure the device paths (mpio_init error rc=23). We suspected
that the system was missing some sort device driver support for the type of storage in use.

Cutting a very long story short, we determined, with the help of the IBM AIX support team, that
the issue stemmed from “old” AIX installation media used to create the AIX 6.1 TL6 SP5 SPOT
and lppsource on the NIM master. Old AIX 6.1 media was originally used (several years ago) to
create the NIM resources and was gradually updated over time, all the way up to TL6 SP5.

IBM support identified that the older install media contained a liblpp.a file that was missing the
necessary PdPathAt ODM files. Newer install media contained a fix to add the appropriate entries
to the bos.rte.cfgfiles. e.g.
SPOT created using OLD install media.
======================================
# ar xv /export/spot/spotaix610605_OLD/usr/lpp/bos/liblpp.a bos.rte.cfgfiles
x - bos.rte.cfgfiles

# grep PdPathAt bos.rte.cfgfiles


#

SPOT created using NEW install media.


======================================
# ar xv /export/spot/spotaix610605_NEW/usr/lpp/bos/liblpp.a bos.rte.cfgfiles
x - bos.rte.cfgfiles

# grep PdPathAt bos.rte.cfgfiles


/usr/lib/objrepos/PdPathAt v4preserve
/usr/lib/objrepos/PdPathAt.vc v4preserve

We recreated the SPOT and lppsource resources using newer media on the NIM master. We were
then able to migrate the AIX 5.3 LPAR to 6.1 without encountering the issues faced previously.

"Cannot find a child device".


a). The error means that the fiber adapter detects an open loop which is in correlation with the
offline port in the switch. Enable the port from the switch side and run a cfgmgr - that should fix
things, if there are not any hardware errors like broken cable etc.
b). This also happens because the cfgmgr doesn't find any LUNs attached to that HBA, the root
cause of your missing paths. You might want to re-check your zoning and LUN masking.

Replaced the Gbic and fibre and ran cfgmgr on the lpar which got rid of the missing paths and
offline port, but has left a few hdisks in a defined state. cfgmgr also produced another error which
points towards drivers.

S-ar putea să vă placă și