Sunteți pe pagina 1din 16

OPERATIONAL INSTRUCTION

Prepared (also subject responsible if other)

1 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Reference

Ericsson AB 2012. All rights reserved. No part of this document may be reproduced in any form
without the written permission of the copyright owner.

APG43, GED Datadisk, Repair


Contents

Page

1
1.1

Introduction
Scope

2
2

2
2.1
2.2

Procedure
Prerequisites
Actions

3
3
4

Additional Information

14

Glossary

15

5
5.1
5.2
5.3

References
Operational Instructions
Manual Pages
Printout Descriptions

16
16
16
16

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

2 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Introduction

1.1

Scope

Reference

This Operational Instruction describes the procedure to repair or replace a


GED Datadisk board of an AP based on APG43 HW.

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

3 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Procedure

2.1

Prerequisites

2.1.1

Conditions

Reference

The following condition must apply before this procedure can be completed:

2.1.2

The alarm AP FAULT is present or there is a work order received.


A technician is available on site.
A GED Datadisk spare replacement board.

Data
The following data must be known to complete this procedure:

2.1.3

The user name and password valid for a user with administrative
rights.
The slot number reported by the alarm AP FAULT.
In case of work order it must be known the <diskSlot_No>.

Special Aids

A grounding strap. To be used when exchanging the equipment.


Torx #8 screw-driver for removing the board.
Torx #5 screw-driver for disconnect the SAS cables.

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

4 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

2.2

Reference

Actions
Connect to AP Local Mode
1.

Connect to the active node. Go to Operational Instruction AP, User


Session, Initiate. Perform the steps in that Operational Instruction
and return to Step 2 on page 4 in this Operational Instruction.

Work order received?


2.

Was a Work Order for GED board replacement received?


Yes
Go to Step 8 on page 5 .
No
Go to Step 3 on page 4 .

Get if the replaced GED is DiskA or DiskB


3.

Note the slot number of the replaced GED board and that of the
other GED board.
Use Command hwmls
Note: Note down the GED Disk slot number of the faulty board.
Note: Faulty GED is DiskA if it is the GED with the smallest slot
number.
Note: Faulty GED is DiskB if it is the GED with the highest slot
number.
Note: Visually the DiskA is the left GED board in the magazine. The
DiskB is the right GED board in the magazine.

AP FAULT Alarm present?


4.

Print the alarm list.


Use Command alist

5.

Which alarm was received?


MIRRORED DISK NOT REDUNDANT
Go to Step 6 on page 4 .
BOARD FAULTY
Go to Step 8 on page 5 .

Restore RAID
6.

Rebuild the raid.


Use Command raidmgr -m

7.

Go to Step 34 on page 8 .

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

5 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Reference

Check LSI Driver Version


8.

Change the current working directory.


Use Command cd /d C:\program files\ap\apos\tools

9.

Check the driver file of the GEP node.


Use Command devcon driverfiles pci\ven_1000
Note: Note the .inf file name from the printout of above command.

10.

Check the LSI driver version of the GEP node.


Use Command devcon drivernodes pci\ven_1000
Note: Note the LSI driver version from the printout of above
command corresponding to the noted .inf file in the Step 9
on page 5 .

11.

Is LSI Driver version of the node less than 1.30.2.0?


Yes
Go to Step 76 on page 13.
No
Go to Step 12 on page 5 .

Check the DataDisk "targetID"


12.

Change the current working directory.


Use Command cd c:\program files\ap\apos\clone

13.

Get the DataDisks targetID for both the Data Disk.


Use Command cscript ListDeviceID.vbs
Note: The TargetID is used to enable write disk cache later.

Block GED Datadisk board


14.

Block the GED Datadisk board to be replaced.


Use Command hwmblk -s <Disk_SlotNo>
Note: Where <Disk_SlotNo> is the slot number of GED DataDisk to
be replaced.
Note: Answer Yes when required.

15.

Wait until the MIA LED on the GED board to be replaced is switched
on.
Use Command hwmls -s <Disk_SlotNo> -l | findstr -I MIA
Note: <Disk_SlotNo> is the slot no of the faulty board.

Check which DataDisk "targetID" is missing


16.

Change the current working directory.


Use Command cd c:\program files\ap\apos\clone

17.

Note which DataDisk targetID is missing. The DataDisk target ID


which is not printed is the missing one.
Use Command cscript ListDeviceID.vbs

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

6 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Reference

Disconnect Cables on Board


18.
Warning!
Be careful when removing the cables. They can be damaged if not
handled carefully.

Disconnect the SCSI cables in the following order.


Table 1.
1

Disconnect SAS-1 on the Board

Disconnect SAS-0 on the Board

Note: Use Torx #5 screw-driver to unscrew the cable connectors.


Note: It is important that the cables will be reconnected in the same
channels (SAS-0 respectively SAS-1) where they were removed
from.
19.

Run the commands vxvol in order to take note of the id's of the
remained Data Disk for each partition.
Use Command vxvol volinfo I:
vxvol volinfo K:

Note: The output for I: partition will contain a string like Volume1-id,
The output for K: partition will contain a string like Volume2-id.
Replace GED Datadisk Board
20.

Replace the board with the GED spare board.


Note: When you have loosened the board from the cabinet a few
millimeters, check that all screws are loose - they have a
tendency to still be connected by a half turn.
Note: Use Torx #8 screw-driver to unscrew the board.

Reconnect SCSI cables for GED Datadisk board


21.

Connect the SCSI cables in the following order.


Table 2.
1

Connect SAS-1 on the replaced Board

Connect SAS-0 on the replaced Board

Note: It is important that the cables are connected in the same channels
(SAS-0 respectively SAS-1) where they were removed from.

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

7 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

22.

Reference

Run the commands vxvol in order to take note of the id's of the new
Data Disk.
Use Command vxvol volinfo I:
vxvol volinfo K:

Note: The output for the commands below will contain strings like
Volume1-id and Volume2-id that were not present in the
output of the Step 19 on page 6 . Take note of the new id
values for each partition.
Integrate new GED Datadisk board
23.

Set the non substituted disk as preferred disk for mirroring. For each
partition use the id individuated in Step 19 on page 6 .
Use Command vxvol -gDataDisk rdpol prefer I:
Volume1-id
vxvol -gDataDisk rdpol prefer K: Volume2-id

Note: The first vxvol command should be executed in a single line.


24.

Integrate the GED Datadisk board indicated by AP FAULT or the


Work Order.
Deblock the GED DataDisk board. Use Command
hwmdeblk -s <Disk_SlotNo>

Note: Where <Disk_SlotNo> is the slot number of GED DataDisk to


be replaced.
25.

Check that the GED Data Disk board is deblocked.


Use Command hwmls
Note: Where the Status "WO" means "Working Board".

26.

Is the status WO?


Yes
Go to Step 27 on page 7 .
No
Go to Step 76 on page 13.

Enable the Write Disk Cache


27.

Change the current working directory.


Use Command cd c:\program files\ap\apos\clone

28.

Enable the Write Disk Cache on the substituted disk using the
targetID's.
Use Command cscript ManageDiskCache.vbs

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

8 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

29.

Reference

Which GED Datadisk board has been replaced?


DiskA
Go to Step 30 on page 8 .
DiskB
Go to Step 32 on page 8 .

Restore GED Datadisk A configuration


30.

Recreate the disk mirror on GED Datadisk A.


Use Command raidmgr -r diskA

31.

Go to Step 36 on page 8 .

Restore GED Datadisk B configuration


32.

Recreate the disk mirror on GED Datadisk B.


Use Command raidmgr -r diskB

33.

Go to Step 36 on page 8 .

Check mirroring
34.

To Check the status of the Raid.


Use Command hwmls -l
Note: The mirroring operation will take 2 to 3 hours to complete.
During this time the status of Raid will be NOT ACTIVE.

35.

Was disk mirroring Active?


Yes
Go To Step 36 on page 8
No
Go To Step 65 on page 12

Check resynchronization
36.

List resynchronisation status of volume I: .


Use Command vxvol volinfo I:
Note: When the synchronization is finished and the volume is OK,
the three status rows should display "Started, Attaching and
Attached".
Note: Wait for the completion of resynchronization of I Drive.
Note: Resynchronisation of volume I: may take several hours before
it is completed.

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

9 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Reference

37.

List resynchronisation status of volume K: .


Use Command vxvol volinfo K:
Note: When the synchronization is finished and the volume is OK,
the three status rows should display "Started, Attaching and
Attached".
Note: Resynchronisation of volume K: may take several hours
before it is completed.
Note: The current progress of resynchronization of drives can be
checked with command: vxtask list
Note: Wait for the completion of resynchronization of K Drive.

38.

Was enable the disk write cache executed?


Yes
Go To Step 39 on page 9 .
No
Go To Step 42 on page 9

Disable the Write Disk Cache


39.

Change the current working directory.


Use Command cd c:\program files\ap\apos\clone

40.

After the rebuilding has been performed disable the Write Disk
Cache.
Use Command cscript ManageDiskCache.vbs
Note: If a failover has happened during the procedure, then
this command must be given also on the other node after
performing a failover.

41.

Reset the preferred disk settings using the same ids used in the
Step 23 on page 7 for each partition.
Use Command vxvol -gdatadisk rdpol round I: Volume1-id
vxvol -gdatadisk rdpol round K: Volume2-id

Note: Execute the above two lines in separate commands.


Verify that no data is corrupt on the datadisks
42.

Verify the volume I: with chkdsk.exe.


Use Command chkdsk I:

43.

Was any data corrupt?


Yes
Go to Step 46 on page 10.
No
Go to Step 44 on page 9 .

44.

Verify the volume K: with chkdsk.exe.


Use Command chkdsk K:

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

10 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

45.

Reference

Was any data corrupt?


Yes
Go to Step 46 on page 10.
No
Go to Step 75 on page 13.

Correct disk corruption or inconsistencies


46.
Warning!
During this procedure, both nodes will be down and the AP will be
unavailable, offering no service!

Connect to Node B
47.

Connect to Node B. Go to Operational Instruction AP, User Session,


Initiate. Perform the steps in that Operational Instruction and return
to next step in this Operational Instruction.

Reboot Node B
48.

Reboot Node B.
Use Command prcboot -o -f

Connect to node A
49.

Connect to Node A. Go to Operational Instruction AP, User Session,


Initiate. Perform the steps in that Operational Instruction and return
to next step in this Operational Instruction.

Reboot node A
50.

Reboot Node A.
Use Command prcboot -o -f
Note: Option -o must be used in order to reboot the node without
starting the cluster resources.

51.

Wait 5 minutes for the node to reboot.

Connect to node A
52.

When the system comes up again, connect to Node A. Go to


Operational Instruction AP, User Session, Initiate. Perform the
steps in that Operational Instruction and return to next step in this
Operational Instruction.

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

11 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Reference

53.

Start the Cluster Service with quorum logging disabled.


Use Command net start clussvc /NQ

54.

Did the Cluster Service start?


Yes
Go to Step 55 on page 11.
No
Go to Step 76 on page 13.

Correct disk corruption


55.

Correct any disk corruption on volume I: .


Use Command chkdsk I:/F

56.

Correct any disk corruption on volume K: .


Use Command chkdsk k:/F

Reboot Node A
57.

Reboot Node A.
Use Command prcboot -f

Connect to Node B
58.

Connect to Node B. Go to Operational Instruction AP, User Session,


Initiate. Perform the steps in that Operational Instruction and return
to next step in this Operational Instruction.

Reboot Node B
59.

Reboot Node B.
Use Command prcboot -f

Connect to active node


60.

When the system comes up again, connect to the active node. Go


to Operational Instruction AP, User Session, Initiate. Perform the
steps in that Operational Instruction and return to next step in this
Operational Instruction.

Verify success or failure


61.

Verify the volume I: with chkdsk.exe.


Use Command chkdsk I:

62.

Was any data corrupt?


Yes
Go to Step 76 on page 13.
No
Go to Step 63 on page 12.

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

12 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

63.

Verify the volume K: with chkdsk.exe.


Use Command chkdsk K:

64.

Was any data corrupt?

Reference

Yes
Go to Step 76 on page 13.
No
Go to Step 75 on page 13.
Check the faulty board
65.

To check the faulty board.


Use Command hwmls -s <Disk_SlotNo> -l
Note: <Disk_SlotNo> is the respective slot no of the datadisk.
Note: Note down the <Disk_SlotNo> of the failed disk.

66.

Is Raid mirroring state active?


Yes
Go To Step 67 on page 12.
No
Go To Step 68 on page 12.

67.

To check the status of the other disks, go to Step 65 on page 12 with


respective slot number.

68.

Block the faulty board.


Use Command hwmblk s <Disk_slotNo>

69.

Deblock the board.


Use Command hwmdeblk s <Disk_SlotNo>

70.

To check the status of the Raid.


Use Command hwmls -l
Note: The mirroring operation will take 2 to 3 hours to complete.
During this time the status of Raid will be NOT ACTIVE.

71.

Is Raid mirroring state active?


Yes
Go to Step 36 on page 8
No
Go to Step 72 on page 13

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

13 (16)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

72.

Reference

Is Disk State Failed? Check the <Disk_SlotNo> of the failed disk as


noted in the Step 65 on page 12
Yes
Go to Step 73 on page 13
No
Go to Step 75 on page 13

73.

Go to Operational Instruction AP, System Data Disk Restore.


Perform the steps in that Operational Instruction and return to Step
74 on page 13

74.

Is Raid mirroring state active?


Yes
Go to Step 36 on page 8
No
Go to Step 76 on page 13

Disconnect from AP Local Mode


75.

Disconnect from AP Local Mode. Go to Operational Instruction AP,


User Session, End. Perform the steps in that Operational Instruction
and return to Step 77 on page 13 in this Operational Instruction.

Contact Next Level of Support


76.

Consult the next level of maintenance support. Further action is


outside the scope of this Operational Instruction.

Job Completed
77.

Make a report. Report of Finished Work.


Note: See Operational Instruction Report of Finished Work.

78.

The job is completed.

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Reference

Additional Information
No additional information is applicable to this document.

14 (16)

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

Reference

Glossary
LED
MIA

Light Emitting Diode


Manual Intervention Allowed

15 (16)

OPERATIONAL INSTRUCTION
Prepared (also subject responsible if other)

No

XSUNACH

4/154 31-CNZ 222 162

Approved

Checked

Date

Rev

TEI/XSB (G.Raele)

(PC-APM)

2012-08-06

References

5.1

Operational Instructions
AP, User Session, End
AP, User Session, Initiate
Report of Finished Work

5.2

Manual Pages
hwmblk (1M)
hwmdeblk (1M)
prcboot (1m)
raidmgr (1M)

5.3

Printout Descriptions
AP FAULT

Reference

16 (16)

S-ar putea să vă placă și