Sunteți pe pagina 1din 33

IBM

Simplified Remote Restart

Document Author :
Hari G M

© Copyright IBM Corporation 2016. All rights reserved.

Page 1 of 33
IBM

Table of Contents
1 Introduction .................................................................................................................. 3
1.1 What is Partition Remote Restart .............................................................................. 3
1.2 Remote Restart Configuration Setup ........................................................................ 5
1.3 Simplified Remote Restart ........................................................................................... 5
1.4 Code Level Requirements ............................................................................................. 6
1.5 Simplified Remote Restart Configuration Setup ................................................... 7
2 Creating a Simplified Remote Restart Partition ................................................ 8
2.1 System Level Capability for Simplified Remote Restart ..................................... 8
2.2 Creating a Simplified Remote Restart Partition using HMC Classic GUI ....... 9
2.3 Creating a Simplified Remote Restart Partition using HMC CLI .................... 10
2.4 Creating a Simplified Remote Restart Partition using Rest API..................... 10
2.5 Toggling Simplified Remote Restart Capability ................................................... 11
2.6 Viewing/Listing the Simplified Remote Restart Capability ............................. 11
3 Remote Restart Operations .................................................................................... 13
3.1 Remote Restart Validation ......................................................................................... 13
3.2 Remote Restart.............................................................................................................. 13
3.2.1 Remote Restart of a Suspended Partition ................................................................. 13
3.2.2 Remote Restart with latest available configuration ................................................ 13
3.3 Remote Restart Abort .................................................................................................. 14
3.4 Remote Restart Recover ............................................................................................. 14
3.5 Source Server Cleanup ................................................................................................ 14
3.5.1 Cleanup of a partition in suspended state ................................................................ 15
3.6 Remote Restart Rest API ............................................................................................ 15
3.7 Error Codes of Remote Restart Commands .......................................................... 17
4 Re-synchronizing the persisted configuration information ......................... 18
4.1 refdev command ............................................................................................................ 18
4.2 refdev using rest api..................................................................................................... 19
5 Important Remote Restart Notes ......................................................................... 21
5.1 Partition Activation...................................................................................................... 21
5.2 Server Configuration Notes ........................................................................................ 21
5.3 Partition Configuration Notes ................................................................................... 21
6 Remote Restart Enhancements ............................................................................ 22
6.1 Cross MC Remote Restart .......................................................................................... 22
6.2 Remote Restart without FSP Connection .............................................................. 23
6.3 Live Partition Mobility (LPM) Override ................................................................... 23
6.4 Manage Partition GUI & Partition Templates ....................................................... 24
6.5 Auto cleanup of Remote Restarted Partitions...................................................... 26
6.6 User overrides/specifications.................................................................................... 27
6.7 Concurrent Remote Restart Improvements .......................................................... 28
7 References ................................................................................................................... 29
8 Appendix....................................................................................................................... 30
8.1 Remote Restart States ................................................................................................ 30
8.2 Remote Restart Feature/Support Matrix............................................................... 32
8.3 Sample script for triggering remote restart .......................................................... 32

Page 2 of 33
IBM

1 Introduction

This document describes usage of the PowerVM feature called Simplified Remote
Restart. It is intended as a user's guide describing configuration and usage of the
Simplified Remote Restart function.

It is expected that the reader is comfortable with PowerVM and partition


management in general and specifically comfortable with using the IBM Hardware
Management Console (HMC) for PowerVM management as well as the HMC SSH
interface (also known as the remote command interface).

The sections that follow provide a description of the setup required for performing
simplified remote restart, how to create and deploy a remote restart partition and
how to use the remote restart functionality. There is also an overview of the
existing remote restart function & user model to highlight the advantages of
simplified remote restart function.

1.1 What is Partition Remote Restart

Remote restart is a high availability option for partitions. In the event of an error
that causes a server outage, a partition configured for remote restart can be
restarted on a different physical server. At times, it might take longer to to bring
up the server , in which case remote restart function can be used for faster re-
provisioning of the partition. Typically this can be done faster than restarting the
server that crashed and then restarting the partition(s).

The Remote Restart function relies on technology similar to Partition Mobility


whereby a partition is configured with storage on a SAN that is shared
(accessible) by the server which will host the partition.

Some important notes on the Remote Restart function:

 Remote Restart is not a suspend/resume or migration of the partition that


preserves the active running state of the partition. It is more like a
shutdown and restart of the partition (restarted on a different system).
 Remote Restart does preserve the partition's resource configuration. That
is, if processors, memory or I/O are added or removed while the partition is
running, the Remote Restart will activate the partition to the most recent
configuration.
 Many times a Remote Restart is used when a server crashes. In this event
it is evident that the partition's run-time state is lost on the original server.
It is also possible to perform a remote restart in the event of a hang
condition of a server. In these cases, the run-time state of the partition on
the original server is still lost or invalidated to prevent the partition from
running on more than one server.
 In fact, the reserved storage device records the owning server so the
partition cannot be re-activated on the original server (after it reboots) if the
partition has been Restarted on another server.

Page 3 of 33
IBM

Here is our typical user model for Remote Restart:

• HMC (V8R8.1.0)
• User can create partition with remote restart capability on a server.
• User can assign a reserved storage device to remote restart partition
which is accessible from another server.
• User activates the partition and the current configuration is
automatically written to the reserved storage device by the server
firmware.
• The data in the reserved storage device is also updated automatically by
the firmware for any configuration change.
• HMC collects the partition's resource configuration and writes to the
reserved storage device which will be used to restore the partition's
configuration during remote restart.
• User performs remote restart and clean up operations

One of the major disadvantage of the above user model is that a reserved storage
device has to be attached to each partition. User has to manage a reserved
storage device pool on source & target systems , keep track of device assigned to
each partition which makes setting up the environment complex. Simplified
remote restart addresses this pain point.

Page 4 of 33
IBM

1.2 Remote Restart Configuration Setup

For a remote restart partition, the partition configuration data is stored in a


reserved storage device associated with the partition. The reserved storage device
must be accessible from both the source server and any/all destination servers
for the partition to be remote restarted in the event that the source server fails.

1.3 Simplified Remote Restart

Simplified Remote Restart, as the name suggests, reduces the complexity and
improves the usability of the remote restart function. Simplified Remote Restart
removes the requirement of a reserved storage to be assigned to each partition,
there-by making the user model or setting up the environment simpler.
Here is the typical user model for simplified remote restart
 User Creates a partition with simplified remote restart capability on a
capable server. Lets call this the source server.
 User can enable/disable the capability anytime after creating the partition.
Toggle of the capability is supported only when the partition is inactive.
Page 5 of 33
IBM

 User assigns resources to the partition. The resource restrictions are


similar to LPM i.e., no dedicated IO, no HEA/HCA, no opticonnect, no
server adapter in IBM I Partition etc
 Storage attached to the partition through Virtual IO should be accessible
from another server (similar to LPM). Lets call this the destination server.
 When user activates the partition, current configuration of the partition
along with partition state data is collected and persisted on the HMC
automatically.
 The data persisted on the HMC is updated automatically for any
configuration change.
 When source server crashes, user initiates the remote restart operation to
restart the partition on destination server.
 Once the source server is back to operating state, user executes a
command to cleanup the partition which is now restarted on a new
(destination) server.

1.4 Code Level Requirements

The following code levels are required to use simplified remote restart
 HMC V8 R8.2.0 or later
 System Firmware 820 or later
 VIOS level of 2.2.3.4 or later
 Simplified Remote Restart for partitions using Shared Storage Pool (SSP) Storage is supported
with HMC level V8 R8.4.0 & VIOS level of 2.2.4.0 or later.

Page 6 of 33
IBM

1.5 Simplified Remote Restart Configuration Setup

For a simplified remote restart capable partition, configuration data is stored on


the HMC. Each HMC managing the server persists its own copy of the
configuration information. Any storage used by the partition must be through
Virtual SCSI or Virtual Fibre Channel Storage accessible from both the source
and destination servers. In general the configuration requirements/restrictions
are similar to that of LPM.

Page 7 of 33
IBM

2 Creating a Simplified Remote Restart Partition

User can set a partition to be simplified remote restartable is during its creation
or thereafter. Toggle of the simplified remote restart capability is supported only
when the partition is inactive.

Partitions can be created using both the Graphical User Interface (GUI) or the
command line or Rest API.

There are several restrictions for a simplified remote restart partition similar to
partition mobility

The partition:
a. can not be a full system partition
b. can not be a VIOS partition
c. can not be a Service Partition
d. can not be an alternate error logging partition
e. can not have BSR
f. can not have Huge Pages (applicable only if AMS enabled)
g. can not be part of eWLM group
h. can not have physical I/O assigned
i. can not have HEA/HCA/SMA/SRIOV adapters assigned
j. can not have Server SCSI adapter
k. can not have a client SCSI adapter hosted by a non-vios partition

In most cases, the HMC prevents the user from improperly configuring a
Simplified Remote Restart capable partition. These checks are repeated at
partition activation to ensure we do not activate a Simplified Remote Restart
partition with an incompatible configuration.

2.1 System Level Capability for Simplified Remote Restart

System level capability for simplified remote restart can be viewed via GUI, CLI or
Rest API.

From CLI, system level capability is displayed using the lssyscfg command

lssyscfg -r sys -m system115a

…. powervm_lpar_simplified_remote_restart_capable=1

The capability can also be listed by

lssyscfg -r sys -m system115a -Fcapabilities

,powervm_lpar_simplified_remote_restart_capable,
The Capabilities tab in the system properties panel displays the capability in the
GUI.

Page 8 of 33
IBM

Using the Rest API, GET of ManagedSystem can be used to view the simplified
remote restart capability. It will be listed under SystemCapabilites as
PowerVMLogicalPartitionSimplifiedRemoteRestartCapable

2.2 Creating a Simplified Remote Restart Partition using HMC


Classic GUI

When creating a simplified remote restart partition from the classic GUI, check
the appropriate box on the very first screen of the partition creation wizard. Refer
to the panel shot below.

Page 9 of 33
IBM

Continue on with the rest of the partition creation wizard. As noted above, some
functions are not be available for a simplified remote restart capable partition.

2.3 Creating a Simplified Remote Restart Partition using HMC


CLI

When creating a partition from the command line interface, add the
“simplified_remote_restart_capable” attribute and set it to the value 1 to enable
simplified remote restart for the partition. For example:

mksyscfg -r lpar -m system115a -i "name=example_rr, profile_name=profile1,


lpar_env=aixlinux, min_mem=256, desired_mem=1024, max_mem=1024, proc_mode=ded,
min_procs=1, desired_procs=1, max_procs=2, sharing_mode=share_idle_procs, ...,
simplified_remote_restart_capable=1"

2.4 Creating a Simplified Remote Restart Partition using Rest


API

When creating a partition using the Rest API , as part of PUT of LogicalPartition, specify

Page 10 of 33
IBM

SimplifiedRemoteRestartCapable as true.

2.5 Toggling Simplified Remote Restart Capability

Simplified remote restart capability can be toggled using Command Line Interface(CLI) or Rest
API. The capability can be changed only when the partition is inactive.

For toggling using CLI, chsyscfg can be used and attribute


“simplified_remote_restart_capable” has to specified with value 1 or 0, to
enable/disable the capability respectively.

chsyscfg -r lpar -m system115a -i


“name=example_rr,simplified_remote_restart_capable=1”

For toggling using Rest API, POST operation can be done on the LogicalPartition
with the desired/appropriate value for SimplifiedRemoteRestartCapable.

2.6 Viewing/Listing the Simplified Remote Restart Capability

In the command line interface the remote restart capability is displayed using
lssyscfg command:

lssyscfg -r lpar -m system115a

name=example_rr,...simplified_remote_restart_capable=1

In the Rest API, SimplifiedRemoteRestartCapable will be displayed when a


GET of LogicalPartition is performed.

The simplified remote restart capability is displayed on the General tab of the
partition's properties when viewed from HMC classic GUI. Enhanced UI support
will be provided later. Toggle of the simplified remote restart capability is not
supported through the GUI currently.

Page 11 of 33
IBM

Page 12 of 33
IBM

3 Remote Restart Operations

Remote Restart operation is supported via the CLI or the Rest API.

3.1 Remote Restart Validation

The rrstartlpar command is used to validate that the destination server is


properly configured and has the available resources to support a remote restart of
the specified partition. This test only reflects the current state of the destination
server. Subsequent changes to the server configuration could change its ability
to support the Remote Restart for that partition.

rrstartlpar -o validate -m <source server> -t <destination server> -p <lpar name> | --id <lpar
id>

3.2 Remote Restart

The rrstartlpar command is also used to remote restart a partition when the
source server has failed. A validation is also performed as part of the restart
operation.

rrstartlpar -o restart -m <source server> -t <destination server> -p <lpar


name> | --id <lpar id>

3.2.1 Remote Restart of a Suspended Partition


It is to be noted that even though the partition is in suspended state on source
server, when the partition is powered on during a remote restart, the suspended
state will be lost and hence force option has to be used to remote restart the
partition which is in suspended state on source server.

rrstartlpar -o restart -m <source server> -t <destination server> -p <lpar


name> | --id <lpar id> --force

3.2.2 Remote Restart with latest available configuration


If the persisted configuration data is not in sync with the current configuration of the
partitions (due to some failures in collecting/storing information), the remote restart status
of the partition will be set to Local Storage Update Failed (Refer to section 7.1 for details
on different remote restart states). Remote Restart operation cannot be performed in this
scenario. However user can force the remote restart to be performed with the available
information using the usecurrdata option. If usecurrdata option is used, HMC will perform
remote restart with the current persisted information and hence current configuration on
source system might not get reflected as is on the destination system.
rrstartlpar -o restart -m <source server> -t <destination server> -p <lpar
name> | --id <lpar id> --usecurrdata

Page 13 of 33
IBM

3.3 Remote Restart Abort

A remote restart operation can be cancelled/aborted by using the rrstartlpar


command. The remote restart operation of a simplified remote restart can be
aborted anytime before the no return point (the remote restart status of the
partition will be set to Destination Remote Restarted, (Refer to section 7.1 for details
on different remote restart states) if the no return point is reached).

rrstartlpar -o cancel -m <source server> -t <destination server> -p <lpar


name> | --id <lpar id>

3.4 Remote Restart Recover

In case a remote restart operation fails, HMC will try to auto recover as much as
possible. If auto recover also fails, an appropriate message will be shown to the
user. The failed remote restart can be recovered by using the rrstartlpar
command. Refer to section 3.7 for different error codes of remote restart
command.

rrstartlpar -o recover -m <source server> -t <destination server> -p <lpar


name> | --id <lpar id> [--force]

3.5 Source Server Cleanup

After a remote restart has completed successfully, the user must manually
remove the original partition from the failed source server (With HMC V8 R8.5.0,
automatic cleanup of remote restarted partitions is performed by HMC. See
section 6.5 for more details on auto cleanup). The following command is used to
cleanup that original partition. The source server must be restored to its a
running state for the cleanup operation to complete successfully, including:
 the source server in the “Powered On” state and
 all Virtual IO Server (VIOS) partitions which were hosting the virtual
adapters should be in “Running” state and
 the RMC connections to those VIOS partitions must should be established.

rrstartlpar -o cleanup -m <source server> -p <lpar name> | --id <lpar id>

Source server Forced Clean-up

In some cases, a server cannot easily be restored to the original running


configuration. In these cases it may be necessary to clean-up the source server in
smaller steps. The remoteRestartLpar command can be used to force the
cleanup of the partition on the source server. Forced cleanup performs as much
cleanup as possible.

rrstartlpar -o cleanup -m <source server> -p <lpar name> | --id <lpar id> --


force

In general, the goal of the clean-up operation is removal of the original partition
and its associated virtual resources. The clean-up operation attempts to
Page 14 of 33
IBM

suppress error messages if the partition is deleted successfully.

Any error output that is generated by a force cleanup operation can be ignored as
long as the partition is deleted. If the partition is not deleted at the end of a
cleanup operation, then the errors should be checked and corrective actions
taken as appropriate.

If the cleanup operation was performed with force option and partition is deleted
successfully, there might be adapter mappings not cleaned up on VIOS partitions
& User might have to clean up the mappings manually.

3.5.1 Cleanup of a partition in suspended state


If the partition to be cleaned up is in suspended state, the clean-up operation
internally forces the suspended partition to shutdown (“Not Activated“ state).

If the forced shutdown fails, an error is returned to stdout and corrective actions
might be needed before the cleanup can be tried again.

3.6 Remote Restart Rest API

Remote Restart through Rest API is performed using a Rest Job. Logical Partition
remote restart Job is used to perform the remote restart operations on the logical partition. The user
can perform validate, recover, restart, cleanup and cancel operations using this job.

The following is the URI which is used to perform remote restart.

https://<hmc
ip>:12443/rest/api/uom/LogicalPartition/{LogicalPartition_UUID}/do/Remote
Restart

Possible Job Parameters :

Operation
targetManagedSystemUUID
targetManagedSystem
Redundancy
Verbose
vlanbridge
force
usecurrdata
retaindev

Sample Job Request:

<JobRequest:JobRequest
xmlns:JobRequest="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/"
xmlns="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/"

Page 15 of 33
IBM

xmlns:ns2="http://www.w3.org/XML/1998/
namespace/k2" schemaVersion="V1_1_0">
<Metadata>
<Atom/>
</Metadata>
<RequestedOperation kb="CUR" kxe="false" schemaVersion="V1_1_0">
<Metadata>
<Atom/>
</Metadata>
<OperationName kb="ROR" kxe="false">RemoteRestart</OperationName>
<GroupName kb="ROR" kxe="false">LogicalPartition</GroupName>
</RequestedOperation>
<JobParameters kb="CUR" kxe="false" schemaVersion="V1_1_0">
<Metadata>
<Atom/>
</Metadata>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kb="ROR" kxe="false">targetManagedSystem</
ParameterName>
<ParameterValue kxe="false" kb="CUR">HV4-221</ParameterValue>
</JobParameter>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kb="ROR" kxe="false">targetManagedSystemUUID</
ParameterName>
<ParameterValue kxe="false"
kb="CUR">b73f1565-0ae4-3070-8eac-58f35a81e898</ParameterValue>
</JobParameter>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kb="ROR" kxe="false">operation</ParameterName>
<ParameterValue kxe="false" kb="CUR">validate</ParameterValue>
</JobParameter>
</JobParameters>
</JobRequest:JobRequest>

For more details on the remote restart Rest Job, refer to the HMC Rest API
reference doc at

https://www.ibm.com/developerworks/community/groups/service/html/comm
unityview?communityUuid=0196fd8d-7287-4dff-8526-
102b5bcf0df5#fullpageWidgetId=W395818bd593b_487f_a7ec_79c3c27093f8&file
=17e2dc0d-4609-48fe-95ca-9c45c7fdfc2a

Page 16 of 33
IBM

3.7 Error Codes of Remote Restart Commands

The following table depicts different error codes for the rrstartlpar command or
the doRemoteRestart Rest job. In most of the cases, HMC auto recovers as much
as possible and recovers/completes actions on source server as well.

Return code Description


value
0 Remote Restart is successful
1 Remote Restart Failed
81 Partition name is already used on target.
82 Remote Restart Validation Failed.
83 Remote Restart failed before no return point
84 Remote Restart failed after no return point
85 Remote Restart Recover is not valid
86 Remote Restart Recover failed
87 Remote Restart failed before no return point & recovery was
successful.
89 Remote Restart force recover failed.
91 Remote Restart recover failed during roll back.
92 Remote restart recover failed during cleanup.

Page 17 of 33
IBM

4 Re-synchronizing the persisted configuration


information

In general, any partition configuration or profile change causes the HMC to


automatically update the partition configuration data persisted on the HMC.
Examples include:

• Any profile change (Create/Edit/Delete)


• DLPAR changes (Dlpar add/remove/move)
• Virtual IO configuration changes
• Activating a partition
• Restoring the profile data
• Any partition property change (e.g. changing the Partition Name)

If an automatic update fails, an error message is displayed providing further


details on the nature of the failure. The actual configuration change or profile
change is still considered successful, but the configuration data persisted on the
HMC is now out of sync with the current configuration of the partition. Some
corrective action may be necessary in such cases and after the corrective action is
complete, the refdev command should be run to manually update (re-
synchronize) the persisted configuration information. Similarly, if any
configuration changes related to the client partition are performed directly on the
Virtual IO Server partition, the persisted configuration information needs to be
updated manually using the refdev command.

4.1 refdev command

The refdev command refreshes the partition and profile configuration data
persisted on the HMC

This command is intended for


 recovery scenarios or
 situations when the Virtual I/O Server configuration is modified without using
the HMC or
 when directed by a message from HMC command line or GUI interface or Rest
API.

Usage: refdev
[ -m <managed system> ]
[ -p <partition name> | --id <partition ID> ]
[ -w <wait time> ]
[ -d <detail level> ]
[ -v ]
[ --help ]

Where
-m <managed system> - the source managed system's name
-p <partition name> - the name of the partition on which to
Page 18 of 33
IBM

perform the operation


--id <partition ID> - the ID of the partition on which to perform
the
operation
-w <wait time> - the maximum time, in minutes, to wait for
commands issued to the VIOS to
complete
-d <detail level> - the level of detail requested
-v - enable verbose mode
--help - prints this help

The refdev command returns 0 if the command complete successfully. Warnings


are possible on stdout.
A return code of 1 indicates an error occurred. Errors and warnings are
presented on stdout.

As an example:

refdev -m system115a -p examplerr

HSCLA9DB The manual resync operation failed, because the partition is


not in valid remote restart state.

4.2 refdev using rest api

refdev can also be performed using the rest api. The following is the resource

/rest/api/uom/ManagementConsole/{ManagementConsole_UUID}/do/CLIRunner
Sample Job Request :
<JobRequest:JobRequest xmlns:JobRequest="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/" xmlns="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/" xmlns:ns2="http://www.w3.org/XML/1998/
namespace/k2" schemaVersion="V1_2">
<Metadata>
<Atom/>
</Metadata>
<RequestedOperation kxe="false" kb="CUR" schemaVersion="V1_2">
<Metadata>
<Atom/>
</Metadata>
<OperationName kxe="false" kb="ROR">CLIRunner</OperationName>
<GroupName kxe="false" kb="ROR">ManagementConsole</GroupName>
</RequestedOperation>
<JobParameters kxe="false" kb="CUR" schemaVersion="V1_2">
<Metadata>
<Atom/>
</Metadata>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kxe="false" kb="ROR">cmd</ParameterName>
Page 19 of 33
IBM

<ParameterValue kxe="false" kb="CUR">refdev -m HV4 –id 3</ParameterValue>


</JobParameter>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kxe="false"
kb="ROR">acknowledgeThisAPIMayGoAwayInTheFuture</ParameterName>
<ParameterValue kxe="false" kb="CUR">true</ParameterValue>
</JobParameter>
</JobParameters>
</JobRequest:JobRequest>

Page 20 of 33
IBM

5 Important Remote Restart Notes

5.1 Partition Activation

 Remote Restartable partitions have several configuration restrictions which


are delineated in Chapter 2. These restrictions are enforced at activation and
will cause the activation to fail if not met.
 The configuration data persisted on the HMC is checked during partitions
activation. If there is no valid copy of the configuration data available and the
data synchronization fails, the partition activation also fails. It is possible,
though unlikely, for this data to be verified valid and a subsequent update to
persisted information during activation to fail. In this event, the activation will
succeed and an error message provided to the user about the failure. In this
case, the persisted data has gone out of sync with the current configuration
and user needs to run the refdev command to sync up the data.

5.2 Server Configuration Notes

 The source and destination servers cannot be the same system.

5.3 Partition Configuration Notes


 The partition name must not already be in use on the destination server.
There is no similar restriction on the partition ID. The HMC attempts to retain
the same partition ID. In the event of a conflict on the destination server
during the remote restart operation, a new partition ID is assigned.
 When adding virtual storage adapters to a running Remote Restartable
partition, the user should complete the server adapter configuration before
adding the client adapter. This allows the data synchronization to complete
successfully when the client virtual adapter is added to the partition.
 If user performs any configuration changes without going through the HMC
interfaces, like attaching storage to a server adapter through the VIOS CLI,
refdev command should be run for the client partition after the change.
 Enabling/Disabling the Simplified remote restart capability is not supported
when using enhanced UI.

Page 21 of 33
IBM

6 Remote Restart Enhancements


The following section describes the Simplified Remote Restart enhancements
being introduced with HMC V8 R8.5.0.

6.1 Cross MC Remote Restart

With this enhancement, Source & Target systems can be managed by 2 different
HMC’s. In such a case, need to setup SSH authentication between the managing
HMC’s before performing a remote restart. Cross MC Remote Restart is supported
only for Simplified Remote Restart capable Partitions.

mkauthkeys command can be used to setup the authentication similar to


authentication setup for Cross MC Live Partition Mobility (LPM).
CLI :
mkauthkeys –-ip <target hmc ip/host name> -u <target hmc user name> --
passwd <password>
rrstartlpar –o validate –m <source system> -t <target system> -p <partition
name> | --id <partition id> --ip <target hmc ip/host name> [-u target hmc
user name]
rrstartlpar –o restart –m <source system> -t <target system> -p <partition
name> | --id <partition id> --ip <target hmc ip/host name> [-u target hmc
user name]
rrstartlpar –o recover –m <source system> -t <target system> -p <partition
name> | --id <partition id> --ip <target hmc ip/host name> [-u target hmc
user name]

Rest API :
https://<<HMCIP>>:12443/rest/api/uom/ManagedSystem/<ManagedSyste
m_UUID>LogicalPartition/<<PARTITION_UUID>>/do/RemoteRestart

Job Parameter Name Mandatory ? Description


TargetRemoteHMCIPAddress Yes – Only for Cross Valid only for
MC SRR related Validate, Restart &
operations Recovery
TargetRemoteHMCUserId Yes – Only for Cross Valid only for
MC SRR related Validate, Restart &
operations Recovery

Page 22 of 33
IBM

6.2 Remote Restart without FSP Connection

This function enables the remote restart to be performed when the entire system
crashes or goes down including the FSP i.e., system state as seen in the HMC in
this case would be “No Connection”. However, the system should have been
connected to the HMC in Operating state before the server outage or the system
connection is lost & the remote restart data should have been captured in the
HMC (Remote Restart Status of the partition can be checked to verify if the
configuration data is valid. See section 8.1 for more info on Remote Restart
States).
A new override is introduced to perform remote restart when the system state is
“No Connection”. User should used the override to perform a remote restart after
making sure that there is actually an server outage & the connection is not lost
because of network failures.
Option is supported only for Simplified Remote Restart capable partitions.

CLI :
rrstartlpar –o validate –m <source system> -t <target system> -p <partition
name> | --id <partition id> --noconnection [--ip <target hmc ip/host name>]
[-u target hmc user name]
rrstartlpar –o restart –m <source system> -t <target system> -p <partition
name> | --id <partition id> --noconnection [--ip <target hmc ip/host name>]
[-u target hmc user name]
Rest API:
https://<<HMCIP>>:12443/rest/api/uom/ManagedSystem/<ManagedSyste
m_UUID>LogicalPartition/<<PARTITION_UUID>>/do/RemoteRestart

Job Parameter Name Mandatory ? Description


noconnection Yes – Only for Remote Valid only for
Restart with no Validate, Restart
connection to the
system

6.3 Live Partition Mobility (LPM) Override

Override is to migrate a Simplified RR capable partition between P7 & P8


systems. The capability will be lost or enabled or retained based on the override &
the partition & source/target system capabilities. For Cross MC LPM operations,
if the override is specified, both the source & target HMC’s need to be at HMC V8
R8.5.0 or later.
CLI :
migrlpar –o v –m <source system> -t <target system> -p <partition name> | -
-id <partition id> --requirerr 1|2

Page 23 of 33
IBM

migrlpar –o m –m <source system> -t <target system> -p <partition name> |


--id <partition id> --requirerr 1|2
1 – Yes 2 – If possible
Rest API :
https://<<HMCIP>>:12443/rest/api/uom/LogicalPartition/<UUID>/do/Migr
ateValidate
https://<<HMCIP>>:12443/rest/api/uom/LogicalPartition/<UUID>/do/Migr
ate

Job Parameter Name Mandatory? Description

Valid Only for


No. Valid Values are
RequireRemoteRestart Validation &
Yes/If Possible
Migration

SRR Partition Migration Override Options

Source system parti- Target system LPM Override LPM Suc- Partition SRR ca-
tion SRR Capable supports SRR cess pable on target

Yes Yes Yes/If Possible Yes Yes

Yes No Yes/Override not speci- No NA


fied

Yes No If Possible Yes No

No Yes Yes/If Possible Yes Yes

No No Yes No NA

No No If Possible/Override not Yes No


specified

6.4 Manage Partition GUI & Partition Templates

Manage Partition GUI :


New Option introduced in Manage Partition UI (Enhanced+ Login) to manage the
Page 24 of 33
IBM

Simplified Remote Restart capability & Remote Restart Status.


 SRR capability can be enabled/disabled when partition is not active
o If system supports Simplifed RR, only Simplified RR option is shown in
the UI even if the partition is enabled with Remote Restart.
 Remote Restart Status is displayed
 Option to refresh configuration data stored for SRR is also provided

Partition Templates :
This functions enables support for creating Simplified RR capable partitions
using partition templates. Enhancements include

 Starter/Pre-defined partition template with Simplified Remote Restart Enabled


 Capture a partition enabled with Simplified RR as a template
 Deploy Partition with Simplified RR capability from templates
o Enabled
 Partition is deployed with SRR capability if system supports
Simplified RR
 Template Deploy fails if system doesn’t support Simplified RR
o Disable
 Partition is deployed without SRR capability
o Enable If Possible
 Partition is deployed with SRR capability if system supports SRR.
 Partition is deployed without SRR capability if system doesn’t
Page 25 of 33
IBM

support SRR

6.5 Auto cleanup of Remote Restarted Partitions

When a partition is remote restarted, prior to HMC V8 R8.5.0, user had to


manually cleanup the remote restarted partition. With HMC V8 R8.5.0,

• Auto Cleanup is performed when


– Source system state comes back be operating state
– Partition remote restart status is “Remote Restarted”
– RMC for the VIOS partitions serving the clients is active
• Auto Cleanup is done without force
• User can trigger the manual cleanup as well using the rrstartlpar command
• When PowerVC is used to orchestrate Remote Restart
– Auto cleanup can be disabled
– By default, auto cleanup is enabled

Page 26 of 33
IBM

– Setting is maintained across upgrades, but not on fresh install.


– CLI :
o rrstartlpar –o set -r mc –i “auto_cleanup_enabled=0|1”
o lsrrstartlpar –r mc
auto_cleanup_enabled=0|1

6.6 User overrides/specifications

User can specify the Shared Processor Pool/Id to be used on the target system
and also the Virtual Fiber Channel Mappings like target vios lpar, target server
adapter slot number, target FC physical Port. Appropriate error/warning
messages would be provided if the user specifications cannot be honored on a
remote restart.
CLI :
rrstartlpar –o validate –m <source system> -t <target system> -p <partition
name> | --id <partition id> -i “shared_proc_pool_id=<target spp
id>|shared_proc_pool_name=<target spp name>” [--ip <IP Address>] [-u <user
id>]
rrstartlpar –o restart –m <source system> -t <target system> -p <partition
name> | --id <partition id> -i “shared_proc_pool_id=<target spp
id>|shared_proc_pool_name=<target spp name>” [--ip <IP Address>] [-u <user
id>]
rrstartlpar -o validate -m <source system> -t <target system> -p <partition
name> | --id <partition id> [--ip <IP Address>] [-u <user id>] -i|-f
“virtual_fc_mappings=slot_num/vios_lpar_name/vios_lp
ar_id/[vios_slot_num]/[vios_fc_port_name]”
rrstartlpar -o restart -m <source system> -t <target system> -p <partition
name> | --id <partition id> [--ip <IP Address>] [-u <user id>] -i|-f
“virtual_fc_mappings=slot_num/vios_lpar_name/vios_lp
ar_id/[vios_slot_num]/[vios_fc_port_name]”

Rest API:
https://<<HMCIP>>:12443/rest/api/uom/ManagedSystem/<ManagedSyste
m_UUID>LogicalPartition/<<PARTITION_UUID>>/do/RemoteRestart

Job Parameter
Mandatory Description
Name

If user wants to choose


a specific shared Valid only for Validation,
SharedProcPoolName processor pool on target Restart.
system, can be
speciified. Mutually

Page 27 of 33
IBM

exclusive with
SharedProcPoolID.

If the user wants to


choose a specific shared
processor pool on target
system using shared Valid only for Validation,
SharedProcPoolID
processor pool id, can Restart
be specified. Mutually
exclusive with
SharedProcPoolName.

If user wants to specify


mappings for Virtual FC
adapters on target
system like choosing Valid only for Validation,
VirtualFCMappings
target vios, target FC Restart.
port, this can be
specified in the same
format as in CLI.

6.7 Concurrent Remote Restart Improvements

 Number of concurrent remote restart operations supported per system is


increased to 32
 New Command to list the system level & lpar level remote restart details
 CLI :
o lsrrstartlpar –r sys | lpar
o lsrrstartlpar –r sys –m <system name>
o lsrrstartlpar –r lpar –m <system name>

Page 28 of 33
IBM

7 References

IBM's Hardware Management Console V8

• http://www.redbooks.ibm.com/abstracts/sg248232.html
• www.redbooks.ibm.com/redbooks/pdfs/sg248232.pdf

Live Partition Mobility (migration) with IBM's Hardware Management Console

• http://www.redbooks.ibm.com/abstracts/sg247460.html
• http://www.redbooks.ibm.com/redbooks/pdfs/sg247460.pdf
• http://www-
01.ibm.com/support/knowledgecenter/POWER8/p8hc3/p8hc3_hmcprepsi
mpremres.htm?cp=POWER8%2F1-7-1-3-7-1-1-0-8

Blog References

 PowerVM Remote Restart Enhancements - https://ibm.biz/Bd4ueT


 Remote Restart with PowerVC & HMC - https://ibm.biz/BdrX6y
 Remote Restart with PowerVc & NovaLink - https://ibm.biz/BdrX6M

Page 29 of 33
IBM

8 Appendix

8.1 Remote Restart States

A remote restartable partition goes through several state changes with respect to
the Remote Restart operation, both on the source and destination servers. Most
remote restart operations are only supported with the partition in the appropriate
remote restart state. A remote restart state is not strongly related to the partition
state itself. It is an indicator specifically associated with the Remote Restart
operation itself.

Note: In this section, the states discussed refer only to a partition's remote restart
state and not any other states that a partition might be in or going through
during its normal or error operations.

A partition configured for remote restart is said to be in the Invalid state until it is
activated. Remote restart is not applicable until a partition has been started for
the first time.

Once the partition is up and running, it transitions into the Remote Restartable
state. A partition in this state can be remote restarted.

During the actual remote restart operation, the source partition is put into the
Source Remote Restarting state and the destination side partition is in the
Destination Remote Restarting state. These states are transitional until the remote
restart operation has completed or been canceled. When the remote restart
operation reaches the no return point on target system, remote restart status
would be set to Destination Remote Restarted.

After the remote restart operation the source side partition is left in the Remote
Restarted state and the destination side partition is now in the Remote Restartable
state. The source side partition could then be cleaned up and the destination
partition is now again ready to be restarted as needed.

If for some reason an update to persisted information on HMC fails, then the
partition would be in the Local Storage Update Failed state. This state indicates
that the the persisted information on HMC is out of sync with the current
partition configuration. Remote restart will not be allowed in this remote restart
state. But the user can use the usecurrdata option to execute a remote restart.
The partition will be restarted with the configuration data on the device and the
remote restart state on the source system will be updated to Forced Source Side
Restart

If a simplified remote restart partition is suspended, the remote restart state


Page 30 of 33
IBM

would be set to Remote Restartable Suspended state. This state indicates that the
partition is suspended on source system and remote restart will not be allowed in
this remote restart state. But the user can use the force option to execute a
remote restart. The partition will be restarted on the target system (and the
suspended state will be lost) and remote restart state on the source system will be
set to Source Remote Restarting for suspended Partition.

When a system is connected to the HMC, if there are partitions with simplified
remote restart capability enabled, HMC will automatically collect the
configuration information & persist the same. There are some configuration
information (like virtual adapter information) which requires RMC connection to
the VIOS partitions and hence HMC will wait till the rmc connection is
established to collect such information. In this case where the virtual adapter
information is not collected, the remote restart state will be set to Partial Update.
Remote Restart state will be set to Stale Data, if there is configuration information
already existing for a partition in the HMC in this case, before the state is
changed to Partial Update.

If an update to the persisted information fails because there is not enough space
on HMC's disk to store the configuration information, then the remote restart
state will be updated to Out Of Space. User can free up space on HMC's disk and
run the refdev command to recover from this state.

If a profile restore operation is performed on a system, while creating the


simplified remote restart capable partition, remote restart state will be set to
Profile Restored.
If a cleanup operation performed after a successful remote restart fails, remote
restart state will be set to Source Side Cleanup Failed.

User can use the lssyscfg command to display the partition's remote restart
status. The possible values are:

• Invalid
• Remote Restartable
• Source Remote Restarting
• Destination Remote Restarting
• Remote Restarted
• Profile Restored
• Forced Source Side Restart
• Source Side Cleanup Failed
• Remote Restartable Suspended
• Local Storage Update Failed
• Stale Data
• Partial Update
• Out Of Space
• Source Remote Restarting for suspended Partition
• Destination Remote Restarted

lssyscfg -m HV4 -r lpar -F


name,state,simplified_remote_restart_capable,remote_restart_status –header

name,state,simplified_remote_restart_capable,remote_restart_status

Page 31 of 33
IBM

lp01,Not Activated,1,Invalid
lp03,Running,1,Remote Restartable
aix1,Not Activated,1,Invalid
aixes_withdev,Not Activated,1,Local Data InValid
ibmi1,Not Activated,1,Local Storage Update Failed
vios2,Running,0,null
vios1,Running,0,null

8.2 Remote Restart Feature/Support Matrix

Function HMC Level Firmware Level VIOS Level PowerVC


Level
HMC CLI for Remote V8R8.1.0 or later FW760.00 or later 2.2.2.0 or later NA
Restart
Toggle Remote Restart V8R8.1.0 or later FW810.00 or later NA NA
Simplified Remote V8R8.2.0 or later FW820.00 or later 2.2.3.4 or later 1.2.3 or later
Restart
Simplified Remote V8R8.4.0 or later FW820.00 or later 2.2.4.0 or later 1.3.1 or later
Restart with Shared
Storage Pool Storage
Remote Restart V8R8.5.0 or later FW820.00 or later 2.2.3.4 or 2.2.4.0 or NA or not yet
Enhancements (Please later supported
refer to Section 6)

Notes:

 HMC V8 R8.2.0 or later, supports both old as well as simplified remote restart.
 If there is a system with FW820.00 or later & Power 7 system (FW760.00 or later), partitions
with old remote restart capability can be remote restarted between them.
 VIOS level of 2.2.3.4 must be used to use NPIV with Simplified Remote Restart.

8.3 Sample script for triggering remote restart

rrMonitor is an example script, which shows how a system can be monitored for
failure and remote restart can be triggered automatically. The script is not
officially supported. The script is available in the following location
https://www.ibm.com/developerworks/community/groups/service/html/comm
unityview?communityUuid=0196fd8d-7287-4dff-8526-
102b5bcf0df5#fullpageWidgetId=W395818bd593b_487f_a7ec_79c3c27093f8&file
=1bb4f57c-f244-4208-87dd-3a780847c613
Below are some notes on the example script

Page 32 of 33
IBM

1. Monitors cec state in time interval specified by user and triggers remote restart
if the system goes to "Error" or "Error - Dump in progress" state
2. Cleanup is not performed by the script
3. While monitoring, performs check for remote restart status and refreshes the
configuration data, if needed.
4. Performs validate for remote restart operation while monitoring in regular
intervals

End of Document
Page 33 of 33

S-ar putea să vă placă și