Documente Academic
Documente Profesional
Documente Cultură
Document Author :
Hari G M
Page 1 of 33
IBM
Table of Contents
1 Introduction .................................................................................................................. 3
1.1 What is Partition Remote Restart .............................................................................. 3
1.2 Remote Restart Configuration Setup ........................................................................ 5
1.3 Simplified Remote Restart ........................................................................................... 5
1.4 Code Level Requirements ............................................................................................. 6
1.5 Simplified Remote Restart Configuration Setup ................................................... 7
2 Creating a Simplified Remote Restart Partition ................................................ 8
2.1 System Level Capability for Simplified Remote Restart ..................................... 8
2.2 Creating a Simplified Remote Restart Partition using HMC Classic GUI ....... 9
2.3 Creating a Simplified Remote Restart Partition using HMC CLI .................... 10
2.4 Creating a Simplified Remote Restart Partition using Rest API..................... 10
2.5 Toggling Simplified Remote Restart Capability ................................................... 11
2.6 Viewing/Listing the Simplified Remote Restart Capability ............................. 11
3 Remote Restart Operations .................................................................................... 13
3.1 Remote Restart Validation ......................................................................................... 13
3.2 Remote Restart.............................................................................................................. 13
3.2.1 Remote Restart of a Suspended Partition ................................................................. 13
3.2.2 Remote Restart with latest available configuration ................................................ 13
3.3 Remote Restart Abort .................................................................................................. 14
3.4 Remote Restart Recover ............................................................................................. 14
3.5 Source Server Cleanup ................................................................................................ 14
3.5.1 Cleanup of a partition in suspended state ................................................................ 15
3.6 Remote Restart Rest API ............................................................................................ 15
3.7 Error Codes of Remote Restart Commands .......................................................... 17
4 Re-synchronizing the persisted configuration information ......................... 18
4.1 refdev command ............................................................................................................ 18
4.2 refdev using rest api..................................................................................................... 19
5 Important Remote Restart Notes ......................................................................... 21
5.1 Partition Activation...................................................................................................... 21
5.2 Server Configuration Notes ........................................................................................ 21
5.3 Partition Configuration Notes ................................................................................... 21
6 Remote Restart Enhancements ............................................................................ 22
6.1 Cross MC Remote Restart .......................................................................................... 22
6.2 Remote Restart without FSP Connection .............................................................. 23
6.3 Live Partition Mobility (LPM) Override ................................................................... 23
6.4 Manage Partition GUI & Partition Templates ....................................................... 24
6.5 Auto cleanup of Remote Restarted Partitions...................................................... 26
6.6 User overrides/specifications.................................................................................... 27
6.7 Concurrent Remote Restart Improvements .......................................................... 28
7 References ................................................................................................................... 29
8 Appendix....................................................................................................................... 30
8.1 Remote Restart States ................................................................................................ 30
8.2 Remote Restart Feature/Support Matrix............................................................... 32
8.3 Sample script for triggering remote restart .......................................................... 32
Page 2 of 33
IBM
1 Introduction
This document describes usage of the PowerVM feature called Simplified Remote
Restart. It is intended as a user's guide describing configuration and usage of the
Simplified Remote Restart function.
The sections that follow provide a description of the setup required for performing
simplified remote restart, how to create and deploy a remote restart partition and
how to use the remote restart functionality. There is also an overview of the
existing remote restart function & user model to highlight the advantages of
simplified remote restart function.
Remote restart is a high availability option for partitions. In the event of an error
that causes a server outage, a partition configured for remote restart can be
restarted on a different physical server. At times, it might take longer to to bring
up the server , in which case remote restart function can be used for faster re-
provisioning of the partition. Typically this can be done faster than restarting the
server that crashed and then restarting the partition(s).
Page 3 of 33
IBM
• HMC (V8R8.1.0)
• User can create partition with remote restart capability on a server.
• User can assign a reserved storage device to remote restart partition
which is accessible from another server.
• User activates the partition and the current configuration is
automatically written to the reserved storage device by the server
firmware.
• The data in the reserved storage device is also updated automatically by
the firmware for any configuration change.
• HMC collects the partition's resource configuration and writes to the
reserved storage device which will be used to restore the partition's
configuration during remote restart.
• User performs remote restart and clean up operations
One of the major disadvantage of the above user model is that a reserved storage
device has to be attached to each partition. User has to manage a reserved
storage device pool on source & target systems , keep track of device assigned to
each partition which makes setting up the environment complex. Simplified
remote restart addresses this pain point.
Page 4 of 33
IBM
Simplified Remote Restart, as the name suggests, reduces the complexity and
improves the usability of the remote restart function. Simplified Remote Restart
removes the requirement of a reserved storage to be assigned to each partition,
there-by making the user model or setting up the environment simpler.
Here is the typical user model for simplified remote restart
User Creates a partition with simplified remote restart capability on a
capable server. Lets call this the source server.
User can enable/disable the capability anytime after creating the partition.
Toggle of the capability is supported only when the partition is inactive.
Page 5 of 33
IBM
The following code levels are required to use simplified remote restart
HMC V8 R8.2.0 or later
System Firmware 820 or later
VIOS level of 2.2.3.4 or later
Simplified Remote Restart for partitions using Shared Storage Pool (SSP) Storage is supported
with HMC level V8 R8.4.0 & VIOS level of 2.2.4.0 or later.
Page 6 of 33
IBM
Page 7 of 33
IBM
User can set a partition to be simplified remote restartable is during its creation
or thereafter. Toggle of the simplified remote restart capability is supported only
when the partition is inactive.
Partitions can be created using both the Graphical User Interface (GUI) or the
command line or Rest API.
There are several restrictions for a simplified remote restart partition similar to
partition mobility
The partition:
a. can not be a full system partition
b. can not be a VIOS partition
c. can not be a Service Partition
d. can not be an alternate error logging partition
e. can not have BSR
f. can not have Huge Pages (applicable only if AMS enabled)
g. can not be part of eWLM group
h. can not have physical I/O assigned
i. can not have HEA/HCA/SMA/SRIOV adapters assigned
j. can not have Server SCSI adapter
k. can not have a client SCSI adapter hosted by a non-vios partition
In most cases, the HMC prevents the user from improperly configuring a
Simplified Remote Restart capable partition. These checks are repeated at
partition activation to ensure we do not activate a Simplified Remote Restart
partition with an incompatible configuration.
System level capability for simplified remote restart can be viewed via GUI, CLI or
Rest API.
From CLI, system level capability is displayed using the lssyscfg command
…. powervm_lpar_simplified_remote_restart_capable=1
,powervm_lpar_simplified_remote_restart_capable,
The Capabilities tab in the system properties panel displays the capability in the
GUI.
Page 8 of 33
IBM
Using the Rest API, GET of ManagedSystem can be used to view the simplified
remote restart capability. It will be listed under SystemCapabilites as
PowerVMLogicalPartitionSimplifiedRemoteRestartCapable
When creating a simplified remote restart partition from the classic GUI, check
the appropriate box on the very first screen of the partition creation wizard. Refer
to the panel shot below.
Page 9 of 33
IBM
Continue on with the rest of the partition creation wizard. As noted above, some
functions are not be available for a simplified remote restart capable partition.
When creating a partition from the command line interface, add the
“simplified_remote_restart_capable” attribute and set it to the value 1 to enable
simplified remote restart for the partition. For example:
When creating a partition using the Rest API , as part of PUT of LogicalPartition, specify
Page 10 of 33
IBM
SimplifiedRemoteRestartCapable as true.
Simplified remote restart capability can be toggled using Command Line Interface(CLI) or Rest
API. The capability can be changed only when the partition is inactive.
For toggling using Rest API, POST operation can be done on the LogicalPartition
with the desired/appropriate value for SimplifiedRemoteRestartCapable.
In the command line interface the remote restart capability is displayed using
lssyscfg command:
name=example_rr,...simplified_remote_restart_capable=1
The simplified remote restart capability is displayed on the General tab of the
partition's properties when viewed from HMC classic GUI. Enhanced UI support
will be provided later. Toggle of the simplified remote restart capability is not
supported through the GUI currently.
Page 11 of 33
IBM
Page 12 of 33
IBM
Remote Restart operation is supported via the CLI or the Rest API.
rrstartlpar -o validate -m <source server> -t <destination server> -p <lpar name> | --id <lpar
id>
The rrstartlpar command is also used to remote restart a partition when the
source server has failed. A validation is also performed as part of the restart
operation.
Page 13 of 33
IBM
In case a remote restart operation fails, HMC will try to auto recover as much as
possible. If auto recover also fails, an appropriate message will be shown to the
user. The failed remote restart can be recovered by using the rrstartlpar
command. Refer to section 3.7 for different error codes of remote restart
command.
After a remote restart has completed successfully, the user must manually
remove the original partition from the failed source server (With HMC V8 R8.5.0,
automatic cleanup of remote restarted partitions is performed by HMC. See
section 6.5 for more details on auto cleanup). The following command is used to
cleanup that original partition. The source server must be restored to its a
running state for the cleanup operation to complete successfully, including:
the source server in the “Powered On” state and
all Virtual IO Server (VIOS) partitions which were hosting the virtual
adapters should be in “Running” state and
the RMC connections to those VIOS partitions must should be established.
In general, the goal of the clean-up operation is removal of the original partition
and its associated virtual resources. The clean-up operation attempts to
Page 14 of 33
IBM
Any error output that is generated by a force cleanup operation can be ignored as
long as the partition is deleted. If the partition is not deleted at the end of a
cleanup operation, then the errors should be checked and corrective actions
taken as appropriate.
If the cleanup operation was performed with force option and partition is deleted
successfully, there might be adapter mappings not cleaned up on VIOS partitions
& User might have to clean up the mappings manually.
If the forced shutdown fails, an error is returned to stdout and corrective actions
might be needed before the cleanup can be tried again.
Remote Restart through Rest API is performed using a Rest Job. Logical Partition
remote restart Job is used to perform the remote restart operations on the logical partition. The user
can perform validate, recover, restart, cleanup and cancel operations using this job.
https://<hmc
ip>:12443/rest/api/uom/LogicalPartition/{LogicalPartition_UUID}/do/Remote
Restart
Operation
targetManagedSystemUUID
targetManagedSystem
Redundancy
Verbose
vlanbridge
force
usecurrdata
retaindev
<JobRequest:JobRequest
xmlns:JobRequest="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/"
xmlns="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/"
Page 15 of 33
IBM
xmlns:ns2="http://www.w3.org/XML/1998/
namespace/k2" schemaVersion="V1_1_0">
<Metadata>
<Atom/>
</Metadata>
<RequestedOperation kb="CUR" kxe="false" schemaVersion="V1_1_0">
<Metadata>
<Atom/>
</Metadata>
<OperationName kb="ROR" kxe="false">RemoteRestart</OperationName>
<GroupName kb="ROR" kxe="false">LogicalPartition</GroupName>
</RequestedOperation>
<JobParameters kb="CUR" kxe="false" schemaVersion="V1_1_0">
<Metadata>
<Atom/>
</Metadata>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kb="ROR" kxe="false">targetManagedSystem</
ParameterName>
<ParameterValue kxe="false" kb="CUR">HV4-221</ParameterValue>
</JobParameter>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kb="ROR" kxe="false">targetManagedSystemUUID</
ParameterName>
<ParameterValue kxe="false"
kb="CUR">b73f1565-0ae4-3070-8eac-58f35a81e898</ParameterValue>
</JobParameter>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kb="ROR" kxe="false">operation</ParameterName>
<ParameterValue kxe="false" kb="CUR">validate</ParameterValue>
</JobParameter>
</JobParameters>
</JobRequest:JobRequest>
For more details on the remote restart Rest Job, refer to the HMC Rest API
reference doc at
https://www.ibm.com/developerworks/community/groups/service/html/comm
unityview?communityUuid=0196fd8d-7287-4dff-8526-
102b5bcf0df5#fullpageWidgetId=W395818bd593b_487f_a7ec_79c3c27093f8&file
=17e2dc0d-4609-48fe-95ca-9c45c7fdfc2a
Page 16 of 33
IBM
The following table depicts different error codes for the rrstartlpar command or
the doRemoteRestart Rest job. In most of the cases, HMC auto recovers as much
as possible and recovers/completes actions on source server as well.
Page 17 of 33
IBM
The refdev command refreshes the partition and profile configuration data
persisted on the HMC
Usage: refdev
[ -m <managed system> ]
[ -p <partition name> | --id <partition ID> ]
[ -w <wait time> ]
[ -d <detail level> ]
[ -v ]
[ --help ]
Where
-m <managed system> - the source managed system's name
-p <partition name> - the name of the partition on which to
Page 18 of 33
IBM
As an example:
refdev can also be performed using the rest api. The following is the resource
/rest/api/uom/ManagementConsole/{ManagementConsole_UUID}/do/CLIRunner
Sample Job Request :
<JobRequest:JobRequest xmlns:JobRequest="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/" xmlns="http://www.ibm.com/xmlns/systems/
power/firmware/web/mc/2012_10/" xmlns:ns2="http://www.w3.org/XML/1998/
namespace/k2" schemaVersion="V1_2">
<Metadata>
<Atom/>
</Metadata>
<RequestedOperation kxe="false" kb="CUR" schemaVersion="V1_2">
<Metadata>
<Atom/>
</Metadata>
<OperationName kxe="false" kb="ROR">CLIRunner</OperationName>
<GroupName kxe="false" kb="ROR">ManagementConsole</GroupName>
</RequestedOperation>
<JobParameters kxe="false" kb="CUR" schemaVersion="V1_2">
<Metadata>
<Atom/>
</Metadata>
<JobParameter schemaVersion="V1_0">
<Metadata>
<Atom/>
</Metadata>
<ParameterName kxe="false" kb="ROR">cmd</ParameterName>
Page 19 of 33
IBM
Page 20 of 33
IBM
Page 21 of 33
IBM
With this enhancement, Source & Target systems can be managed by 2 different
HMC’s. In such a case, need to setup SSH authentication between the managing
HMC’s before performing a remote restart. Cross MC Remote Restart is supported
only for Simplified Remote Restart capable Partitions.
Rest API :
https://<<HMCIP>>:12443/rest/api/uom/ManagedSystem/<ManagedSyste
m_UUID>LogicalPartition/<<PARTITION_UUID>>/do/RemoteRestart
Page 22 of 33
IBM
This function enables the remote restart to be performed when the entire system
crashes or goes down including the FSP i.e., system state as seen in the HMC in
this case would be “No Connection”. However, the system should have been
connected to the HMC in Operating state before the server outage or the system
connection is lost & the remote restart data should have been captured in the
HMC (Remote Restart Status of the partition can be checked to verify if the
configuration data is valid. See section 8.1 for more info on Remote Restart
States).
A new override is introduced to perform remote restart when the system state is
“No Connection”. User should used the override to perform a remote restart after
making sure that there is actually an server outage & the connection is not lost
because of network failures.
Option is supported only for Simplified Remote Restart capable partitions.
CLI :
rrstartlpar –o validate –m <source system> -t <target system> -p <partition
name> | --id <partition id> --noconnection [--ip <target hmc ip/host name>]
[-u target hmc user name]
rrstartlpar –o restart –m <source system> -t <target system> -p <partition
name> | --id <partition id> --noconnection [--ip <target hmc ip/host name>]
[-u target hmc user name]
Rest API:
https://<<HMCIP>>:12443/rest/api/uom/ManagedSystem/<ManagedSyste
m_UUID>LogicalPartition/<<PARTITION_UUID>>/do/RemoteRestart
Page 23 of 33
IBM
Source system parti- Target system LPM Override LPM Suc- Partition SRR ca-
tion SRR Capable supports SRR cess pable on target
No No Yes No NA
Partition Templates :
This functions enables support for creating Simplified RR capable partitions
using partition templates. Enhancements include
support SRR
Page 26 of 33
IBM
User can specify the Shared Processor Pool/Id to be used on the target system
and also the Virtual Fiber Channel Mappings like target vios lpar, target server
adapter slot number, target FC physical Port. Appropriate error/warning
messages would be provided if the user specifications cannot be honored on a
remote restart.
CLI :
rrstartlpar –o validate –m <source system> -t <target system> -p <partition
name> | --id <partition id> -i “shared_proc_pool_id=<target spp
id>|shared_proc_pool_name=<target spp name>” [--ip <IP Address>] [-u <user
id>]
rrstartlpar –o restart –m <source system> -t <target system> -p <partition
name> | --id <partition id> -i “shared_proc_pool_id=<target spp
id>|shared_proc_pool_name=<target spp name>” [--ip <IP Address>] [-u <user
id>]
rrstartlpar -o validate -m <source system> -t <target system> -p <partition
name> | --id <partition id> [--ip <IP Address>] [-u <user id>] -i|-f
“virtual_fc_mappings=slot_num/vios_lpar_name/vios_lp
ar_id/[vios_slot_num]/[vios_fc_port_name]”
rrstartlpar -o restart -m <source system> -t <target system> -p <partition
name> | --id <partition id> [--ip <IP Address>] [-u <user id>] -i|-f
“virtual_fc_mappings=slot_num/vios_lpar_name/vios_lp
ar_id/[vios_slot_num]/[vios_fc_port_name]”
Rest API:
https://<<HMCIP>>:12443/rest/api/uom/ManagedSystem/<ManagedSyste
m_UUID>LogicalPartition/<<PARTITION_UUID>>/do/RemoteRestart
Job Parameter
Mandatory Description
Name
Page 27 of 33
IBM
exclusive with
SharedProcPoolID.
Page 28 of 33
IBM
7 References
• http://www.redbooks.ibm.com/abstracts/sg248232.html
• www.redbooks.ibm.com/redbooks/pdfs/sg248232.pdf
• http://www.redbooks.ibm.com/abstracts/sg247460.html
• http://www.redbooks.ibm.com/redbooks/pdfs/sg247460.pdf
• http://www-
01.ibm.com/support/knowledgecenter/POWER8/p8hc3/p8hc3_hmcprepsi
mpremres.htm?cp=POWER8%2F1-7-1-3-7-1-1-0-8
Blog References
Page 29 of 33
IBM
8 Appendix
A remote restartable partition goes through several state changes with respect to
the Remote Restart operation, both on the source and destination servers. Most
remote restart operations are only supported with the partition in the appropriate
remote restart state. A remote restart state is not strongly related to the partition
state itself. It is an indicator specifically associated with the Remote Restart
operation itself.
Note: In this section, the states discussed refer only to a partition's remote restart
state and not any other states that a partition might be in or going through
during its normal or error operations.
A partition configured for remote restart is said to be in the Invalid state until it is
activated. Remote restart is not applicable until a partition has been started for
the first time.
Once the partition is up and running, it transitions into the Remote Restartable
state. A partition in this state can be remote restarted.
During the actual remote restart operation, the source partition is put into the
Source Remote Restarting state and the destination side partition is in the
Destination Remote Restarting state. These states are transitional until the remote
restart operation has completed or been canceled. When the remote restart
operation reaches the no return point on target system, remote restart status
would be set to Destination Remote Restarted.
After the remote restart operation the source side partition is left in the Remote
Restarted state and the destination side partition is now in the Remote Restartable
state. The source side partition could then be cleaned up and the destination
partition is now again ready to be restarted as needed.
If for some reason an update to persisted information on HMC fails, then the
partition would be in the Local Storage Update Failed state. This state indicates
that the the persisted information on HMC is out of sync with the current
partition configuration. Remote restart will not be allowed in this remote restart
state. But the user can use the usecurrdata option to execute a remote restart.
The partition will be restarted with the configuration data on the device and the
remote restart state on the source system will be updated to Forced Source Side
Restart
would be set to Remote Restartable Suspended state. This state indicates that the
partition is suspended on source system and remote restart will not be allowed in
this remote restart state. But the user can use the force option to execute a
remote restart. The partition will be restarted on the target system (and the
suspended state will be lost) and remote restart state on the source system will be
set to Source Remote Restarting for suspended Partition.
When a system is connected to the HMC, if there are partitions with simplified
remote restart capability enabled, HMC will automatically collect the
configuration information & persist the same. There are some configuration
information (like virtual adapter information) which requires RMC connection to
the VIOS partitions and hence HMC will wait till the rmc connection is
established to collect such information. In this case where the virtual adapter
information is not collected, the remote restart state will be set to Partial Update.
Remote Restart state will be set to Stale Data, if there is configuration information
already existing for a partition in the HMC in this case, before the state is
changed to Partial Update.
If an update to the persisted information fails because there is not enough space
on HMC's disk to store the configuration information, then the remote restart
state will be updated to Out Of Space. User can free up space on HMC's disk and
run the refdev command to recover from this state.
User can use the lssyscfg command to display the partition's remote restart
status. The possible values are:
• Invalid
• Remote Restartable
• Source Remote Restarting
• Destination Remote Restarting
• Remote Restarted
• Profile Restored
• Forced Source Side Restart
• Source Side Cleanup Failed
• Remote Restartable Suspended
• Local Storage Update Failed
• Stale Data
• Partial Update
• Out Of Space
• Source Remote Restarting for suspended Partition
• Destination Remote Restarted
name,state,simplified_remote_restart_capable,remote_restart_status
Page 31 of 33
IBM
lp01,Not Activated,1,Invalid
lp03,Running,1,Remote Restartable
aix1,Not Activated,1,Invalid
aixes_withdev,Not Activated,1,Local Data InValid
ibmi1,Not Activated,1,Local Storage Update Failed
vios2,Running,0,null
vios1,Running,0,null
Notes:
HMC V8 R8.2.0 or later, supports both old as well as simplified remote restart.
If there is a system with FW820.00 or later & Power 7 system (FW760.00 or later), partitions
with old remote restart capability can be remote restarted between them.
VIOS level of 2.2.3.4 must be used to use NPIV with Simplified Remote Restart.
rrMonitor is an example script, which shows how a system can be monitored for
failure and remote restart can be triggered automatically. The script is not
officially supported. The script is available in the following location
https://www.ibm.com/developerworks/community/groups/service/html/comm
unityview?communityUuid=0196fd8d-7287-4dff-8526-
102b5bcf0df5#fullpageWidgetId=W395818bd593b_487f_a7ec_79c3c27093f8&file
=1bb4f57c-f244-4208-87dd-3a780847c613
Below are some notes on the example script
Page 32 of 33
IBM
1. Monitors cec state in time interval specified by user and triggers remote restart
if the system goes to "Error" or "Error - Dump in progress" state
2. Cleanup is not performed by the script
3. While monitoring, performs check for remote restart status and refreshes the
configuration data, if needed.
4. Performs validate for remote restart operation while monitoring in regular
intervals
End of Document
Page 33 of 33