Documente Academic
Documente Profesional
Documente Cultură
Engineering Residency
Best Practices Planning
Abstract
This white paper outlines the results of a recent JPMorgan Chase EMC Engineering Residency focused on
research and development of JPMorgan Chase data migration options and strategies. This paper describes the
overall EMC Engineering Residency project and the experience JPMorgan Chase had using EMC data
migration technologies in EMC Engineering labs. Data migration methodologies and applicability for
migrating off legacy storage arrays onto multi-tiered Symmetrix DMX systems were evaluated and decision
criteria outlined based on JPMorgan Chases business objectives and availability requirements.
May 2007
Copyright 2007 EMC Corporation. All rights reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION
MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE
INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com
All other trademarks used herein are the property of their respective owners.
Part Number H2799
Data Migration Considerations: A Customer Engineering Residency
Best Practices Planning 2
Table of Contents
Executive summary ............................................................................................ 4
Introduction......................................................................................................... 4
Audience ...................................................................................................................................... 4
Data migration process...................................................................................... 4
Phase 0: Assessment .................................................................................................................. 5
Phase 1: Planning and design..................................................................................................... 6
Phase 2: Change control ............................................................................................................. 7
Phase 3: Migration execution....................................................................................................... 7
Phase 4: Post-migration review................................................................................................... 8
Data migration approaches.......................................................................................................... 8
Host-based data migrations ..................................................................................................... 8
Array-based data migrations .................................................................................................... 9
EMC data migration technologies ..................................................................... 9
EMC Open Migrator/LM............................................................................................................... 9
EMC Open Replicator .................................................................................................................. 9
EMC PowerPath Migration Enabler ............................................................................................. 9
Data migration technology comparison ........................................................... 9
EMC Open Migrator/LM considerations..................................................................................... 10
EMC Open Replicator considerations........................................................................................ 10
Push operations ..................................................................................................................... 11
Pull operations........................................................................................................................ 11
EMC PowerPath Migration Enabler considerations................................................................... 11
Decision criteria................................................................................................ 12
JPMorgan Chase technology decision tree.................................................... 13
Conclusion ........................................................................................................ 13
References ........................................................................................................ 13
Data Migration Considerations: A Customer Engineering Residency
Best Practices Planning 3
Executive summary
Enterprise data migration projects are often complex, time-consuming engagements that require detailed
planning to mitigate the risk of incurring business disruptions. Migration projects are often considered
one-offs because the technologies and methodologies must be reviewed on a migration-by-migration
basis. However, the overall procedures and best practices are similar for the majority of data migration
projects.
Within the multiple JPMorgan Chase data centers worldwide, host-based and disk-based data migrations
are regularly required to consolidate storage, eliminate capacity limitation, address resource performance
constraints, and periodically refresh technology to reflect current standards. A team of JPMorgan Chase
system architects, implementers, and storage administrators, working jointly with EMC Engineering,
engaged in a project to streamline the host-based data migration techniques currently deployed. This project
was referred to as the Commando Residency and was held at EMC Engineering headquarters in
Hopkinton, Mass., in June 2006.
The Symmetrix
Open Migrator/LM provides online data migration in Microsoft Windows and UNIX environments.
Open Migrator/LM allows volumes to remain online during a migration, increasing application availability
during a process that traditionally requires extensive downtime. The application outage occurs when
switching from the production to the target devices.
Open Migrator/LM for UNIX provides an online data migration technology that utilizes host system
resources to migrate data from the source to the target storage arrays.
Open Migrator/LM for Windows operates at the filter-driver level to manage and move Windows data from
a source to a target volume with minimal disruption to the server or applications.
EMC Open Replicator
EMC Open Replicator provides a method for copying data to or from a Symmetrix DMX storage system
to qualified storage arrays. It requires no host resources as it leverages the storage area network (SAN)
infrastructure to provide deployment flexibility and massive scalability. Open Replicator can also be used
to create point-in-time copies to be used for high-speed data mobility, remote vaulting, migration, and
distribution. Copying data from a Symmetrix DMX array to devices on remote storage arrays can be done
fully or incrementally.
EMC PowerPath Migration Enabler
PowerPath
Migration Enabler (PPME) is a two-part host- and array-based migration tool that allows data
migration between storage systems while providing a nondisruptive (to the application) cutover to the new
system. PPME does not require preconfigured multipathing with PowerPath or any independent
multipathing in general. PowerPath Migration Enabler works in conjunction with underlying EMC
replication technology such as Open Replicator.
When the data is relocated with PowerPath Migration Enabler, the data on the source device continues to
be accessible to host applications while the migration takes place. This minimizes (or potentially
eliminates) application disruption. The amount of disruption depends on whether data is migrating from
pseudo or native devices and also whether PowerPath is already installed on the system.
Data migration technology comparison
Data migration projects conducted at an enterprise data center often move data between different types of
storage systems. There are many reasons this may be necessary: equipment lease periods expire, equipment
may be decommissioned to install newer technology, or new tiered storage requirements may be
Data Migration Considerations: A Customer Engineering Residency
Best Practices Planning 9
implemented. Data classification has enabled JPMorgan Chase to implement a tiered storage approach,
allowing them to place more critical application data on their faster, more reliable storage hardware.
The following considerations should be reviewed when determining which data migration approach and
tools are appropriate.
EMC Open Migrator/LM considerations
Open Migrator/LM uses host system resources to perform the migration. The source can be any type of
qualified storage array. The migration can be performed online, though an application outage will be
necessary at the end of the data copy.
Because host resources are required and the actual data transfer can be performed with the application
running, taking action to throttle I/O during the migration will lessen any performance impact, especially
during peak periods. The UNIX version is optimized to minimize impact on system I/O performance. It has
a user-tunable migration rate and I/O copy size.
Because the devices being migrated often contain mission-critical application data, they are usually
configured in a clustered environment. Therefore, data migration technologies and methodologies need to
support clusters. Check the release of Open Migrator/LM, as it may not support VERITAS clusters and
dynamic disks. Automatic failover in a cluster environment should be disabled, because a failover between
servers would move disk resources that could be part of the migration process.
Windows environments using Open Migrator/LM may require a reboot to attach the filter-driver. This is
dependent on the specific version of the Windows operating system and the specific version of Open
Migrator/LM. For all supported UNIX environments, Open Migrator/LM can be installed, operated, and
uninstalled without performing a system reboot.
In general, data migrations transfer data in one direction, from the source to the target storage. There is no
business continuity requirement to move data back to the original source after the successful relocation of
data. If a problem occurs during the data transfer that requires a system reboot, Open Migrator/LM allows
for migrations to persist across system reboots. Because there are generally no requirements to move data
back to the original source, except to back out of the migration, migrating between volumes of different
sizes is permitted. An Open Migrator/LM migration target must be equal to or greater than the source
volume capacity.
Verifying data integrity is critical in determining the success of the data relocation. Open Migrator/LM
supports a compare action for verification of source and target volume synchronization. This increases the
data migration duration but validates the integrity of the data once copied.
EMC Open Replicator considerations
Open Replicator can be used to move data onto or off of a Symmetrix DMX or CLARiiON
storage
system. However, a copy session, which is required for copying data, cannot be created with control and
remote devices on the same Symmetrix system. Open Replicator can also be used to pull data off qualified
third-party arrays, The control DMX system initiates the pull or push copy operation from the control
devices to the remote devices. The remote devices can be of different protection and even different
metaconfigurations. For the purposes of the following discussion the target device is receiving the copied
data and the source device is supplying the data to be copied. The target device must be equal to or greater
than the capacity of the source for the copy operation.
Because Open Replicator supports the remote copy process to non-Symmetrix storage, it should be noted
that the Symmetrix API (SYMAPI) does not recognize these subsystems and must use a World Wide Name
as the device identifier. Because it uses front-end director resources to perform the data propagation, it is
important to assess the bandwidth required to determine whether the appropriate throttling parameter (for
example, the pace option) is properly set.
Data Migration Considerations: A Customer Engineering Residency
Best Practices Planning 10
Push operations
During a push copy operation, the data will be copied from the control DMX system and devices to the
remote devices on another system. During both push and pull operations the flow of data is controlled by
the DMX control system. On push copy operations, remote hosts should not access the remote devices until
copying is complete.
Data corruption to devices may be possible during a copy operation if another host on the SAN has write
access to the remote device. To guarantee that the device cannot change while copying is in process
unmount the remote device or mark any other hosts on the SAN as not ready.
Accumulated I/O errors between the control device and remote device will cause a copy session to fail if
the copy operation is an online push. A copy session can stall and restart when errors are encountered
during offline push, online pull, and offline pull copy operations.
Pull operations
During a pull copy operation, the data will be copied from the remote devices onto the DMX control
system. On pull operations, the remote devices should be inaccessible to the remote hosts for the duration
of the copy process. To prevent this, the device should be write-disabled.
Online pull operations can potentially result in the loss of application host updates made during the copy
operation. This can happen because the devices that are also being copied, the target devices, will also
continue to be updated by the application hosts (this is the implication of an online operation). However,
these updates by the host will not be tracked by the copy operation. There are certain error scenarios that
the pull operation cannot recover from, at which point it will have to be restarted. Any application host
updates made during the failed pull operation will be lost.
EMC PowerPath Migration Enabler considerations
PowerPath Migration Enabler uses Open Replicator as the data propagation mechanism. Therefore, review
the considerations for Open Replicator when using PPME for data migration. At the time of testing, PPME
was supported only in Solaris environments.
Data Migration Considerations: A Customer Engineering Residency
Best Practices Planning 11
Decision criteria
Based on information gathered, the appropriate methodology and technology can be determined. Table 1
lists the decision criteria for the various migration methods.
Table 1. Decision criteria for choosing a migration method
Migration method When to use Decision criteria and consideration
Traditional backup and
restore
Any supported host type
Minimal amount of data to be migrated
Extended migration window (hours)
allowed
Application can be offline during
migration
Independent of storage system
Application must be offline during
migration
Target may be of different size than
source
Host Logical Volume
Manager tools (VXVM,
AIX LVM, and others)
Smaller projects (single application/host)
Source and target volumes can be of
different sizes
Potential performance impact
Complex methodology
For Windows hosts, consider Open
Migrator
Operating system copy
tools (such as dd)
Application can be offline during the
entire migration
No recovery in the event of a failure
during copy
Symmetrix Remote Data
Facility (SRDF
)
Source and target are both Symmetrix
Full volume replication
Data center move
Need to move entire frame
SRDF typically requires the source and
target array be no more than one
generation behind
Bin file change required
Open Replicator Symmetrix to/from non-Symmetrix
Need to replicate full volumes
Independent of host operating system
Need target LUNs to be larger than
source
Environments where copy workload may
impact overall SAN must be minimized
Only Symmetrix control volume can be
active during migration
Can set throttle level for replication
No bin file change required
Methodology can be scripted
Open Migrator Need to move to logical volumes of
different structures (that is, raw devices
to VERITAS volumes)
To move data to or from non-EMC
storage
Host-based replication technology
Logical volume level or drive level
No CLI for Windows
PowerPath Migration
Enabler (PPME)
Limited host support
When application must be continuously
available and/or switchover time is small
(that is, the time it takes to perform a
reboot)
Currently uses Open Replicator as
underlying replication technology
No dependency on PowerPath path
multipathing
Data Migration Considerations: A Customer Engineering Residency
Best Practices Planning 12
JPMorgan Chase technology decision tree
Figure 1 details the JPMorgan Chase decision tree used in choosing a migration technology.
Host Type
Long Distance
Migration
Must Use Open
Replicator or
SRDF
UNIX(Not HPUX)/ W2K
Novell/HPUX
SRDF
Yes
SRDF/Open
Replicator/ Open
Migrator/ PPME
No
SAN Impact A
Factor?
Utililze Open
Replicator
Throttling
Yes
Can application
tolerate downtime
during cutover?
No
Utilize
PPME
No
Host impact a
concern?
Yes
Yes
No
Open
Replicator,
SRDF
Volume
Changes?
Open Migrator
Yes
Large number
of hosts?
SRDF
No
Yes No
Open Replicator
Open Migrator
Large number
of hosts?
SRDF
Yes No
Open
Replicator
Figure 1. Technology selection decision tree
Conclusion
The processes described in this white paper are generic, allowing them to be applied to many different data
migration projects. These processes benefit enterprise data centers by increasing reliability, reducing
administration costs, and maximizing application availability by minimizing the disruption normally
inherent in data migration. The overall procedures and best practices described are applicable to the
majority of data migrations within JPMorgan Chase.
References
The following list includes reference materials used during the JPMorgan Chase Commando Residency:
EMC Open Migrator/LM for UNIX and Linux CLI Product Guide
EMC Open Migrator/LM for Windows Product Guide
EMC Solutions Enabler Symmetrix Open Replicator CLI Product Guide
Data Migration Considerations: A Customer Engineering Residency
Best Practices Planning 13