Documente Academic
Documente Profesional
Documente Cultură
Abstract
This white paper is a comprehensive guide for MirrorView
functionality, operations, and best practices. It discusses the
specifics of the synchronous (MirrorView/S) and asynchronous
(MirrorView/A) products, and compares them to help users
determine how each is best deployed.
March 2015
Table of Contents
Executive Summary ................................................................................................. 6
Introduction ............................................................................................................ 9
Audience ............................................................................................................................ 9
Terminology ....................................................................................................................... 9
MirrorView/Synchronous ....................................................................................... 32
Data protection mechanisms ............................................................................................ 33
Fracture log .................................................................................................................. 33
Write intent log ............................................................................................................. 33
Fracture log persistence ............................................................................................... 35
Synchronization ............................................................................................................... 35
Thin LUN secondary images.......................................................................................... 36
Link requirements ............................................................................................................ 36
MirrorView/Asynchronous ..................................................................................... 38
Executive Summary
copy, such as a backup, and applying changes to bring the application to its state
before the disaster.
RecoverPoint
RecoverPoint (including RecoverPoint/SE) is an appliance-based DR solution that
supports local and remote replication. A RecoverPoint/SE license is also included in
the Remote Protection Suite, Total Protection Pack, and Total Efficiency Pack.
RecoverPoint performs write journaling. As a result, several granular points of
recovery are available. RecoverPoint supports local replication with full binary copies.
It also supports remote replication with zero data loss synchronous and
asynchronous options. Consider RecoverPoint when you need a solution that:
Provides very granular recovery points Updates are made every few seconds
as opposed to minutes.
Supports heterogeneous and/or non-EMC storage systems. (RecoverPoint/SE
supports the platform with which it is sold. For example, if it is purchased with
a VNX, it supports local and remote replication with VNX platforms).
Can switch between synchronous and asynchronous replication on the fly.
Supports a recovery point objective (RPO) of seconds.
Replication modes
Data protection
Consistency across
volumes
Space reclamation
(thick to thin
replication)
MirrorView
RecoverPoint
SAN Copy
Replication between
VNX and/or CLARiiON
primary and
secondary systems
Storage system-based
Replication between
VNX, CLARiiON,
Symmetrix, and nonEMC systems
Appliance-based
(Splitter software
supported on VNX or
CLARiiON SP)
Replication mode at
the LUN level can be
dynamically switched
between sync and
async.
1:4
Replication between
VNX, CLARiiON,
Symmetrix, and nonEMC systems
Storage systembased
1 primary to 1
secondary async; 2
secondarys sync (1:1
with consistency
groups)
Access to secondary
volumes controlled by
MirrorView. SnapView
can be used to access
a replica of a
secondary image.
Native consistency
group support
1 primary to 4
secondary async
1 source copied to up
to 100 targets
Access to secondary
volumes controlled by
RecoverPoint
Remote copy is
available for server
access. SnapView is
recommended for
data access.
Native consistency
group support
Supported
Consistency managed
by the user (for
example: hot backup
mode) or another
application (for
example: Replication
Manager)
Supported in remote
pull configuration
ideal for space
reclamation during
migration
Full copy or
incremental copy
1:100
Introduction
This white paper provides an overview of MirrorView software and discusses key
software features, functions, and best practices. Refer to the References section of
this white paper for related information, including help files, administrator guides,
and white papers.
Audience
Terminology
Terminology, operations, and object statuses are the same for both MirrorView
products. The following is a list of frequently used terms and conditions. A more
comprehensive glossary is available in the product documentation.
Primary image The LUN that contains production data and the contents of
which are replicated to the secondary image. Also referred to as the primary.
Secondary image A LUN that contains a mirror of the primary image LUN.
Also referred to as the secondary. This LUN must reside on a different storage
system than the primary image.
State Remote mirror states and image states. The remote mirror states are:
Active and Attention. The image states are: Synchronized, Consistent,
Synchronizing, and Out-of-Sync.
Consistency group A set of mirrors that are managed as a single entity and
whose secondary images always remain in a consistent and recoverable state
with respect to their primary image and each other.
A complete list of mirror and image states and conditions are listed in Appendix A: Remote Mirror Conditions and Image
States
1
Recovery objectives
Recovery objectives are service levels that must be met to minimize the loss of
information and revenue in the event of a disaster. The criticality of business
applications and information defines the recovery objectives. The terms commonly
used to define the recovery objectives are Recovery Point Objective (RPO) and
Recovery Time Objective (RTO).
Recovery Point Objective
Recovery Point Objective (RPO) defines the amount of acceptable data loss in the
event of a disaster. Typically, the business requirement for RPO is expressed as
duration of time. For instance, Application A may have zero tolerance for loss of data
in the event of a disaster. This is typical for financial applications where all completed
transactions must be recovered. Application B may be able to sustain the loss of
minutes or hours worth of data. RPO determines the required update frequency of the
remote site. The rate of change of the information determines how much data needs
to be transferred. This, combined with the RPO, has a significant impact on the
distance, protocol, and bandwidth of the link between remote sites.
Recovery Time Objective
Recovery Time Objective (RTO) is defined as the amount of time required to bring the
business application back online after a disaster occurs. Critical applications may be
required to be back online in seconds, without any noticeable impact to the end
users. For other applications, a delay of a few minutes or hours may be tolerable.
Figure 1 shows an RPO and RTO timeline. Stringent RPO and RTO requirements can
add cost to a DR solution. Therefore, it is important to distinguish between absolutely
critical business applications and all other applications. Every business application
may have different values for RPO and RTO, based on the criticality of the application.
10
Replication models
There are a number of solutions available to replicate data from the primary
(production) to the secondary (remote) site. These replication solutions can be
broadly categorized as synchronous and asynchronous.
Synchronous replication model
In a synchronous replication model, each server write on the primary is written
concurrently to the secondary site. The primary benefit of this model is that its RPO is
zero, because the transfer of each I/O to the secondary occurs before the
acknowledgement is sent to the server. Figure 2 depicts the data flow of MirrorView
synchronous replication.
1. Server attached to the primary VNX system initiates a write to the system.
2. The primary VNX system replicates the data to the secondary VNX system.
3. The primary VNX system waits for the acknowledgement from the secondary VNX
system.
4. Once the primary VNX system receives the acknowledgement, it sends an
acknowledgement back to the server.
In the event of a disaster at the primary site, data at the secondary site is exactly the
same as data at the primary site at the time of disaster.
11
tolerance for latency. Secondly, the link between the primary and remote systems
must handle peak workload bandwidths, which can add cost to the link.
Asynchronous replication model
There are a few general approaches to asynchronous replication, and each
implementation may vary. At the highest level, all asynchronous replication models
decouple the remote replication of the I/O from the acknowledgement to the server.
The primary benefit is that it allows longer distance replication, because application
write response time is not dependent on the latency of the link. The trade-off is that
the RPO will be greater than zero.
In classic asynchronous models, writes are sent to the remote system as they are
received from the server. Acknowledgement to the server is not held for a response
from the secondary, so if writes are coming into the primary faster than they can be
sent to the remote system, multiple I/Os may be queued on the source system to
await transfer. Figure 3 depicts the data flow of MirrorView asynchronous replication.
1. Server attached to the primary VNX system initiates a write to the system.
2. The primary VNX system sends an acknowledgement to the server.
3. The primary VNX system tracks the changes and replicates the data to the
secondary VNX system at a user-defined frequency.
4. Once the secondary VNX system receives the data it sends an acknowledgement
back to the primary VNX system.
12
MirrorView Family
Both products run on all available storage system platforms, including the VNX2
series, VNX1 series, CX4, CX3, and CX series, and the AX4-5 FC. MirrorView is not
available on the VNXe series, CX300, or the AX4-5 iSCSI models. For maximum
protection of information, both products have advanced features, including:
Bidirectional mirroring. Any storage system can host primary and secondary
images as long as the primary and secondary images within any mirror reside
on different storage systems.
The ability to have synchronous and asynchronous mirrors on the same
storage system.
Consistency groups for maintaining data consistency across write-order
dependent volumes.
Mirroring supported between the VNX2, VNX1, and CLARiiON storage system
models. Refer to Table 6 for software interoperability.
The ability to mirror over FC and iSCSI Front-End (FE) ports. VNX, CX4 and CX3
series storage systems with FC and iSCSI FE ports can mirror over FC ports to a
group of storage systems, while mirroring over iSCSI FE ports to other storage
systems.
MirrorView configurations
13
considerations should be well thought out in case either site must run primary and
secondary applications during an outage.
14
15
If connectivity exists over the preferred link, the wizard performs all setup and
configuration steps. The MirrorView Wizard is discussed in detail in the MirrorView
management section of this paper. MirrorView ports are discussed in the MirrorView
ports and connections section. Additional information for advanced users is available
in the iSCSI connections section.
Although VNXe storage systems support iSCSI, replication cannot be achieved using
MirrorView between a VNX and VNXe system. With the use of Replication Manager,
the native remote replication capability of VNXe systems can be used to replicate
iSCSI LUNs and file shares between other VNXe systems or between VNXe and VNX
series systems. For more information refer to the Technical Note, Replicating VNXe
iSCSI to VNX using Replication Manager, on the EMC Support page.
MirrorView operates over discrete paths between SPs of the primary and secondary
systems. All MirrorView traffic goes through one port of each connection type (FC
and/or iSCSI) per SP. For systems that have only FC ports, MirrorView traffic goes
through one FC port of each SP. For FC/iSCSI systems, one FC port and one iSCSI port
are available for MirrorView traffic. A path must exist between the MirrorView ports of
SPA of the primary and SPA of the secondary system. The same relationship must be
established for SP B.
The MirrorView ports are automatically assigned when the system is initialized.
Systems shipped from the factory with FC and/or iSCSI I/O modules will arrive with
their MirrorView ports persisted. MirrorView port designations do not change, even if
additional I/O modules are added at a later time. For systems that add the first FC or
iSCSI I/O module after the initial configuration, the MirrorView port for that protocol is
assigned as the new I/O module is persisted. For instance, when adding the first
iSCSI I/O module to a system that has only FC connectivity, the MirrorView port is
assigned automatically as the iSCSI I/O module is persisted.
VNX series systems assign the lowest logical port number of each supported type (FC
or iSCSI) for use by MirrorView.
For the VNX2 series (Unified and Block only VNX5200, VNX5400, VNX5600, VNX5800,
VNX7600 and VNX8000), the MirrorView port is the lowest logical port of each
supported type FC or iSCSI (10G and 1G) I/O Modules. But port 0 of the first I/O
Module for FC and iSCSI port is automatically assigned as the MirrorView port.
For the VNX5100, VNX5300, and VNX5500, logical ports SPA 0 and SPB 0 will be the
FC MirrorView port that corresponds to physical port 2 of the embedded mezzanine
ports. The iSCSI MirrorView port will be assigned once an iSCSI IO module is persisted
in I/O module slot 0 or 1.
Connectivity is highly customizable for VNX5700 and VNX7500 systems. Out of a total
of five I/O module slots, up to four can be available for protocol assignment by the
user. There is a minimum of one SAS I/O module in slot 0 that is used for back-end
SAS connectivity to disk enclosures. As the first FC I/O module or iSCSI I/O module is
16
persisted, the MirrorView port is assigned as the lowest logical port number of each
type. The physical location depends on the slot where the I/O module is installed.
The CX4 series also offers flexibility in the number and type of FE ports on each
system. The MirrorView port is assigned to the highest-numbered port of each type.
Like VNX systems, MirrorView ports based are designated automatically on the initial
port configuration. The MirrorView ports will not change, even if additional ports are
added.
For example, the base configuration for a CX4-960 is four FC and two iSCSI Front-End
(FE) ports per SP. This represents ports 0-3 for FC and ports 4-5 for iSCSI. If the
customer orders the base configuration, the FC MirrorView port is port 3, and the
iSCSI MirrorView port is port 5. If an additional four FC ports and/or an additional two
iSCSI ports are added to the system, the MirrorView ports remain the same: port 3
and port 5. Table 2 lists the base FE port configurations for the CX4 series and the
MirrorView ports.
Table 2 - CX4 base configuration front-end port numbering and MirrorView ports
Model
FC Ports
iSCSI Ports
FC MV Port
iSCSI MV Port
CX4-960
CX4-480
CX4-240
CX4-120
0-3
0-3
0-3
0-3
4-5
4-5
4-5
4-5
3
3
1
1
5
5
3
3
If a CX4 system is ordered from the factory with a configuration other than the base
configuration, then the MirrorView port assignment is based on that specific
configuration. For example, if a CX4-960 is ordered with eight FC FE ports (ports 0-7)
and two iSCSI FE ports (ports 8-9), then the MirrorView ports will be port 7 for FC and
port 9 for iSCSI. These MirrorView port assignments will not change, even if additional
ports are added to the system at a later time.
The VNX and CX4 series offer several FE connectivity options. A 4-port I/O module is
available for 8 Gb/s FC connectivity. 4-port 1 Gb/s and 2-port 10 Gb/s I/O modules
are available for iSCSI connectivity. A 2-port FCoE module is also available for host
connectivity, but is not supported for use by MirrorView.
I/O modules in CX4 series systems can be upgraded to newer and faster I/O modules.
4 Gb/s FC I/O modules (not offered for VNX series) can be upgraded to 8 Gb/s
modules. Also, 1 Gb/s iSCSI I/O modules can be upgraded to 10 Gb/s I/O modules.
This allows users to upgrade connectivity options for MirrorView.
For the CX3 series, CX series, and AX4-5 FC, MirrorView operates over one
predetermined FC port and/or one predetermined iSCSI port per Storage Processor
(SP). The MirrorView port assignments are always the same for these systems (Table
3).
Table 3 - Front-end port numbering and MirrorView port by system type
Model
FC MV Port
iSCSI MV Port
CX3-80, CX700, CX600 FC
N/A
17
1
5
3
1
N/A
3
1
N/A
*The MirrorView port (port 1) is the same across all CX3-20 and CX3-40 FC-only models.
You can use Unisphere to view MirrorView ports. For Release 33 (Unisphere 1.3) you
can see the MirrorView ports in the <storage system Name> Settings > Network >
Settings for Block view. For Release 31/32 (Unisphere 1.1 and 1.2), you can also see
MirrorView ports in the <Storage System Name > Settings > Network Settings for Block
view. In Release 30 (Unisphere 1.0), you can see MirrorView ports in the System >
Hardware view.
The MirrorView port is displayed with a unique icon, as shown in Figure 8. In this
case, the system is a VNX7600. Here, the FC MirrorView port is Port 0 of Slot 1 of each
SP, which is the first FC I/O module, and the iSCSI MirrorView ports is Port 0 of Slot 3,
which is the first iSCSI I/O module.
18
shown in Figure 8. (The display of the output has been manipulated to better fit the
figure, but the values are unchanged.)
19
20
iSCSI connectivity between them. These systems are used in several figures and
examples throughout this paper.
21
second two entries show that there are also iSCSI connections between the iSCSI
MirrorView ports. Ports A4/B4 of Site 2 are logged into ports A4/B4 of Site 1.
22
Site_2 is zoned in an FC SAN and has iSCSI connections established. The information
in the Status column shows that the MirrorView connection is enabled through iSCSI.
The following sections outline aspects of common operations for MirrorView/S and
MirrorView/A. More detailed information for either product is included in the productspecific sections.
Synchronization
Synchronization is an operation MirrorView performs to copy the contents of the
primary image to the secondary image. This is necessary to establish newly created
mirrors or to re-establish existing mirrors after an interruption.
23
Initial synchronization is used for new mirrors to create a baseline copy of the primary
image onto the secondary image. In almost all cases, when a secondary image is
added to a mirror, initial synchronization is a requirement. In the special case where
the primary and secondary mirror LUNs are newly created and have not been assigned
to a server, the user can deselect the initial sync required option, which is selected by
default. In this case, the MirrorView software considers the primary and secondary
mirror to be synchronized. Use this option with caution and under restricted
circumstances. If there is any doubt that the primary LUN is modified, use the default
setting of initial sync required.
Primary images remain online during the initial synchronization process. During the
synchronization, secondary images are in a synchronizing state. Until the initial
synchronization is complete, the secondary image is not in a usable state. If the
synchronization were interrupted, the secondary image would be in the out-of-sync
state indicating the secondary data is not in a consistent state.
After the initial synchronization, MirrorView/S and MirrorView/A have different
synchronization behaviors due to the nature of their replication models. MirrorView/S
synchronizes only after an interruption, because under normal conditions all writes
are sent to the secondary as they occur. After the initial synchronization,
MirrorView/A images do not return to the synchronizing state under normal operating
conditions. Secondary images are in the updating condition while incremental
updates are in progress.
Synchronization rate
Synchronization rate is a mirror property that sets the relative value (Low, Medium, or
High) for the priority of completing updates. The setting affects initial
synchronizations, MirrorView/S resynchronizations, and MirrorView/A updates.
Medium is the default setting, but it can be changed at any time.
Low and Medium settings have low impact on storage system resources such as CPU
and disk drives, but may take longer to complete a transfer. Synchronizations at the
high setting will complete faster, but may affect performance of other operations such
as server I/O or other storage system software.
Promoting a secondary image
A secondary image is promoted to the role of primary when it is necessary to run
production applications at the DR site. This may be in response to an actual disaster
at the primary site, part of a migration strategy, or simply for testing purposes. A
secondary image can be promoted if it is in the consistent or synchronized state. The
definitions for consistent and synchronized image states are:
24
Normal promote The storage system executes a normal promote when the
promote command is issued to a secondary image in the synchronized state
and an active link exists between the primary and secondary images. Images
in the synchronized state are exact, block-for-block copies of the primary
image. When promoted, the secondary image becomes the primary image for
the new mirror and the original primary becomes the secondary image. The old
mirror is destroyed. The user is left with a mirror of the same name, whose
images have opposite roles than the original mirror. I/O can then be directed
to the new primary image, and no synchronization is required.
Force promote Force promote is used when a normal promote fails. A normal
promote may fail for a variety of reasons, including that the secondary image is
in the consistent state, the mirror is fractured, and/or there is no active link
between images. Active primary images often have corresponding secondary
images in the consistent state. Images in the consistent state are in a
recoverable/usable state but are not exact copies of the primary. Therefore, if
the original primary is placed in the new mirror as a secondary, a full
synchronization is necessary.
If there is an active link, force promote places the original primary in the new
mirror as a secondary image. The original mirror is destroyed, so there is only one
mirror with the original mirror name. The user would then issue a synchronize
command to start the full synchronization.
If the secondary image state is synchronized or consistent when the promote is
issued, but the link between them is not active, force promote does not place the
original primary as a secondary of the new mirror. Instead, the original secondary
is placed in a new mirror as a primary, but with no secondary image. The original
primary remains in the original mirror, but there is no secondary image. The
mirrors have the same name, but different mirror IDs. Both primaries remain
available for host access, and when the link is back up, secondary images can be
added to one or both images.
Local only promote - Like force promote, this option promotes the
secondary without adding the original primary as a secondary image. The
original mirror and the new mirror exist with primary images. However,
local only promote also allows this type of promote with an active link. Use
25
this option to recover changes made since the last update to the original
primary. This option is available for individual MirrorView/A mirrors,
MirrorView/A consistency groups, and MirrorView/S consistency groups. It
is not available for individual MirrorView/S mirrors.
Promoting with MirrorView/S
When promoting with MirrorView/S, if the image state and link conditions are met for
a normal promote, a normal promote is executed. If the conditions for normal
promote are not met, MirrorView/S will execute a force promote.
MirrorView/Synchronous uses the Quiesce Threshold setting to determine when an
image moves from the consistent state to the synchronized state. The Quiesce
Threshold setting defines the amount of time, in seconds, MirrorView waits before
transitioning the secondary image from the consistent state to the synchronized state
when there is no write activity. It should be set to the amount of time it takes a server
to flush its buffers to disk when write activity is halted. The Quiesce Threshold,
default of 60 seconds, is a property of the mirror and can be changed at any time.
To perform a controlled failover to the secondary site and fail back without
conducting a full synchronization, you must:
1. Quiesce I/O to the primary image.
2. Wait for the secondary image to become Synchronized.
3. Promote the secondary image.
4. Resume I/O to the new primary image (the previous secondary).
Promoting with MirrorView/A
A secondary image will be in the synchronized state if I/O is stopped and a
subsequent update is started and completed before I/O resumes to the primary
image. This occurs in a controlled failover case, such as a data migration or a DR site
test.
MirrorView/A secondary images are usually in the consistent state when the primary
image is active. A delta between primary and secondary images is expected in an
asynchronous model. Therefore, users should expect to promote a secondary in the
consistent state in the event of a disaster.
The following steps outline how to conduct a normal promote to fail over to the
secondary site and fail back to the primary without requiring a full synchronization:
1. Quiesce I/O to the primary image (an unmount recommended).
2. Synchronize the group, so that the secondary becomes synchronized.
3. Promote the secondary image.
4. Resume I/O to the new primary (previous secondary) image.
In cases where a MirrorView/A update is in progress at the time of a site failure, the
secondary can still be recovered. This is possible because MirrorView/A maintains a
protective snapshot (gold copy) of the secondary image. When the promote command
is issued to the secondary while an update is in progress, MirrorView/A rolls the
contents of the gold copy to the secondary image, and makes the secondary image
26
available for access. The rollback only occurs if the image is promoted. In cases
where the primary is recovered and the secondary is not promoted, the update will be
allowed to continue. The gold copy process is discussed in detail in the Data
protection mechanisms section of this paper.
Fracture
A fracture stops MirrorView replication from the primary image to the secondary mirror
image. Administrative (Admin) fractures are initiated by users when they wish to
suspend replication. This is initiated by the MirrorView software when there is a
communication failure between the primary and secondary systems.
For MirrorView/S, this means that writes continue to the primary image, but are not
replicated to the secondary. Replication can resume when users issue the
synchronize command. For MirrorView/A, the current update (if any) stops, and no
further updates start until a synchronize request is issued. The last consistent copy
remains in place on the secondary image if the mirror was updating.
Although not required, an Admin fracture may be used to suspend replication while
conducting preventive maintenance, such as software updates to connectivity
elements and/or storage arrays. It is common to have components restart and/or run
in degraded mode for some portion of the maintenance. During this time, users may
choose to minimize the number of concurrent operations in the environment. After
the maintenance is complete, a synchronization can be conducted to resume steady
state operations.
Recovery policy
MirrorViews recovery policy determines the behavior in recovering from link failures.
The policy is designated when the mirror is created, but can be changed at any time.
There are two settings.
Auto Recovery Option to have synchronization start as soon as a systemfractured secondary image is reachable. Does not require human intervention
to resume replication and is the default setting.
MirrorView will automatically set the recover policy to Manual after a promote is
executed. The policy is set to manual to give users control over full synchronization
when it is required.
Minimum required images
Minimum required images specify the number of secondary images that must be
active for the mirror not to enter the Attention State. Possible values are 0, 1, or 2 for
MirrorView/S and 0 or 1 for MirrorView/A. The default value is 0. Mirroring continues
27
while in the attention state if there are any active secondary images. Users can set up
alerts through Unispheres Event Monitor feature (Event Code 0x71050129) to alert
them if a mirror enters the Attention State.
Interoperability
MirrorView can be used with many storage system based replication and migration
products. The following sections address common interoperability questions at a high
level. For each case, there are documents that provide more specific and detailed
implementation guidelines. Those documents should always be consulted when
designing a solution.
Storage-system based replication software
MirrorView can be used with other storage-system-based replication software. For
example, SnapView snapshots and clones are often used for the local replication of
MirrorView images. The tables below summarize storage-system-based replication
software interoperability with MirrorView.
Table 4 - Storage system replication software interoperability
Potential usages
LUN Type
Make a
snapshot of
it?
Make a
clone of it?
Use as
source for
MirrorView?
Use as
source for
full SAN
Copy?
Use as
source for
incremental
SAN Copy?
LUN, metaLUN,
thick LUN, or thin
LUN not yet
replicated(1,2)
Yes
Yes
Yes
MV/A or MV/S,
not both
Yes
Yes
Clone
Yes
No
No
Yes
Yes
Snapshot
No
No
No
Yes
No
MV Primary
Yes
Yes(3)
No
Yes
Yes
MV Secondary
Yes
Yes(3)
No
No
Yes
1. Includes any LUN used as a destination for either a full or incremental SAN Copy session.
2. Thin LUNs may only be remotely replicated to storage systems running release 29 or later.
3. For AX4-5, you cannot make a clone of a mirror.
Virtual LUNs
Pool LUNs
Virtual LUNs can be deployed as RAID group (classic) LUNs or pool based (thick and
thin) LUNs. RAID-group-based LUNs offer a high degree of control of data placement
28
and are used for environments that demand highly-predictable performance. Poolbased LUNs offer intelligent and automated data placement, auto tiering, and
capacity on demand. Any or all LUN types can coexist in a storage system. Virtual LUN
technology allows each LUN type to be migrated to different physical disks and/or
LUN types, while remaining online for applications. For detailed information on poolbased LUNs, refer to the Virtual Provisioning for the VNX2 Series and EMC CLARiiON
Virtual Provisioning white papers on EMC Online Support.
Pool-based LUNs consist of thick LUNs and thin LUNs. Pools offer automatic
implementation of best practices and automation of basic capacity-allocation tasks.
The consumed capacity of a pool LUN defines the amount of physical capacity that
has been allocated to the LUN. User capacity represents the server-visible capacity of
a pool LUN. Thick LUNs consume their total user capacity plus a minimal amount for
metadata from the pool when they are created. Therefore, thick LUNs are the poolbased equivalent of traditional RAID-group LUNs.
Thick LUNs are intended for applications with higher performance requirements,
while offering the ease-of-management features that pool-based LUNs offer. Thick
LUNs are available in release 30, and are supported by all VNX block-based and
CLARiiON replication products.
Thin LUNs use a capacity-on-demand model for allocating physical disk capacity.
Subscribed capacity is the total user capacity for all LUNs in a pool. Subscribed
capacity can exceed the total physical capacity of the pool when thin LUNs are
deployed. When creating a mirror of a thin LUN, the user capacity is used to
determine which LUNs are eligible for source/destination pairs. When replicating from
a thin-LUN primary to a thin-LUN secondary, only the consumed capacity of the thin
LUN is replicated.
It is possible to replicate a thin LUN to a thick LUN, traditional LUN, or a metaLUN. The
user capacity for a thin LUN must be equal to the user capacity of the target LUN.
When a thin LUN is replicated with MirrorView to other LUN types, the thin LUN
initially retains its thin properties. Zeros are written to the destination for areas of the
thin LUN that have yet to be allocated. However, the thin LUN becomes fully
consumed if a failback performs a full synchronization from the secondary image to
the (thin) primary image.
It is also possible to mirror a classic LUN or metaLUN to a thin LUN. In this case, the
thin LUN becomes fully consumed when a full synchronization is performed from the
primary image to the (thin) secondary image. You can prevent this at the initial
synchronization, as described in the Synchronization section. However, a full sync
may not be avoidable in some failover and failback cases.
MetaLUNs
MetaLUNs have two capacity attributes: total capacity and user capacity. Total
capacity is the maximum capacity of the metaLUN in its current configuration.
Additional capacity can be added to a metaLUN to increase its total capacity. User
capacity is the amount of the total capacity that is presented to the server. Users have
control over how much of the total capacity is presented to the server as user
29
capacity. When the total capacity is increased, user capacity also can be increased.
MirrorView looks at user capacity to determine which LUNs can be in a mirrored
relationship.
LUN shrink and LUN expansion
LUN shrink and LUN expansion operations are supported on all LUN types. LUN
expansion is performed via metaLUNs for RAID-group-based LUNs. In release 30 and
higher, pool-based LUNs can be expanded on the fly. Any increase in pool LUN
capacity is available almost instantaneously. For thick LUNs, expansion only involves
assigning additional 1 GB slices to the LUN. For thin LUN expansion, the system
simply increases the LUNs user capacity while consumed capacity is unaffected by
the operation.
When expanding a LUN that is using MirrorView, SnapView, or SAN Copy, user
capacity cannot be increased until the replication software is removed from the LUN.
Since metaLUNs allow for separate control of total capacity and user capacity, the
total capacity can be increased while the replica is in place.
For example, assume that a user wants to stripe-expand a LUN using metaLUNs.
When a LUN is stripe expanded, data is restriped over added disks before the
additional user capacity is available. You can minimize the time the LUN is not being
mirrored by allowing the striping operation to complete before removing the mirror
from the LUN to increase user capacity. Increasing metaLUN user capacity is a nearly
instantaneous process.
Once user capacity is increased, the mirror can be re-created on the LUN. To use the
same secondary image, user capacity has to be increased to the same user capacity
as the primary image.
If I/O continues while the mirror is removed, a full synchronization is required when
the mirror is re-created. A full synchronization can be avoided if I/O can be quiesced
while the mirror is removed and user capacity is increased. The following steps are
recommended to avoid a full synchronization.
1. Quiesce I/O and remove either the host or LUNs from the storage group. You
2.
3.
4.
5.
6.
must make sure that there is no I/O or discovery of the new capacity while
there is no mirror in place.
Ensure the secondary image is in the synchronized state as reported by the
primary storage system. For MV/S, wait for the image to become
synchronized once I/O is quiesced. For MirrorView/A, perform an update
after I/O is quiesced.
Remove the secondary image and destroy the mirror session.
Increase user capacity of the primary and secondary images to the exact
same value.
Re-create the mirror and clear the Initial Sync Required option when adding
the secondary image.
Add hosts or LUNs back into the storage group for discovery of new capacity
and resume I/O.
30
For more information on LUN expansion with metaLUNs, see the white paper EMC
CLARiiON MetaLUNs: Concepts, Operations, and Management on EMC Support.
LUN migration
LUN migration can be used with MirrorView images under certain conditions. LUN
migration is supported to and from any LUN type, therefore, if requirements are met, it
is possible to leave existing mirrors in place during the migration.
Table 5 outlines the conditions under which MirrorView LUNs can and cannot be
migrated. For more information on LUN migration, see Unisphere Help.
Table 5 - LUN migration rules for mirror images
LUN Type
Migrate to same size LUN
Primary Image
Can be migrated
Secondary Image
Can be migrated
31
Primary
Secondary
R26
(CX3,
CX700,
CX500)
R33
(VNX2)
R32
(VNX1)
R31
R31.5
(VNX1)
R30
(CX4)
R29
(CX4)
R28
(CX4)
R33 (VNX2)
Preferred
Supported
Supported
Supported
R32 (VNX1)
Supported
Preferred
Supported
Supported
Supported
Supported
Preferred
Supported
Supported
Supported
Supported
Preferred
Supported
Supported
Supported
R29 (CX4)
Supported
Preferred
Supported
Supported
Supported
R28 (CX4)
R26 (CX3,
CX700,
CX500)
R23 (AX4-5
FC)
Supported
Preferred
Supported
Supported
Supported
(CX3 Only)
Supported
(CX3 Only)
Supported
(CX3 Only)
Supported
Supported
Supported
Preferred
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Supported
Preferred
R31, R31.5
(VNX1)
R30 (CX4)
Supported
(CX3 Only)
Supported
(CX3 Only)
Supported
(CX3 Only)
R23
(AX4-5 FC)
Supported
Supported
Supported
Under normal conditions, when software versions are within one release of one
another, it is not necessary to fracture the mirrors to perform a code upgrade. While
upgrading storage systems containing secondary images, mirrors will system fracture
while the owning SP of the secondary image reboots. When the owning SP of a
primary image reboots, an SP trespass is performed. MirrorView/S will continue
replicating. MirrorView/A updates are automatically held by the system until the
upgrade is complete.
When mirroring a thin LUN to another thin LUN, only consumed capacity is replicated
between storage systems. This is most beneficial for initial synchronizations. For
classic and thick LUNs, all data, from the first logical block address to the last, is
replicated in an initial sync. Since thin LUNs track the location of written data, only
the data itself is sent across the link. No whitespace is replicated. Steady state
replication remains similar to mirroring traditional LUNs, since only new writes are
written from the primary storage system to the secondary system.
MirrorView/Synchronous
32
MirrorView/S has mechanisms to protect data loss on the primary and/or secondary
image. The fracture log protects primarily against loss of communication with the
secondary image. The write intent log protects primarily against interruptions to the
primary image. Use of the write intent log is optional. Both of these structures exist to
enable partial synchronizations in the event of interruptions to the primary and
secondary images.
Fracture log
The fracture log is a bitmap held in the memory of the storage processor that owns
the primary image. It indicates which physical areas of the primary have been
updated since communication was interrupted with the secondary.
The fracture log is automatically invoked when the secondary image of a mirror is lost
for any reason and becomes fractured. The mirror is fractured (system fractured) by
MirrorView software if the secondary is not available, or it can be administratively
fractured through Unisphere Manager or CLI. MirrorView/S system fractures an image
if an outstanding I/O to the secondary is not acknowledged within 10 seconds. While
fractured, the primary pings the secondary every 10 seconds to determine if
communication has been restored.
The fracture log tracks changes on the primary image for as long as the secondary
image is unreachable. It is a bitmap that represents areas of the primary image with
regions called extents. The amount of data represented by an extent depends on the
size of the mirror images. Since the fracture log is a bitmap and tracks changed areas
of the primary image, it is not possible to run out of fracture log capacity. It may be
necessary, depending on the length of the outage and the amount of write activity, to
resynchronize the entire volume.
When the secondary LUN returns to service, the secondary image must be
synchronized with the primary. This is accomplished by reading those areas of the
primary addressed by the fracture log and writing them to the secondary image. This
activity occurs in parallel with any writes coming into the primary and mirrored to the
secondary. Bits in the fracture log are cleared once the area of the primary marked by
an extent is copied to the secondary. This ability to perform a partial synchronization
can result in significant time savings.
By default, the fracture log is stored in memory. Therefore, it would be possible for a
full resynchronization to be required if a secondary image is fractured and an
interruption in service occurs on the primary SP. The write intent log protects against
such scenarios.
Write intent log
The write intent log is a record stored in persistent memory (disk) on the storage
system on which the primary LUN resides. The write intent log consists of two 128MB
LUNs, one assigned to each SP in the storage system. Each LUN services all the
mirrors owned by that SP that have the write intent log enabled. When the write intent
log LUNs are assigned to the SP, they become private LUNs and are under the control
of MirrorView software. Write intent log LUNs must reside on RAID-group-based LUNs.
33
If all eligible disks in a storage system are deployed in pools, then the write intent log
LUNs may be placed on the system drives.
During normal operation, the write intent log tracks in-flight writes to both the primary
and secondary images in a mirror relationship. Much like the fracture log, the write
intent log is a bitmap composed of extents indicating where data is written. The write
intent log is always active, but the fracture log is only enabled when the mirror is
fractured.
When in use, MirrorView makes an entry in the write intent log of its intent to update
the primary and secondary images at a particular location, then proceeds with the
attempted update. After both images respond that data has been written (governed
by normal LUN access mechanisms, for example, written to write cache), MirrorView
clears previous write intent log entries. For performance reasons, the write intent log
is not cleared immediately following the acknowledgement from the primary and
secondary images. It will be cleared while subsequent write intent log operations are
performed.
In a recovery situation, the write intent log can be used to determine which extents
must be synchronized from the primary storage system to the secondary system. For
instance, if a single SP becomes unavailable (for example during a reboot or failure),
there may be in-flight writes that were sent to the secondary, but not acknowledged
before the outage. These writes will remain marked in the write intent log.
Then server software, such as PowerPath, trespasses the LUN to the peer SP. The
remaining SP directly accesses the unavailable SPs write intent log and recovers the
recent modification history. The SP then resends the data marked by the extents in
the write intent log. This allows for only a partial resynchronization, rather than a full
resynchronization because it ensures that any writes in process at the time of the
failure are acknowledged by the secondary image. If the entire array becomes
unavailable, then the write intent log is used to facilitate a partial resynchronization
from primary to secondary, once the primary array is recovered.
Use of the write intent log is optional on CX3 series and lower models. On these
systems, the maximum number of mirrors that can use the write intent log is less than
the maximum number of configurable mirrors. For information on limits, refer to the
MirrorView Release Notes for your storage systems operating environment.
There were also performance considerations when using the write intent log on these
systems. However, releases 26 and later have optimizations that improve
performance when using the write intent log. The optimizations are implemented in
the SP write cache. Prior to release 26, performance would degrade if the storage
system write cache neared full. In release 26, there is no impact at lower workloads
and only a slight impact (10%) on response time at high workloads. Therefore, use
of the write intent log should be maximized. If FAST Cache is enabled at the system
level, it should be disabled for write intent log LUNs. Optimal performance is achieved
when write intent log LUNs are fronted only by SP write cache.
The write intent log is automatically used on all mirrors on VNX series and CX4 series
systems. Both series of systems allow all mirrors to have the write intent log allocated
34
Synchronization
35
the most impact on host I/O, the storage system will not run as many of these
concurrently as it would images with medium or low synchronization rates.
Images waiting to be synchronized remain in a queued-to-be-synchronized image
condition until some of the current synchronizations are completed. As each
secondary image finishes its synchronization, the storage system determines if the
next image queued to be synchronized can be started. If all images have the same
synchronization rate, the next image in the queue starts its synchronization each time
a synchronization completes. If there is a mix of sync rates, then the number of
images started is based on the resources made available. For example, if a
synchronization with a high sync rate finishes, the storage system can start several
synchronizations at medium or low.
The synchronization progress is tracked in the fracture log. Bits of the fracture log are
cleared as the areas they represent are copied to the secondary. You can view
progress in Unisphere in the Mirror Properties dialog box by clicking the Secondary
Image tab.
Thin LUN secondary images
Before starting the synchronization of the mirror or consistency group, MirrorView/S
determines if the secondary pool has enough capacity to accept all of the
synchronization data. For example, in the case of an individual mirror, if there is 25
GB of synchronization data to be sent, MirrorView/S verifies that there is at least 25
GB of free capacity in the secondary image LUN storage pool.
For example, assume a consistency group has three mirrors. The first two mirrors have
secondary image LUNs in Pool A and have 20 GB and 40 GB of sync data,
respectively. The third has its secondary image LUN in Pool B on the same array and
has 30 GB of sync data. MirrorView/S will verify that Pool A has at least 60 GB of free
capacity and that Pool B has at least 30 GB of free capacity before starting the
synchronization.
If enough capacity does not exist in a secondary image LUN storage pool, the mirror
and/or consistency group becomes administratively fractured. An error will be
returned to the user and an event will be logged in the SP event log. The events are:
Checking available pool capacity does not guarantee that all of the synchronizations
will be accommodated by the secondary LUNs pool. It does ensure that the capacity
is available at the beginning of the synchronization; however, it is possible for other
thin LUN activity in the same pool to consume pool capacity while the
synchronization is in progress.
Link requirements
MirrorView/S is qualified for a large variety of SAN topologies that include high-speed
FC SANs, distance extension solutions including long wave optics, FC/IP converters,
36
and native iSCSI. For comprehensive guidance on supported topologies, see the
Topology Resource Center available within the E-Lab Interoperability Navigator. Within
the Resource Center there are technical books for FC, iSCSI, Extended Distance
Technologies, and others that can be used as references.
It is expected that bandwidth and latency requirements will be governed by the
application, since each write I/O traverses the link. Under these conditions, links are
typically high bandwidth lines with low latency. There are no bandwidth requirements
enforced by MirrorView software. However, there are functional requirements for
latency, since MirrorView/S will fracture a secondary image if it has an I/O
outstanding to the secondary for 10 seconds.
The graph in Figure 15 illustrates the effect of latency on transaction response time
plotted against the write component of the transactions. An OLTP simulator was run,
where each OLTP transaction consisted of 21 reads and nine writes. The load was
generated using 8-KB I/O over 12 data LUNs and 1 log LUN on a CX3-80. The link is
1000 Mb with compression and fast write enabled through an FC/IP device.
Individual results vary for transaction response times in an end-user environment
depending on a wide variety of factors, including application concurrency, server
hardware and software, connectivity components, storage system model, number of
disks, and so on. The graph does not represent any MirrorView or system limits, but
simply illustrates the effects of latency with all other things equal across the
configuration.
8 KB I/O over 1000 Mb LAN McData 1620 with Fast Write
and compression enabled
Network Delay
37
features such as compression and fast write will have performance advantages over
native iSCSI connections.
MirrorView/Asynchronous
MirrorView/A uses a periodic update model for transferring write data to the
secondary image. Between updates, areas written to are tracked by MirrorView/A.
During an update, the data represented by these areas is transferred to the secondary
image. The set of data that is transferred is referred to as a delta set.
A delta set may be smaller than the aggregate number of writes that occurred to the
primary image, since areas in a bitmap are marked to denote that a change occurred
to a particular location of the image. If more than one write occurs to the same area,
only the current data is sent when the update is started. SnapView technology is used
to preserve a point-in-time copy of the delta set during the update.
MirrorView/A is designed to enable distance solutions over lower bandwidth and
higher latency lines than would be possible with a synchronous solution. Solutions
are often implemented over T3 type lines or the equivalent bandwidth within a larger
link. Replication distances can range from 10s to 1000s of miles.
RPO is determined by the frequency and duration of updates. There are three
methods for defining the frequency of updates.
User-defined intervals are specified in minutes, with an input range of 0 40,320 (28
days). However, typical update intervals range from 15 minutes to hours.
The following sections describe the MirrorView/A software. For information on
MirrorView/A limits, refer to the MirrorView/A Release Notes for your storage systems
operating environment.
MirrorView/A uses SnapView technology for data protection on the primary and
secondary storage systems. On the primary storage system, SnapView tracks changes
on the primary image and creates a point-in-time image that is the source for the
delta set transfer.
On the secondary storage system, SnapView creates a protective copy of the
secondary image during the update. The protective copy, referred to as the gold copy,
ensures that there is a usable copy on the secondary storage systems at all times.
All of MirrorViews use of SnapView is autonomous and does not need be managed
by the user. Therefore, a SnapView license is not required to use MirrorView/A.
38
39
Delta set
A delta set is a virtual object that represents the regions on the primary mirror LUN
that need to be transferred to the secondary mirror LUN. A SnapView snapshot,
managed by MirrorView/A, is used as the source of the transfer to maintain a point-intime image of the delta set for the duration of the transfer.
Ultimately, the size of the delta set will depend on the number of writes between
updates and the locality of those writes on the LUN. Some applications that are
random in nature may have to replicate most of the writes that occurred between
updates. Other applications, which have high locality, may write to the same location
several times between updates and result in small delta sets. Results can vary greatly
by application and implementation.
For example, Figure 16 shows expected delta set behavior for a consistent workload.
The delta set size peaks before the first update. While the update progresses, I/O
continues to the primary volume, so some data lag exists throughout the update
process. Delta set peaks remain consistent in this test because of the steady
workload.
Update Start
Update End
Figure 16 - Data lag trend for Oracle 9i RAC test over T3 (50 ms latency)
The data for this test was generated using an Oracle database. Replication was
conducted over a T3 line with a 50 ms delay to simulate long-distance replication. The
bottleneck for update bandwidth in this test was the T3 link.
Gold copy (protective snapshot)
A gold copy is a virtual point-in-time copy (snapshot) of the secondary image that is
initiated before the update begins. It ensures that there is a known good copy on the
secondary during an update. The gold copy is managed entirely by MirrorView/A
40
software and is not directly available for server access. Users can take additional
snapshots of the secondary image for a usable point-in-time copy of the secondary.
If the update from primary to secondary is interrupted due to a catastrophic failure at
the primary site, MirrorView/A uses the gold copy to roll back the partial update on
the secondary LUN if the secondary is promoted for use. If the secondary is not
promoted, the gold copy is maintained so that the update may complete when the
interruption is rectified.
Reserved LUNs
Place the reserved LUNs on different physical drives than the primary images.
Since it is possible to have tracking and/or copy on first write activity
associated with a server write, it is best not to have all of the activity queued
41
up for the same set of disks. Often times, several reserved LUNs are configured
on their own RAID group or pool.
Use high performance SAS or FC drives for the reserved LUNs. It is common to
have several reserved LUNs on the same drives. The tracking and copy on first
write activity is a critical component for write response time, so it is important
to avoid I/O contention on these drives. Flash drives do not provide a
significant performance advantage over SAS or FC drives when used for
reserved LUNs.
Place write-cache-enabled reserved LUNs on SAS or FC drives. For more
information, see the Disk drive technologies section of this white paper.
Use the same configuration on the primary and secondary array. Some end
users consider cutting costs on the DR side by using fewer drives or lower-cost
drive technology. However, the reserved LUNs play an important role in copyon-first-write protecting the secondary images during an update. It is also
expected that in a recovery situation, the applications will run with similar
performance when failed over to the DR site.
For a more comprehensive look at reserved LUN guidelines for MirrorView and other
replication applications, see the white paper CLARiiON Reserved LUN Pool
Configuration Considerations on EMC Support.
Initial synchronization
The transfer map is used to track initial synchronization. When an initial sync is
required, all of the bits in the transfer map are set to indicate transfer. The bits of the
transfer map are cleared as the areas they represent are copied to the secondary.
All mirrors can synchronize simultaneously. Resources are shared among active
mirrors to evenly distribute bandwidth during the synchronization period.
There is no gold copy (so no protective snapshot) during the initial sync, since there is
no usable copy of data until the initial sync is complete. If the initial sync is
interrupted, the secondary will be in an out-of-sync state and cannot be promoted.
MirrorView/A is intended to operate over lower bandwidth lines. In implementations
where the delta set would consume most or all of the link capacity, it can be difficult
to perform the initial synchronization. This is prevalent in implementation with large
volumes (multiple TBs) and relatively smaller change rates (GBs) over the update
period.
Often in these scenarios, the secondary system is brought to the same site as the
primary for the initial synchronization and then shipped to the remote site, where an
incremental update is performed. The general process to do this is:
1. With primary and secondary on the local SAN, create mirrors, add secondary
images, and perform initial synchronization.
2. Admin fracture the mirrors. An incremental update is optional before the
fracture to get the very last possible changes.
3. Remove the secondary system from the Unisphere/Navisphere domain.
42
4. Transport secondary and install at final site, including any zoning and IP
Link requirements
MB/s
GB/h
0.18
5.4
11.9
18.5
0.63
18.9
41.9
65
43
Managing all write-order dependent volumes as one entity ensures that there is a
restartable copy of the application on the secondary system. The copies are
restartable because all the volumes are as current as the solutions RPO allows. Other
replication strategies may yield a recoverable copy of data at the DR site. Recoverable
copies typically require more operations to bring them back online. Consider the
example of a Data Base Management System (DBMS) application. When using log
shipping in the event of a disaster, data files are restored to a previous point in time
(for example, a backup) and then made current by applying logs replicated more
frequently.
Restartable copies offer shorter RTOs than recoverable copies in terms of access to
current data. A trade-off is that more data may be replicated to constitute restartable
copies as both logs and data files are being replicated on a regular basis, as opposed
to only replicating the logs.
Consistency groups protect against data corruption in the event of partial failures, for
example on one SP, LUN, or disk. With partial failures, it is possible for the data set at
the secondary site to become out of order or corrupt.
For example, assume an interruption prevents only one SP from communicating with
its secondary volume. If the log volumes reside on the interrupted SP and the
database volumes reside on the other SP, it is possible for updates to be written to
the databases secondary site, but not the logs secondary site.
The following sections discuss specific benefits of consistency groups as they pertain
to MirrorView/S and MirrorView/A.
MirrorView/S
For MirrorView/S, consistency groups are crucial to performing consistent fracture
operations. Figure 17 shows a database where SP A owns the log components and SP
B owns the data files. Without consistency groups, an interruption in communication
on SP A invokes the fracture log on the log LUNs, and then I/O continues to the log
LUNs and the data LUNs. Since there is no outage on SP B, I/O to the data LUNs would
44
still be mirrored to the secondary. If an outage occurs on the primary system during
this time, there would not be a usable copy because data would be ahead of the logs
at the secondary site.
45
46
You can manage consistency groups using Unisphere or the Navisphere CLI. Figure 20
illustrates how to perform operations such as synchronization, fracture, and so forth
from Unisphere.
47
Like individual mirrors, consistency groups have conditions (normal, admin fractured)
and states (synchronized and consistent). Consistency group conditions and states
depend on the condition and states of the member mirrors. For a consistency group to
be in the synchronized state, all of its members must be in the synchronized state.
The group will transition to synchronized after all of the mirrors in the group have
transitioned to this state. A complete list of consistency group conditions and states
is available in Appendix B for reference.
If using the Navisphere Secure CLI, verify the state of the consistency group with the
storage system that owns the primary images. If the storage system that owns the
primary images is not available, the state of the consistency group can be obtained
from the secondary storage system.
Consistency groups can be promoted if they are in the synchronized or consistent
states. The various promote options are as follow:
Force promote Force promote is used if the conditions for normal promote are
not met. It is a brute force promote that ignores most errors preventing a
normal promote, such as the secondary image state being consistent, the
mirror being in a fractured state (admin or system fracture), a failed link, and
so forth. In cases where connectivity still exists between the primary and
secondary storage systems, the secondary images are placed as primary
images of the same consistency group. The original primaries are placed in the
group as secondary images. A full synchronization is then required from the
new primary (old secondary) images to the new secondary (old primary)
images. If connectivity does not exist, force promote acts like local only
promote.
Local only promote This promotes the consistency group on the secondary
site without adding the primary images as secondary images of the mirrors. A
new group is created on the secondary storage system, and the original
secondary images are placed in the new consistency group as primary images.
The primary images on the primary storage system remain in the original
consistency group. Both consistency groups will be in the Local Only state.
48
MirrorView management
Required storage system software
MirrorView runs within the VNX Operating Environment for Block on VNX systems. The
MirrorView/A and MirrorView/S enablers are included in the Remote Protection Suite,
the Total Protection Pack, and the Total Efficiency Pack. MirrorView runs within the
FLARE operating environment on CLARiiON and Unified NS series systems.
MirrorView/A and MirrorView/S are separately licensed products for these systems.
Figure 21 shows the presence of various enablers in the Storage System Properties
dialog box.
49
Management topology
Management network connectivity
Network connectivity requirements for configuring MirrorView are determined by
Unisphere and/or CLI. Unisphere and Navisphere CLI communicate with the storage
systems using ports 80/443 or alternately, 2162/2163. Furthermore, network
connectivity between systems on these ports is required to set up and configure
mirrors. Once configured, MirrorView communicates in-band, through the data path,
between the storage systems. Any communication needed, such as to facilitating a
promote operation, occurs in-band.
For example, a user stops I/O to a primary volume that is replicated with MirrorView/S
until the secondary image is in the synchronized state. A command can then be
issued to the secondary storage system to promote the secondary image. This
command is sent through the secondary storage systems management LAN port.
MirrorView on the secondary system then communicates in-band with MirrorView on
the primary system to reverse the image roles. The process could then be reversed by
issuing the promote command to the current secondary (former primary) storage
system.
Although MirrorView communicates in-band after initial configuration, EMC
recommends that communication between storage system network ports always be
available. This allows recovery from all failure scenarios (like force promote) without
having to first reconfigure the management LAN to allow connectivity.
Unisphere domains
MirrorView can be used between systems in the same Unisphere domain and/or with
systems that are not in the same Unisphere domain. Unisphere has a multi-domain
management feature that allows management of discrete Unisphere domains in one
user interface. Figure 22 shows Unisphere configured to manage multiple domains.
50
51
(MirrorView), and perform restore and recovery operations. The VNX Data Protection
roles are listed below with the analogous CLARiiON/NS series roles in parenthesis:
Local Data Protection (Local Replication) Allows the user to perform basic
operation for snapshots and clones, such as session start/activate and clone
sync/fracture.
Data Protection (Replication) Allows the user to perform Local Replication
operations, plus basic operations for SAN Copy sessions and MirrorView
mirrors.
Data Recovery (Replication/Recovery) Allows the user to perform Local
Replication and Replication operations, plus operations such as restoring
replicas back to production LUN (Snap rollback, clone reverse-sync, MirrorView
promote).
None of these roles allow users to create new replication objects such as snapshots,
clones, SAN Copy sessions, or mirrors, nor do they allow users to delete objects.
Users can only control existing replica objects with these roles. All three roles have
view-only privileges in the domain for object types outside of their control, so that
they can understand the context of the replica object. A detailed list of management
rights for each role is listed in Appendix B: Consistency Group States and Conditions.
Data protection user roles are created and managed in the same manner as all other
user roles. However, when data protection roles are employed, all of the storage
systems in the domain must be running a minimum of release 29 software for legacy
CLARiiON/NS systems. Data protection roles users can be assigned a local, global, or
LDAP scope in the same manner as traditional user roles.
Management software
52
Figure 23. MirrorView Wizard launch from the task pane (Unisphere 1.3)
The MirrorView Wizard uses a task-oriented approach to guide users through the
process of creating mirrors. Along with creating the mirrors, the wizard automatically
performs any needed setup operations such as creating iSCSI connections and
MirrorView connections, and creating/assigning write intent log and reserved LUNs.
Explicit dialog boxes for creating mirror connections, creating mirrors, adding
secondary images, and so forth are still available for advanced users. Advanced users
who have very specific requirements, or have a set of conditions that are not covered
by the wizard, may use this approach.
Navisphere CLI offers a command line approach. All configuration options are
available through the CLI and all security features for authentication, authorization,
privacy, and audit are shared with Navisphere Manager.
For a comprehensive description of management operations, the following
documents and white papers are available on EMC Online Support:
The following sections discuss the features, benefits, and assumptions of the
MirrorView Wizard, object-based management, and CLI.
MirrorView Wizard
The MirrorView Wizard uses a server-centric approach. This ease-of-use feature
changes the traditional approach to storage management of managing the
53
environment from the storage system out to the server. The basic steps for the
wizards are select a server > select a storage system > select a LUN(s) > perform an
action. This assumes that a server will be attached and a LUN assigned prior to using
the MirrorView Wizard.
Before using the wizard, physical connections must be established between the two
storage systems. This includes proper LAN preparation, SAN zoning, and Unisphere
Domain configuration. These tasks are not included in the MirrorView Wizard.
In the wizard, the user is first guided through the LUN selection process by selecting
the desired server and storage system. One or many LUNs may be selected. Then, the
wizard performs the following discrete operations based on user input and existing
conditions:
1.
2.
3.
4.
5.
6.
54
The default name may be changed by modifying the default name in the Secondary
LUN Name field in the wizard, as shown in Figure 24.
Mirror Type
Medium
Mirror of <Primary_Server> <Primary_LUN>
4 Hours
Group of <Primary_Server> <Primary_LUN1>
<Primary_LUN2> <Primary_LUNx>
Both
Both
Asynchronous only
Both Only when multiple
LUNs are selected
Additional parameters are automatically configured by the wizard with no user input,
as listed in Table 9. In addition, the contents of the primary LUN are automatically
copied to the secondary LUN upon completion of the wizard.
Table 9 - Default remote mirror properties
Mirror Property
Default Value
Minimum Required Images
Mirror Type
Both
55
Quiesce Threshold
Use Write Intent Log
60
Selected
Synchronous Only
Synchronous Only
56
The server is already configured in a storage group The wizard places the
newly bound LUNs in the servers storage group.
The server is not configured in a storage group The wizard creates a storage
group for the server, places the server in the storage group, and places the
newly bound LUNs in the storage group. In this case, the wizard automatically
creates a storage group named SG_<server name>. If you need a specific
storage group name, you may change it using the traditional Storage Group
Properties dialog box.
57
Two Reserved LUNs are bound for each primary and secondary LUN selected in
the wizard.
The total overhead storage space is divided evenly among the number of
Reserved LUNs.
By default, the system configures reserved LUNs as thick LUNs using the
following method:
2. If the conditions in step 1 cannot be met, the system tries to create a new
storage pool (three- or five-disk RAID 5 or two-disk RAID 1/0) for the thick
LUNs. It then configures the reserved LUNs on this pool.
3. If no disks are available to create a new storage pool, it places the reserved
LUNs in an existing pool that has enough capacity.
4. If there is not an existing pool with enough capacity, it evaluates RAID
groups using steps 1 to 4.
5. If it cannot find an acceptable RAID group, the process fails.
Reserved LUNs created in pools typically alternate SP ownership. By default, two
reserved LUNs are created per source, one owned by SP A and the other by SP B.
When created on a RAID group, the Default Owner property is assigned based on the
RAID group from which the reserved LUN is bound:
For even-numbered RAID groups, the Default Owner is SP A.
For odd-numbered RAID groups, the Default Owner is SP B.
Note: The naming convention used is the same as for write intent log LUNs: Virtual
Disk sequence_number
For example, a user wants to create a mirror of LUN 12, which has a capacity of 200
GB. Existing LUNs on both systems range in LUN number and name from LUN 0 LUN
50. The wizard will configure the reserved LUN pool on the primary storage system
with two 20 GB LUNs and secondary storage system with two 20 GB LUNs. The
reserved LUN names on both arrays will be Virtual Disk 51 through Virtual Disk
54.
The wizard always creates two new Reserved LUNs for each source LUN and never
uses unallocated LUNs in the pool. Two Reserved LUNs for each source LUN allows
the reserved snapshots to expand over time.
Advanced dialogs
Dialogs to perform specific tasks such as creating a mirror, adding a secondary, and
so on are available in Unisphere. These dialogs can be launched from the task menu
pane or from the right-click object menu of a specific object. Additional configuration
options are available in these dialogs. They can be used to create mirrors as well as
change the mirror properties of any mirror regardless of which method was used to
create it.
58
59
Figure 27 - Create Remote Mirror and Add Secondary Image dialog boxes
Consistency groups have similar dialogs as shown in Figure 28. Note that for
asynchronous mirrors, you can designate the Recovery Policy and Synchronization
rate at the consistency group level. Values set at the consistency group level override
the corresponding settings for individual mirrors in the consistency group. For
synchronous mirrors, you set the recovery policy at the consistency group level, but
you set the synchronization rate at the individual mirror level.
Figure 28 - Create Group dialog boxes for synchronous and asynchronous mirrors
Unisphere Analyzer
60
Figure 29 - Unisphere Analyzer charts for total bandwidth and time lag
MirrorView/A and MirrorView/S replicated data is included in the FE port statistics for
the secondary storage system; it is not in FE port statistics for the primary storage
system. If the MirrorView port for the secondary storage systems is dedicated to
MirrorView, then all traffic shown by Analyzer will be MirrorView activity. However, it is
important to note that if the port is also used for servers or other host traffic, the
traffic shown by Analyzer will include all of that traffic, not just MirrorView traffic.
When a failure occurs during normal operations, MirrorView and the base operating
environment implement several actions to facilitate recovery. MirrorView tries to
achieve three goals in the recovery of failures: minimize the duration of data
61
Failures in the production server to storage system path, primary image(s), or primary
storage system affect application performance and availability. The nature of the
failure may dictate that little corrective action is needed. It may also dictate that the
secondary image be promoted and used as the production image until the primary
site (or image) can be repaired. The following sections describe various failure
scenarios within the primary storage system and its components.
Path failure between server and primary image
If a server loses its I/O path(s) to a storage processor, LUNs are trespassed to the
alternate SP. This is typically enacted by server-based path management software
such as PowerPath. With SP ownership changed, the primary storage system will send
a request to the secondary storage system to trespass the secondary image. This way,
SP ownership is consistent between the mirrored pairs on the primary and secondary
storage systems. Between updates, MirrorView/A may not trespass the secondary
volume until the start of the next update. Once trespassed, the update proceeds.
Primary image LUN failure
VNX series and legacy CLARiiON/NS systems (Release 26 and later) have internal I/O
redirection features that make them more resilient to certain types of LUN failures
than previous releases. When a failure prevents an SP from accessing its LUNs on the
back end, I/O is internally redirected to the other SP, provided that the other SP can
access the LUNs. The benefit is that back-end failures affecting just one SP (for
example an LCC failure) do not result in a LUN trespass by the host failover software.
Once the failure is corrected, the original SP resumes control of back-end I/O to the
LUN.
Masking the back-end failure internally provides a faster and less complex failover
and recovery process. Server I/O and mirroring traffic remain balanced across the
connectivity environment. Also, there is no interruption to the mirroring process,
which reduces the number of fractures and resynchronizations.
If the LUN failure occurs in such a way that neither SP can access the LUNs, then the
mirror will become fractured and server I/O to the LUN is rejected until the LUN
Many media failures are themselves automatically detected and corrected by lower-level CLARiiON and VNX software, so this
scenario is probably the least likely.
2
62
becomes available again. During this process, the user may choose to promote the
secondary image, and redirect server I/O to the secondary image.
If the secondary image state is consistent and/or fractured at the point in time of the
failure on the primary image, a full resynchronization of the originally-primary-nowsecondary image would be required, since there would be no other way to guarantee
byte-for-byte synchronicity.
For MirrorView/A, if the primary image fails during an update and the user promotes
the secondary image, any changes to the secondary are rolled back from the
protective snapshot. If the user fixes the error on the primary without promoting the
secondary, the update continues from the point it was interrupted.
Storage Processor (SP) controlling primary image failure
If the SP owning the primary image fails, the MirrorView LUN trespasses to the
surviving SP on the primary storage system per direction of the server failover
software. At the same time, MirrorView on the primary system will issue a trespass
command to the secondary to trespass the secondary image.
For MirrorView/S, the write intent log is used to avoid a full resynchronization. Mirrors
not using the write intent log will have to do a full synchronization. For MirrorView/A,
a full synchronization is not required since tracking/transfer data is stored on
persistent media.
63
Before adding the original primary image to the new mirror, the old mirror must be force-destroyed. In this case, the forcedestroy option is used because changes were made to the mirror while to primary system was unavailable.
3
64
Figure 32 - Before the primary storage system failure: The primary image (on the
primary system) has mirror ID X
Figure 33 - Following the primary storage system failure: The newly promoted primary
image (on the secondary system) has mirror ID Y
65
to write to the secondary. However, the secondary image remains a member of the
mirror.
Secondary image LUN failure
The I/O redirection capability (as described in the Primary image LUN failure
section) allows mirroring to continue if the LUN failure only affects the current owning
SP. The primary image would continue mirroring to the secondary image on the
original SP. If the secondary image LUN fails in such a way that affects both SPs, then
the image is admin fractured. At that point, the administrator can choose whether to
repair the LUN or remove it from the MirrorView relationship altogether. If the LUN is
repaired, it can be resynchronized once it is available for use. If another LUN is
selected to serve as the secondary image, it must first be fully synchronized.
Storage processor (SP) controlling secondary image failure
If the SP owning the secondary image fails, it is system fractured until the SP comes
back online. Once the SP comes back online, the mirror is automatically
resynchronizedprovided that the auto recovery policy is selected. If auto restore is
not enabled, then the administrator must intervene by explicitly starting the
synchronization4. Additionally, if the same SP on the primary storage system also
failsresulting in the trespassing of the primary LUN to the peer SP and, likewise, the
trespassing of the secondary LUN to its peer SPthe secondary image is
automatically resynchronized (again, provided auto-restore is enabled).
Secondary image storage system (or link) failure
If the storage system containing the secondary image fails, the secondary image is
system fractured. Similarly, if the link between the primary and secondary storage
system fails, the secondary image is also system fractured. In either event, once the
issue has been resolved and the secondary storage system is back in communication
with the primary storage system, the secondary image is automatically
resynchronized if auto recovery policy is enabled.
Users are given the option to have the resynchronization occur automatically by enabling auto restore. If they prefer, they can
manually initiate this themselves. This is a property of the mirror, and can be changed at any time.
4
66
Figure 34 - A failure of the link between storage systems or failure of the secondary
storage system temporarily suspending mirroring
67
intervention is required to allocate more storage to the reserved LUNs and restart the
update operation. The update will continue from the point it was interrupted.
68
All SnapView replicas of secondary images should represent points in time where the
secondary image is in the consistent or synchronized state. During MirrorView
synchronizations, the secondary image LUN is not in a usable state. Secondary
images are in the synchronizing state during MirrorView/S synchronizations and the
updating state during MirrorView/A updates. You should not start SnapView
snapshots or fracture clones while the secondary images are in either state.
You can use the Manual MirrorView recovery policy to coordinate SnapView activities
with fractured mirrors. The Manual option requires user intervention to start
synchronizations after a system fracture. Snapshots can be started or clones
fractured before initiating the mirror synchronization.
SnapView and MirrorView consistency features are also compatible. Use the
SnapView consistent start and consistent fracture operations when creating
SnapView replicas of a MirrorView/S consistency group. SnapView consistent
operations cannot be performed on the consistency group, but can be conducted
across all members of the group. Consistent operations can also be used with MV/A,
but are not required if you are starting sessions or fracturing clones between updates.
Replication Manager 5.1 and later versions have support for creating SnapView
replicas of MirrorView/S and MirrorView/A primary images and MirrorView/S
secondary images.
Snapshots of MirrorView image LUNs are available on all storage systems that
support both SnapView snapshots and MirrorView. Clone of a mirror is available on
all VNX series systems and legacy systems running release 24 or later. The AX4-5
currently runs Release 23 and does not offer this option.
LUN eligibility
Any mirror image can be a clone source; however, a clone image of another source
cannot serve as a mirror image. The LUN can become a clone source or mirror image
in any order. For example, an existing clone source could be added to a mirror as a
secondary image. Alternately, a mirror secondary image could also be made a clone
source.
A snapshot can be taken of any MirrorView image. Snapshot images cannot serve as
mirror images. In other words, you cannot mirror a snap.
Limit dependencies
Clone and MirrorView/S limits are independent of one another. The number of mirror
images and clones for each LUN and each storage system are counted separately
against the LUN and storage system limits. While system limits for clones vary with
storage system type, the number of clones per source LUN remains eight for all
storage system models.
MirrorView/A images count toward the LUN-level snap-session limit of eight. The LUNlevel snap-session limit is the same for all storage system models. MirrorView/A uses
a reserved snap session on both the primary and secondary images to track changes
and maintain point-in-time copies. Therefore, users can create up to seven SnapView
69
snap sessions on a MirrorView/A primary or secondary image. There are no such limit
dependencies between MirrorView/S images and SnapView snapshots.
Instant restore
Both snapshots and clones offer instant restore capability. For clones, the reversesynchronization operation is used for instant restore. For snap sessions, the rollback
feature is used. In both cases, the source LUN is instantly available and will appear to
the host as the point in time that the SnapView replica represents. In the background,
SnapView moves the data from the SnapView replica back to the source LUN.
When used with MirrorView, instant restores can only be performed on primary image
LUNs. During an instant restore to a primary image, the secondary image must be
fractured. If the mirror is part of a consistency group, then the group must be
fractured. The software enforces this to prevent MirrorView from presenting a
consistent secondary image state while the data itself is not usable. Once the restore
process completes, a partial resynchronization can be performed to move the
changes to the secondary image. Secondary image LUNs can be restored from a
SnapView replica only after they are promoted to the role of primary image.
Promoting a secondary image while the primary image is being restored is possible
with the Local Only promote option. This option is available for individual
MirrorView/A mirrors and consistency groups for both MirrorView/A and
MirrorView/S. It is not available for individual MirrorView/S mirrors. The local only
promote option places the secondary image LUN in a new mirror as the primary
image. The original primary image LUN cannot be demoted to secondary while the
restore process is active.
Finally, in the case of MirrorView/A only, an instant restore cannot be performed on a
primary image if the image is rolling back. An image is rolling back if it was recently
promoted and MirrorView/A is recovering it from the gold copy. It is possible to
restore the primary image once the rollback process is complete.
Clone state Clone Remote Mirror Synching
The Remote Mirror Synching clone state was added to help users coordinate mirrorsync and clone-sync-and-fracture operations. During mirror sync operations the
secondary image LUN is not in a usable state.
This is true for initial syncs, MirrorView/S partial syncs, and MirrorView/A updates
when the secondary image is in the synchronizing or updating states.
Clones that are in-sync during a mirror sync will be in the Clone Remote Mirror
Synching state as shown in Figure 36. This clone state indicates that the source LUN,
also serving as a secondary image, is not in a consistent state and should not be
fractured.
70
Several EMC and third-party software products are qualified to work with MirrorView
to automate high-availability and DR solutions. The following products are currently
qualified and available for use with MirrorView. Always check the latest EMC Support
Matrix for the latest interoperability information.
VMware vSphere and Site Recovery Manager (SRM)
Both Virtual Machine File System (VMFS) volumes and Raw Device Mapping (RDM)
volumes can be replicated with MirrorView. Some of the major implementation
guidelines for MirrorView and other replication applications are listed below.
RDM volumes must be in Physical compatibility mode when used with storagesystem-based replication.
When replicating an entire VMFS volume that contains a number of virtual
disks on a single LUN, the granularity of replication is the entire LUN with all of
its virtual disks.
When making copies of VMFS volumes that span multiple LUNs, you must use
the consistency technology of SnapView/MirrorView.
For VMFS-3 and RDM volumes, the replica can be presented to the same ESX
server but not to the same virtual machine.
VMware Site Recovery Manager (SRM) provides a framework for remote site recovery
automation. SRM works with storage vendor-provided Storage Replication Adapters
(SRAs) to automate site recovery using the storage vendors native replication
software. The VNX series/CLARiiON SRA supports SRM automation with MirrorView/A
and MirrorView/S.
Please consult the E-Lab Navigator and/or the following references on EMC Online
Support for detailed implementation information:
MirrorView Insight for VMware (MVIV) is a tool bundled with the MirrorView Site
Recovery Adapter (SRA) that complements the Site Recovery Manager (SRM) v4.x
framework by providing failback capability (experimental support only). MVIV also
provides detailed mapping of VMware file systems and their replication relationships.
71
With Site Recovery Manager v5.x, failback capability is built-in, so MVIV is no longer
needed and SRM failback is not experimental. For more details, refer to the
MirrorView Insight for VMware (MVIV) Technical Note available on EMC Online
Support.
EMC Replication Enabler for Microsoft Exchange Server 2010
The EMC Replication Enabler for Microsoft Exchange Server 2010 is a free software
utility that integrates EMC RecoverPoint and RecoverPoint/SE synchronous remote
replication and MirrorView/S replication with the Exchange Server 2010 data
availability group (DAG) architecture. This allows users to use the same, consistent,
array-based replication solutions for Exchange 2010 and other applications in the
environment. For more information, see the Replication Enabler for Microsoft
Exchange Server 2010 Release Notes on EMC Online Support.
EMC Data Protection Advisor (DPA)
EMC Data Protection Advisor helps users ensure their replication environment
remains accurate and effective in the face of constant change. It provides a scalable,
enterprise-level solution for replication analysis to ensure mission-critical
applications are protected and recoverable. It discovers selected client applications,
databases, and file systems, and maps them to physical storage devices. It maps all
copies of the primary data including snapshots, clones, and remote synchronous and
asynchronous replicas. After discovery, it correlates client, application, storage
mapping, and other metadata to identify incomplete and inconsistent replicas that
would result in a failed recovery. It then provides an intuitive graphical map of the
relationship between host and storage. It represents any recoverability gaps and
exposures in easy-to-understand reports and views that can be used by
administrators to resolve issues.
For VNX series and legacy CLARiiON/NS series systems, DPA supports MirrorView/A,
MirrorView/S, SnapView, SAN Copy, and RecoverPoint (CDP, CRR, CLR). Application
integration includes Oracle, SQL Server, PostgreSQL, Exchange, and file systems.
DPA eliminates the need for manual data collection, and collects data more
frequently than would be practical manually. With this information, it can actively
alert administrators to any issues as well as reduce the time it takes to conduct and
audit by as much as 95 percent. This can reduce costs by eliminating the need for
third-party consulting services to perform these tasks. For more information on Data
Protection Advisor, the following white papers are available on EMC Online Support:
72
seamlessly manages all storage system processes necessary to facilitate cluster node
failover. MirrorView/A and MirrorView/S replication is supported.
MV/CE offers reduced RTO by allowing the Windows operating system and
applications like Microsoft Exchange and Microsoft SQL Server to automate resource
failover between sites. MV/CE can also be used in Hypervisor environments in host
clustering configurations for Hyper-V and guest clustering configurations for Hyper-V
and VMware. For more information, please see the following resources on EMC Online
Support:
MV/CE 4.1 supports VNX Series systems, CLARiiON/NS systems running FLARE
releases 26-30, and AX4 systems running Release 23.
Replication Manager
Replication Manager (RM) is replica automation software that creates point-in-time
replicas of databases and file systems residing on VNX, CLARiiON, Symmetrix, or
Celerra storage systems. Replicas can be used for repurposing, backup, and
recovery, or as part of a DR plan. RM provides a single interface for managing local
and remote replicas across supported storage systems. Replication Manager supports
SnapView snapshot and clone replicas of MirrorView/S and MirrorView/A primary and
secondary images. For more information see the Replication Manager Product Guide
on EMC Online Support.
AutoStart
AutoStart is high-availability software that enables clustering of enterprise
applications. It safeguards critical applications by automating the failover of
applications such as ERP, database, e-mail, and web applications. It also monitors
application health and performance, provides alerting, and reprovisions/restarts
services as necessary in response to failures.
AutoStart is heterogeneous software that provides a consistent solution across a
variety of operating systems and platforms. AutoStart 5.3 adds support for
applications being replicated by MirrorView/S and MirrorView/A. This enables
automated application failover between nodes across MirrorView links.
For more detailed product information, consult the EMC AutoStart Administrators
Guide. For supported environment information, consult the EMC Information
Protection Software Compatibility Guide.
VNX for File and Celerra
LUNs allocated to VNX for File or legacy Celerra line of network-attached storage
products can be replicated with MirrorView/S. The following are high-level guidelines
for using MirrorView/S with Celerra. More detailed information is included in the
technical modules Using MirrorView/Synchronous with VNX for File for Disaster
73
Recovery and Using MirrorView/Synchronous with Celerra for Disaster Recovery found
on EMC Online Support.)
All LUNs dedicated to VNX for File or Celerra must be managed within a
consistency group. Consistency groups are discussed in the Storage-systembased consistency groups section.
In Celerra systems, as part of the baseline configuration, control LUNs 0, 1,
and 4 must be included in the consistency group. The remainder of the LUNs in
the consistency group can be allocated as user LUNs.
The write intent log must be allocated. See more in the Write intent log
section.
Existing configurations can implement MirrorView/S provided they can meet
the first guideline.
74
EMC midrange storage systems offer the widest range of disk drive technologies in
the industry. Multiple price points and performance profiles are needed to meet a
diverse set of end-user application needs. Disk drive offerings are categorized into
three tiers based on price-performance attributes:
Tier 0 Flash drives are comprised of solid state memory technology. Flash
drives are applicable for very high I/O per second (IOPs) environments with
low response time requirements. Flash drives can offer many times the IOPs
performance of other drive technologies, along with high reliability. They also
offer low power consumption per I/O and per GB as there are no moving parts.
Tier 1 Serial Attached SCSI (SAS) and FC drives. SAS and FC drives consist of
traditional rotating drives, and are offered in 10k and 15k rpm varieties. They
offer high I/O and bandwidth performance with high reliability and high
capacity.
Tier 2 Near Line SAS (NL-SAS) and Serial Advanced Technology Attach II
(SATA II) drives. NL-SAS and SATA II drives are best for applications with very
high capacity requirements and with duty cycles less than 100 percent. They
are traditional rotating drives that are available in 7200 rpm and low-power
5400 rpm offerings. They have a lower IOPs profile than the other drive
offerings, but can offer comparable sequential stream bandwidth
performance.
EMC midrange systems also offer the widest array of storage system based data
services in the midrange market. Fully Automated Storage Tiering for Virtual Pools
(FAST VP) and FAST Cache work at the sub-LUN level to optimize data placement and
performance based on current activity. Compression allows users to maximize
capacity utilization by storing data in a smaller footprint than would normally be
possible.
When LUNs are deployed in RAID groups, they have a homogeneous drive type over
the entire address space. Pools can be deployed in homogeneous pools or
heterogeneous pools. For heterogeneous pools, FAST VP allows optimization of data
placement on heterogeneous drive types at a sub-LUN level.
FAST Cache is available for both RAID group and pool LUNs. FAST Cache can boost
performance of a storage systems most active data by moving the most active
components to Flash drives. It allows a relatively small allotment of Flash drives to
boost performance of the most active data in the entire storage system.
Data movement using FAST VP and FAST Cache is largely decided by the activity of the
production application relative to other application activity on the storage system.
Compression is an option for data that is relatively inactive but still requires five 9s of
availability. It is enabled at the LUN level and runs as a background process to reduce
the data footprint of a given LUN. New data is written uncompressed. The
compression feature monitors the amount of uncompressed data and processes it
once enough has been written to surpass a system-defined threshold.
75
Storage system based replication is supported with any LUN type and in conjunction
with any feature such as FAST VP, FAST Cache, or compression. FAST VP and FAST
Cache perform their analysis on a per-array basis. When used with remote replication,
any FAST VP data relocation or FAST Cache promotions occur independently at each
location. I/O performed by a replication product such as MirrorView or SAN Copy is
factored into the I/O analysis of the source/primary and target/secondary LUNs.
For example, consider the case where a synchronous mirror is set up between two
thick LUNs. Both the primary and secondary images reside in heterogeneous pools
with FAST VP. FAST Cache is enabled on the primary image LUN. The application
workload is 80 percent reads and 20 percent writes. On the primary volume, FAST VP
and FAST Cache manage any data movement considering both the read and write
I/Os. FAST VP on the secondary storage system factors in only the write I/Os. Whether
or not data is moved between storage tiers via FAST VP depends on the other I/O
activity on each storage system.
EMC recommends that that you disable FAST Cache on MirrorView secondary image
LUNs for best performance. FAST Cache typically doesnt improve the performance of
secondary image LUNs, so fewer system resources are used if its disabled. If using
storage pools, secondary image LUNs should be placed in a storage pool with FAST
Cache disabled. If the secondary image LUNs are promoted, FAST Cache can be
enabled for the pool containing the newly promoted LUNs.
Compression analyzes data at the LUN level. Replication I/O can contribute to the
uncompressed data threshold used to trigger the compression process. For instance,
if a mirror secondary or SAN Copy destination is compressed, updates to those
volumes will be written uncompressed. When enough data is written, the
compression process will start automatically. If both the source and target are
compressed, their respective compression processes are run independently.
Release 33 for the VNX2 series supports Deduplication of pool LUNs. MirrorView also
supports replication of the LUNs which have Deduplication enabled. If both the
source and target LUNs have deduplication enabled, their space savings may differ as
deduplication is unique to the pools on each system.
When choosing drive technologies for a DR site, users must consider if production
applications will run at the DR site for significant periods of time. If so, similar drive
types and configurations should be used at the DR location. Keep in mind that after
promotion, FAST VP and FAST Cache will begin to react to the production load running
on that system.
For instance, after promotion, FAST Cache will have a warming period as the system
analyzes the production workload. For more information on FAST VP and FAST Cache,
please refer to the following white papers on EMC Online Support:
EMC VNX2 Multicore FAST Cache A Detailed Review
EMC VNX2 FAST VP A Detailed Review
Virtual Provisioning for the VNX2 Series Applied Technology
EMC VNX2 Deduplication and Compression Maximizing effective capacity
utilization
76
The remainder of this section describes the effect of how drive types factor into
MirrorView implementation from the perspective of server I/O and replication I/O.
Replication I/O is I/O generated by MirrorView, like synchronizations. The principles
are the same regardless of whether the data was explicitly placed on a drive type or
was automatically moved by FAST and/or FAST Cache.
Flash drives excel in transactional I/O profiles when compared to other drives,
particularly when there is a significant random read component. Random reads
typically cannot be serviced by read-ahead algorithms of the drive itself or by the
storage system read cache. Therefore, the latency of a random read is directly related
to the seek time of the disk drive. For rotating drives, this is the mechanical
movement of the disk head to the desired area. Flash drives can perform this
operation at solid-state memory speeds, which can be an order of magnitude faster
when compared to rotating drives.
Generally speaking, there is no major advantage of using Flash drives purely to
improve replication I/O performance, because replication I/O involves a high
percentage of writes. However, applications that benefit from Flash drives will
continue to do so when using MirrorView for DR.
When using MirrorView/S, Flash drives still provide the same advantage to the
production application for random reads. Minimum write response times will
probably be dictated by the replication over the link to the secondary image.
Write intent log LUNs should always be on write-cache enabled LUNs as their I/O
profile is almost exclusively writes and it is required to take advantage of SP cache
optimizations. Write intent log LUNs use an optimization that keeps active data in SP
cache. If placed on Flash drives, users must enable write cache on those LUNs. There
is no advantage to placing the write intent log LUNs on Flash drives over Tier 1 drives
when write cache is enabled. For optimal performance, it is recommended that FAST
Cache be disabled on write intent log LUNs.
While using MirrorView/A, Flash drives continue to provide the same readperformance benefits to the production application. In addition, when primary image
LUNs are placed on flash, the impact of copy-on-first-write operations on production
volumes can be reduced. Part of the copy-on-first-write operation is a read from the
source LUN and a write to the reserved LUN.
The read performance advantage of Flash drives can provide benefits to this portion
of the copy-on-first-write operation, thereby reducing response time impacts.
Therefore, when source LUNs reside on Flash drives, copy-on-first-write activity has
minimal impact on the application.
The I/O profile for reserved LUNs is largely comprised of writes. Therefore, there is
little to no advantage of explicitly placing reserved LUNs on Flash drives. We
recommend that you use write-cache enabled LUNs on Tier 1 drives for reserved LUNs.
Reserved LUNs can be RAID-group-based LUNs and/or thick LUNs.
Writes to reserved LUNs can be random in nature, particularly when there are several
reserved LUNs on a particular set of drives. Therefore, we do not recommend Tier 2
77
drives for reserved LUNs. In the case of heterogeneous pools, it is possible that some
percentage of a reserved LUN resides on Tier 2 drives. If the reserved LUN is allocated
for a long period of time, like for MirrorView/A primary images, FAST VP can process
the reserved LUN for tiering. For reserved LUN allocations of shorter duration, such as
short-term snapshots, the LUN may not be allocated long enough for FAST VP or FAST
Cache to assist. In these cases, the user may choose to explicitly place reserved LUNs
on the preferred drive type by using RAID-group-based LUNs or specifying tier
placement in the pool.
iSCSI connections
iSCSI connections define initiator/target pairs for iSCSI remote replication between
storage systems. Similar to zoning in an FC environment, they dictate which targets
the initiators log in to. Since all necessary iSCSI connections are automatically
created with the MirrorView Wizard or the MirrorView Connections dialog box, it is
rare that its necessary to explicitly manage iSCSI connections for MirrorView.
Release 29 adds support for VLAN tagging, which allows you to configure several
virtual ports on each physical iSCSI port. Enabling VLAN tagging is optional on a perport basis. VNX series systems support up to 8 virtual ports per 10 Gb/s and 1 Gb/s
iSCSI port. CX4 and NS series systems support up to 8 virtual ports per 10 Gb/s port
and up to 2 virtual ports per 1 Gb/s port. The next figure shows the iSCSI port
properties for a port with VLANs enabled and two virtual ports configured.
78
iSCSI connections are defined per physical FE port for ports without VLAN tagging
enabled, or per virtual port for ports with VLAN tagging enabled. For each MirrorView
connection to another storage system, an iSCSI connection will be defined. iSCSI
connections must be made for the primary storage system to the secondary system
and for the secondary system back to the primary system. This is required because
mirroring can be bidirectional between systems and primary/secondary images can
reverse roles, resulting in a reversal of replication direction.
Figure 38 shows a connection defined for the storage system Site 1. This
connection allows the MirrorView ports (A-8, B-8) to log in to target ports of the Site
2 system. The IP addresses listed in the Target Portal column represent the
MirrorView ports of the Site 2 system. The connections shown were automatically
generated by MirrorView when the MirrorView connection was enabled.
79
When creating iSCSI connections, the user has three options for CHAP authentication:
Shared A general set of CHAP credentials that can be used by any or all of
the connections of a storage system
Connection Specific CHAP credentials that are defined specifically for the
connection being created
None No authentication
80
Figure 40 shows the iSCSI Port Properties page, which is launched from the storage
system right-click menu and going to <Storage System Name> Settings > Network >
Settings for Block, selecting the desired port, and clicking Properties. The option for
requiring initiator authentication is at the bottom of the dialog box.
81
When iSCSI connections are created by the system, CHAP credentials are always
entered regardless of whether they are required by the target system. This way, if the
user did not initially require initiator authentication, the credentials are already in
place if they were to enable it later.
Conclusion
MirrorView offers two products that can satisfy a wide range of recovery point
requirements. MirrorView/S offers a zero data loss solution typically implemented
over high bandwidth, low latency links. MirrorView/A offers a periodic update
solution with recovery points in minutes to hours. MirrorView/A is typically
implemented over long distances on low bandwidth, higher latency lines.
Both products offer a consistency group feature for managing write ordering over a set
of volumes. Consistency groups can lower RTO by maintaining restartable copies of a
data set on the remote storage system. Ease of management is achieved with the
ability to manage all remote mirrors in the consistency group as a single unit.
Several solution automation products, such as VMware SRM, MirrorView/CE, and
Replication Manager, are available to simplify disaster recovery solutions for specific
applications and platforms. They rely on MirrorView to perform the replication options
because of its flexibility, reliability, and performance.
References
module
Using MirrorView/Synchronous with VNX for File for Disaster Recovery
Virtual Provisioning for the VNX2 series
82
Meaning
The remote mirror is running normally.
The mirrors secondary image is fractured, and the mirror is configured to
generate an alert in this case. The mirror continues to accept server I/O in
this state. The event code ID for a mirror moving into the Attention state is
0x71050129.
Meaning
The secondary image is identical to the primary image. This state persists
only until the next write to the primary image, at which time the image
state becomes Consistent.
The secondary image is identical to either the current primary image or to
some previous instance of the primary image. This means that the
secondary image is available for recovery when you promote it.
The software is applying changes to the secondary image to mirror the
primary image, but the current contents of the secondary are not known
and are not usable for recovery.
The secondary image requires synchronization with the primary image.
The image is unusable for recovery.
A successful promotion occurred where there was an unfinished update
to the secondary image. This state persists until the Rollback operation
completes.
Image condition
Along with an image state, an image will have an image condition that provides more
information.
Condition
Normal
Admin Fractured
System Fractured
Synchronizing
Updating
(MirrorView/A only)
Waiting on admin
Queued to be
Synchronized
(MirrorView/S only)
Meaning
The normal processing state.
The administrator has fractured the mirror, or a media failure (such as a
failed sector or a bad block) has occurred. An administrator must initiate
a synchronization.
The mirror is system fractured.
MirrorView/A or MirrorView/S is performing an initial
synchronization.
MirrorView/S is synchronizing after a fracture.
MirrorView is performing a periodic update.
The mirror is no longer system fractured. It is a temporary condition for
automatic recovery mirrors. For manual recovery mirrors, the
administrator can now initiate the synchronization command.
Synchronize command has been received, but the maximum number of
synchronizations are in progress. The mirror will synchronize when one of
the in-process synchronizations completes. See the release notes for
maximum synchronizations per SP limits.
83
Scrambled
Empty
Incomplete
Local Only
Rolling Back
(MirrorView/A)
Meaning
All of the secondary images are in the Synchronized state.
All of the secondary images are either in the Synchronized or Consistent
state, and at least one is in the Consistent state.
A new member that is not consistent with existing members is added to
the consistency group, which automatically starts an update. After the
update completes, the consistency group is again consistent.
At least one mirror in the group is in the Synchronizing state, and no
member is in the Out-of-Sync state.
The group may be fractured, waiting for synchronization (either automatic
or by the administrator), or in the synchronization queue. Administrative
action may be required to return the consistency group to having a
recoverable secondary group.
There is a mixture of primary and secondary images in the consistency
group. During a promote, it is common for the group to be in the
scrambled state.
The consistency group has no members.
Some, but not all of the secondary images are missing, or mirrors are
missing. This is usually due to a failure during group promotion. The
group may also be scrambled.
The consistency group contains only primary images. No mirrors in the
group have a secondary image.
A successful promotion occurred where there was an unfinished update
to the group. This state persists until the Rollback operation completes.
Invalid
Meaning
The normal processing state.
The group is not accepting I/O; this is normal during group promotion.
The administrator has fractured the group, or a media failure (such as a
failed sector or a bad block) has occurred. An administrator must initiate
a group synchronization.
Some or all of the members are system fractured.
The group is waiting on a Synchronize command. Consistency groups with
the manual recovery option require the user to initiate the
synchronization. If it is after a system fracture, it is only a temporary
condition for automatic recovery groups before the system initiates a
synchronization.
A temporary condition while a group fracture is in progress; the group is
partially fractured. If the state is incomplete, the condition is irrelevant
and is thus set to invalid. If this state persists, try fracturing the group and
synchronizing it.
MirrorView/A
Condition
Normal
Initializing
Updating
Meaning
The normal processing state.
The group is performing an initial synchronization.
The group is performing a periodic update.
84
Admin Fractured
System Fractured
Waiting on admin
85
State/Condition
Any
Primary storage
systems
secondary SP fails
Primary storage
systems
controlling SP fails
Any
Primary storage
systems
controlling SP fails
and reboots. No
server I/O
attempted while
the SP is
unavailable
Any
Action
Option 1: Administrator promotes a secondary LUN on
a secondary storage system. After application data
recovery, applications on a standby server can start
up and access required data.
Note: Any writes that are in progress when the primary
storage system fails may not propagate to the
secondary storage system. Also, if the remote image
was fractured at the time of the failure, any writes
since the fracture will not have propagated.
Option 2: Repair and reboot the primary storage
system, then synchronize the secondary LUN(s) using
the write intent log. If the write intent log is disabled,
then a full synchronization is necessary.
Repair the failed SP to restore high-availability
protection for the data. Access to the mirrored LUN
data is uninterrupted.
The I/O request fails, and software on the server
retries the request to the secondary SP. This result in
a trespass to the secondary SP, which then
synchronizes the remote LUN (if necessary) based on
the write intent log. The I/O request is coordinated
with the synchronization.
All primary LUNs are checked for a current relationship
with their secondary LUNs.
No action
If a full synchronization is interrupted, then it
continues from the last checkpoint.
If a fracture log-based synchronization is interrupted,
then a full synchronization is necessary unless the
mirror is using the write intent log, in which case the
partial synchronization can be resumed.
In either case, if auto sync is enabled, then
synchronization starts automatically. Otherwise, an
administrator must start the synchronization.
A full synchronization is necessary.
If auto sync is enabled, then synchronization starts
automatically. Otherwise, an administrator must start
the synchronization.
For a primary LUN using the write intent log, regions in
the log marked potentially out-of-sync are read from
the primary LUN and written to the secondary LUN.
For a primary LUN not using the write intent log, the
secondary LUN is marked out-of sync and a full
synchronization is necessary. If auto sync is enabled,
86
LUN is consistent or
in-sync.
Primary LUN is
synchronizing with
secondary LUN.
Secondary LUN is
fractured.
Any
MV/A Update
State/Condition
Any
Secondary storage
systems SP fails
Secondary storage
systems controlling
SP fails or reboots
Any
Action
Repair and reboot the secondary storage system.
The secondary LUN is fractured. To rejoin the mirror,
synchronization (either full or based on a fracture log)
of the secondary LUN is necessary for all states except
in-sync.
No action. Repair the failed SP to restore highavailability protection for the data.
Primary storage system detects the failure and
fractures the secondary LUN.
If the failed SP is unavailable for an extended period,
an administrator should manually trespass the primary
LUN to its secondary SP so mirroring can continue.
When the failed SP returns to service, synchronization
is necessary (unless the LUN is in-sync). If the LUN is
consistent or synchronizing, the system uses the
fracture log to perform synchronization. Otherwise, a
full synchronization is necessary.
No action. This is the expected result when a LUN
trespasses for any reason.
87
Data Protection
Replication
Data Recovery
Replication/Recovery
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Fracture a clone
Roll back a snap session
Reverse synchronize a clone
MirrorView
Synchronize a mirror / consistency
group
Fracture a mirror / consistency
group
Control the update parameters of
an async mirror
Modify the update frequency of an
async mirror
Throttle a mirror / consistency
group
Promote a sync or async
secondary mirror / consistency
group
SAN Copy
Yes
No
No
Yes
No
No
Yes
Yes
Yes
No
Yes
Yes
No
No
Yes
No
Yes
Yes
No
Yes
Yes
No
Yes
Yes
No
No
Yes
Start a session
Stop a session
Pause a session
Resume a session
Mark a session
Unmark a session
No
No
No
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Verify a session
Throttle a session
No
No
Yes
Yes
Yes
Yes
88