Documente Academic
Documente Profesional
Documente Cultură
USING VPLEX METRO WITH VMWARE HIGH AVAILABILITY AND FAULT TOLERANCE FOR ULTIMATE AVAILABILITY
Abstract This white paper discusses using best of breed technologies from VMware and EMC to create federated continuous availability solutions. The following topics are reviewed Choosing between federated Fault Tolerance or federated High Availability Design considerations and constraints Operational Best Practice
Copyright 2012 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
Table of Contents
Executive summary ............................................................................................. 5
Audience ......................................................................................................................... 6 Document scope and limitations................................................................................. 6
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
Path loss handling semantics (PDL and APD) ........................................................... 53 Cross-connect Topologies and failure scenarios. .................................................... 55 Cross-connect and multipathing ............................................................................... 58 VPLEX site preference rules ......................................................................................... 58 DRS and site affinity rules ............................................................................................. 59
Conclusion ......................................................................................................... 64 References ......................................................................................................... 65 Appendix A - vMotioning over longer distances (10ms) .............................. 67
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
Executive summary
The EMC VPLEX family removes physical barriers within, across, and between datacenters. VPLEX Local provides simplified management and non-disruptive data mobility for heterogeneous arrays. VPLEX Metro and Geo provide data access and mobility between two VPLEX clusters within synchronous and asynchronous distances respectively. With a unique scale-out architecture, VPLEXs advanced data caching and distributed cache coherency provide workload resiliency, automatic sharing, balancing and failover of storage domains, and enable both local and remote data access with predictable service levels. VMware vSphere makes it simpler and less expensive to provide higher levels of availability for important applications. With vSphere, organizations can easily increase the baseline level of availability provided for all applications, as well as provide higher levels of availability more easily and cost-effectively. vSphere makes it possible to reduce both planned and unplanned downtime. The revolutionary VMware vMotion (vMotion) capabilities in vSphere make it possible to perform planned maintenance with zero application downtime. VMware High Availability (HA), a feature of vSphere, reduces unplanned downtime by leveraging multiple VMware ESX and VMware ESXi hosts configured as a cluster, to provide automatic recovery from outages as well as cost-effective high availability for applications running in virtual machines. VMware Fault Tolerance (FT) leverages the well-known encapsulation properties of virtualization by building fault tolerance directly into the ESXi hypervisor in order to deliver hardware style fault tolerance to virtual machines. Guest operating systems and applications do not require modifications or reconfiguration. In fact, they remain unaware of the protection transparently delivered by ESXi and the underlying architecture. By leveraging distance, VPLEX Metro builds on the strengths of VMware FT and HA to provide solutions that go beyond traditional Disaster Recovery. These solutions provide a new type of deployment which achieve the absolute highest levels of continuous availability over distance for todays enterprise storage and cloud environments. When using such technologies, it is now possible to provide a solution that has both zero Recovery Point Objective (RPO) with zero "storage" Recovery Time Objective (RTO) (and zero "application" RTO when using VMware FT). This white paper is designed to give technology decision-makers a deeper understanding of VPLEX Metro in conjunction with VMware Fault Tolerance
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
and/or High Availability discussing design, features, functionality and benefits. This paper also highlights the key technical considerations for implementing VMware Fault Tolerance and/or High Availability with VPLEX Metro technology to achieve "Federated Availability" over distance.
Audience
This white paper is intended for technology architects, storage administrators and EMC professional services partners who are responsible for architecting, creating, managing and using IT environments that utilize EMC VPLEX and VMware Fault Tolerance and/or High Availability technologies (FT and HA respectively). The white paper assumes that the reader is familiar with EMC VPLEX and VMware technologies and concepts.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
The configuration is in full compliance with VPLEX best practice found here: http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Tech nical_Documentation/h7139-implementation-planning-vplex-tn.pdf
Please consult with your local EMC Support representative if you are uncertain as to the applicability of these requirements. Note: While out of scope for this document, it should be noted that in addition to all best practices within this paper, that all federated FT and HA solutions will carry the same best practices and limitations imposed by the VMware HA and FT technologies too. For instance at the time of writing VMware FT technology is only capable of supporting a single vCPU per VM (VMware HA does not carry the same vCPU limitation) and this limitation will prevail when federating a VMware FT cluster. Please ensure to review the VMware best practice documentation as well as the limitations and considerations documentation (please see the References section) for further information.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
Introduction
Increasingly, more and more customers wish to protect their business services from any event imaginable that would lead to downtime. Previously (i.e. prior to VPLEX) solutions to prevent downtime fell into two camps: 1. Highly available and fault tolerant systems within a datacenter 2. Disaster recovery solutions outside of a datacenter. The benefit of FT and HA solutions are that they provide automatic recovery in the event of a failure. However, the geographical protection range is limited to a single datacenter therefore not protecting business services from a datacenter failure. On the other hand, disaster recovery solutions typically protect business services using geographic dispersion so that if a datacenter fails, recovery would be achieved using another datacenter in a separate fault domain from the primary. Some of the drawbacks with disaster recovery solutions, however, are that they are human decision based (i.e. not automatic) and typically require a 2nd disruptive failback once the primary site is repaired. In other words, should a primary datacenter fail the business would need to make a non-trivial decision to invoke disaster recovery. Since disaster recovery is decision-based (i.e. manually invoked), it can lead to extended outages since the very decision itself takes time, and this is generally made at the business level involving key stakeholders. As most site outages are caused by recoverable events (e.g. an elongated power outage), faced with the Invoke DR decision some businesses choose not to invoke DR and to ride through the outage instead. This means that critical business IT services remain offline for the duration of the event. These types of scenarios are not uncommon in these "disaster" situations and non-invocation can be for various reasons. The two biggest ones are: 1. The primary site that failed can be recovered within 24-48 hours therefore not warranting the complexity and risk of invoking DR. 2. Invoking DR will require a failback at some point in the future which in turn will bring more disruption. Other potential concerns to invoking disaster recovery include complexity, lack of testing, lack of resources, lack of skill sets and lengthy recovery time. To avoid such pitfalls, VPLEX and VMware offer a more comprehensive answer to safeguarding your environments. By combining the benefits of HA and FT, a new category of availability is created. This new type of category provides the automatic (non-decision based) benefits of FT and
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
HA, but allows them to be leveraged over distance by using VPLEX Metro. This brings the geographical distance benefits normally associated with disaster recovery to the table enhancing the HA and FT propositions significantly. The new category is known as Federated Availability and enables bullet proof availability which in turn significantly lessens the chance of downtime for both planned and unplanned events.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
10
VPLEX Director
VPLEX Metro HA
Access Anywhere
Federation
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
11
elements at a peer level over distance enabling mobility, availability and collaboration Automatic Automated No human intervention whatsoever (e.g. HA and FT) No human intervention required once a decision has been made (e.g. disaster recovery with VMware's SRM technology)
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
12
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
13
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
14
Datacenter
Figure 3 Single host access to a single disk Clearly the host in the diagram is the only host initiator accessing the single volume. The next figure shows a local two node cluster.
Cluster of hosts coordinate for access
Datacenter
Figure 4 Multiple host access to a single disk As shown in the diagram there are now two hosts contending for the single volume. The dashed orange rectangle shows that each of the nodes is
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
15
required to be in a cluster or utilize a cluster file system so they can effectively coordinate locking to ensure the volume remains consistent. The next figure shows the same two node cluster but now connected to a VPLEX distributed volume using VPLEX cache coherency technology.
VPLEX AccessAnywhere
Datacenter Datacenter
Figure 5 Multiple host access to a VPLEX distributed volume In this example there is no difference to the fundamental dynamics of the two node cluster access pattern to the single volume. Additionally as far as the hosts are concerned they cannot see any different between this and the previous example since VPLEX is distributing the device between datacenters via AccessAnywhere (which is a type of federation). This means that the hosts are still required to coordinate locking to ensure the volume remains consistent. For ESXi this mechanism is controlled by the cluster file system Virtual Machine File System (VMFS) within each datastore. In this case each distributed volume will be imported into VPLEX and formatted with the VMFS file system. The figure below shows a high-level physical topology of a VPLEX Metro distributed device.
A A A A A
SITE A AccessAnywhere
A
SITE B
LINK
A
Figure 6 Multiple host access to a VPLEX distributed volume This figure is a physical representation of the logical configuration shown in Figure 5. Effectively, with this topology deployed, the distributed volume
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
16
can be treated just like any other volume; the only difference being it is now distributed and available in two locations at the same time. Another benefit of this type of architecture is extreme simplicity since it is no more difficult to configure a cluster across distance that it is in a single data center. Note: VPLEX Metro can use either 8GB FC or native 10GB Ethernet WAN connectivity (where the word link is written). When using FC connectivity this can be configured with either a dedicated channel (i.e. separate non merged fabrics) or ISL based (i.e. where fabrics have been merged across sites). It is assumed that any WAN link will have a second physically redundant circuit. Note: It is vital that VPLEX Metro has enough bandwidth between clusters to meet requirements. EMC can assist in the qualification of this through the Business Continuity Solution Designer (BCSD) tool. Please engage your EMC account team to perform a sizing exercise. For further details on VPLEX Metro architecture, please see the VPLEX HA Techbook found here: http://www.emc.com/collateral/hardware/technicaldocumentation/h7113-vplex-architecture-deployment.pdf
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
17
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
18
VPLEX CLUSTER PARTITION Site A ONLINE SUSPENDED GOOD SUSPENDED (by design) Site B SUSPENDED ONLINE
SITE A FAILS Site A FAILED FAILED Site B SUSPENDED ONLINE GOOD SUSPENDED (by design)
SITE B FAILS Site A ONLINE GOOD SUSPENDED FAILED BAD (by design) SUSPENDED (by design) Site B FAILED
GOOD
Table 1 Failure scenarios without VPLEX Witness As we can see in Table 1(above) if we only used the preference rules without VPLEX Witness then under some scenarios manual intervention would be required to bring the volume online at a given VPLEX cluster(e.g. if site A is the preferred site, and site A fails, site B would also suspend). This is where VPLEX Witness assists since it can better diagnose failures due to the network triangulation, and ensures that at any time at least one of the VPLEX clusters has an active path to the data as shown in the table below:
Preference Rule Cluster A Preferred Cluster B preferred No automatic winner VPLEX CLUSTER PARTITION Site A ONLINE SUSPENDED GOOD SUSPENDED (by design) GOOD ONLINE FAILED Site B SUSPENDED SITE A FAILS Site A FAILED GOOD ONLINE GOOD SUSPENDED (by design) Site B ONLINE SITE B FAILS Site A ONLINE ONLINE GOOD FAILED GOOD SUSPENDED (by design) Site B FAILED
Table 2 Failure scenarios with VPLEX Witness As one can see from Table 2 VPLEX Witness converts a VPLEX Metro from an active/active mobility and collaboration solution into an active/active continuously available storage cluster. Furthermore once VPLEX Witness is deployed, failure scenarios become self-managing (i.e. fully automatic) which makes it extremely simple since there is nothing to do regardless of failure condition!
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
19
Figure 7 VPLEX configured for VPLEX Witness As depicted in Figure 7 we can see that the Witness VM is deployed in a separate fault domain (as defined by the customer) and connected into both VPLEX management stations via an IP network. Note: Fault domain is decided by the customer and can range from different racks in the same datacenter all the way up to VPLEX clusters 5ms of distance away from each other (5ms measured round trip time latency or typical synchronous distance). The distance that VPLEX witness can be placed from the two VPLEX clusters can be even further. The current supported maximum round trip latency for this is 1 second.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
20
IMPORTANT / REQUIREMENT!
Figure 8 Detailed VPLEX Witness network layout The witness network is physically separate from the VPLEX inter-cluster network and also uses storage that is physically separate from either VPLEX cluster. As stated previously, it is critical to deploy VPLEX Witness into a third failure domain. The definition of this domain changes depending on where the VPLEX clusters are deployed. For instance if the VPLEX Metro clusters are to be deployed into the same physical building but perhaps different areas of the datacenter, then the failure domain here would be deemed the VPLEX rack itself. Therefore VPLEX Witness could also be deployed into the same physical building but in a separate rack. If, however, each VPLEX cluster was deployed 50 miles apart in totally different buildings then the failure domain here would be the physical building and/or town. Therefore in this scenario it would makes sense to deploy VPLEX Witness in another town altogether; and since the maximum round trip latency can be as much as one second then you could effectively pick any city in the world, especially given the bandwidth requirement is as low as 3Kb/sec.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
21
For more in depth VPLEX Witness architecture details please refer to the VPLEX HA Techbook that can be found here: http://www.emc.com/collateral/hardware/technicaldocumentation/h7113-vplex-architecture-deployment.pdf Note: Always deploy VPLEX Witness in a 3rd failure domain and ensure that all distributed volumes reside in a consistency group with the witness function enabled. Also ensure that EMC Secure Remote Support (ESRS) Gateway is fully configured and the witness has the capability to alert if it for whatever reason, fails. It is important to note that there is no impact to I/O if the witness fails.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
22
Note: The FT configuration on VPLEX Witness must be enforced to reside only within one location and not a stretched / federated FT configuration. The storage that the VPLEX Witness uses should be physically contained within the boundaries of the third failure domain on local (i.e. not VPLEX Metro distributed) volumes. Additionally it should be noted that currently HA alone is not supported, only FT or unprotected.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
23
VPLEX Metro HA
As discussed in the two previous sections, VPLEX Metro is able to provide active/active distributed storage, however we have seen that in some cases depending on failure, loss of access to the storage volume could occur if the preferred site fails for some reason causing the non-preferred site to suspend too. Using VPLEX Witness overcomes this scenario and ensures that access to a VPLEX cluster is always maintained regardless of which site fails. VPLEX Metro HA describes a VPLEX Metro solution that has also been deployed with VPLEX Witness. As the name suggests, VPLEX Metro HA effectively delivers truly available distributed storage volumes over distance and forms a solid foundation for additional layers of VMware technology such as HA and FT. Note: It is assumed that all topologies discussed within this white paper use VPLEX Metro HA (i.e. use VPLEX Metro and VPLEX Witness). This is mandatory to ensure fully automatic (i.e. decision less) recovery under all the failure conditions outlined within this document.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
24
SITE A
A
OPTIONAL X CONNECT
SITE B
A
AccessAnywhere
LINK
VPLEX
WITNESS
IP
IP
Figure 9 VPLEX Metro deployment with cross-connect As we can see in the diagram the cross-connect offers an alternate path or paths from each ESXi server to the remote VPLEX. This ensures that if for any reason an entire VPLEX cluster were to fail (which is unlikely since there is no single-point-of-failure) there would be no interruption to I/O since the remaining VPLEX cluster will continue to service I/O across the remote cross link (alternate path) It is recommended when deploying cross-connect that rather than merging fabrics and using an Inter Switch Link (ISL), additional host bus adapters (HBAs) should be used to connect directly to the remote data centers switch fabric. This ensures that fabrics do not merge and span failure domains. Another important note to remember for cross-connect is that it is only supported for campus environments up to 1ms round trip time. Note: When setting up cross-connect, each ESXi server will see double the paths to the datastore (50% local and 50% remote). It is best practice to ensure that the pathing policy is set to fixed and mark the remote paths across to the other cluster as standby. This ensures that the workload remains balanced and only committing to a single cluster at any one time.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
25
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
26
each location and does not require an ISL (although an ISL can be optionally deployed). To understand this in greater detail and to quantify the benefits of non-uniform access we must first understand uniform access.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
27
As noted in the previous section this type of configuration is known as "Uniform Access" since all I/O will be serviced uniformly by the exact same controller for any given storage volume, passing all I/O to and from the same location. The diagram in Figure 10 below shows a typical example of a uniform architecture.
SITE A
A
Single Controller
Single Controller
Communication
Communication
SPLIT CONTROLLERS
SITE B
A
Figure 10 A typical non-uniform layout As we can see in the above diagram, hosts at each site connect to both controllers by way of the stretched fabric; however the active controller (for any given LUN) is only at one of the sites (in this case site A).
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
28
1. Like the split cluster topology, this is still an active/passive solution and requires additional FC networking between locations. All hosts at the passive location need to access the active storage through some kind of cross site ISL connection. Also similar to the split cluster topology, this will introduce higher response times at the passive site for both read and write I/O as well as increase the bandwidth utilization since data has to traverse the WAN twice. 2. This type of configuration requires deep integration into the host I/O stack and adds complexity as the passive volume needs to have its identity "spoofed". This is due to the fact that (unlike VPLEX) the passive volume has a different WWN and UUID (identity) than the active volume. 3. The host path management software has to be configured and maintained on all of the connected hosts at initial deployment time as well as each time a new volume is added to the configuration. Also since this is a non-standard configuration, only the vendors path management software can be used and therefore it will be host and operating system dependent. 4. Manual intervention is required under some failure scenarios due to APD.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
29
3. Non-uniform access is typically more efficient when compared to uniform access since under normal conditions all I/O is handled by the local active controller (all controllers are active). 4. Interestingly, due to the active/active nature of VPLEX, should a full site outage occur VPLEX does not need to perform a failover since the remaining copy of the data was already active. This is another key difference when compared to uniform access since if the primary active node is lost a failover to the passive node is required. The diagram below shows a high-level architecture of VPLEX when distributed over a Metro distance:
VPLEX Cluster A
VPLEX Cluster B
Communication
Communication
I P or FC
Backend
A A A
Backend
Backend
Backend
A
SITE A
Figure 11 VPLEX non-uniform access layout
SITE B
As we can see in Figure 11, each host is only connected to the local VPLEX cluster ensuring that I/O flow from whatever location is always serviced by the local storage controllers. VPLEX can achieve this because all of the controllers at both sites are in an active state and able to service I/O. Some other key differences to observe from the diagram are: 1. Storage devices behind VPLEX are only connected to each respective local VPLEX cluster and are not connected across the WAN, dramatically simplifying fabric design. 2. VPLEX has dedicated redundant WAN ports that can be connected natively to either 10GB Ethernet or 8GB FC. 3. VPLEX has multiple active controllers in each location ensuring there are no local single points of failure. With up to eight controllers in each location, VPLEX provides N+1 redundancy.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
30
4. VPLEX uses and maintains single disk semantics across clusters at two different locations.
VPLEX Cluster A
VPLEX Cluster B
Communication
Communication
I P or FC
Backend
A A A
Backend
Backend
Backend
A
SITE A
Figure 12 High-level VPLEX cross-connect with non-uniform I/O access
SITE B
In Figure 12, each ESXi host now has an alternate path to the remote VPLEX cluster. Compared to the typical uniform diagram in the previous section, however, we can still see that the underlying VPLEX architecture differs significantly since it remains identical to the non-uniform layout, servicing I/O locally at either location.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
31
VPLEX Cluster A
VPLEX Cluster B
Communication
Partition
Communication
Backend
A A A
Backend
Backend
Backend
A
SITE A
SITE B
Figure 13 Forced uniform mode due to WAN partition As illustrated in Figure 13, VPLEX will invoke the "site preference rule" suspending access to a given distributed virtual volume at one of the locations (in the case site B). This ultimately means that I/O at site B has to traverse the link to site A since the VPLEX controller path in site B is now suspended due to the preference rule. Another scenario where this might occur is if one of the VPLEX clusters at either location becomes isolated or destroyed. The diagram below shows an example of a localized rack failure at site B which has taken the VPLEX cluster offline at site B.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
32
VPLEX Cluster A
VPLEX Cluster B
Communication
Communication
I P or FC
Cache (Distributed)
Cache
Backend
A A A
Backend
Backend
Backend
A
SITE A
Figure 14 VPLEX forced uniform mode due to cluster failure
SITE B
In this scenario the VPLEX cluster remains online at site A (through VPLEX Witness) and any I/O at site B will automatically access the VPLEX cluster at site A over the cross-connect, thereby turning the standby path into an active path. In summary, VPLEX can use forced uniform mode as a failsafe to ensure that the highest possible level of availability is maintained at all times. Note: Cross-connected VPLEX clusters are only supported with distances up to 1 ms round trip time.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
33
ESX VPLEX
A A
ESX
A
VPLEX
Heterogeneous Storage
IP
VPLEX
WITNESS
IP
Heterogeneous Storage
SITE A
SITE B
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
34
For detailed technical setup instruction please see the VPLEX Procedure generator - Configuring a distributed volume as well as the " VMware vSphere Metro Storage Cluster Case Study " white paper found here: http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STORCLSTR-USLET-102-HI-RES.pdf for additional information around: Setting up Persistent Device Loss (PDL) handling vCenter placement options and considerations DRS enablement and affinity rules Controlling restart priorities (High/Medium/Low)
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
35
http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STORCLSTR-USLET-102-HI-RES.pdf). Note: A design consideration to take into account if DRS is desired within a solution is to ensure that there are enough compute and network resources at each location to take the full load of the business services should either site fail.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
36
SITE A
VPLEX
A
ESX
A A A
SITE B
VPLEX
A
WAN
IP
IP
VPLEX
WITNESS
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
37
The table below shows the different failure scenarios and the outcome:
Failure Storage failure at site A VMs at A Remain online / uninterrupted VMs at B Remain online / uninterrupted Notes Cache read miss at sire A now incurs additional link latency, cache read hits remain the same as do write I/O response times Cache read miss at site B now incurs additional link latency, cache read hits remain the same as do write I/O response times Both VPLEX clusters dial home Once the ESXi hosts are recovered, DRS (if configured) will move them back automatically Once the ESXi hosts are recovered, DRS (if configured) will move them back automatically Cross-connect is not normally in use and access remains non-uniform. Site B notes only valid for ESXi 5.0 update 1 and above. ESXi versions prior to 5.0 update 1 will require manual intervention for VMs at site B. Use DRS site affinity to avoid manual intervention for older versions. Site A notes only valid for ESXi 5.0 update 1 and
Remain online / uninterrupted All VMs are restarted automatically on the ESXi host at site B
All VMs are restarted automatically on the ESXi host at site A Remain online / uninterrupted
Total cross-connect failure (if using crossconnect) Full WAN failure (no cross-connect in place) and VPLEX preference at site A
Distributed volume suspended at B and Persistent Device Loss (PDL) sent to ESX servers at B causing VMs to die. This invokes a HA restart and VMs start coming online at A.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
38
preference at site A
Loss (PDL) sent to ESXi servers at A causing VMs to die. This invokes a HA restart and VMs start coming online at B.
above. ESXi versions prior to 5.0 update 1 will require manual intervention for VMs at site A. Use DRS site affinity to avoid manual intervention for older versions. Remain online / uninterrupted Cross-connect is now in use for the hosts at the "nonpreferred" site. (This is called forced uniform mode.) Site B notes only valid for ESXi 5.1 and above. ESXi versions prior to 5.1 (including 5.0 update 1) will require manual intervention for VMs at site B. Use DRS site affinity to avoid manual intervention for older versions. *See note below Site A notes only valid for ESXi 5.1 and above. ESXi versions prior to 5.1 (including 5.0 update 1) will require manual intervention for VMs at site A. Use DRS site affinity to avoid manual intervention for older versions. *See note below Highly unlikely since VPLEX has no SPOFS. Full site failure more likely. Highly unlikely since VPLEX has no SPOFS. Full site failure more likely.
Full WAN failure with cross-connect partitioned and VPLEX preference at site A
Distributed volume suspended at B and Persistent Device Loss (PDL) sent to ESX servers at B causing VMs to die. This invokes a HA restart and VMs start coming online at A.
Full WAN failure with cross-connect partitioned and VPLEX preference at site B
Distributed volume suspended at A and Persistent Device Loss (PDL) sent to ESXi servers at A causing VMs to die. This invokes a HA restart and VMs start coming online at B.
VPLEX cluster outage at A (with cross-connect) VPLEX cluster outage at B (with cross-connect)
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
39
ESXi detects an all paths down condition (APD) and VMs cannot continue and are not restarted. Remain online / uninterrupted
Highly unlikely since VPLEX has no SPOFS. Full site failure more likely.
ESXi detects an all paths down condition (APD) and VMs cannot continue and are not restarted. Remain online / uninterrupted
Highly unlikely since VPLEX has no SPOFS. Full site failure more likely.
Since the VPLEX Witness ensures that the datastore remains online at B, all VMs die (at A) but are restarted automatically at B.
A disaster recovery solution would need a manual decision at this point whereas the VPLEX HA layer ensures fully automatic operation with minimal downtime. A disaster recovery solution would need a manual decision at this point whereas the VPLEX HA layer ensures fully automatic operation with minimal downtime.
Since the VPLEX Witness ensures that the datastore remains online at A, all VMs die (at B) but are restarted automatically at A.
Note: In a full WAN partition that includes cross-connect, VPLEX can only send SCSI sense code (2/4/3+5) across 50% of the paths since the crossconnected paths are effectively dead. When using ESXi version 5.1 and above, ESXi servers at the non-preferred site will declare PDL and kill VM's causing them to restart elsewhere (assuming advanced settings are in place); however ESXi 5.0 update 1 and below will only declare APD (even though VPLEX is sending sense code 2/4/3+5). This will result in a VM zombie state. Please see the section Path loss handling semantics (PDL and APD)for more details.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
40
<secondary VM
VMware FT VMware FT
Primary VM >
<Primary VM
secondary VM>
ESX VPLEX
A
ESX
VPLEX
Heterogeneous Storage
IP
VPLEX
WITNESS
IP
Heterogeneous Storage
SITE A
SITE B
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
41
Similar to federated HA this type of configuration requires a stretched layer 2 network to ensure seamless capability regardless of which location the VM runs in. Note: A further design consideration to take into account here is any limitation that exists with VMware FT compared to HA will also pertain in the federated FT solution. Currently, with vSphere 5.1 and earlier, VMware FT can only support a single vCPU per VM. See the paper here for more details http://www.vmware.com/files/pdf/fault_tolerance_recommendations_con siderations_on_vmw_vsphere4.pdf.
ESX
cross connect
A A A
SITE A
SITE B
VPLEX
A
WAN
IP
IP
VPLEX
WITNESS
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
42
The table below shows the different failure scenarios and the outcome:
Failure Storage failure at A VM State (Assuming primary at A) Remain online / uninterrupted VM using Primary or Secondary Primary Notes Cache read hits remain the same as do write I/O response time. Cache read miss at A now incurs additional link latency (<1ms), Can manually switch to secondary if required to avoid this. No impact to storage operations as all I/O is at A Both VPLEX clusters dial home. FT automatically starts using the secondary VM The primary VM is automatically protected elsewhere. If using more than 2 nodes in the cluster best practice is to ensure this reprotected at the remote site via vMotion. Cross-connect is not normally in use and access remains non-uniform. VPLEX suspends volume access at non-preferred site. Cross-connect still not in use since in this case since the primary VM is running at the preferred site.
Storage failure at B
Remain online / uninterrupted Remain online / uninterrupted Remain online / uninterrupted Remain online / uninterrupted
Primary
Primary Secondary
Primary
Primary
WAN failure with cross-connect intact and primary running at preferred site.
Primary
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
43
WAN failure with cross-connect intact and primary running at non-preferred site. VPLEX cluster outage at A (with cross-connect)
Primary
Cross-connect now in use (forced uniform mode) and all I/O is going to the controllers at the preferred site. Host I/O access will switch into forced uniform access mode via ESXi Path policy No impact since no host I/O at secondary VM and even if there was the cross-connect ensures an alternate path to the other VPLEX cluster. A disaster recovery solution would need a manual decision at this point whereas the VPLEX FT layer ensures fully automatic operation with no downtime. Primary has no need to switch since it is active at the site that is still operational.
Primary
Primary
Secondary
Primary
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
44
Federated Automatic FT Federated Automatic HA Disaster Recovery Automated Downtime Automated* Avoidance * Notes:
no no yes
RPO
0 0 0minutes
0 0 seconds*
Yes Yes No
Any ***
Continuous
hybrid
N/A
Yes
* Does not include decision time ** Must be invoked before downtime occurs *** VMware only qualified with VPLEX Metro today.
Table 5 BC attributes comparison As one can see from Table 5, DR has a different set of parameters when compared to federated availability technologies. The diagram below shows a simplified pictorial view of the bigger business continuity framework laying out all of the various components in relation to distance and automation level.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
45
BC Comparison
High Availability Fault Tolerance
VPLEX AccessAnywhere
Automatic
Federated HA/FT
Within DC
Across DCs
Automated
Operational Recovery
Disaster Recovery
Downtime Avoidance
RecoverPoint ProtectEverywhere
Figure 19 Automation level vs. Distance Figure 19 shows a comparison of the automation level vs. distance. Due to the distances VPLEX Metro can span, VPLEX does lend itself to a type of disaster recovery however this ability is a byproduct of its ability to achieve federated availability across long distances. The reason for this is that VPLEX is now not only performing the federation layer, but also by default synchronous replication is also handled by the VPLEX. We can also see that there is an overlap in the disaster recovery space with EMC RecoverPoint technology. EMC RecoverPoint Continuous Remote Replication (CRR) has been designed from the ground up to provide best of breed long-distance disaster recovery capability as well as operational recovery. It does not, however, provide a federated availability solution like VPLEX. Similar to using VPLEX Metro HA with VMware HA and FT, RecoverPoint CRR can also be combined with VMwares vCenter Site Recovery Manager software (SRM) to enhance its DR capability significantly. VMware vCenter Site Recovery Manager is the preferred and recommended solution for VM disaster recovery and is compatible with
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
46
VPLEX (Local or Metro). When combined with EMC RecoverPoint CRR technology using the RecoverPoint SRA (Storage Replication Adapter),SRM dramatically enhances and simplifies disaster recovery. Since a VM can now be protected using different geographical protection options, a choice can now be made as to how each VM can be configured to ensure that the protection schema matches that of the business criticality. This can effectively be thought of as protection tiering. The figure below shows the various protections tiers and how they each relate to business criticality.
Federated FT*
. FT + VPLEX
*Note: Although not mentioned in the figure and while out of the scope for this paper, both federated FT and HA solutions can be easily used in conjunction with RecoverPoint Continuous Data Protection (CDP) for the most critical workload giving automatic and highly granular operational recovery benefits protecting the entire environment from potential corruption or data loss events perhaps caused by a rogue employee or virus.
AUTOMATIC
Federated HA*
HA + . VPLEX
. SRM+VPLEX+RecoverPoint
Disaster Recovery
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
47
to the inherent I/O journaling capabilities of RecoverPoint, the best of breed operational recovery benefits are automatically added to the solution too. While RecoverPoint and vCenter Site Recovery Manager are out of scope for this document the figure below shows some additional topology information that is important to understand if you are currently weighing the different options of choosing between DR, federated availability or both.
VPLEX Metro Within or across buildings (Federated HA and DA) RecoverPoint CRR (Operational and Disaster Recovery)
Figure 21 Augmenting HA with DR A good example of where augmenting these technologies makes sense would be where a company had a campus type setup or perhaps different failure domains within the same building. In this campus environment it would make good sense to deploy VMware HA or FT in a VPLEX federated deployment providing an enhanced level of availability. However a solution like this would also more than likely require an out-ofregion disaster recovery solution due to the close proximity of the two campus sites.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
48
2. The site locations where the VPLEX Clusters reside are located too far apart (i.e. beyond 5ms where VPLEX Metro HA is not possible). VPLEX Metro HA is only compatible with synchronous disk topologies. Automatic restart is not possible with asynchronous type deployments. This is largely due to the fact that the remaining copy after a failure may be out of date. 3. VPLEX Witness is unable to be deployed. To ensure recovery is fully automatic in all instances, VPLEX Witness is mandatory. 4. The business requires controlled and isolated DR testing for conformity reasons. Unless using custom scripting and point in time technology, Isolated DR testing is not possible when stretching a cluster, since an additional version of the system cannot be brought online elsewhere (only the main production instance will be online at any given time). The only form of testing possible with a stretched cluster is to perform a graceful failover, or to simulate a site failure (See VPLEX Fault injection document for more details). 5. VM Restart granularity (beyond 3 priorities) is required. In some environments it is vital that some services start before other services. HA cannot always guarantee this since it will try and restart all VMs that have failed together (or recently prioritizes them high/medium/low). DR on the other hand can have a much tighter control over restart granularity to always ensure that services come back on line in the correct order. 6. Stretching a Layer 2 network is not possible. The major premise of any federated availability solution is that the network must be stretched to accommodate the relocation of VMs without requiring any network configuration changes. Therefore if it is not possible to stretch a layer 2 network between two locations where VPLEX resides then a DR solution is a better fit. 7. If automatic network switchover is not possible. This is an important factor to consider. For instance, if a primary site has failed, it is not much good if all of the VMs are running at a location where the network has been isolated and all of the routing is pointing to the original location.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
49
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
50
Another key factor in network topology is latency. VPLEX can support up to 5ms of round trip time latency where VMware HA solutions can be deployed, however only 1ms between clusters is supported for both VPLEX cross cluster connect topologies as well as VMware FT topologies. The VPLEX hardware can be ordered with either an 8GB/sec FC WAN connection option, or a native 10GB Ethernet connectivity option. When using VPLEX with the FC option over long distances, it is important there are enough FC buffer to buffer credits (BB_credits) available. More information on BB_credits is available in the EMC (SRDF) Networked Storage Topology Guide (page 91 onwards), available through Powerlink at: http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Technical_ Documentation/300-003-885.pdf
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
51
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
52
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
53
The settings that need to be applied to vSphere 5.0 update 1 deployments (and beyond, including vSphere 5.1) are: 1. Use vSphere Client and select the cluster, right-click and select Edit Settings. From the pop-up menu, click to select vSphere HA, then click Advanced Options. Define and save the following option: das.maskCleanShutdownEnabled=true 2. On every ESXi server, create and edit (with vi) the /etc/vmware/settings with the content below, then reboot the ESXi server. The following output shows the correct setting applied in the file: ~ # cat /etc/vmware/settings disk.terminateVMOnPDLDefault=TRUE Refer to the ESXi documentation for further details and the whitepaper found here http://www.vmware.com/files/pdf/techpaper/vSPHR-CSMTRO-STOR-CLSTR-USLET-102-HI-RES.pdf. Note: vSphere and ESXi 5.1 introduces a new feature called APD timeout. This feature is automatically enabled in ESXi 5.1 deployments and while not to be confused with PDL states does carry an advantage whereby if both fabrics to the ESXi host or an entire VPLEX cluster fails, the host (which would normally hang (also known as a VM zombie state)) would now be able to respond to non-storage requests since "hostd" will effectively disconnect the unreachable storage, however this feature does not cause the affected VM to die. Please see this article for further details: http://www.vmware.com/files/pdf/techpaper/Whats-New-VMwarevSphere-51-Storage-Technical-Whitepaper.pdf. It is expected that since VPLEX uses a non-uniform architecture that this situation should never be encountered on a VPLEX METRO cluster.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
54
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
55
Dedicated route A cross-connect configuration is deemed "dedicated " when the VPLEX WAN uses a physically separate channel to the cross-connect network AND is diversely routed, therefore maintaining a separate failure domain. It is therefore much less likely that a full partition of all circuits would occur at once. In this configuration it would be deemed best practice to terminate the circuits into different cabinets within each DC.
2. Two or four host initiators Two initiators (whereby each ESXi server only has 2 HBA ports) For cross-connect to be possible with only two host initiators, merged fabrics between locations are required, and each ESXi initiator is zoned into the local and remote VPLEX front end ports (i.e. both VPLEX clusters see the same initiator WWNs). Typically this will be adopted if the physical server only has 2 initiators, and each initiator is zoned (across the ISL) to the remote VPLEX as well as the local. Four initiators (whereby each ESXi server has 4 HBA ports) Four initiators are required if separate fabrics are used in each location. This means connectivity is only possible via a dedicated pair of initiators for the local site, and another set of dedicated initiators for the remote site. (This is also a requirement for the "dedicated route" topology as noted above.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
56
The table below shows all of the failure scenarios that a cross-connect protects against and notes the effect to I/O at the preferred and nonpreferred locations.
Cross connect configuration topology failure comparisons
OPTION 1 Dedicated + diversely routed (6) 4 Dedicated HBAs nonpreferred preferred forced OK uniform forced OK uniform forced OK uniform forced OK uniform forced OK uniform OPTION 2 Dedicated + diversely routed (6) 2 HBAs / Merged fabric nonpreferred preferred forced OK uniform forced OK uniform forced OK uniform OPTION 3 Shared route (7) 4 Dedicated HBAs nonpreferred preferred OK forced uniform PDL (1,2,7) OK forced uniform OPTION 4 Shared route (7) 2 HBAs / Merged fabric nonpreferred preferred OK forced uniform PDL (1,2,7) OK forced uniform OPTION 5 No cross connect No cross connect nonpreferred preferred OK APD(3,5) OK APD(3,5) OK PDL (1,7) OK APD(3,5) OK APD(3,5)
Option # Xconnect has Shared or dedicated route 4 HBA's (dedicated) or 2 HBAs (merged fabric) Scenario VPLEX WAN partition Preferred VPLEX failed (5)
OK OK Non-preferred VPLEX failed (5) both fabrics fail at preferred forced APD(3,4,5) APD(3,4,5) OK site (5) uniform Full SAN failure at both Full SAN failure at both both fabrics fail at nonforced sites! sites! OK preferred site (5) uniform Notes: 1. Cross-connect also is partitioned or not installed. PDL Will cause a VM to restart elsewhere (Hence orange colour).
2. Only 50% of paths get 2/4/3+5. ESXi 5.1 and above will PDL, but ESXi 5.0 U1 or below will APD and may require manual intervention. 3. VPLEX unable to send 2/4/3+5. APD may require manual intervention. (Pre-ESX 5.1, VM will also be in zombie state) 4. VPLEX and back-end storage arrays in total isolation at both sites. Fabric must be restored to continue. 5. This would be deemed an unlikely scenario. 6. If your xconnect network is not diversely routed, please used the "shared" columns since it is more likely all channels fail together. 7. WAN partition would be deemed less likely if both physical channels are diversely routed.
Table 6 Cross-connect topology options As we can see from Table 6, if possible it is always best to deploy the crossconnect with additional HBAs while also use a separate dedicated route that is not shared with the VPLEX WAN. Note: Only the first scenario (VPLEX WAN partition) would be deemed a likely event. All other scenarios assume a double component failure which is highly unlikely. Additionally when not using cross-connect, it is possible to significantly reduce the risk of WAN partition by diversely routing the dual VPLEX WAN channels too.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
57
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
58
Note: When not using cross-connect (option 5 above) but when both the individual VPLEX WAN channels are diversely routed and terminated into different cabinets within the datacenter, the preference rule settings also becomes less vital, since the likelihood of WAN failure is significantly reduced.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
59
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
60
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
61
Recommendations: 1. Try to keep VMs in a given cluster either all enabled for FT or all disabled for FT (i.e. try not to mix within clusters). This will ensure two types of cluster in your datacenter (FT or simple HA clusters) This way DRS can be enabled on the simple HA clusters bringing the benefits to those hosts, whereas the FT cluster should be equally balanced between sites providing total resiliency for a smaller subset of the most critical systems. 2. Although an FT cluster can have more than two nodes, for a maintenance free topology consider using no more than two nodes in the FT cluster. This ensures that the secondary VM placement always resides on the remote location without any intervention. If more nodes are required consider using additional clusters, each with two nodes. 3. If more than two nodes are to be used ensure there is an even symmetrical balance (i.e. if using a 4 node cluster, keep 2 nodes at each site). Odd numbers clusters are not sensible and could lead to an imbalance or not having enough resources to fully enable FT on all of the VMs. 4. When creating and naming physical ESXi servers always try to give a site designation in the name. The reason for this is vSphere treats all the hosts in the cluster as a single entity. Naming the host correctly makes it easy to see which site each VM is located on. 5. When enabling FT with more than two nodes in a cluster, it is important to ensure that the secondary VM is manually vMotioned to an ESXi host that resides in the remote VPLEX fault domain (FT will randomly place the secondary VM initially onto any node in the cluster which could be in the same failure domain as the primary) 6. If any host fails or is placed into maintenance mode and when using more than two nodes in an FT cluster, it is recommended to re-check the FT secondary placements as they may end up in the same failure domain as the primaries.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
62
It is considered best practice to use a VPLEX consistency group per FT cluster and set all of the volumes within the group to be preferred at the same site where all of the primary VMs are located. This ensures that for any given cluster all of the primary VMs reside in the same physical location as each other. Larger consistency groups can be used that span multiple FT clusters, but care should be taken to ensure that all of the primary VMs reside at the preferred location (this is extremely easy to enforce with 2 node clusters). Note: Cross-cluster connect is a mandatory requirement for VMware FT with VPLEX. Please submit an RPQ to EMC if considering using FT without cross-connect beyond distances of 1ms.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
63
Conclusion
Using best of breed VMware availability technologies brings increased availability benefits to any x86 based VM within a local datacenter. VPLEX Metro HA is unique and dissolves distance by federating heterogeneous block storage devices and leveraging distance to enhance availability. Using VPLEX HA in conjunction with VMware availability technologies (such as VMware HA or FT) provides new levels of availability suitable for the most mission critical environments without compromise that go beyond any other solution on the market today.
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
64
References
Demo of VPLEX and VMware Federated HA and FT http://www.youtube.com/watch?v=Pk-1wp91i2Y EMC VPLEX page on EMC.com http://www.emc.com/campaign/global/vplex/index.htm EMC VPLEX simple support matrix https://elabnavigator.emc.com/vault/pdf/EMC_VPLEX.pdf VMware storage HCL (Hardware compatibility list) http://www.vmware.com/resources/compatibility/search.php?action=bas e&deviceCategory=san EMC VPLEX HA Techbook http:/www.emc.com/collateral/hardware/technical-documentation/h7113-vplexarchitecture-deployment.pdf VMware Metro Storage Cluster White paper http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR-CLSTR-USLET102-HI-RES.pdf EMC Networked Storage Topology Guide (page 91 onwards) http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Technical_Docume ntation/300-003-885.pdf VPLEX implementation best practices http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Technical_Docume ntation/h7139-implementation-planning-vplex-tn.pdf What's new in vSphere 5.1 storage http://www.vmware.com/files/pdf/techpaper/Whats-New-VMwarevSphere-51-Storage-Technical-Whitepaper.pdf
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
65
VMware Fault Tolerance recommendations and considerations http://www.vmware.com/files/pdf/fault_tolerance_recommendations_con siderations_on_vmw_vsphere4.pdf VMware HA best practices http://www.vmware.com/files/pdf/techpaper/vmw-vsphere-highavailability.pdf VPLEX Administrator guide on Powerlink http://powerlink.emc.com/km/appmanager/km/secureDesktop?_nfpb=trueand_p ageLabel=defaultandinternalId=0b014066805c2149and_irrt=true VPLEX Procedure Generator http://powerlink.emc.com/km/appmanager/km/secureDesktop?_nfpb=trueand_p ageLabel=query2andinternalId=0b014066804e9dbcand_irrt=true EMC RecoverPoint page on EMC.com http://www.emc.com/replication/recoverpoint/recoverpoint.htm Cisco OTV White paper http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DCI/whitepa per/DCI_1.html Brocade Virtual Private LAN Service(VPLS) white paper http://www.brocade.com/downloads/documents/white_papers/Offering_Scalabl e_Layer2_Services_with_VPLS_and_VLL.pdf
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
66
USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH VPLEX METRO HA FOR ULTIMATE AVAILABILITY
67