Documente Academic
Documente Profesional
Documente Cultură
EMC, VMWare, Rainfinity, Invista, and RecoverPoint are trademarks of EMC Corporation.
All other trademarks used herein are the property of their respective owners.
Course Objectives
Upon completion of this course, you will be able to:
y Define a Virtual Infrastructure
y List VMWare product differences
y Explain the concept and benefits of Storage Virtualization
y Identify benefits, features, and advantages of an Invista
Solution
y Cite basic concepts of File Level Virtualization
y Describe Rainfinity features, functions and benefits
y Identify key features of a RecoverPoint solution
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 2
The objectives for this course are shown here. Please take a moment to read them.
VMware Virtualization
Upon completion of this module, you will be able to:
y Define a Virtual Infrastructure
y List VMWare product differences
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 3
The objectives for this module are shown here. Please take a moment to read them.
Virtualization Technologies
APP APP
OS OS
APP APP
Invista
OS OS
IP network
APP APP Global Namespace Runs in the
OS OS storage network
NAS storage pool
APP APP
OS OS
APP APP
OS OS
APP APP
OS OS
File servers NetApp
EMC
Physical storage
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 4
Complementing virtualization services are the virtualization technologies – server, file, and block. We
all know these EMC virtualization technologies well – VMware, Rainfinity, and Invista. Take the time
to learn about these technologies if you’re not familiar with them. There are occasions when one of
these technologies will be an obvious recommendation to address a customer problem.
Flexible Infrastructure
Server Virtualization
APP APP APP APP APP APP
OS OS OS OS OS OS
Next, let's move on to virtual servers and virtual storage, and consider the benefits to applications from
seeing their own logical server and storage resources. These are the two areas in which EMC is
focused on delivering virtualization technology.
This slide shows the principle of server virtualization in one graphic. There are three physical servers
with six copies of the operating system and applications running on top of them. Each of these
operating systems believes it has its own dedicated server.
The result is the ability to run multiple applications on the same server, even if an application believes
it must have its own server in order to function correctly, and increases server utilization. And the
EMC technology we review even moves applications across servers non-disruptively!
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 6
In traditional data centers, there is a tight relationship among particular computers, disk drives, and
network ports and the applications they support. VMware’s Virtual Infrastructure allows us to break
those bonds. We can dynamically move resources where they are needed, and move processing where
it makes most sense. VMware detaches the operating system and its applications from the hardware
they run on.
VMware Infrastructure 3 is a suite of software to optimize and manage the virtual infrastructure.
VMware Products
Category Product Use case
Enterprise
VMware ACE Desktop security
Desktop
Run, share, evaluate pre-built
VMware Player applications and beta software in
FREE VMs
Virtualization
Test/dev, evaluate software, server
VMware Server
provisioning
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 7
This table categorizes various VMware products and identifies the typical use case for each product
listed.
VMware Infrastructure 3
y A software suite for optimizing and
managing IT environments through
virtualization
y Consists of the following software:
– ESX Server
¾ Virtual SMP
– VirtualCenter
¾ VMotion
¾ VMware HA (High Availability)
¾ VMware DRS (Distributed Resource
Scheduler)
– VMware Consolidated Backup (VCB)
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 8
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 9
VMware ESX Server is the virtual infrastructure platform for datacenter environments. ESX Server
uses a bare-metal hypervisor architecture to provide the optimal performance and scalability for server
applications running in virtual machines.
With ESX Server, the VMware virtualization layer runs directly on the x86 hardware with AMD or
Intel processors, to give the virtual machines the most direct access to the host’s resources. Because
the VMware virtualization layer is in complete control of the host’s hardware, it makes it possible to
provide fine-grained resource allocations to each virtual machine. Precise amounts of host processor,
memory, network I/O and disk I/O resources can be granted to each virtual machine and those
allocations can be dynamically adjusted as workloads and service levels change.
The ESX Server virtualization layer is a very thin, special purpose kernel entirely dedicated to
execution of virtual machines. It lacks much of the “surface area” found in conventional operating
systems like user logins, network stacks and remote access.
ESX Server gives you the flexibility to run it on large x86 servers in a scale-up environment or
installed on many smaller servers in a scale-out strategy. ESX Server is designed to run on host
systems ranging from dual processor blades to 16-way NUMA servers. Each ESX Server host can run
up to 128 virtual processors concurrently and sharing up to 64GB of memory. With Virtual SMP, the
ESX Server lets you configure 2- or 4-way virtual machines for larger workloads that require more
than one processor.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 10
Under ESX Server, applications running within virtual machines access CPU, memory, disk, and their
network interfaces without direct access to the underlying hardware. The ESX Server’s virtualization
layer intercepts these requests and presents them to the physical hardware.
The service console supports administrative functions for the ESX Server. The service console is based
on a modified version of Red Hat Enterprise Linux 3 (Update 6). Users of ESX Server who use the
command line find that Red Hat Linux experience, or experience with other versions of Unix family
operating systems, can be very helpful to them.
The VMkernel always assumes that it is running on top of valid, properly functioning x86 hardware.
Hardware failures, such as the death of any physical CPU, can cause ESX Server to fail. If you are
concerned about the reliability of your server hardware, the best approach is to cluster virtual machines
between ESX Servers.
ESX 3 is supported on Intel (Xeon and above) and AMD Opteron (32-bit mode) processors. ESX 3
offers experimental support for a number of 64-bit guest operating systems. Refer to the Systems
Compatibility Guide for a complete list.
VMware VirtualCenter
y Create and manage
inventory of hosts and virtual
machines
y Provision virtual machines
from templates
y Migrate running VMs across
hosts (VMotion)
y Balance virtual machine
workloads across hosts
(VMware DRS)
y Manage virtual machines for
high availability and disaster
recovery (VMware HA)
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 11
VMware VirtualCenter is VMware’s tool for managing your virtual infrastructure. VirtualCenter gives
you a “single pane of glass” view of your entire virtual infrastructure, spanning all ESX Servers and
virtual machines hosted on those servers.
Provisioning new server virtual machines with VirtualCenter is a quick operation and VirtualCenter
lets you create a library of standardized virtual machine templates so your newly provisioned systems
always conform to your datacenter requirements.
VirtualCenter delivers a feature called VMotion that lets you migrate running virtual machines
between servers so you can perform hardware maintenance and shift servers with minimal downtime.
Other VirtualCenter features include VMware DRS, which helps you balance virtual machine
workloads across hosts, and VMware HA, which helps you manage virtual machines for high
availability and disaster recovery.
VirtualCenter Components
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 12
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 13
The VI Client and Web Client are the user interfaces used to access the VirtualCenter Server or ESX
Server directly. The Web Client provides a browser-based interface for managing VMs. Web Client
requires that Web Access run on the VirtualCenter Server, or ESX Server, or both.
Two services exist on the ESX Server that are responsible to coordinate and launch tasks received from
VirtualCenter or client interfaces. The VirtualCenter Server sends task requests to the VirtualCenter
agent, vpxa, which then forwards them to hostd. hostd is a background process that launches the task to
be performed.
Module Summary
Key points covered in this module:
y Virtual Infrastructure allows dynamic mapping of
compute, storage, and network resources to business
applications
y VMware ESX Server is the virtual infrastructure platform
for datacenter environments
y VMware VirtualCenter is VMware’s tool for managing
your virtual infrastructure
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 14
These are the key points covered in this module. Please take a moment to review them
Rainfinity Overview
Upon completion of this module, you will be able to:
y Describe basic concepts of File Level Virtualization
y Identify and describe Rainfinity features and functions
y Explain the operation of Rainfinity Virtualization
y List the benefits of a Rainfinity solution
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 15
The objectives for this module are shown here. Please take a moment to read them.
IP network IP network
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 16
EMC Rainfinity Global File Virtualization virtualizes NAS environments, making them simple to
manage. It does this by dynamically moving information without disruption to clients or applications.
Rainfinity Global File Virtualization achieves this through its unique network file-virtualization
capabilities – an out-of-band file system virtualization that enables non-disruptive data movement in
multi-vendor NAS environments.
y Virtualization
– Increases data mobility by providing location independence of file
systems from the users and applications that utilize them
– This provides a layer of transparency to users
y Consolidation of Data
– Allows customers to have more boxes
– Looks like One Big Virtual Box
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 17
An old way to solve the file virtualization problem was to consolidate smaller file servers into a single
large file server. Another approach was to use an even larger file server - one that was easier to
manage and maintain. However, at some level of scale, it's impossible or doesn't make sense to keep
finding a bigger box. Of course, it is possible to perform lots of acrobatics on the client side using
auto-mounters, name servers and load balancers. But as system size mushroomed, those solutions
ultimately proved unwieldy and impossible to maintain.
That's when the current batch of file virtualization tools became attractive. They allow customers to
have more boxes, but make it look like it's just one big box. That simplifies life for storage users.
What virtualization does is increase data mobility by providing location independence of file systems
and files from the applications and people who use them.
Rainfinity
y Rainfinity is a dedicated hardware/software solution that
manages file-oriented (NFS/CIFS) storage access
y Provides transparent data mobility
y Enables file storage virtualization using industry
standard protocols and mechanisms in a heterogeneous
environment
y GFV is the abstraction of file-based storage over an IP
network
– Physical storage location is transparent to users and applications
– File-based storage systems are seen as a logical pool of resources
– Provides constant access to data while moving NFS/CIFS data
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 18
Rainfinity supports the management of file-oriented data and their servers. File-oriented data is data
that is accessed by CIFS or NFS. Rainfinity is a dedicated hardware/software platform solution.
Rainfinity allows clients to access data it is managing, and is done transparently to the user.
Virtualization is an abstraction of the logical and physical paths to data. The client is unaware where
the data physically resides. The management of the namespace can be accomplished by industry
standard mechanisms such as a Distributed File System (DFS) in a Windows environment, and
NIS/Automount and LDAP in a UNIX environment. Rainfinity does not create its own namespace; it
integrates with these existing industry namespaces.
Rainfinity GFV provides for constant access to data. This means file-sharing data can be moved from
one file server to another while clients are reading and writing to that data.
GFV provides a layer of transparency to users and applications. With Rainfinity, data is mirrored and
appears as a single source to users and applications. Namespace transparency maps the logical name
to the physical location after it has been moved so users and applications are redirected to the new
location without reconfiguring the physical path names. GFV simplifies storage management by
bringing location transparency to users and applications accessing storage.
NAS devices/platforms
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 19
EMC Rainfinity Global File Virtualization delivers these benefits by virtualizing NAS environments,
making them simple to manage, and dynamically moving information without disruption to clients or
applications.
Rainfinity Global File Virtualization achieves this through its unique network file-virtualization
capabilities, which are delivered through a 2U rack mountable appliance.
Rainfinity Hardware
y GFV-4 y GFV-5
y Customized Linux operating system y Customized Linux operating system
– Version 2.6 kernel – Version 2.6 kernel
y 64 bit hardware (processor) y 64 bit hardware (processor)
– Single or clustered unit – Single or clustered unit
– 4GB SDRAM DIMM for single – CIFS license
configuration ¾ 4 GB single rank registered 667-MHz
SDRAM DIMM
– 12 GB SDRAM DIMM for enterprise – CIFS, NFS or Enterprise license
configurations ¾ 16GB dual rank registered 667-MHz
SDRAM DIMM
y CPU:
– Dual Intel Xeon Processors 3.6 GHz y CPU:
or higher – Dual Intel Xeon Processors 3.0 GHz
or higher
y L2 Cache:
– 1 MB (GFV-4) y L2 Cache:
– 4 MB (GFV-5)
y Keyboard/Mouse:
– PS/2 (GFV-4) y Keyboard/Mouse:
– USB (GFV-5)
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 20
GFV 4:
y All versions of the Rainfinity software requires a 64 bit processor.
y It ships as a stand-alone unit or can be clustered for high availability.
y Rainfinity uses a customized operating system based on the Linux 2.6 kernel.
y The Rainfinity appliance is based on HP ProLiant DL380 G4 hardware.
y 2U rack-mount form actor with sliding rails.
y SmartArray 6i storage controller buffers all writes to disk so that in event of a critical full-system
failure, important state is saved even during abrupt disk or power failure.
GFV 5 only:
The Global File Virtualization appliance chassis is based on Intel processor-based hardware.
The appliance includes dual Intel Xeon processors, 4MB of L2 cache and a 1333 MHz front-side bus.
The memory available is either 4GB or 16GB. The 4GB (8 x 512MB) single rank registered 667-MHz
SDRAM DIMM (Dual Inline Memory Module) is available with the standard configuration CIFS
license only. The 16GB (8 x 2GB) dual rank registered 667-MHz SDRAM DIMM is available with
CIFS, NFS and the Enterprise configuration license.
Hot swappable fans are included.
Front of Appliance
GFV- 4
GFV- 5
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 21
Rainfinity Software
y Ships with 3 Separate Cd’s:
– Rainfinity code
– Windows Proxy Service
¾ Required to move CIFS Data
– SID Translator
¾ Moves local groups from source to destination server
y Minimal setup
– Network Config …
¾ IP address, Netmask, Default Gateway IP, DNS, Hostname
– Date/time
¾ Date, Time, NTP time services, Time Zone
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 22
Rainfinity ships with three disks, the Rainfinity Code, Windows Proxy Service, and an SID Translator.
The Windows Proxy service is required to move CIFS data. Windows servers require that some
operations be performed by native Windows clients. Rainfinity connects to a computer running the
Windows Proxy and uses it to perform statistic collection and administration tasks.
Rainfinity is able to translate Security IDs (SIDs) in the security properties of the files and directories
involved in a CIFS transaction. The capability may be used to assist data migration projects in which
the data’s group or user association changes from the source to the destination. For example, when the
Access Control List (ACL) is defined in terms of local groups on the source file server. When the data
is migrated to the destination server, the ACL should be defined in terms of corresponding local groups
on the destination server. The rules governing such translation are defined in the SID translation tables.
The first step is initial Rainfinity setup. This includes, at a minimum: logging in and setting the
date/time, port configuration, and basic network settings.
When a Rainfinity appliance ships to a customer location, the software has been installed and tested. In
some cases, it might be necessary to reconfigure or re-install the system to customer specifications.
Upon first time login, the rssetup script receives the login request and acts as an interactive menu-
based interface for the user.
Namespace
DFS
Global Namespace
Rainfinity Manager AD
Appliance
Automount
Event Log
NIS LDAP
1 Data Mobility
Rainfinity is triggered
2 Redirection
Global namespace updated
Admin
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 23
This is a representation of a NAS environment. The heterogeneous back-end file servers and NAS devices are represented
below – Rainfinity operates at the protocol level, CIFS and NFS, so a broad range of support is available.
The namespace is represented in the top left. This shows that clients are accessing data through a logical view the
namespace provides the underlying physical location to the logical map. Rainfinity supports industry standard namespaces
such as DFS and Automount. Rainfinity can also work with custom login scripts.
The IP Client network is shown here connecting clients to the Network Attached Storage. Rainfinity installs by plugging
into the network switch. There are no changes required to client mount points. Rainfinity installs in the network but is not in
the data path. When you install Rainfinity you set up a separate VLAN in the network. Clients continue to access storage
with no disruption.
When there is some data relocation that needs to take place for cost or optimization reasons the ports associated with the
involved file servers are associated with the Rainfinity VLAN. Rainfinity is now in the data path for these file servers and
can ensure client access to the data even though it is being dynamically relocated.
Rainfinity treats a data relocation as a transaction and has rollback capability and pause restart whether a directory, volume
or file system is being relocated. Any updates during the transaction are synchronized across both the original source and
new destination. If Rainfinity were to be moved from the network in the middle of a transaction, there is no data integrity
risk. All updates are reflected on the source. The clients are still mounting the source. You can plug Rainfinity back in and
the transaction resumes.
Once the data relocation is complete, Rainfinity now updates the global namespace (DFS for Windows, Automount for
Unix, or login scripts or homegrown namespace solution). The namespace in turn updates the clients. Rainfinity leverages
the industry standard approaches such as Microsoft DFS so there is no need for additional agents to be deployed on all
clients.
The new authority copy of the data is at the new location. The original source reflects a point in time copy at the end of the
transaction and reflects any updates made up and to that point. Updating client mappings takes time however, so Rainfinity
remains in the data path and redirects client access to the new location. Overtime the number of sessions to redirect
decreases as new sessions are mounting directly to the new location.
Namespace
DFS
Rainfinity AD
Appliance
Automount
Event Log
NIS LDAP
1 Data Mobility
Rainfinity is triggered
2 Redirection
Global namespace updated
3 Transaction Complete Admin
Without downtime
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 24
When all of the client sessions have been remapped to the new location Rainfinity completes the
transaction and the NAS devices move out of the Rainfinity VLAN. Rainfinity is now out of the data
path.
Rainfinity has an autocomplete feature that provides a policy to control transaction completion. This
can be based on percentage of client remapping.. You can set up tiers of users and have different
policies for each. Key operation clients or apps have to be 100% remapped for instance while other
department users can have lower percentages Rainfinity can also terminate client sessions to perform a
remap/remount based on idle time thresholds.
Rainfinity virtualizes an environment 100% of the time with the namespace providing a logical
abstraction layer. The Rainfinity appliances selectively virtualizes traffic on the wire based on
particular optimization or relocation events that need to take place.
Questions that might come up:
Can Rainfinity handle simultaneous transactions? Yes, you can define multiple transactions. Rainfinity
only does 1 active move transaction at a time and these are queued, but Rainfinity can perform the
redirection simultaneously for multiple transactions
Namespace
DFS
File Management
Rainfinity AD
Appliance
Automount
Event Log
NIS LDAP
4 Policy based File
Archiving
Rainfinity archives files
5 End-user Retrieval
Access stub file, retrieves file Admin
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 25
The last animation regarding the migration process is displayed here; please take a moment to review.
Rainfinity Applications
y Capacity Management
y Performance
Management
y Tiered Storage
Management
y Global Namespace
Management
y Migration and
Consolidation
y Synchronous
Replication
y Rainfinity Platform
y File Management
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 26
Rainfinity application software provides the functionality of the system. The first three applications
listed here drive storage optimization by visualizing usage trends and exceptions. The Global
Namespace Management application manages existing namespaces in the environment. The other
applications move data between storage devices. The next few slides highlight the major features of
Rainfinity, although there is so much more.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 27
Rainfinity is designed to be installed between file servers and clients on the network. To achieve this,
Rainfinity functions as an Ethernet switch.
This functionality as a Layer 2 switch enables Rainfinity to see and process traffic between clients and
file servers with minimal modification to your existing networking. Rainfinity is aware of file-sharing
protocols. It is this application-layer intelligence that allows Rainfinity to move data without
interrupting client access.
When Rainfinity is doing a move, the two file involved in the move must be on the private side (or
server-segment). In this case, Rainfinity is said to be in-band servers, and those file servers are also
referred to as in-band.
When Rainfinity is not doing a move or redirecting access to certain file servers, those file servers may
be moved to the public (or client-side) LAN segment. In this case, Rainfinity is said to be out-of-band
for these file servers, and those file servers are also referred to as out-of-band.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 28
Tiered storage management, or file archival, allows for the efficient placement of data based on
capacity and service-level agreements. File placement can increase efficiency by intelligently placing
data in optimal tiers of performance and price, thereby lowering the overall average cost of storage.
Enterprises can use lower cost storage, such as ATA drives or tape, to store less critical data at a
fraction of the cost of high performance storage. Intelligent software is required to make file
placement practical and feasible, and that automatically classifies and migrates data based on policy.
Policy engines, or intelligent software running on a separate server, migrate data from one storage tier
to another based upon configured policies. These policies can be based upon the size of the data, the
length of time since the last access, or by particular file extension type. One of the primary features
announced with Rainfinity GFV version 7.0 is Rainfinity File Management, or RFM. RFM provides
the policy engine functionality and currently supports NetApp to Centera archival and retrieval.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 29 29
Rainfinity Global File Virtualization leverages industry-standard Global Namespaces with a scalable,
transparent file-protocol switching capability. There are many advantages to this approach: it limits
risk and performance concerns, but also leverages the continuing investments being made by large
vendors and Standards bodies, whether your namespace is Microsoft DFS or Automount. Rainfinity
Global File Virtualization can also work with existing environments in which a standard namespace is
not deployed, such as login scripts or “homegrown”.
Rainfinity Global File Virtualization is a solution that provides complete transparency…not just
transparency of file access - transparency to the environment. This solution does not require mount-
point changes or the deployment of agents on clients or servers.
Virtualization should leverage the investments you’ve already made in your storage-infrastructure and
management tools. Check for Standards support and vendor certifications, and make sure your existing
management tools and data-protection policies are not adversely impacted.
y Industry standards-based
y Enterprise service and
support
– EMC’s world class technical
service organization and 24 X 7
global hardware and software
support
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 30 30
If you’re looking to streamline the operations of your file-server and NAS environments, Rainfinity
Global File Virtualization delivers:
y Optimized utilization of storage resources
y Accelerated storage consolidation
y Simplified management
y Increased protection of critical files
It does this by simplifying capacity management through non-disruptive data movement and
namespace-management updates, maintaining a virtual file system of your physical file-serving and
NAS resources.
Module Summary
Key points covered in this module:
y Virtualization is the newest approach in solving the consolidation of
many servers into one
y Rainfinity is a dedicated hardware/software solution that manages
file-oriented (NFS/CIFS) storage access
y Both GFV-4 and GFV-5 includes mirrored 146.8 GB drives. Both
have slight differences for hardware and processing power
y Rainfinity combines applications that monitor and move data
between storage devices
y GFV streamlines operations of your file-server and NAS
environments through non-disruptive data movement, and
leveraging the customers existing environment
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 31
These are the key points covered in this module. Please take a moment to review them.
Invista Overview
Upon completion of this module, you will be able to:
y Understand the concept and benefits of storage
virtualization
y Identify the benefits, features, and advantages of an
Invista Solution
y List the hardware and software components of Invista
and how they work together to achieve storage
virtualization
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 32
The objectives for this module are shown here. Please take a moment to read them.
EMC Invista
Network-based Storage Virtualization
y Performance architecture
– Leverages next-generation “intelligent” SAN
switches for high performance Virtual volumes
– Designed to work in enterprise-class
environments
Runs in Invista
y Provides advanced functionality the storage
network
Data Path
Controller
Control Path
Cluster
Data Path
Controller
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 33
Invista is a SAN-based storage-virtualization solution. Its architecture leverages new intelligent SAN-
switch hardware from EMC’s Connectrix partners that enables new levels of scalability and
functionality.
Unlike other storage-virtualization products, Invista is not appliance-based. This enables it to deliver
consistent, scalable performance across a heterogeneous storage environment, even using highly
random-I/O applications. Because Invista uses the processing capabilities of intelligent switches, it
eliminates the latency and bandwidth issues associated with an “in-band” appliance approach. By
using purpose-built switches with port-level processing, this “split-path” architecture delivers wire-
speed performance with negligible latency.
EMC’s unique network-based approach to storage virtualization enables certain key functionalities,
such as the ability to move active applications to different tiers of storage non-disruptively, and the
ability to leverage clones across a heterogeneous storage environment. These functions work uniformly
across qualified hosts and heterogeneous storage arrays.
In addition to integrating discovery and monitoring functions for virtual volumes into EMC
ControlCenter, Invista can also be easily managed from a graphical user interface (GUI) or a
command-line interface (CLI).
y Network-based Volume
Management
– Pool storage and manage
volumes at the network
level
y Heterogeneous Point-in-
Time Copies
– Create local copies of data
for testing and repurposing
across multiple types of
storage
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 34
The next-generation hardware, combined with powerful Invista software, enables some unique
capabilities such as Dynamic Volume Mobility, Network-based Volume Management, and the ability
to create heterogeneous point-in-time copies.
Dynamic Volume Mobility allows Administrators to move primary volumes between heterogeneous
storage arrays while the application remains online. This enabler of information lifecycle management
allows you to move applications non-disruptively to the appropriate storage tier, based on application
requirements and service levels.
Network-based volume management is the basis for what many people think of as “virtualization.”
Invista enables you to create and configure virtual volumes from a heterogeneous storage pool and
present them out to hosts. It makes sense for the network to be the control point for this - abstracting
and aggregating the back-end storage, configuring it, and making it available to all of the connected
hosts.
Invista also gives you the ability to create clones of virtual volumes. This allows you to extend the
use of clones to areas where their use may have previously been impossible, due to compatibility
issues. For example, you can now create a clone from a high-tier, primary storage array and extend it
to a lower-tier, lower-cost storage array. This provides another local-replication option in your tiered
storage environment.
y Reduces complexity
– Single interface for managing all tiers of storage
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 35
Invista provides support for EMC and third-party arrays, which allows an enterprise to leverage its
existing investments in storage capacity and resources.
Invista also supports Information Lifecycle Management by enabling data movement across multiple
storage tiers, and reduces management complexity by establishing a single interface for managing all
tiers of storage.
Finally, it increases operational efficiency by simplifying the movement of data to optimize
performance and the provisioning of storage among multiple vendor arrays.
Production Hosts
IP Network
Layer 2
SAN
Heterogeneous
Storage Arrays
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 36
In this illustration, the components of an Invista instance are detailed. Note that the illustration is a
simplified representative of the Invista components - it does not show redundant components needed
for fault tolerance that are mandatory in Invista production configurations.
Invista includes these major components: the Control Path Cluster (CPC), Data Path Controller (DPC),
and an Ethernet switch.
The Control Path Cluster does not contain user data. Instead, it stores the Invista configuration
parameters and performs management functions for an Invista instance. These functions include
configuring and managing virtual storage.
The Data Path Controller accepts data and control requests from hosts to perform on the virtual
storage. It is the component that maps data read and write operations between the hosts (front end) and
storage arrays (back-end). The DPC gets its configuration from the CPC.
The Ethernet switch connects the CPC and DPC, and configuration and control information is passed
between the CPC and DPC via the Ethernet connections.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 37
The (CPC) stores the following configuration meta data about the Invista instance.
The storage element (back-end array volumes) information. These volumes have been assigned to the
Invista instance. The back-end volumes must be allocated exclusively for Invista usage by the
administrator of the storage arrays.
The imported storage element information. Imported storage elements are simply storage elements that
have been “imported” into the Invista instance. This identifies storage array capacity that Invista
intends to use for creating virtual volumes.
The Virtual Volumes information. This includes the virtual volume name, storage volume
identification (ID), and the imported storage element used to create the virtual volume.
The Virtual Frame information. A Virtual Frame identifies one or more virtual volumes and the host
allowed to access them.
The Clone Group information, including a data (source) volume and clone (copy) volumes.
The CPC downloads the appropriate information to the DPC.
y Serves as a:
– Virtual target for hosts
– Virtual initiator for storage arrays
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 38
The Data Path Controller (DPC) resides in the intelligent switch component of Invista. It receives part
of its configuration from the CPC.
The DPC is the center of all traffic in Invista. Each FC frame generated by hosts and storage arrays is
examined and forwarded to the appropriate device.
To the host, the DPC is a virtual target. To a storage array, the DPC is a virtual initiator.
Control
Operations
SAL-Agent
When a command arrives at the DPC, there are two places where processing occurs.
The data path processors are the port-level ASICs that handle the incoming I/O from the host and do
the remapping to the back-end storage. In typical operations, 95%+ of I/O will be handled by the
DPC’s. Whatever can’t be handled by the DPC’s is termed an exception. An exception might be a
SCSI inquiry about the device, or an I/O for which the DPC doesn’t have mapping information. These
exceptions are handled by the Control Path Cluster.
The CPC is also where the storage application actually runs. When the system starts up, the CPC loads
the mapping tables for the virtual volumes into the DPC’s.
Hosts
DPC
High-Availability Configuration
y Mirrored SAN
– 2 separate SANs, dual-HBA hosts
Hosts
– Supports nondisruptive code
upgrade to virtualization
components
– Provides HA for switch Layer 2 SAN
configurations through fault (A/B Fabric)
isolation
– Invista LUNs can be exposed on
both fabrics Invista
DPC CPC DPC
Core
y CPC uses 2 Control Path Nodes
– Active/Active cluster
Layer 2 SAN
– LUN ownership model follows CP (A/B Fabric)
nodes
y Multiple DPCs
– Failover LUNs across DPCs Storage Arrays
– Support for switch upgrades
(hardware and firmware)
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 41
In this illustration, each host has two HBA’s. Each HBA is cabled into a unique front-end layer two
fabric. Each layer two fabric is cabled into a separate DPC. The DPC shares the same CPC for
redundancy and the ability to share volume mapping in case a component of one of the paths is out of
order. Each DPC is cabled into a layer 2 backend SAN, which is cabled into one port in the backend
arrays.
In this example, only one CPC is needed because high availability is implemented within the CPC.
B C D
A E
DPC A
Invista DPC B
Instance F
L2 Fabric B
Fabric A SAN
Heterogeneous
Storage Arrays
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 42
The diagram shows how an Invista configuration may look when it is coexisting with a traditional
SAN. DPC A and the fabric A switches are cabled together and are managed as one fabric. DPC B and
fabric B are configured in the same manner.
In this scenario, a hosts C, D, and E are directly connected to the Invista environments. Hosts A and F
are directly attached to the L2 SAN environment.
Host B has one connection to the Invista instance and another to the L2 SAN. Hosts may be connected
in this manner for a number of reasons. One, they may not be taking part in the virtualized
environment. Or maybe they are being prepared for volume migration to the virtualized environment.
Hosts A, C, D, E and F can be separated from Invista by zoning the HBA to the array port. By not
zoning the HBA to a virtual target, it bypasses Invista. However, physical connectivity to the DPC is
preserved in case the host is migrated in the future.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 43
In this example, the Storage tree folder has been expanded. Next, the underlying Storage Elements
folder is expanded, and then the Imported folder. The tree view displays a list of all imported storage
elements and the properties screen displays the properties of all imported storage elements.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 44
The initial version of Invista includes the advanced software functionality best suited for storage
virtualization.
They are network-based volume management, dynamic volume mobility, and heterogeneous point-in-
time copies (clones). Each of these applications is controlled using Element Manager or the INVCLI
command line.
Remote Replication is achieved by integrating Invista with EMC RecoverPoint.
Volume Management
y Simplify volume presentation
and management
– Create, delete and change
functionality
– Provides front-end LUN masking Concatenated volume
and mapping of storage volumes
to the host
y Centralized volume
management and control Virtual Volumes
– Element Manager provides single
interface for volume management
y Use Cases
– Heterogeneous backup and recovery
– Testing, development, training
– Parallel processing, reporting, queries
y Integrated management
– Replication Manager
– EMC Control Center
– Microsoft VSS
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 46
This slide illustrates how a virtual volume (shown in blue), can be cloned to other virtual volumes of
the same size. In this example, there are three clones shown in yellow, red, and green.
Invista permits users to create one or more full copies of a virtual volume. This functionality is
performed by Invista, not hosts or arrays, and does not require host CPU cycles. Administrators can
use Element Manager console or the Invista CLI to control cloning operations.
Active clones are managed as a “Clone Group”, which consists of a source volume and one or more
clone volumes. Clones can be built on volumes that span heterogeneous arrays. Invista cloning has the
requirement that the source and clone volumes must be the same size.
Clones created with Invista can be used for backups, restoring data, testing, report creation, etc.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 47
With Invista, data mobility refers to the high speed non-disruptive movement of data from one virtual
volume to another virtual volume. The source and destination arrays must be available to Invista. The
move is transparent to the host. There is no requirement to reboot or take other action due to the
migration. The host “sees” the same virtual volume before and after the data has been moved,
regardless of the storage array containing the data. In this example, the data on the green volume is
being moved to the blue volume.
Data mobility is a valuable tool for any situation in which the customer needs to move data without
impacting the application. For example, it is useful when the lease expires on a storage array and the
data must be retained. In this case, the data can be moved to the new array while the application is
running.
Module Summary
Key points covered in this module:
y Storage virtualization concepts
y Invista design and benefits
y Major hardware components of Invista
y Invista:
– Management interfaces
– Services (functionality)
– Theory of operations
– Configuration strategies
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 48
These are the key points covered in this module. Please take a moment to review them.
RecoverPoint
Upon completion of this module, you will be able to:
y Identify key features of a RecoverPoint solution
y Discuss the logical and physical components of the
RecoverPoint solution
y Describe the architecture of RecoverPoint for CRR and
CDP implementations
y Discuss the methods of write splitting used by
RecoverPoint
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 49
The objectives for this module are shown here. Please take a moment to read them.
EMC RecoverPoint
y A single, unified solution
for all storage arrays
Application Database Messaging File and
y Recover data at a local servers servers servers print servers
impacting performance
and retaining write-order
fidelity
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 50
The EMC RecoverPoint data protection solution is a comprehensive solution for your entire data
center, providing local continuous data protection (CDP) and continuous remote replication (CRR).
RecoverPoint is a single, unified solution to protect and/or replicate data from all your current and
future storage arrays. It allows you to recover data at a local or remote site to any point-in-time. It
simplifies management, reduces cost across the data center, and ensures continuous replication to a
remote site without impacting performance and retaining write-order fidelity.
Recoverpoint Recoverpoint
Appliance Appliance
I/O
Journal Bookmarking Journal
Volume Volume
Local Remote
Recovery Recovery
(CDP) CRR
Storage Array Storage Array Storage Array Storage Array
The RecoverPoint solution provides heterogeneous, bi-directional, and asynchronous replication across
an IP WAN infrastructure. This allows it to be placed into existing or new environments constrained
by long distances, high latency, or low bandwidth requirements.
Additionally, it significantly reduces infrastructure operational costs because dedicated bandwidth,
expensive Fibre Channel extension gear, and multiple complex solutions for different heterogeneous
arrays are no longer absolutely required. Moreover, RecoverPoint provides recovery capabilities at the
local or remote site and between different array models and or types.
Data-Replication Challenges
y Heterogeneous Environments
Application response
time
Local site Remote site
Oracle Exchange SQL Application- Oracle Exchange SQL
consistent
recovery
Corruption
protection
SAN
SAN SAN
Disaster-recovery
testing
Communications
cost
Existing
infrastructure
Heterogeneous
storage
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 52
RecoverPoint tackles the challenges noted in this slide with the following features:
y Integration with existing (heterogeneous) storage arrays, switches, and server environments - no
“rip and replace”.
y Intelligent use of bandwidth and data compression.
y A policy-driven engine that supports multiple applications with different data-protection
requirements.
y True bi-directional local and remote support, enabling flexible protection and recovery schemes
that can be tailored to business processes.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 53
RecoverPoint offers three types of bandwidth reduction features that offer significant reduction in the
WAN bandwidth used across the network. One is Delta Differentials, which only send the changed
data across the WAN. The next is Hot Spot identification, which only sends the last write for each
block across the WAN. Another is Compression, which takes the reduced data and compresses it
before sending across the WAN. The RPA at the remote site then performs the de-compression of the
data.
y RecoverPoint/SE
– Replication or data protection solution for Windows
– Host-based Splitter
– Support for CLARiiON Arrays only
– Can be upgraded to RecoverPoint w/o data loss
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 54
RecoverPoint is an out-of-band, block level replication product for a heterogeneous server and storage
environment.
RecoverPoint CDP (Continuous Data Protection) provides local synchronous replication between
LUNs that reside in one or more arrays at the same site.
RecoverPoint CRR (Continuous Remote Replication) provides remote asynchronous replication
between two sites for LUNs that reside in one or more arrays. RecoverPoint CDP and RecoverPoint
CRR feature bi-directional replication and an any-point-in-time recovery capability which allows the
target LUNs to be rolled-back to a previous point-in-time and used for read/write operations, without
effecting the ongoing replication or data protection.
RecoverPoint/SE is a version of RecoverPoint targeted for the Windows-based, CLARiiON-only
environments. RecoverPoint/SE CDP supports the local synchronous replication of up to 4TB of data
between LUNs that reside inside the same CLARiiON array. RecoverPoint/SE CRR supports the
remote asynchronous replication of up to 4TB of data between LUNs that reside in ONE CLARiiON
array at one site to ONE CLARiiON array that resides at the other site.
y Space-efficient protection
RecoverPoint Appliance
Local CDP
Journal
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 55
RecoverPoint Continuous Data Protection is a licensable solution that provides true CDP with real
time data recovery at the local site. It provides the same level of recovery as Continuous Remote
Replication - just to a local site.
/A /C rA rC
/B rB
Production volumes Replica volumes Journal volume
5. Write-order-consistent data is
distributed to the replica volumes
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 56
This slide describes the data flow from the application host to the production volumes, and how the
RecoverPoint appliance accesses the data as part of the CDP process.
An application server issues a write to a LUN that is being protected by RecoverPoint. This write is
“split,” then sent to the RecoverPoint appliance in one of two ways. One is through a host splitter,
which is installed as a driver on the host. The splitter looks at the destination for the write packet. If it
is to a LUN that RecoverPoint is protecting, the splitter will send a copy of the write packet to the
RecoverPoint appliance.
The other is through an intelligent fabric switch, such as the Connectrix MDS 9000 with the SSM
module running SANTap. The switch will intercept all writes to LUNs being protected by
RecoverPoint, and will send a copy of that write to the RecoverPoint appliance.
In either case, the original write travels though its normal path to the production LUN. When the copy
of the write is received by the RecoverPoint appliance it is acknowledged back. This acknowledgement
is received by the splitter where it is held until the acknowledgement is received back from the
production LUN. Once both acknowledgement are received, the acknowledgement is sent back to the
host, and I/O continues normally.
Once the appliance has acknowledged the write, it will move the data into the local journal volume,
along with a timestamp and any application-, event-, or user-generated bookmarks for the write. Once
the data is safely in the journal, it is then distributed to the target volumes, with care taken to ensure
that write order is preserved during this distribution.
Appliance Appliance
Local Remote
Journal Journal
Key:
FC
IP
Storage Array Storage Array Storage Array Storage Array
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 57
5. Data is
4. Appliance
functions sequenced,
checksummed,
• FC-IP
conversion compressed,
• Replication
and replicated
/A /C to the remote rA rC
• Data Reduction
/B & compression RPAs over IP rB
• Monitoring and (either
Local site management asynchronous, Remote site Journal volume
or synchronous
and bi- 8. Consistent data is
directional) distributed to the
remote volumes
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 58
This slide describes the data flow from the application host to the production volumes, and how the
RecoverPoint appliance accesses the data as part of the CRR process.
An application server issues a write to a LUN that is being protected by RecoverPoint. This write is
“split,” then sent to the RecoverPoint appliance exactly the same as is done in a CDP deployment.
From this point, the original write travels though its normal path to the production LUN. When the
copy of the write is received by the RecoverPoint appliance, it is immediately acknowledged back
from the local RecoverPoint appliance, unless synchronous remote replication is in effect. If
synchronous replication is in effect, the acknowledgement is delayed until the write has been received
at the remote site. Once the acknowledgement is issued, it is processed by the splitter, where it is held
until the acknowledgement is received back from the production LUN. Once both acknowledgements
are received, it is sent back to the host, and I/O continues normally.
Once the appliance receives the write, it will bundle this write up with others into a package.
Redundant blocks are eliminated from the package, and the remaining writes are sequenced and stored
with their corresponding timestamp and bookmark information. The package is then compressed, and
an checksum is generated for the package.
The package is then scheduled for delivery across the IP network to the remote appliance. Once the
package is received there, the remote appliance verifies the checksum to ensure the package was not
corrupted in the transmission. The data is then uncompressed and written to the journal volume. Once
the data has been written to the journal volume, it is distributed to the remote volumes, ensuring that
write-order sequence is preserved.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 59
Splitters are components that split each application write, and send a copy to the Appliance.
Host-based splitters require the RecoverPoint Splitter Driver (RPSD) to be installed on the host. This
driver performs the actual I/O splitting from the host, and sends one copy to the RPA and one to the
host’s normal storage volume.
The RPSD can be managed with a utility program, which provides functionality such as Umount,
mount, Quiesce, and host log data gathering.
Initiator’s VSAN
(SANTap)
Initiator target I/O
Not in
primary
data path
SANTap
SAN
Copy of Appliance
primary
I/O Targets and
RecoverPoint
VSAN
Target
RecoverPoint supports the Connectrix MDS Storage Services Module (SSM), which is a blade that
exists in the MDS-9000 series of intelligent fabric switches. The SSM provides a SANTap service that
can be used to intercept and redirect a copy of a write between a given initiator and target.
RecoverPoint uses the SANTap services to eliminate the need for a host splitter on each application
server. SANTap monitors the writes to specific LUN targets, and sends a copy of the write to the
RecoverPoint appliance for further CDP or CRR processing.
Replication Concepts
y Replication Pairs
– Contains a replication volume at each site
– Associate which replication volumes will replicate between each
other
y Consistency Group
– Contains all replication pairs used by an application
– Replication type (CDP/CRR)
– Contains replication policy information
y Journal Volume
– Tracks data changes, time ordering, and block location
– Keeps bookmarks for recovery
– Repository for live data updates
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 61
A replication pair contains two replication volumes, source and target, that will have data replicated
between them. Since a consistency group can maintain consistency of data across multiple volumes,
multiple replication pairs can be created to form a single consistency group.
A consistency group requires a minimum of one replication pair per consistency group. Consistency
and write order fidelity is maintained across all replication pairs contained in the same consistency
group. The consistency group also contains at least one Journal volume on each site to hold consistent
point-in-time images for the specific consistency group. All policy information, such as RPO and
RTO, is associated with a specific consistency group, allowing for multiple policies to be maintained
in the replication environment. Each group also contains the ability to compress the data for a single
consistency group beyond the intuitive compression already present in the solution. It also allows for
specific bandwidth to be allocated for the consistency group.
R1 Replication Pair
R1’
H
jvol R3 Replication Pair R3’ jvol
Consistency
Group
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 62
A Consistency Group is a logical grouping of replication pairs that must be consistent across each
other. The need for consistency across these volumes could be due to the volumes being used by the
same application. A Consistency Group has at least 1 replication pair, and each site has a Journal
volume.
A Consistency Group is also used to determine replication direction and policies on a set of replication
volumes. Each consistency group is an independent entity and can have different replication direction
and policies than other consistency groups. This allows for synchronous and asynchronous replication,
as well as bi-directional replication, to exist in the same environment.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 63
The Management Console is accessible via an Internet browser from a computer that is connected to
the appliance management network (the latest Java plug-in is required). To manage a RecoverPoint
cluster, open an Internet browser and connect to the RecoverPoint Appliance Management IP address
for one of the sites. All configuration and monitoring of the replication environment can be performed
through the Management Console.
The system status section provides a basic visual representation of the environment. It groups each
part of the replication into “types” instead of displaying each individual component. Each site contains
a Host, Switch, Storage and RecoverPoint Appliances. The appliances provide pro-active monitoring
of all major connectivity in the replication environment. If an error occurs or connectivity is lost to a
specific component, a visual alert indicator is displayed on the representation and the status of the
component type on the right of the system status portion displays “Error”.
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 64
The RecoverPoint journal stores all changes to all LUNs in a Consistency Group. It also stores
metadata that allows an Administrator to quickly identify the correct image to be used for recovery.
This screenshot is an example of Application Bookmarks, which is a journal window for a remotely
replicated Oracle instance that show tagged events and application-based information. The Time/Date
and Size are attributes of the saved data. For a value to be in the Bookmark field, a RecoverPoint CLI
script must issue a command with a bookmark parameter specified.
The application column is used today only when RecoverPoint decodes the data stream to identify
application.
The Journal provides time-stamped recovery points with application-consistent bookmarks. It also
correlates system-wide events with potential corruption events. This is very useful when performing
root-cause analysis. These application and system bookmarks are automatic, but users can also enter
their own.
Module Summary
Key points covered in this module:
y The EMC RecoverPoint data protection solution provides
local continuous data protection (CDP) and continuous
remote replication (CRR)
y RecoverPoint provides heterogeneous server and
storage support
y RecoverPoint CRR provides remote asynchronous
replication between two sites for LUNs that reside in one
or more arrays
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 65
These are the key points covered in this module. Please take a moment to review them
Course Summary
Key points covered in this course:
y Virtual Infrastructures
y VMWare product differences
y Concepts and benefits of Storage Virtualization
y Benefits, features, and advantages of an Invista Solution
y File Level Virtualization basic concepts
y Rainfinity features, functions and benefits
y Key features of a RecoverPoint solution
© 2007 EMC Corporation. All rights reserved. EMC Storage Virtualization Foundations - 66
These are the key points covered in this training. Please take a moment to review them.
This concludes the training.