The Challenges of Protection and Recovery in A Virtual World

The Challenges of Protection and Recovery in a Virtual World
Sponsored By: Hewlett Packard
Speaker: Laura DuBois, Program Director, Storage Software, at IDC

Moderated By: Billy Naples
Billy Naples: Hello, everyone and thank you for attending today‟s session on „The
Challenges of Protection and Recovery in a Virtual World‟. My name is Billy Naples, and I
am the Worldwide Product Marketing Manager for Data Protector Software at Hewlett
Packard, and I‟ll be moderating today‟s session, which will be given by Laura DuBois,
Program Director, Storage Software, at IDC.
Before I hand it over to Laura, I‟d like to bring out a few points about today‟s session. One,
please be sure to enter questions in the Questions Manager, because we‟ll be offering a $20
Amazon gift certificate, which will be e-mailed to you once we select the winner at the end of
the session. I‟d also like to say that today‟s session is being sponsored by HP Data Protector
software. As well, a couple of other points I‟d like to mention are the slides for this
presentation will be pushed to your screen automatically. If you have any questions
throughout the presentation, you can type them into the “Ask A Question” area, which is
located on the right-hand side of the viewing console, and they will be addressed at the end of
the presentation. That‟s where you get to win the prize. If you have any difficulty reading or
viewing the slides, there is an “Enlarge Slide” button that you can click on, which is located
just below the slides. And, if you experience any technical difficulties with this presentation,
there is a “Help” link that you can click on, over on the lower right-hand corner of your
screen. I think that wraps up the introduction for today and, with that, I‟ll hand it over to
Laura.
How virtualization is changing backup
An interactive discussion with our Featured Speaker:
Laura DuBois
Program Director, Storage Software
IDC
Live questions & audience polls

Moderated by:
Billy Naples
Worldwide Product Marketing Manager, Data Protector
Hewlett Packard
Best question wins a $20 Amazon gift certificate!
Sponsored by HP Data Protector software
37,000 customers worldwide
© 2008 IDC May-10
Laura DuBois: Great! Thanks so much, Billy, for that introduction, and thank you, to the
audience out there, for joining us on this webcast, because one of the things that we see is
virtualization is really changing the storage and the backup landscape today. And, I find that
firms are making some dramatic improvements in disaster recovery because of some of the
capabilities that virtualization brings with it. Before we get started with the agenda, let me
just give you an example.
Key Takeaways
Changing landscape
Approaches to consider
What others are doing
© 2008 IDC May-10 2
So, I had a recent discussion with the CIO at a hospital in the Midwest and they were really
putting in place transformation for their data center. They were moving from a physical
infrastructure to virtual. They were dealing with basically an outdated design in their data
center, lack of space. They actually couldn‟t get enough power and cooling into the space
they had, and they also were leveraging an older storage architecture. So, in the building out
of the new facility, they of course put in virtualization to consolidate their infrastructure.
They also moved from a DAS to a shared storage configuration. In terms of consolidation,
they went from approximately 12 racks to three racks, and from approximately 112 physical
servers down to 30 physical servers. And, with that change, they brought
about…implemented a new data protection and recovery strategy. And, they were able to
improve their recovery times by a factor of three for their critical systems. So, that‟s the kind
of change that we see coming about because of server virtualization. I‟ll be talking more
about some of the changes that are occurring in the data protection space, particularly…and,
frequently we see the use of disk in the data protection path or process, as, for example some
form of virtualization approach. And, we‟ll talk about approaches to virtual machine backup
later in the presentation, but I really want to give you three key takeaways.
The first one is that there are changes occurring in the data protection landscape today. We‟ve
seen -- those changes really started five or ten years ago, but in the last three years, because
of virtualization, we‟ve really seen a spike in changes. And, we‟re going to look at several
different approaches to protecting virtual infrastructure and go over some of the
considerations, pros and cons, and I‟m going to share with you some IDC research on what
some of the challenges and what some of the approaches some of your peers are using for
backup and recovery of virtual machines, so you can use that to help benchmark yourselves
against where you are with your peers.
So, let‟s briefly look at some of the changes that are occurring and so, as we move to the next
slide here, let‟s face it, backup has changed a lot and it‟s really been an evolutionary process
rather than a forklift upgrade kind of process, or a more extreme transformation. But, these
changes that we‟ve seen are really in the area of, firstly, media. So, the media that‟s used for
backup has really kind of moved from exclusively a tape-based phenomenon to the use of
disk, either disk as an interim point, on its way to tape, or perhaps it‟s ultimately at the ending
point for backup copies.
Changing Requirements
Media Data Scope
Priority SLAs Scale
Recovery Approach Connect
Endpoint Retention Policy
Physical Virtual
© 2008 IDC May-10 3
In terms of the data that‟s getting backed up, the focus 10 years ago was very much on
structured data, ERP database data, and really with the growth of unstructured content, we‟ve
really seen much more of a focus, or a parity focus with file-based data or file systems, in
addition to the structured or relational data. In terms of scope, we‟ve seen what was once a
heap of focus on systems in the data center, and we‟ve seen that shift to expand beyond just
systems in the data center to also remote branch office locations, because our research shows
that as the number of data centers tend to consolidate, the number of remote branch locations
with the physical infrastructure in them tends to increase.
We‟ve also seen priority change. Backup was once sort of a back office function and it
certainly gained a lot of priority. I‟ll give you an example. One firm I recently spoke with has
their recovery objectives published on their corporate website, and it can‟t get much more
visible than that, certainly a C-level topic is around recovery and availability. SLAs have
continued to compress. Recovery times and recovery points are going to vary by application,
but for the mission critical systems, really no downtime can be tolerated. In terms of scale, a
large database 10 years ago was a couple of hundred gig, and now it‟s a couple of terabytes in
size. Recovery, I think, because of virtualization in particular, has been a focus, not just on
the recovery of the data but recovery of the system, so recovery of the entire, whether it be a
physical server, recovery of the entire system, from the bare metal up, or recovery of an
entire virtual machine or recovery of an entire host.
The approaches for backup have really changed, I think. There are many more methods out
there, not just traditional backup but also leveraging of snapshots, whether they are array-
based snapshots, for example, also technologies like deduplication, technologies like
continuous data protection, technologies like virtual tape libraries. So, there are a whole lot of
options available for protection in today‟s world. And, connectivity, so there are many
different interconnects. It‟s not just exclusively fiber channel or network, and you‟ve got
FCoE, you‟ve got iSCSI, you‟ve got NAS, you‟ve got fiber channel, and you‟ve got more
options there in terms of connectivity.
Really, an area we‟ve seen a lot of changes around, the endpoints, so while often in the past
the protection of laptops or PCs has really been left to the purview of the users themselves,
and we see more firms taking on a centralized approach for backup of endpoints, and that
could even, over time, include smart devices. And, then retention, we start to see the
delineation between backup and archive, and the difference between recovery and retrieval.
And, then lastly is policy, and more policy around how things get protected, moving to
different layers in the infrastructure, so not just policy in the backup application but more and
more policy within tools that are provided by the array providers, for example, whether it be
snapshot capabilities and snapshot policies, or replication policies, so policy moving to
different layers in the stack.
But, of course, underlying all of this, we have not just the physical infrastructure but the
virtual infrastructure as well. But, before we go any further, I wanted to get a sense from
everyone out there in the audience at what stage of virtualization adoption you all are at. So,
the question here is really kind of how much your company or organization has adopted
virtualization, and the options here are not yet implemented, we‟ve implemented in test and
dev only, we‟ve implemented with approximately less than 25% of our production
workloads, between 25% to 50% of our production workloads, and implemented for more
than 50% of our production workloads. So, if you could go ahead and look at that, the answer
for your firm, that would be appreciated, and we‟ll give this a couple of minutes‟ time for
everyone to submit their answers.
While we‟re doing that, I‟d like to share with you some research on what we see actually in
terms of adoption of server virtualization and our research shows that in 2009 approximately
12.8% of all of the physical servers out there were virtualized. That‟s worldwide. So, it‟s still
a relatively low number. We expect that percent to increase to 23% by 2013. And, then in
terms of consolidation, we see an average virtual machine to host consolidation at an average
of six virtual machines to one host, and that‟s an average for 2009, but we expect that to grow
to approximately eight to nine virtual machines per host by 2013.
So, I‟ll give this a couple of more seconds and then we‟re going to close the poll here and see
what the results are. So, let‟s take a look at the results for this question. OK, so in terms of
everyone out there, 16% of you, 16.6%, have implemented it in test and dev, 33% of you
have implemented it for less than 25% of your workloads, almost 50%, so 41%, implemented
for 25% to 50% of their workloads, and then 8.3% for 50% or more of your workloads. So,
we‟ve got a pretty good, mature base of virtualization users out there. That‟s great!
So, let‟s take a look at comparing this result from our audience out there, to what we see kind
of at the macro industry level. So, we‟ll move along to the next slide and this really shares
with you some recent IDC research, and in the left-hand axis we have the spending, and the
yellow shaded columns represent the physical server spending. The green represents the
management costs associated with that. And, the red bars are the power and cooling costs.
And, in the out year, that 2012 year, the largest cost component is actually going to be the
administration and management component, almost 50% approximately. And, this spending
will actually translate to 35 million physical servers in 2010, but of course, on average, those
servers are 10% to 15% utilized, which is of course a component of what‟s driving
virtualization overall. But, if we look at the curves in this chart, this is where we really see
the virtualization phenomenon, and what has happened is the install base for virtual servers,
the red curve, has actually exceeded the install base for physical servers. And, the number of
virtual machines per host is also increasing, as I mentioned. And, so this of course raises the
need for improved management, and highlights a bit of a management gap. And, the way we
can address this management gap is by using or leveraging automation, and that can be
automation of patching and updating of OSs. It can be automating assurance around SLAs,
and it also can be automating or streamlining backup and recovery. Because, the reality is
that the explosion in the number of virtual machines is really putting more pressure onto IT
and systems administrators to keep up with the changes in the proliferation of virtual images.
New Economic Model for the Datacenter

Shifts to Automation Tools are a Requirement
Worldwide IT Spending on Servers, Power and Physical

(US$B) Cooling, and Management/Administration Server Installed
Base (Millions)
$200,000
Power & Cooling Logical
$175,000
Mgmt & Administration Server Installed
Base (Millions)
$150,000 New Server Spending
$125,000
$100,000 Virtualization
Management Gap
$75,000
$50,000
$25,000
$0
'96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09 '10 '11 '12
© 2008 IDC May-10 4
So, let‟s look at what some firms are doing to (inaudible) virtualization today. And, as the
study shows here, or the research shows, virtualization is well beyond test and dev. One
airline I recently spoke to was 96% virtualized, which is quite high. And, really virtualization,
as the study shows and as the statistics in the panel here show, or the webcast show, it‟s really
beyond a test and dev. Critical workloads that are being virtualized are business processing,
so ERP or CRM, smaller databases, collaboration. It‟s really considered a mainstream or
mature technology. And, we do see more richly configured systems, in the way of processors,
and memory, and disk, because of the proliferation of virtualization. And, we see an average
of 6:1 consolidation ratio, 10:1 for mature environments, and maybe as high as 25 to 30 in
leading edge companies. And, of course, the organizations that we talk to tend to need to find
a way to manage all of this virtual sprawl so they can keep better track of, and assign, and
reclaim resources. So, there needs to be integration between the physical and this virtual
environment, in terms of operations, processes, policies. And, one of the areas that this
impacts quite a bit is around storage and network optimization, and we‟re actually going to
get into that in the following slides. And, of course, as you scale up the number of VMs per
host, that has implications on backup processes.
Today Virtualization is….
 Used for production applications Use Cases for x86

Virtualized Servers
 Being deployed for business
critical workloads
37%
 Considered a mainstream and

mature technology 63%
 Driving increasing consolidation

ratios Test/Dev Production
 Bringing about management IDC End-User Storage for Virtualized Servers
challenges and new requirements Survey, N=402
© 2008 IDC May-10 5
So, let‟s look at some of the backup challenges that we‟re seeing firms deal with, relative to
virtual infrastructure. So, we did a study where we spoke to 400 end users about what their
virtualization infrastructure was and what kind of challenges they faced with backup, and
what kind of technologies they were looking at to address those challenges. Early adopters of
virtualization really approach backup of virtual machines just like they did their server…their
physical ones. They put an agent in each VM and it was done. And, aside from, kind of early
on, a licensing revolt around that scenario, it really brought several challenges to the
forefront. Firstly, it was in getting data from multiple VMs out of a single back point of the
physical server, so there‟s a lot of I/O congestion and network congestion. That often led to
the need to look at a round robin or a staged approach to the backup of virtual machines, and
that meant possibly running over the backup window, or not completing the backup in the
given time. And, then of course there is the desire to use disk as a backup target for overall
faster recovery, or improved reliability. But, the storage capacity, what we see doubling every
year, really ran at odds with budgets, backup windows, bandwidth constraints on the network,
and resources overall.
Top Virtual Machine Backup
Challenges and Evaluation Priorities
 Storage capacity to keep backup data
 Network and/or I/O bottlenecks impacting backups
 Inadequate processing power to complete backups
Technologies in Evaluation to Address Challenges
Data deduplication/space efficient snapshots 53%
Centralized backup agent for all images on a

52%
single server
Continuous data protection (CDP) 40%
Host-based replication 39%
Array-based data replication 34%
Other 3%
0% 10% 20% 30% 40% 50% 60%

IDC End-User Storage for Virtualized Servers Survey, N=402
© 2008 IDC May-10 6
So, the challenges from place…felt were really kind of forcing them to look at new
technologies, and here is some data from the study that we did, which highlights really kind
of three key needs. If you look at the results, it kind of -- and sort of look at it at a macro
level, the technologies that are being looked at stem from the need to, one, reduce the amount
of data that‟s productive, two, offload the VMs as much as possible from the backup process,
and three, use of disk for some form of operational gain. And, I‟d also like to call out that,
from the perspective of replications and snapshots, that we really see this as a mainstream
technology for data protection and more and more of the use of snapshots, array-based
snapshots, as a reliable way to do protection of physical or virtual infrastructure.
So, a couple of other challenges that we‟ve heard as well. So, in addition to the ones we
already spoke about, there might have been issues with timeouts or…on say a snapshot-based
approach through the virtual infrastructure, or maybe the need to do a multiple step recovery.
Let‟s face it, what people need is not just -- they want file level recovery, they want image
level recovery, and they want host level recovery, depending upon the failure scenario. We
also found that most environments deploy both physical and virtual infrastructure, and so
there needs to be a tool that can address both of those environments rather than deployment
of virtual point products. We call them virtualization point products. That really kind of adds
to the management complexity and lack of -- and kind of exacerbate the virtual sprawl in a
way.
Other Challenges with Virtual Server
Backup
– Impacts to server and application performance

• Shared resources on physical host impacts all virtual machines
– Complex management
• Different tools for managing virtual and physical servers
– Longer recovery times
• Crash-consistent backups means checking for errors on restore;
not suitable for mission-critical environments
© 2008 IDC May-10 7
And, the last challenge is consistency, consistency in recovery times. So, application
consistent backup, as everyone out there probably knows, is like performance…a clean
shutdown of the application, the server, and performing the backup. So, everything‟s in a
consistent state. The application is made aware it‟s going to get backed up and is able to flush
all of its data, and a clean image or a clean copy of the database so the application can get
taken.
Now, of course, virtualization does introduce some complexity into the environment, and for
VMware environments, depending on the configuration, you may only actually get crash
consistent backups. And, crash consistent is maybe satisfactory for some environments out
there, but it‟s not the same as application consistent. Crash consistent means the application
may not get into a synchronized state. It‟s almost like the equivalent of pulling a power cord,
and one of the reasons on the VMware side, where VMware doesn‟t interface fluently with
the applications. So, when VMware tells a virtual machine, which is represented as a VMDK
file, to present the backup, prepare for a backup, the application running inside the virtual
machine doesn‟t necessarily get the message. So, when the backup begins, as far as the
application is concerned, it looks like, to the application, that someone‟s pulled the power
cord. The result is a crash consistent backup rather than an application consistent backup.
Now, again, it depends on your environment. Crash consistent may be adequate. You may not
need application consistent. So, it really depends on the environment, but one of the
considerations of crash consistent is that you may introduce longer recovery times, because
you‟re going to need to do things like check for errors and may need to run some separate
tools to check for errors and for being…and that may introduce some time in the restore
process.
So, with that, why don‟t we go ahead and talk about what kind of challenges everyone out
there in the audience is facing with virtual machine backup. So, we are now going to go over
polling question number two here, what‟s your most significant challenge with virtual
machine backup…don‟t know, data growth prohibits meeting backup windows, failed,
unreliable, or incomplete backups, managing timeouts with snapshots or replication
processes, multiple backups for image and file level recovery, or inability to meet recovery
objectives, like RTO or RPO. So, we‟re going to give some of you out there in the audience
some time to look at this and answer your…respond to the question.
One thing I‟d like to go over, while the polling is going on, is we‟ve seen the evolution of
different virtualization specific point products out there, and what you really want to
consider, when you‟re looking at addressing the management gap that I talked about earlier,
is are you introducing a tool that‟s actually going to help eliminate or alleviate that
management gap or are you going to introduce a tool that‟s going to maybe perhaps make
that management gap a little bit broader? And, you also want to think about -- because most
environments are going to have a physical and a virtual infrastructure in place.
So, we‟re going to give this a couple of more seconds and then see what kind of challenges
that everyone out there in the audience faces. So, why don‟t we go ahead and close the poll
and see what the results are. OK, so 25% of you say you‟re not sure you‟ve implemented
virtualization yet, and then 12% of you are dealing with timeouts of snapshots, so you‟re
probably using VCB, I would assume, if you‟re dealing with that issue. Thirty seven percent,
so almost 40% of you, are dealing with the need to have multiple backups for image and file
level recovery, and then 25% of you are not able to meet your recovery objectives. So, we
actually have some slides on that further on in the presentation. So, that‟s great! Thanks to
everyone out there for responding to that.
Protection and Recovery in a Virtual

World: Approaches
Traditional backup (agent per guest)
VMware backup (VCB, vStorage API)
Hyper-V backup (VSS)
Array-based Snapshots
Enabling technologies
© 2008 IDC May-10 8
So, let‟s go ahead and look at what are some of the options. So, option number one, back up
your virtual infrastructure like you would a physical server. You put an agent in each guest.
Traditional backup applications all have application agents that you can put inside each
virtual machine. And, we‟ll talk about some of the challenges for that. Consolidated backup
can come in the form, and particularly for VMware, could come in the form of either VCB,
VMware Consolidated Backup, or using the vSphere vStorage APIs. Now, there are still folks
out there that are using VCB, so I‟m going to discuss this for completeness, but I do want to
call out that directionally where VM is going is really towards the direction of the vSphere
vStorage APIs. So, VMware did announce the end of life for VCB framework. The next
version of vSphere, that‟s coming later this year, will support VCB, but will really be --
VMware will really be moving towards the new vStorage APIs for data protection that were
introduced with vSphere 4.0. So, VCBs support…VCB binaries will still be available and
supported with the 3.x and the vSphere 4.0, but according to the support policy they won‟t be
in the new platform. The VCB won‟t be in the new platform. So, the question becomes,
relative to vStorage API, is what backup applications support the vStorage APIs. So, you
really need to ask that of your backup application.
For Hyper-V environments, you need to have the backup application support VSS. That‟s the
default way that Hyper-V likes to get backed up, most backup applications do, but there are
different implementations, based on the supplier. Array-based snapshots, so you may not
have to rely on a backup application at all. You can leverage array-based snapshots as your
approach for protecting your virtual machines. More and more firms are doing that, although
if you need to send it out to tape, you also want to consider integration of array-based
snapshots with some of the tape backup applications. Lastly, there are enabling technologies,
enabling technologies such as deduplication, which is really having a profound impact in
reducing network, backup window, and storage capacity savings, reducing the amount of data
that needs to be stored. It also has an impact on footprint and power and cooling savings. So,
we‟ll talk about the deduplication approaches as well.
Traditional Backup (Agent per Guest)
Well understood tools I/O implications

No new infrastructure Limit # VMs per host
Minimal costs Backup windows
Application consistency Network congestion
Single file recovery Agents to manage
VM image recovery required?
© 2008 IDC May-10 9
So, let‟s jump into this. So, traditional backup, agent and HVM. So, the benefits of this
approach is its well understood tools. You don‟t have any new infrastructure. You can just
use your traditional backup application. It traditionally has been minimal cost implications.
Most backup applications have changed their licensing schema to support…to have some
gains or benefits, in terms of licensing for virtual machines inside a host. You get application
consistency and then you get the backup. And, you can get single file recovery. The
considerations are, of course, as you increase the number of VMs per host, the I/O
implications of that, and so that may require you to limit the number of virtual machines you
put in a host, sort of defeating the purpose of virtualization overall. You may find that you‟re
not able to meet your backup window, if you‟re having to move to a staged approach for
backup of virtual machines. You may be dealing with a lot of network congestion, and then
there‟s also the age old challenge of agents to manage. You also may need to have, not just
backup within the VM for file level recovery, but you might want to also have that image
level recovery, so being able to bring back a single VM or multiple…a couple of VMs,
relatively to the host, or maybe even a full host level recovery. And, so that‟s not really been
a possibility using a traditional agent-based approach, which has really introduced an
approach for VMware called VMware Consolidated Backup.
VMware Consolidated Backup (VCB)
Avoid many agents Incremental cost x2

Offload host from backup Requires shared storage
Reduce network traffic Crash consistency only
Puts VM in consistent state
File level recovery separate
Minimizes backup window process
Get VM level recovery Longevity
Requires separate infrastructure (proxy server, disk, etc)
© 2008 IDC May-10 10
So, the benefit here is you‟re able to consolidate the backup onto what‟s called a proxy
server. So, you don‟t have to put a backup agent inside each VM. You offload the host from
the backup process by putting out this proxy server, a physical server, and you‟re reducing
the network traffic as well. So, you‟re putting your virtual machines into a consistent state
while you‟re reducing the network traffic and minimizing the backup window. And, the
benefit here is you get an image level recovery, but it does require a separate infrastructure,
the proxy server, the physical server. The disk associated with each VM tends to be 1:1, and
the backup, you tend to have to pay VMware for the VCB option and then probably some
VCB connection license for your backup application, but there is no software required on the
host. It‟s all -- the software is on the proxy server. So, that‟s what I meant by an incremental
cost. It requires shared storage. To implement VCB, you have to moved to a shared storage
model, which, in and of itself may, if you‟re not, have costs associated with it. You‟re going
to get a crash consistent only, but you‟re going to get a file level recovery, but it is a separate
process. So, some of you may have talked about the multiple recovery processes. And, then
there‟s the question of the longevity. So, as I mentioned earlier, the VCB approach is really
not directionally where VMware is going in the future, because everything is moving towards
the vSphere vStorage API. So, that‟s VCB, VMware Consolidated Backup.
Now, we‟re going to go to the VMware vStorage API, and the benefits here are you‟re able to
do change…it has a change block tracking capability within it. So, the API is something that
is the control mechanisms upon which VMware interfaces with the backup applications, and
it doesn‟t require an agent inside each guest, and it uses change block tracking to do…to
track the changes at the block level, and can do incremental backups. So, you‟re able to,
because the amount of data that you‟re backing up is reduced, you‟re able to increase the
number of simultaneous backups on the host. So, to give you an example, one client I was
talking to just the other day said he was able to increase the number of simultaneous backups
on the host from six to 40, so quite a bit of improvement. And, there‟s no dedicated
infrastructure required, in the way that there is on the VCB. There‟s no proxy server. All of
the interfacing or work is done within VMware itself, and then within the backup application,
so there‟s no separate physical proxy server. And, you‟re going to get a single pass backup
with both an image and a file level recovery. Some considerations, the vStorage API requires
the vSphere infrastructure, the 4.0 infrastructure. You need to make sure your backup
application supports the vStorage API. You need to consider also if you want to get the off-
host backup with shared storage. So, it eliminates the dependency on the shared storage, but
you‟re going to get off-host backup with the shared storage configuration. But, it is crash
consistent only. So, if you want application consistent, you‟re going to need to put an agent,
application agent, in each guest. So, I see a lot of interest in vStorage API. You‟re able to
gain some performance, reduce the amount of data that you‟re backing up, do the single pass
backup with the image and file level recovery. So, the tradeoff there is really about crash
consistency only. So, that is the considerations for the VMware vStorage API.
VMware vStorage API
Change block tracking for vSphere infrastructure

block level incrementals
Backup application support
Increase # of simultaneous
backups on host Off host backup with shared
No dedicated infrastructure storage
required Crash consistency only
Single pass backup, image
and file recovery
© 2008 IDC May-10 11
The next option is around the array-based snapshots. So, most enterprise level arrays, and
actually now mid-tier arrays, provide some form of snapshotting capability. And, in this
configuration, you‟re leveraging the ability for the application…or, excuse me, the array, to
take point in time snapshots of production volumes at some level of frequency, and can
schedule any of these snapshots hourly, a couple of times a day. And, the benefits of array-
based snapshots is you‟re really kind of taking the application server out of the involvement
of doing backup, and you‟re avoiding the tape management overhead with traditional backup.
The snapshots are done in the array, rather than on the host, so the way that it works is that
you would have a production volume and at some point in time you would put the production
volume into an application on that production volume into a consistent state to quiesce the
database. You put the application in an online backup mode, take a snapshot, and then
unfreeze the production volume. And, so you have a point in time copy in which you can
restore from. And, some customers do that hourly. It really depends on your environment and
how granular a recovery point you need. But, all of the benefit of that is you‟re basically
eliminating the host from having to be involved in the backup process. You get the
performance gains of direct access to disk. You can leverage the interconnect that you have in
place, whether that be fiber channel or iSCSI. Maybe, you‟ve implemented some form of
network attached storage. It‟s known and its existing technology. It‟s very popular. It‟s
proven. And, the benefits of snapshots, especially array-based snapshots, is you‟re going to
be able to recover from a logical error or corruption, so sort of different from replication,
whereas you have an error on the production volume. When you‟re replicating, you replicate
that error or corruption over to the replicated volume. With point in time snapshots, you‟re
able to create snapshots and then you can go back to the last known good snapshot, prior to
the corruption. You really want to be looking for array-based snapshots that have
incorporated a fair amount of policy, in terms of scheduling the snapshots, in terms of
integrating or supporting pre- and post-scripting, in terms of being able to integrate with
backup applications. You want to look for an array-based snapshot approach that has VSS
integration to put the virtual machine into a consistent state or application, virtual machine
into a consistent state. You really want to look for deduplication on the back end to reduce
the storage volume. We‟ll actually get into that. And, then some array-based snapshot
approaches have vCenter integration, or integration with some virtual management console so
that you can actually browse the data stores or the virtual machines, and be able to do point in
time snapshots from the vCenter console and/or do restore from the vCenter console. And,
then, as I mentioned, you might want to look at integration with backup, and the benefit of
that is you basically get a zero time backup to leverage the snapshot or the point in time copy
of the target for your backup, and you‟re completely going to almost a zero time backup. So,
those are some of the benefits and considerations with array-based snapshots.
Array-based Snapshots
Avoid tape management Policy management

Snapshots done in array VSS integration to put VM
Performance gains of direct into a consistent state
access to disk
Deduplication support to
Support for different reduce storage volume
interconnects
vCenter integration browse
Known existing technology data stores and VMs
Recover from logical error or Integration with backup
corruption
© 2008 IDC May-10 12
So, let‟s go ahead and look at what folks are doing for server virtualization, in terms of what
type of solution that they‟re using. So, for those of you who have implemented server
virtualization, are you using a VMware ESX 3.0 or 3.5? Have you moved to vSphere? What
folks out there are using Hyper-V or Citrix Xen, or some kind of other virtualization scheme,
like Solaris Containers for example. I will point out, as we‟re waiting for the polling to get
done here, I was just at the Microsoft management summit and we see quite a bit of interest
in Hyper-V. VMware is still, I would say, tends to be a more proliferated technology, but
Hyper-V seems to be getting the interest of folks, because of the economics and the
availability of it. So, why don‟t we go ahead and close this poll and see what the results are?
OK, so…yeah. Close to 60 and…so…yeah. I see almost 57% of you, so almost 60% of you,
are using vSphere 4.0, which is great! Almost 30% of you are using ESX 3.0 or 3.5, and then
almost 15% are using Hyper-V. So, that‟s pretty consistent and that‟s what I think is what we
see here, in terms of some of the research we do.
So, let‟s go ahead and look at…given that there are 15% of you out there that are doing
Hyper-V, let‟s look at what the solution is for Hyper-V VSS backup. So, for those of you
who have implemented Hyper-V, as you know, the method for backup here is using VSS, and
VSS stands for Windows Volume Shadow Services, and it‟s a standard API that the backup
application will use to place the application into a consistent state. So, the backup application
is the requester that‟s working with the VSS writers within each application, whether it be
Exchange, or SQL, or SharePoint. There are actually other VSS writers as well to put that
application into a consistent state, also known as a quiescent state, as I mentioned, and this is
important to get a clean backup and recovery. And, Microsoft creates the VSS writers for
each application, which ensures that there‟s tight integration between the application and the
Microsoft hypervisor. So, VSS is basically a method that presents a consistent image or
snapshot of a file system or the application to a backup application so that it can then back it
up. An important point here is that the application is aware of being backed up, so it clears its
transaction logs and it does other maintenance tasks, and that‟s really critical, rather than…in
this approach, rather than using a backup agent inside each VM. The backup software can tell
the application to quiesce itself and it triggers the VSS snapshot, and then it backs up that
snapshot, and then it tells the application that the backup is done so the application can then
clear its logs. And, then there are some applications that can give you not only image level
backup but also file level recovery out of that image for a single pass backup.
Microsoft Hyper-V VSS-Based Backup
Windows Volume Shadowcopy Services (VSS)

Standard API
Exchange
3rd party Allows 3rd parties to
SQL VSS “backup” request application-
solution consistent snapshots
SharePoint
Different than crash
consistent recovery
Writer Does not requires an

Requester application agent inside
the guest
13
© 2008 IDC May-10 13
So, with that, let‟s talk about deduplication. So, deduplication is a technology that has a
couple of different approaches for where you can do it, so you can do -- and specifically to
backup. You can do deduplication at the client or in the backup software, or you can do it at
the target. So, some of the considerations for doing backup…or, excuse me, deduplication at
the client, typically that requires an agent inside each virtual machine, its deduplication
enabled or aware client. That will dedupe the data at the application server level or at the
virtual machine level, and so the hash calculation has to look up…this is done at the client.
So, the benefit you get in that is that because the dedupe is done at the client, you don‟t need
to send all of that backup data over the LAN or the WAN, and that can help with the backup
window. And also, in addition, it helps with storage capacity, footprint, and power and
cooling savings. The (inaudible) here, with that approach, is that it does take up overhead on
the client, both the hash calculation and the hash lookup, and deduplication uses a hash based
algorithm to identify typically at the block level or the sub-file level, calculates a hash value
for each chunk of data, and will determine what needs to be saved or what‟s redundant, based
upon having seen that hash value before. So, all of that hash calculation and hash lookup
takes up a fair amount of overhead on the client, and you‟re already trying to offload those
virtual machines from processing backup. You probably don‟t want to be introducing more
overhead for the deduplication process.
Enabling Technologies
Considerations: Considerations:
 LAN/WAN bandwidth  Implementation times
 Backup window  No disruption to client
 Storage capacity, footprint  Storage capacity, footprint
and power and cooling and power and cooling
 Overhead on the client  Appliance form factor
© 2008 IDC May-10 14
Some considerations on the target side is that you‟re going to get an…target side tends to be
more of an appliance delivery model, so you‟re going to get deduplication of backup data as
it‟s coming into the storage system, and so you‟re going to typically have an implementation
that‟s more of an appliance form factor, which can help with implementation times. There‟s
really no disruption to the client, the host or the VM, and you get the benefit of the storage
capacity, footprint, and power cooling savings, as you get with the client side as well. And,
then, also what we see is folks that are using a target side deduplication approach are also
deploying, in parallel, a replication to replicate the backup data to a remote site for disaster
recovery, and in employing that they eliminate the need for offsiting of tape copies. So,
they‟re removing their physical tape copy, tape handling processes. So, those are some of the
high level considerations around client versus target. We see both being popular and it really
depends on your environment, your workload, how long you‟re keeping your data for, where
your data is. Is it at a remote branch location? Is it in the data center, and what type of data do
you have? But, I would say that overall it‟s having a huge impact, in terms of storage savings.
So, that‟s a key enabling technology that we see. So, let‟s look at how people out there are
doing protection of virtual infrastructure. So, 36% are using some form of traditional backup.
Seventeen percent are using some form of consolidated backup, whether that be running an
agent on the service console, which is actually an outdated process now, but using VCB or
vSphere API. Forty-two percent are using some form of replication or snapshotting. And, as I
said, I think that‟s pretty consistent with kind of what we see going on, more and more people
using snapshots or replication as a way to backup their data.
Approaches to Protection and
Recovery in a Virtual World
Backup for Virtual Machines
 “36% are using a
traditional backup”
4%
17%
36%
5%  “17% are using a

10% consolidated backup”
11%
17%
 “42% are using a
some form of
Backup agent on each virtual image
Single back up agent (e.g., Vmware VCB)
Array-based replication
replication”
Host-based replication
Replication appliance
Native virtual machine snapshots
Other (please specify)
IDC End-User Storage for Virtualized Servers Survey, N=402
© 2008 IDC May-10 15
And, let‟s look at some of the considerations and questions you should be asking as you‟re
considering what kind of approach to use. So, first of all, I would say do you really want to
make an investment in separate protection tools for virtual systems? So, do you want to
leverage the same systems, the same technology, for both virtual and physical protection, and
I would say that‟s a best practice, is trying to leverage the same…especially in this day and
age. Right now, we‟re all trying to do more with less, being able to leverage your existing
backup technology, or replication, or snapshotting technology to protect your virtual
infrastructure.
Backup Modernization is Underway

Some Considerations and Questions to Ask….
 Investment in separate protection tools for virtual systems, leverage systems

for both virtual and physical protection
 Business expectations on recovery times, have the conversation and leverage
to modernize your backups, overwhelming use of disk
 Factor in the number of VMs/host and CPU impact with traditional approaches
 Recovery granularity: host, image and file level requirements
 Offsite strategy for DR, tape shipping versus remote replication, availability of
a secondary site
 Protecting distributed data in remote locations, are endpoints a requirement
 Application consistent recovery versus crash consistent recovery and
coordinated n-tier recovery
 Explosion in storage capacity, backup windows and LAN/WAN constraints are
fueling adoption of deduplication
 Technology investments you may already have made, shifts to shared storage
© 2008 IDC May-10 16
Another thing I would say is a best practice is business expectations for recovery times are
actually getting much smaller, so you really should be having the conversation with the
business units, about leveraging your existing technology, but also use it as a conversation to
help set expectations about…which you can use to help modernize your backups. We see,
overwhelmingly, the use of disk, and so…not to say that tape is dead. It definitely has its role.
I‟m not here to say tape is dead. But, having the conversation about what the recovery
expectations are and what the reality is can help facilitate a right sizing or modernization of
your backup processes.
You want to factor in the number of virtual machines per host, in the CPU, and packed with
traditional approaches. I think we talked about that quite a bit. You want to think about
recovery granularity, both at the host level, the image level, and the file level, depending on
kind of what kind of restores you‟re needing to do in your environment. Some environments,
I know, don‟t do any file level restores. Others do it daily. Someone I recently talked to had a
person dedicated solely to doing file restores. You also want to consider your offsite strategy
for DR. I see more and more people using some form of electronic vaulting, remote
snapshotting, or remote replication as an alternative to tape shipping, and of course that
depends on the availability of a secondary site. Understand that. And, then more focus on
protecting of data in distributed locations, remote locations and endpoints.
I think you want to be considering how important application consistent recovery is versus
crash consistent recovery. Also for applications that have an N-tier architecture, like
SharePoint, that has the IIS servers, the SharePoint servers, the frontend web servers. You
want to think about trying to get a coordinated recovery across all of those systems. You
really want to be looking at deduplication, in my opinion, as a way to offset the cost of using
disk for improved recovery, because it can have a dramatic impact, not just on storage
capacity, and backup windows, and LAN/WAN constraints, but also in power and cooling.
And, you want to think about what technology investments you may have already made or
are making in some form of shared storage. So, those are all some considerations to take into
account when you‟re looking at backup modernization.
So, with that, I‟d like to ask the audience, out there, how are you…for those that are making a
change in your backup for your virtual infrastructure, which approach are you leaning
towards? Are you leaning towards a standard backup application, whether that be through an
agent, or VCB, or vSphere VSS? Are you leaning towards a deduplication enabled backup
application? Are you leaning towards a host-based replication or snapshot approach, or an
array-based replication or snapshot approach, or lastly, some other form, CDP or online
backup, a cloud-based mechanism? So, let‟s give that a few minutes to wrap up. I‟m very
curious to see what the answer here is. My sense is there‟s going to be a leaning towards the
top two circles here, but we‟ll see. OK, so let‟s wrap up the results here, for this poll. OK, I
am proved wrong. Well, I guess I‟m 50% right here. So, deduplication enabled backup
application, 50%. An array-based replication and snapshot is 50%. OK, well I think those are
the two right answers, certainly, so that‟s great!
So, let‟s look at, in the next slide here, their recovery objectives by application. So, I just
wanted to point out that really kind of recovery objectives are really based upon how quickly
you need to recover your data and with what granularity you need to recover it from. So,
before you experience an outage, a crash, or a disaster of some type, you think about a few
questions you might want to answer. How often does your data change? How much data can
you afford to lose if you have a failure or an emergency? How far back in time do you need
to go in order to recover? And, how soon must the application be operational after a disaster?
Recovery Objectives by Application
Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks
Recovery Point Recovery Time
Print Server
Mail Server
R&D Server
Web Server
ERP
Online transactions
© 2008 IDC May-10 17
And, the other thing I should point out is I see a difference between kind of operational
recovery objectives and disaster recovery objectives. So, disaster recovery is a site level
problem. Right? Where it‟s the whole data center is out. That happens not very frequently,
but you certainly want to plan for it. And, the other scenario is you have kind of an isolated
problem, where some of your systems may be down. It may be a component level failure. It
may be a corruption. It may be an application error, maybe a user error, so you want to kind
of take into consideration both of those scenarios. But, two key metrics, recovery point
objectives, that‟s the amount of data loss you can tolerate for a particular application as a
result of downtime, and the second is recovery time objective. This is the amount of time in
which you need to be back up and running, basically.
So, let me give you an example. The data on your R&D server may change every few hours,
and thus you have a fairly high recovery point objective, but user tolerance for that same
server, in terms of the ability to access the data may be more lax. And, thus you have a less
critical recovery time objective, perhaps a day or maybe half a day, but on the other hand of
the spectrum you have online customer transactions that may have a recovery time and a
recovery point objective of only a few seconds. Right? The customer transactions, you can‟t
lose any of that data, and you also have to have that up and running within a few seconds, so
your RTO and RPO for that customer transaction system is going to be very short. So , that‟s
the kind of dialog you want to have, based upon each system, each application, because your
RTO and RPO objectives are going to vary, based upon the application, and you‟re not going
know what those policies are unless you have the conversation with the business unit.
And, you really want to consider the right technology, depending upon the recovery
objective. So, here‟s where we look at a tiered recovery strategy, which is really a plan that
maps your various recovery needs to the hardware and software configurations that suit your
budget and your resources. So, tape is a great traditional approach. You store information for
long periods of time. It‟s very useful for compliance and long-term record retention. The
economics of tape are very attractive. But, it requires the system to be offline and it can take
anywhere from minutes to days to restore. It may not be onsite. But, it is suited for long-term
data storage, of weeks, months, or years, and for general applications that have (inaudible)
time and point objectives, that may just suit you fine. So, your last known good backup might
have been from the night before, and maybe that‟s an incremental. Maybe, you have to do a
recovery of a full and several incrementals, and those are all on tape, but if your recovery
time is days, that may be perfect, and your recovery point is of your last known good backup.
Define a Flexible, Tiered Recovery

Strategy
Determine the
RPO/RTO for each
class of information REPLICATION
Recovery point objective
Align objectives with

budgets, have the
BACKUP TO DISK
conversation with Bus
Select technology
based on recovery
requirements of the TAPE BACKUP
application workload
Recovery time objective
© 2008 IDC May-10 18
The next stage up might be some form of backup to disk. So, here you might have more
mission critical applications or systems that have a higher recovery time objective, where you
might want to add disk base to your strategy. Here, you might store weeks or even months of
information, and you can recover in minutes or even seconds, depending upon the amount of
data. And, what we see here is more data on disk and less data on tape to achieve shorter
recovery times.
And, then, lastly is replication, which is really the highest level of disaster tolerance, able to
access redundant copies of data at another location, whether that‟s in the same building or
across in a campus configuration, or in a remote branch office. Here, historically it‟s really
been that solution, that only the largest companies could afford, and we see that dramatically
changing. A lot of midrange storage, systems have snapshotting and replication capabilities
included with them1, where data replication is actually much more economic and much more
viable for a broader set of customers. And, of course, enables very fast RTO and RPO.
So, with that, let‟s look at some of the common application tiers that I see, and I‟d like to
point out, firstly, that your priorities are going to vary, based upon the content, and
application, and the industry. But, generalizing, in kind of…in very, very mature
environments, they‟re going to have five tiers. In less mature environments, they‟re going to
have three tiers. And, in kind of the least mature environments, there‟s some
acknowledgment that the stuff we need to get back right away and everything else, that can
wait. Right? So…and the stuff they need to get back right away, they have a fairly good sort
of informal understanding about what that right away means. It might mean four hours. It
might mean eight hours, but there‟s just generally two buckets.
Application Tiers, Priority and SLAs
Tier Workloads Service Levels Technologies

Transactional
Tier-0 (trading, banking, Always on, no service Fault tolerant
patient systems) disruption, load picked systems, clusters,
up at another location replication, scripts
Oracle, Sybase, SQL,
DB2-based applications
Exchange, Notes, 2-4 hr RTO, RPO varies Replication,

Tier-1 Web servers Snapshots, CDP
Departmental 6-8 hour RTO, RPO can Replication,

Tier-2 systems, ECM vary snapshots, backup
systems / restore from disk
Content and application priority varies by industry

Deduplication occurring at the source and target
© 2008 IDC May-10 19
So, I took the middle of the road here and set up three tiers, and in the tier zero here, you‟ve
got your most mission critical systems, might be patient systems or some kind of customer
facing system, some kind of transaction system, where money is coming in. This is the
scenario where these services need to be always on. There‟s no disruption that can be
tolerated. In the ideal scenario, there‟s full redundancy and load balancing, and the ability to
pick up the servicing on another system, so they may rely on fault tolerant systems or
clusters, replication, and scripts. In a tier…and the…so, there‟s really kind of no seconds.
Seconds really is the name of the game here. Tier one is maybe Exchange, or Notes, or web
servers. Here, you might have a two- to four-hour RTO, and your RPO might vary. Here, you
might rely on replication, or snapshots, or CDP, continuous data protection. And, then lastly,
the tier two here might be kind of departmental systems, maybe a SharePoint or content
management systems, as an example. This is not meant to be exhaustive. And, there you
might have maybe a six- or eight-hour RTO, and the RPO could vary, and then the
snapshots…the technologies, excuse me, that might start to get included might be backup to
disk or restore from disk, as well as replications and snapshots. And, then there‟s a tier four to
five as well, but I took the top three tiers. So, again the SLAs will vary, based upon
application and the priority, but we see deduplication occurring across these tiers, and it‟s
occurring both at the source and at the target for different configurations and environments,
but that can have a benefit, in terms of the benefits I already talked about.
Tipping Points for Backup
Modernization
 Server virtualization driving re-architecture of backup
 Data center consolidation or build out
 Migrations from DAS to shared storage
 Litigation and e-discovery, pain of discovery on tape
 Failed or at risk security or compliance audits
 Reduction in data center footprint, power and cooling
 Movement of processing to lower cost locations
 New applications coming online, upgrades pending
 Non-renewal of DR service provider contracts
© 2008 IDC May-10 20
So, what are some of the tipping points for backup, how to sort of trigger or help yourself in
modernizing your backup? Server virtualization is a great way to redraw your architecture.
Data center consolidation and build-out, migrations, any kind of application migration. A lot
of you have probably gone through some kind of pain around discovery of data on tape or
any kind of failed audits. It‟s a great way….we talked to someone recently who had said,
“Hey, come in and prove to me you can do a restore from tape under such and such a time.”
It‟s a great way to help drive change, new applications coming online, as well as offshoring
are all considerations for trying to interject a backup modernization strategy.
So, with that, I‟ve gone through a lot of information. I‟d like to thank everyone for attending
and we have about five minutes left for some Q&A, so I‟d like to open up, I guess, the floor
for questions.
Thank you!
Questions ……
Please feel free to contact

me at ldubois@idc.com
© 2008 IDC May-10 21
Billy Naples: Hi Laura! That was great! Thank you! We do have some questions in here and
one of the first questions was if you could elaborate a bit more on the benefits of array-based
snapshots for virtualization.
Laura DuBois: Yeah, sure. That‟s a great question! And, so one of the benefits of an array-
based snapshot is the fact that it‟s really non-disruptive to the host. So, you‟re able to take the
snapshot and then back up off of that snapshot, and that snapshotting process is all offloaded
from the host. It‟s done within the storage array completely. So, the process goes like this,
where you quiesce the database, you put the application into an online backup mode, you take
the snapshot, you unfreeze the production volume, and all of that is all that‟s required to do
the backup. So, you have the snapshot that serves as the backup, and all of that is done within
the array. And, you can then take that snapshot and then back it up to tape, if you like. So,
you‟re able to do that off of the host. You‟re able to do it within the array. You‟re able to
achieve faster recovery, because what you can do is you can choose -- as I mentioned, I think,
you can do a snapshot basically every hour, or you can do a snapshot even more granularly,
or maybe a couple of times a day, and then you can choose what snapshot you want to
recover from. So, you‟re getting very fast recovery off of that snapshot, versus having to go
and do a restore from tape. And, then, if in the event that you have some kind of corruption or
user error, you can choose the point in time just prior to the corruption and you‟re able to
resolve that issue. And, you‟re essentially getting a zero time backup. You want to make sure
that when you look at array-based snapshot that you‟re looking for an array-based snapshot
that has a provider that integrates with the VSS, so that you‟re getting application
consistency, and I think we talked quite a bit about that. But, VSS has a couple of different
types of providers and one of them is snapshot-based provider, so you want to just look at the
array-based snapshot and make sure they have that.
Billy Naples: OK. That‟s interesting Laura, because one of your polling questions was asking
about what are the challenges, and I noticed 25% of the people said they had an inability to
meet recovery point and recovery time objectives. Would array-based snapshots allow you,
because there‟s no load, would that allow you to take more frequent backups of your virtual
environment?
Laura DuBois: Well, that‟s a great point! So, yes, you can do more frequent backups than
the traditional backup every night. So, you‟re getting a more granular recovery point instead
of getting your backup…recovering from your last backup from last night, you can get a
recovery point which is from 15 minutes ago, your last known good snapshot. So, you‟re able
to improve your recovery points and you‟re able to improve your recovery time, because
you‟re recovering from disk and not from tape.
Billy Naples: OK, great! Another question we have here is do you need a SAN vendor
supplied agent to back up the snapshots, and we‟re talking about array-based snapshots here?
Laura DuBois: So, typically if you want application consistency, yes, you would typically
need some form of plug-in on the application server, depending on the vendor.
Billy Naples: One of the questions here is that Hyper-V has a virtual machine snapshot
capability. Can I use snapshots for backups in Hyper-V environment hosts?
Laura DuBois: Yes, that‟s a good question! There is a distinction, actually, between the
virtual machine backup capability and the snapshots that I‟m talking about within the array.
So, the virtual machine snapshots are actually not the same as backups created by volume
shadow copy, or VSS, writer, and actually Microsoft doesn‟t recommend using virtual
machine snapshots as a permanent data or system recovery point. Even though virtual
machine snapshots with Hyper-V provide a convenient way to store different points of time,
and it stores basically different points of system state data and configuration, really what
Microsoft developed that for was more for test and dev purposes. So, I think you would find
that the Microsoft support policy is really a recommendation, and so that‟s really not a tool to
be used natively for backup. One reason is really that the snapshots are not an acceptable
substitute for backup with the Hyper-V virtual machine snapshots, as they don‟t protect
against problems that may occur on the server running Hyper-V, or hardware malfunction on
the physical computer, or a software related issue in the operating system. Another reason is
that the applications that run in the virtual machine are not aware of the snapshot, the Hyper-
V snapshot, and so it won‟t be able to address it appropriately. If you use a virtual machine
snapshot to restore an Exchange server, the server would expect the same set of client
connections that present…that were there when the snapshot was taken. So, it‟s really not…I
don‟t think something that Microsoft recommends, you using the Hyper-V virtual machine
snapshot as a backup tool.
Billy Naples: OK. One last question. I think, as a matter of time, we‟ll wrap it up, if that‟s
OK. This last question is, “Why can‟t we have a good recovery time/recovery point objective
with non-array based snapshots?”
Laura DuBois: You can. It‟s not the only solution. It depends on what you need, but you can
have an adequate recovery time or recovery point objective, based upon host-based
replication. The advantages of array-based replication is you‟re offloading the host.
Billy Naples: The last one and then wrap it up. How‟s that?
Laura DuBois: OK.

Billy Naples: One last one. “Does deduplication causes a restore penalty, like compression?”
Laura DuBois: Typically, no. So, what deduplication is going to do, and it depends on sort
of how you do it, so the answer is it depends. But, if you‟re doing a target-based dedupe
applications, for example, it‟s really storing pointers to data that‟s changed, and so, for
example, blocks that are redundant, it‟s going to store pointers to those blocks. So, it‟s really
about playing back those point (inaudible) sort of in the…so, “no” is the answer.
Billy Naples: OK. Alright, and with that, thank you Laura! That was brilliant! I think we‟ll
wrap it up. I don‟t know if I need to hand off to anybody, but thank you very much for your
time, your insights, and your knowledge. We appreciate it!
Laura DuBois: Thanks so much! It‟s been great being here!

The Challenges of Protection and Recovery in A Virtual World

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

The Challenges of Protection and Recovery in A Virtual World

Încărcat de

Drepturi de autor:

Formate disponibile

The Challenges of Protection and Recovery in a Virtual World

Sponsored By: Hewlett Packard

Speaker: Laura DuBois, Program Director, Storage Software, at IDC

Live questions & audience polls

Best question wins a $20 Amazon gift certificate!

Sponsored by HP Data Protector software

37,000 customers worldwide

© 2008 IDC May-10

What others are doing

© 2008 IDC May-10 2

Media Data Scope

Priority SLAs Scale

Recovery Approach Connect

Endpoint Retention Policy

New Economic Model for the Datacenter

Worldwide IT Spending on Servers, Power and Physical

© 2008 IDC May-10 4

 Used for production applications Use Cases for x86

 Considered a mainstream and

 Driving increasing consolidation

 Bringing about management IDC End-User Storage for Virtualized Servers

challenges and new requirements Survey, N=402

© 2008 IDC May-10 5

Technologies in Evaluation to Address Challenges

Data deduplication/space efficient snapshots 53%

Centralized backup agent for all images on a

Continuous data protection (CDP) 40%

Host-based replication 39%

Array-based data replication 34%

0% 10% 20% 30% 40% 50% 60%

– Impacts to server and application performance

© 2008 IDC May-10 7

Protection and Recovery in a Virtual

Traditional backup (agent per guest)

VMware backup (VCB, vStorage API)

Hyper-V backup (VSS)

© 2008 IDC May-10 8

Traditional Backup (Agent per Guest)

Well understood tools I/O implications

VM image recovery required?

© 2008 IDC May-10 9

VMware Consolidated Backup (VCB)

Avoid many agents Incremental cost x2

Requires separate infrastructure (proxy server, disk, etc)

© 2008 IDC May-10 10

VMware vStorage API

Change block tracking for vSphere infrastructure

© 2008 IDC May-10 11

Avoid tape management Policy management

© 2008 IDC May-10 12

Microsoft Hyper-V VSS-Based Backup

Windows Volume Shadowcopy Services (VSS)

Writer Does not requires an

© 2008 IDC May-10 13

© 2008 IDC May-10 14

5%  “17% are using a

© 2008 IDC May-10 15

Backup Modernization is Underway

 Investment in separate protection tools for virtual systems, leverage systems

Recovery Objectives by Application

Recovery Point Recovery Time

© 2008 IDC May-10 17

Define a Flexible, Tiered Recovery