Sunteți pe pagina 1din 17

89 Fifth Avenue, 7th Floor

New York, NY 10003

www.TheEdison.com

212.367.7400

White Paper

HP 3PAR Thin Deduplication:


A Competitive Comparison
Printed in the United States of America

Copyright 2014 Edison Group, Inc. New York.

Edison Group offers no warranty either expressed or implied on the information contained
herein and shall be held harmless for errors resulting from its use.

All products are trademarks of their respective owners.

First Publication: June 2014

Produced by: Chris M. Evans, Senior Analyst; Manny Frishberg, Editor; Barry Cohen, Editor-in-
Chief
Table of Contents

Executive Summary ..................................................................................................................... 1

Introduction .................................................................................................................................. 2
Objective ..................................................................................................................................... 2

Audience .................................................................................................................................... 2
Contents of this Report............................................................................................................. 2

Space Optimization in Primary Storage .................................................................................. 3

Data Deduplication ................................................................................................................... 4


Technical Features................................................................................................................. 4

Managing Resiliency ................................................................................................................ 5


Making the Cost of Flash Acceptable ..................................................................................... 5

Anticipated Space Savings ....................................................................................................... 5


HP 3PAR Thin Deduplication: Deep Dive.............................................................................. 6

Background ................................................................................................................................ 6
Hardware Acceleration ........................................................................................................ 6

Thin Deduplication Implementation...................................................................................... 6


Express Indexing ....................................................................................................................... 7
Thin Clones ................................................................................................................................ 7

Virtual Volume Conversion .................................................................................................... 8


Thin Deduplication Estimation ............................................................................................... 8

Space Savings and Write Efficiency ....................................................................................... 9


Competitive Analysis ................................................................................................................ 10

SolidFire Storage System........................................................................................................ 10


Pure Storage FlashArray ........................................................................................................ 11
EMC XtremIO .......................................................................................................................... 12

Conclusions and Recommendations ...................................................................................... 13

Interpreting Savings ............................................................................................................... 13


Executive Summary

As data growth continues at exponential rates, IT departments are being asked to deliver
storage at ever-increasing levels of efficiency – the classic “do more with less” dilemma.
At the same time, traditional storage arrays are failing to keep up with I/O density
requirements and customers are transitioning to all-flash systems, which have a much
higher raw $/GB price point. Space reduction technologies such as thin provisioning,
compression and data deduplication form a key strategy in all-flash systems by helping
businesses meet their storage needs while driving high levels of efficiency.

HP 3PAR StoreServ’s thin deduplication feature continues the story of delivering value
to customers through optimizing the way their shared storage systems store data. Thin
deduplication further leverages the use of HP 3PAR’s custom application-specific
integrated circuit (ASIC) to minimize the impact of performing deduplication inline as
data is written to the array. Strong data integrity is maintained through additional
integrity checks on every deduplicated write, a process that is achieved at line speed
using the ASIC technology.

HP 3PAR StoreServ thin deduplication is the latest feature in a line of thin technologies,
including thin provisioning, thin persistence and thin reclaim that deliver value and cost
savings to the customer. Each of the technologies is fully built-in to the 3PAR StoreServ
architecture.

In this study, HP 3PAR StoreServ was compared to competing all-flash offerings from
SolidFire, Pure Storage and EMC. All of the solutions offer inline (real time)
deduplication, although FlashArray from Pure Storage does do some post-processing of
data. Both SolidFire and Pure Storage integrate compression into their space saving
technologies (and their savings figures). Only HP 3PAR and Pure Storage offer
additional data integrity checking through hash verification.

From thin deduplication alone (not including Zero Page Detect), HP 3PAR StoreServ
achieves up to a 10:1 savings, depending on the data type in use. This exceeds the
figures claimed by the three competing platforms, two of which also include
compression technology and pattern detection in their calculated figures.

In summary, thin deduplication, added to the existing set of thin technologies, extends
HP 3PAR StoreServ’s leadership in offering customers highly efficient, highly scalable
primary storage for every enterprise requirement.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 1


Introduction

Objective

This report looks at the implementation of data deduplication on the HP 3PAR StoreServ
storage platform and compares the features and functionality offered to equivalent
products in the marketplace today. The constant drive to “do more with less” means all
space reduction technologies are valuable tools for increasing the level of efficiency in
primary storage arrays. The ubiquity of flash, as we will discuss, means primary
deduplication is ready for production implementation.

Audience

Decision makers in organizations, looking to deliver highly efficient deployments of


centralized storage will find this report provides an understanding of the technical
issues in deploying deduplication and the resultant benefits it can deliver.

Contents of this Report

 Executive Summary – A summary of the background and conclusions derived from


Edison’s research and analysis.

 Space Optimization in Primary Storage – A primer on the evolution of shared


storage and space savings techniques that help to manage exponential growth.

 HP 3PAR Thin Deduplication: Deep Dive – An in-depth discussion on the features


and functionality of the HP 3PAR StoreServ thin deduplication feature.

 Competitive Analysis – An examination of the implementation of deduplication in


competitive storage platforms with comparison to HP 3PAR StoreServ.

 Conclusions and Recommendations – A summary of the findings from the research.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 2


Space Optimization in Primary Storage

The exponential rate of data growth has been a significant challenge for many
organizations to manage since the introduction of shared storage over 20 years ago.
Demand for storage is insatiable, with estimates on growth varying from 50-100 percent
per annum. To help manage growth, storage vendors have implemented software
features that optimize the use of physical storage capacity. These include:

 Thin Provisioning – this is a space reduction technique that stores only host-written
data to disk. Space savings are made through storing only the actual data written to
each volume, rather than reserving out the whole capacity of the volume in “thick”
provisioned implementations. Thin provisioning solutions can save anywhere from
35-75 percent of physical disk capacity, depending on the data profile, however
ongoing housekeeping is required to keep efficiency at optimum levels. HP 3PAR
StoreServ systems see an average of 65 percent based on field data.

 Zero Page Reclaim – this space reduction technique identifies pages of empty or
“zeroed” data and removes them from physical disk, retaining metadata information
to indicate the logical page in the volume is empty. Most solutions use post-
processing zero page reclaim (ZPR) as the overhead of identifying empty pages in
real time impacts I/O performance. However, the HP 3PAR StoreServ platform is
unique in using a dedicated ASIC processor that identifies and eliminates zero pages
in real time (known as Inline Zero Detect), reducing disk I/O and saving on disk
capacity.

 Data Compression – this is a space reduction technique that identifies repeated


patterns or redundancy in data and removes it, leaving in place metadata that allows
the original information to be recreated. Although compression can make significant
savings, the overhead on processor requirements means many vendors have chosen
not to implement the technology.

 Space Efficient Snapshots and Clones – although not directly a space reduction
technique, snapshots and clones of primary data can be taken space-efficiently, using
metadata to track the differences between the primary volume and the snapshots.
On some architectures there are performance implications from using snapshots;
some also require space to be reserved for a snapshot pool, however no such
restrictions exist within the HP 3PAR StoreServ platform.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 3


Data Deduplication

Deduplication is a space reduction technique that identifies redundant or duplicate data


in physical storage, removing the redundant copies to retain a single copy of data on
disk. Metadata (in the form of lookup tables in memory) is used to map logical volumes
to the single instance copies of data. Significant savings in physical disk capacity can be
achieved where systems contain lots of similar or repeated data, such as virtual server
and virtual desktop environments. To date, deduplication has been widely used in disk
backup systems where savings of 90-95 percent, or over 20:1 reduction in physical
capacity have been realized.

Technical Features

Some of the technical features of data deduplication include:

 Inline/Post Processing – data deduplication can be performed either as data is being


committed to disk, in which case it is known as inline, or after the data is on disk, so-
called post processing. Inline processing requires fast efficient algorithms in order to
minimize any impact on performance, with the added benefit that space savings are
realized immediately. Post processing removes any direct performance impact,
however physical disk space usage will vary as data is written to disk and
deduplication is performed as a background task.

 Fixed/Variable Block Size – deduplication techniques identify potentially duplicate


data either using fixed or variable data block techniques. Variable block algorithms
typically produce higher deduplication ratios than fixed-block solutions but require
more processing overhead. Smaller fixed block sizes tend to produce more efficient
results, but cost more in terms of processor overhead and system memory through
additional metadata lookups.

 Data Hashing – hashing refers to the process of generating a unique checksum value
from a block of data. The hash value from each block is used as the fingerprint to
reference that data in metadata tables and when comparing new data for
deduplication. Hashing techniques vary in their reliability, with some algorithms
generating the same hash value for different data, known as a “hash collision”. There
is a balance to be struck between the complexity of the hash algorithm and the
impact on performance, so some implementations use lightweight hashing and
validate all data before confirming duplicates.

 Data Profile – the deduplication of data results in a more randomized pattern of


access for a single volume, as the original physical locations for blocks of data are not

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 4


determined by the logical volume layout. Random data access is more difficult for
HDD-based storage arrays to manage as random I/O results in a lot of latency from
mechanical disk head movement. Flash storage on the other hand has no such issues,
making this technology highly suited to managing deduplicated data.

Managing Resiliency

In systems that are highly deduplicated, a single block of data may be a component in
tens or hundreds of logical volumes. As a result, the impact of losing data due to a
hardware failure is much higher than in non-deduplicated environments. Data loss
could occur through logical corruption (due to a software bug) or through hardware
failure (such as two disks failing in a RAID group using single parity). Some
deduplication implementations are enabled by default and cannot be disabled by the
administrator, which may be undesirable for certain data types.

Making the Cost of Flash Acceptable

All-flash arrays are a recent entrant into the shared storage marketplace. These
appliances use flash exclusively as the permanent storage medium. Flash is much more
expensive per GB than traditional hard drives, and as a result, vendors of these products
have looked to find ways to make the cost of all-flash arrays more acceptable based on
the historical $/GB measurement.

One solution has been to quote array capacities after space reduction savings have been
applied. The result is a much more palatable cost that is more in line with traditional
disk-based arrays. However basing purchasing decisions on anticipated space savings
can be risky, unless the data profile is well known or validated first.

Anticipated Space Savings

The aim of deduplication is to make savings on physical disk space. Savings vary with
the type of data being optimized, with highly redundant data such as virtual server and
VDI (Virtual Desktop Infrastructure) deployments seeing the best results. Structured
data, encrypted data and media content does not usually realize much in the way of
savings as the data usually already been optimized by the application. Data savings may
also change over time as information is created and destroyed through a normal
lifecycle. The savings made from deduplication should therefore be seen more as an
additional benefit rather a core capacity measurement.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 5


HP 3PAR Thin Deduplication: Deep Dive

Background

The HP 3PAR StoreServ architecture is based on a cache coherent active-mesh cluster


comprised of multiple controller nodes and disk shelves. All controllers participate in
data access, in an “active-active” configuration, ensuring that all resources on all nodes
are used to service I/O requests. The HP 3PAR OS uses a three-level mapping
methodology similar to that used in enterprise operating systems to store and track
physical and virtual resources. With the introduction of flash technology, the HP 3PAR
StoreServ architecture is ideally placed to exploit faster storage media, through features
that include the existing range of thin technologies.

Physical space on backend storage is divided up into 1GB units known as chunklets.
Chunklets are then combined to create logical disks (LDs), applying data protection
(RAID) and data placement rules to each LD. Virtual volumes (VVs) or logical unit
numbers (LUNs) are then created out of logical disks as the entity that is assigned to
hosts using a page size granularity of 16KiB. Data resilience is achieved by distributing
data across multiple nodes, disk shelves and disks.

Hardware Acceleration

One of the key differentiators of the 3PAR StoreServ platform is the use of a custom
hardware controller, or ASIC. The ASIC, now in its fourth generation, provides line
speed zero page detect for each 16KiB block of data written to the array. It is a core
technology in delivering the existing 3PAR StoreServ thin technologies, including thin
provisioning, thin persistence, thin conversion and thin copy reclamation.

Thin Deduplication Implementation

Thin deduplication is a new feature initially implemented on HP 3PAR StoreServ


Storage Systems. The feature is provided as a no-cost option within the base HP 3PAR
OS suite, providing customers with the option for immediate cost savings at no
additional charge. Thin deduplication is available for both virtual volumes and
snapshots.

Thin deduplication is an inline deduplication process that takes advantage of the


generation four ASIC to perform hash calculations of each 16KiB block of data as it is
written to the system. When data is received by the system, the hash calculation effort is

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 6


offloaded to the ASIC and delivered at wire speed. The array then uses a feature called
Express Indexing to check whether the new data already exists in the system. If a hash
match is found, the ASIC is used to do a bit-by-bit comparison of the new data with the
copy on the backend flash to ensure no hash collision has occurred. As this function is
offloaded to the ASIC and performed at line speed, there is negligible CPU overhead.

Express Indexing

The HP 3PAR operating system uses a process called Express Indexing to detect
duplicate page data. The process takes advantage of the innovative and robust tri-level
indexing system used within the OS to store and manage traditional (non-deduplicated)
volumes.

When data is received by the array, Express Indexing calculates a hash value for each
16KiB block of data. The hash value is then used to check whether the new data block
already exists on the system by “walking” the metadata tables using the hash value. If
the block of data is located, it is read from the backend and compared at a bit level
(using XOR) in the ASIC. The XOR of two equal pages will result in a page of zeros that
will also be detected in line by leveraging the ASIC zero detection built in engine.

A successful comparison results in a “dedupe hit”, in which case the virtual volume
LBA pointers are updated to reference the located data and the incoming data is
discarded. In the unlikely event a hash collision is detected, then the data is stored to
disk directly associated with the virtual volume and not treated as deduplicated. If the
new data was not located at lookup, a new data block is allocated and the data is written
to backend storage.

With this innovative technique the HP 3PAR StoreServ solution makes efficient use of
existing memory structures to track unique and deduplicated data and map it to virtual
volumes. With the 3PAR memory structure design there is no need to keep reference
counts to shared data as any unreferenced data is eventually cleaned up as part of an
online garbage collection process via a “mark and sweep” algorithm.

Thin Clones

The abstraction of logical and physical volume content through deduplication provides
the ability to implement features such as thin clones. A thin clone is a replica of a
volume that is created through copying only the metadata that associates a virtual
volume with the physical data on disk. At initial creation, thin clones point to the same
blocks of data as the cloned volume, however as volumes are updated and the content of

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 7


data changes, new writes will map to different deduplicated blocks (or create new
blocks), so no direct overwrite process occurs. Thin clones continue to “stay thin” if
updated data continues to map to existing deduplicated data on the array.

Thin clones allows HP 3PAR StoreServ to implement highly efficient and instant volume
copies for hypervisor cloning functions such as VAAI on VMware vSphere and ODX on
Microsoft’s Hyper-V.

Virtual Volume Conversion

When thin provisioning technology was first introduced into storage arrays, many end
users were wary of technology due to the perceived risk of oversubscription causing
write I/O failures. As customers have become familiar and comfortable with the
technology, thin provisioning has gained widespread acceptance and is one of the
standard space saving techniques in use by many organizations.

Some customers may feel uncomfortable placing all of their data onto a deduplicated
platform and in mixed hard disk and SSD deployments Thin Deduplication would only
be supported on the SSD tier.

VV Conversion provides HP 3PAR StoreServ customers with the ability to convert


virtual volumes between any of the three formats (Thick, Thin and Dedupe) on-demand
and with the ability to revert the volume back as required.

The flexibility of VV Conversion means customers can choose to move data to


deduplication with confidence and also take the opportunity of deduplication savings as
they migrate data from traditional HDD tiers to SSD tiers of storage.

Thin Deduplication Estimation

HP 3PAR StoreServ Storage users can preview the potential space-saving benefits of
deduplication on existing production volumes by using 3PAR Thin Deduplication to
estimate potential benefits. This assessment can be run against any online production
volume and runs as a non-intrusive, non-disruptive background task. Additionally, for
customers interested in assessing the benefits of adding flash to the system, 3PAR Flash
Advisor provides a built-in, free, and complete set of tools that includes a Thin
Deduplication benefits assessment.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 8


Space Savings and Write Efficiency

HP 3PAR Thin Deduplication has been shown to deliver savings of up to 10:1,


depending on the source data. This exceeds the levels of savings claimed by other all-
flash storage vendors. HP has also done research on the differences between using the
default 16KiB block size of the 3PAR StoreServ platform and the lower 4KiB size used by
other platforms. Tests showed a modest improvement in savings of less than 15 percent.
As a result, HP chose to remain with the existing 16KiB block size as this resulted in the
optimum use of processor and memory resources.

HP also looked at telemetry data from tens of thousands of existing customer systems.
These showed the sweet spot for deduplication was between 8KiB and 16KiB in block
size. Values lower than this saw some modest improvement in savings but introduced
higher system load.

HP 3PAR StoreServ’s write striping capability means that write I/O across SSDs are
distributed evenly, reducing the risk of catastrophic device failure. HP provides a 5-year
unconditional warranty on cMLC drives in StoreServ systems.

Inline Zero Detect means data is removed from the I/O pipeline and not written to
backend storage, further reducing the wear on SSD devices. Finally features such as
Adaptive Write and Adaptive Sparing provide additional SSD management, resulting in
extending SSD capacity by a further 20 percent.

All of the features described are fully integrated with the new thin deduplication
technology.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 9


Competitive Analysis

Data deduplication has not been widely adopted in the primary storage marketplace,
however the all-flash array vendors have used the technology as part of new
architecture designs. The notable exception to early deduplication adoption is NetApp,
who introduced deduplication technology into Data ONTAP as early as 2007.
Unfortunately, this implementation was based on post-processing data and
consequently limited aggregate size due to the performance impact of the post-
processing task.

In the all-flash startup market, deduplication has become a “table stakes” feature with
vendors looking to emphasize the effective cost per GB of their products after space
saving techniques have been applied. This has caused problems for Violin Memory, who
have no native space reduction technologies in their products.

Three vendors offering deduplication have been chosen as a comparison to the HP 3PAR
StoreServ technology. These are SolidFire’s Storage System, Pure Storage FlashArray
and EMC XtremIO. All of these systems are new technology from startups and therefore
have deduplication built into their architecture.

SolidFire Storage System

SolidFire’s Storage System has been available since 2012, evolving through three
generations of hardware and six generations of the platform’s Element operating
system. The SolidFire architecture is a scale-out “shared nothing” loosely coupled node
design, which uses a back-end 10GbE network for inter-node communication. Systems
can expand and shrink by adding and removing nodes. Data protection is implemented
through simple mirroring of data between nodes.

SolidFire uses a content-based data placement algorithm to distribute data evenly across
a node complex. Space reduction is achieved through a combination of both data
deduplication and compression. As data is received by the system, it is divided into
4KiB blocks and compressed before being hashed. The content is then routed to the node
responsible for managing that hashing range of data. If the new data is found to be a
duplicate, then a reference to the content is stored against the volume and the node
discards it; if the data is unique it is written to SSD. New deduplicated data is not
checked before writing to disk.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 10


Compressing data as it is written to the system results in blocks of variable length,
which are then written in a tightly packed arrangement on backend storage. The means
as data is expired from the system, housekeeping is required to reclaim usable space and
restack content on physical media.

SolidFire delivers inline deduplication based on a 4KiB block size and is always enabled.
The company claims between 4:1 and 10:1 efficiency savings, based on both compression
and deduplication, although no breakdown of each method is given.

Pure Storage FlashArray

Pure Storage released their first FlashArray product in May 2012. The system is built on
a scale-up architecture consisting of dual active-active redundant node controllers and
shelves of solid-state disk (SSD).

FlashArray uses five different techniques for data reduction1, all known together as
“FlashReduce”. The components are:

 Pattern Removal – this looks for repeated patterns in data including identifying
zeroed data.
 Inline Compression – this process uses a lightweight implementation of the LZO
(Lempel-Ziv-Oberhumer) algorithm and is a “first pass” at compression inline before
data is committed to disk.
 Adaptive Inline Deduplication – deduplication is performed inline using a variable-
size block deduplication algorithm, based on blocks from 4KiB to 32KiB in 512 byte
increments (the minimum size is based on SSD page writes, which are 4KiB).
 Deep Reduction – this process uses a patent pending form of the Huffman encoding
algorithm and is performed as a post-processing task to achieve more aggressive
space savings.
 Copy Reduction – all snapshots and clones in a FlashArray system are deduplication
aware. This feature is also implemented in the HP 3PAR StoreServ platform.
Deduplication is always enabled within FlashArray systems, however the architecture
allows the deduplication process to be curtailed during periods of heavy system load. In
this scenario, hash lookups may be abandoned and potentially duplicate data written to
disk. As a result, FlashArray uses the Deep Reduction feature to identify missed
deduplication opportunities and to apply compression more aggressively than could be
achieved inline.

1 http://www.purestorage.com/blog/pure-storage-flash-bits-adaptive-data-reduction/

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 11


FlashArray deduplication cannot be disabled on a per-volume basis; all volumes have
deduplication applied to them. Pure Storage quote their space savings using a “real-
time” ticker on their website, which shows savings based on information from customer
arrays. This shows an overall reduction rate of 5.72:1, with 2.13:1 achieved from
deduplication and 2.68:1 from compression.

EMC XtremIO

EMC acquired the Israeli startup XtremIO in 2012, with the first GA products shipping
at the end of 2013. The all-flash XtremIO platform is based on a scale-out node
architecture of paired controllers called X-Bricks, which encapsulate a fixed amount of
flash (25 drives) per controller pair. Multiple X-Bricks are connected through an RDMA
mesh.

The XtremIO design uses a content-based data placement architecture where data is
stored in 4KiB blocks based on the hash value generated by each write I/O. This results
in an even distribution of data across all nodes in a system, with each node managing a
part of the hash value address space. The distribution mechanism means system
expansion is a non-trivial exercise and currently XtremIO systems cannot be expanded.

The XtremIO operating system (XIOS) runs a number of processes (called modules) that
manage data flow in the XtremIO system. As write I/Os are received, the Routing
module splits the data into 4KiB chunks and calculates the hash value of each chunk.
The Control module maintains a hash table list of data and checks to see if the hash
value represents data already stored by the system. If the data is unique, the hash value
is recorded and the data is passed to a data module to store on SSD. If the data is a
duplicate, the data module simply increments a reference count and discards the data.
The XtremIO system is therefore heavily dependent on maintaining accurate reference
counters to each 4KiB of stored data.

XtremIO is based on fixed 4KiB blocks, with no verification of the hash value before
committing to disk. Deduplication is global across the entire XtremIO cluster, due to the
use of content-based data storage. However, data is not replicated across nodes using a
standard replication scheme such as RAID. Instead XtremIO uses a RAID-6 style
protection mechanism called XDP, which writes data redundantly within each X-Brick
with a capacity overhead of around 8 percent. Loss of an X-Brick therefore means data
becomes inaccessible. The current design of XDP means no flexibility in data protection
mechanisms is available and deduplication cannot be turned off for more sensitive data.
EMC claims a 5:1 deduplication ratio in their documentation when quoting usable
capacity.

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 12


Conclusions and Recommendations

Data deduplication is a technology that can offer significant space and cost savings in
primary storage. Due to the random nature of deduplicated data, the technology has not
seen traction and deployment in traditional arrays; instead it has become a key feature
for all-flash solutions, which capably cope with the random I/O profile.

The underlying design and architecture of the HP 3PAR StoreServ platform means it is
well suited to the requirements of deduplication on flash storage. HP 3PAR StoreServ
Thin Deduplication continues the evolution of space savings features of the platform,
adding to savings customers are already achieving through thin provisioning, thin
reclaim, thin conversion and thin persistence.

Thin Deduplication leverages the 3PAR StoreServ custom ASIC to perform hashing and
data integrity checking at line speed; the ASIC continues to be a key differentiator in the
primary array marketplace.

In comparison to other platforms, HP 3PAR StoreServ implements Thin Deduplication


with little or no performance overhead and provides the customer with the ability to
choose which data should be considered for deduplication on a volume by volume basis.
In true 3PAR StoreServ ethos, space saving settings can be changed dynamically without
requiring work by the customer or restricting the array design or layout.

Interpreting Savings

Explanations of space savings are murky and not transparently explained. Some
vendors exclude their RAID overhead; some include all space saving techniques
(including thin provisioning) without providing a breakdown of the savings and how
they are achieved. There is also typically no discussion on how much space metadata
occupies on backend storage.

In the product comparisons, EMC XtremIO quotes a saving ratio of 5:1 (without any
detail on how this is achieved), Pure Storage quotes 5.72:1 and SolidFire quotes values
from 4:1 to 10:1. Note that figures from Pure and SolidFire also include compression
savings (which has considerable processor overhead), which is not currently an HP
3PAR StoreServ feature.

HP 3PAR StoreServ Systems achieve deduplication ratios of up to 10:1 without


including savings from other Thin Technologies. Space savings from Inline Zero Detect,

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 13


for example, are not included but can be significant, making overall savings much
greater.

Data deduplication ratios alone are not a true indication of the benefit of deduplication
technology. HP 3PAR StoreServ integrates deduplication with existing thin technologies
and features such as Thin Clones to deliver a comprehensive integrated space saving
solution.

With the release of thin deduplication, HP 3PAR StoreServ continues to maintain


leadership in delivering customers highly efficient primary storage solutions.

4AA5-3223ENW

Edison: HP 3PAR StoreServ Thin Deduplication: A Competitive Comparison Page 14