Sunteți pe pagina 1din 23

Content Addressed Storage

Module 2.5

2009 EMC Corporation. All rights reserved.

Module Objectives
Upon completion of this module, you will be able to: Describe CAS, fixed content and archives, traditional storage solutions for archive Describe the features and benefits of a CAS based storage strategy

List the physical and logical elements of CAS


Describe the storage and retrieval process for CAS data objects

Describe the best suited operational environments for CAS solutions


2009 EMC Corporation. All rights reserved. Content Addressed Storage - 2

Lesson: CAS Overview


Upon completion of this lesson, you be able to: Define Content Addressed Storage (CAS)

Describe traditional archival solutions and its shortcoming


List benefits of CAS

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 3

What is Content Addressed Storage (CAS)


CAS is a solution for fixed content Object-oriented, location-independent approach to data storage Repository for the Objects Access mechanism to interface with repository Globally unique identifiers provide access to objects

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 4

What are Fixed Content and Archives


Generate New Revenues Improve Service Levels Leverage Historical Value

Digital Assets Retained For Active Reference And Value


Electronic Documents Digital Records Rich Media

Contracts, claims, etc. E-mail and attachments Financial spread sheets CAD/CAM designs Presentations

Documents
Checks, securities trades Historical preservation

Medical
X-rays, MRIs, CTI

Photographs
Personal / professional

Video
News / media, movies Security surveillance

Surveys
Seismic, astronomic, geographic

Audio
Voicemail Radio

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 5

Challenges of Storing Fixed Content


Fixed content is growing at more than 90% annually
Significant amount of newly created information falls into this category New regulations require retention and data protection

Often, long-term preservation is required (years-decades) Simultaneous multi-user online access is preferable to offline storage Need faster access to fixed content Need for location independent data, enabling technology refresh and migration Traditional storage methods are inadequate
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 6

Traditional storage solutions for Archive


Three categories of archival solution are:
Online, nearline, and offline based on the means of access

Traditional archival solution were offline


Traditional archival process used optical disks and tapes as media for archival An archive is often stored on a Write Once Read Many (WORM) device, such as a CD-ROM

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 7

Shortcomings of Traditional Archiving Solutions


Tape is slow, and standards are always changing Optical is expensive, and requires vast amounts of media

Recovering files from tape and optical is often time consuming


Data on tape and optical is subject to media degradation Both solution require sophisticated media management

CAS has emerged as an alternative to traditional


archiving solutions
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 8

Benefits of CAS
Content authenticity Content integrity

Location independence
Single-instance storage (SiS) Retention enforcement Record-level protection and disposition Technology independence

Fast record retrieval

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 9

Lesson Summary
Key points covered in this lesson: CAS Definition

Challenges of Storing Fixed Content


Shortcomings of Traditional Archiving Solutions Benefits of CAS

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 11

Lesson: CAS Architecture


Upon completion of this lesson, you will be able to: Describe CAS architecture

Describe Physical and logical elements of CAS


Describe data storage and retrieval process in CAS environment CAS examples

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 12

Physical Elements of CAS

Storage devices (CAS Based)

Servers (to which storage devices get connected)

Client
Access Nodes Private LAN

Storage Nodes

IP

CAS System

API

Server
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 13

CAS Terminology
Application Programming Interface (API)
API

A set of function calls that enables communication between applications or between an application and an operating system

Access Profile
Profiles

Used by access applications to authenticate CAS Cluster, and by CAS Clusters to authenticate themselves to each other

Pool 1

Virtual Pools
Enable a single logical cluster to be broken up into multiple logical groupings of data

BLOB
The Distinct Bit Sequence (DBS) of user data represents the actual content of a file and is independent of the filename and physical location
Content Addressed Storage - 14

2009 EMC Corporation. All rights reserved.

CAS Terminology (Cont)


C-Clip
A package containing the user's data and associated metadata C-Clip ID (C-Clip handle or C-Clip reference) is the CA that the system returns to the client application

Content Address (CA)


An identifier that uniquely addresses the content of a file and not its location. Unlike location-based addresses, content addresses are inherently stable and, once calculated, they never change and always refer to the same content

C-Clip Descriptor File (CDF)


The additional XML file that the system creates when making a C-Clip. This file includes the content addresses for all referenced BLOBs and associated metadata
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 15

How CAS Stores a Data Object


2 1
Client presents data to API to be archived
Client

4
CAS authenticates the Content Address and stores the object
CAS

Unique Content Address is calculated

Object is sent to CAS via CAS API over IP


Application Server

API

Object ID

5 6
Object-ID is retained and stored for future use
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 16

Acknowledgement returned to application

How CAS Retrieves a Data Object

4
1

Object is needed by an application


Application Server Client
API

CAS authenticates the request and delivers the object

CAS

3 2

Application finds Content Address of object to be retrieved

Object ID

Retrieval request is sent to the CAS via CAS API over IP

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 17

CAS Features
Features available with most CAS systems are:
Integrity checking Data protection
Local replication Remote replication

Load balancing Scalability Self-diagnosis and repair Report generation and event notification Fault tolerance Audit trails

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 18

Example 1: CAS Healthcare Solution


Hospital

Patient Studies

Stored locally for Short-Term Use (60 Days)

API

Data Stored on CAS CAS System

Application Server

Each X-ray image ranges from about 15MB to over 1GB

Patient record is stored online for a period of 60-90 days


Beyond 90 days patient records are archived
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 19

Example 2: CAS Financial Solution


Bank

API

Application Server

CAS System

Check image size is about 25KB Check imaging service provider may process 5090 million check images per month Checks are stored online for a period of 60 days Beyond 60 days data is archived
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 20

Lesson Summary
Key points covered in this lesson: CAS architecture

Physical and logical elements of CAS


CAS storage and retrieval process CAS solution examples

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 21

Module Summary
Key points covered in this module: Benefits of CAS based storage strategy

Overview of physical and logical elements of CAS


Storing and retrieving data from CAS CAS application examples

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 22

Concept in Practice EMC Centera


Centera Architecture
Based on RAIN (Redundant Array of Independent Node)
Access Node Storage Node
To Server 1 2 3 4 5 6 Ethernet Switch Access/Storage Nodes Content Mirrored Content Storage Nodes 4 3 6

LAN

Private LAN

1 5 2

Ethernet Switch

Power Rails
2009 EMC Corporation. All rights reserved. Content Addressed Storage - 23

Check Your Knowledge


What are the key features of a CAS implementation? What are the benefits of a CAS Storage Strategy?

What are 2 business applications that would benefit from CAS technology?
What are the logical elements of a CAS system? How does data get stored in a CAS environment?

2009 EMC Corporation. All rights reserved.

Content Addressed Storage - 24

S-ar putea să vă placă și