Sunteți pe pagina 1din 56

Using AWS for Backup and Restore

Backup in the cloud, Backup to the cloud, and Recovery options

Neel Mitra, Solutions Architect


Sep 7, 2017
Backup and recovery before the cloud

Tape storage

Application Media Data bunker


servers server
Local disk
Backup challenges in today’s age

• IDC estimates the volume of digital data


will grow 40% to 50% per year. By
2020, IDC predicts the number will have
reached 40,000 EB, or 40 Zettabytes
(ZB).
a
• The world’s information is doubling
Primary Storage every two years.
Primary Storage

Amazon EFS

• Primary Storage provides file, block and object storage targets. Targets can either be extensions into the on-
premise environment or a pure cloud implementation.
• Primary storage provides first level storage of data to customer workloads
• Storage for a variety of customer workloads
• File distribution services
• Gateway for IP storage protocols
• Replication of storage via native replication mechanisms
Backup and Recovery

Backup and recovery use cases protect data from logical errors such as system failure,
application error, or accidental deletion. Backups can be run on-premise to the cloud,
either directly to a cloud target or via a gateway appliance, or within the cloud.
Backup is not archive
• Backup represents a point in time copy of the data.
• Archived data is the only authoritative copy of the data.
Archive

• The Archive use case allows the migration of important, but infrequently used data to storage
devices of the appropriate cost and resiliency. Frees existing “primary” storage for new or
frequently accessed data, achieving both a potential cost and performance advantage for the
customer.
• Archives move data between different classifications of storage
• Archive is not backup/recovery
– Backup represents a point in time copy of the data. There may be many copies of the data depending on the number of backups
that have been completed.
Backup vs. Archive
Backup Archive
Number of copies for one piece of
data
Many 1

Growth of the repository over time Exponential Linear


Contains “the” copy of data? No Yes
Point in time copy of data? Yes Yes
Select individual pieces of data
based on policy
Not really Yes

Backups held for “long” periods of


time  
Remember:
• A Backup makes a copy of the data and keeps as many copies as needed.
• An Archive moves data between different classifications of storage but does not make any copies.
• Long term backups make life hard for business…
• Costly
• Hard to track
Backup to the Cloud & Recovery
What should I use and when?

Amazon S3 Amazon Glacier AWS Storage AWS Snowball


Durable object storage Archival storage Gateway Petabyte-scale data
for all types of data for infrequently Hybrid Storage service transport solution
accessed data

Economics Easy to use Reduce risk Agility, scale


 Pay as you go  Self service administration  Durable and secure  Reduce time to market

 No upfront investment  SDKs for simple  Avoid risks of physical  Focus on your business,
integration media handling not your infrastructure
 No commitment
 No risky capacity
planning
Backup and recovery to the cloud
Cloud Connector

Local disk
Internet Amazon S3
Application Media server
servers with cloud
connector Amazon S3-IA

Amazon
AWS
Cloud Gateway Direct Glacier
Connect

Cloud Gateway

Application Media
servers server
Local disk
Cloud Connector

1. Direct Amazon S3 / Glacier API/SDK


2. Amazon S3 lifecycle integration
3. Third-party tools and gateways

These are only a few examples of APN Technology partners with S3 connectors
Cloud Gateways for Backup
Customer premises

AWS Storage AWS


Gateway Storage Gateway Amazon S3
Internet
back-end
Gateway
Amazon
appliances
Glacier
Application AWS
server Direct Altavault Amazon EBS
Connect appliance in snapshots
EC2
AWS Storage Gateway
Hybrid Storage Service
Storage Gateway for Backup & Recovery

• Deploy appliance ‘locally’ as a VM


Amazon S3 Amazon in ESX, Hyper-V or EC2
Glacier
• Connect to local applications,
including backup servers
Storage Amazon EBS
snapshots • Backup & archive in S3, Glacier
Gateway
• Backup volumes as EBS Snapshots
• Restore on-premises or in the cloud
Virtual Tapes Archived Tapes
stored in
Amazon S3
stored in
Amazon Glacier • Works with major backup vendors
Backing up to AWS via Storage Gateway
3 options to write on-premises backups to AWS
Customer Premises or EC2 Customer Bucket

File S3 S3-IA Glacier


Standard
Gateway
Customer Environment
iSCSI S3

Backup Volume Volume Gateway EBS Snapshots


Server Gateway

Tape S3 Glacier
Gateway Tape Gateway VTL
File Gateway

Customer Premises

NFS
HTTPS
v3 / v4.1

Application File S3 S3 Glacier


Server Gateway Standard Standard -
Infrequent
Access
Data (including backups) stored and retrieved from your S3 buckets
1-1 mapping from files-to-objects
File metadata stored in object metadata
Bucket access managed by IAM role you own and manage
Use S3 Lifecycle Policies, versioning, or CRR to manage data
Volume Gateway
On-premises volume storage backed by Amazon S3 with EBS snapshots
Customer Premises

iSCSI HTTPS

Application Volume Storage Gateway Amazon


Server Gateway bucket in EBS
Amazon S3
snapshots

Block storage in S3 accessed via the volume gateway


Compression of data in-transit and at-rest
Backup on-premises volumes to EBS snapshots
Create on-premises volumes from EBS snapshots
Up to 1PB of total volume storage per gateway
Can be used by backup apps, e.g. Veeam, to write to AWS and recover in EC2
Volume Gateway
GATEWAY-CACHED

Customer Data Center

AWS Storage

INITIATOR
iSCSI Gateway VM

TARGET
INITIATOR HTTPS
iSCSI
Client
AWS Volume Amazon EBS
Storage storage backed snapshots
Gateway service by Amazon S3
Upload
Application buffer
Cache
servers storage

Users
Volume Gateway
GATEWAY-STORED

Customer data center

AWS Storage

INITIATOR
iSCSI Gateway VM

TARGET
INITIATOR
iSCSI
Client
AWS Storage Amazon EBS
Gateway service snapshots
Upload
Application buffer
Volume
servers volume
storage

Users
Tape Gateway
Virtual tape storage in Amazon S3 and Glacier with VTL management
Customer Premises

DRIVE CHANGER
MEDIA
iSCSI HTTPS

Tape Virtual Tapes Archived Tapes

TAPE
Backup stored in stored in
Server Gateway Amazon S3 Amazon Glacier

Virtual tape storage in S3 and Glacier accessed via tape gateway


Compressed of data in-transit and at-rest
Up to 1 PB total tape storage per gateway, unlimited archive capacity
Supports leading backup applications
3-5 hour retrieval of virtual tapes from Glacier
Backup, archive, and disaster recovery
Cost effective storage in AWS with local or cloud restore

“Tapes are a headache. AWS Storage Gateway


provided the most cost-effective and simple alternative.
We switched from physical to virtual tape backup simply by dropping the
gateway’s virtual appliance into our existing Veeam workflow.
Setting it all up took three hours, at most.
We even got disaster recovery by using a bi-coastal data center.”
-Jesse Martinich, Network Services Manager, SOU
AWS Snowball
Petabyte-scale data transport solution
What is AWS Snowball?
Petabyte-scale data transport
Ruggedized case
“8.5G impact”

80 TB
10 GE network
Rain- and dust-
resistant

Tamper-resistant
case and
electronics

All data encrypted


end-to-end
E-ink shipping
label
How it works
How fast is Snowball?
 Less than 1 day to transfer 250 TB through 5 x 10G connections with 5
Snowballs, less than 1 week, including shipping
 Number of days to transfer 250 TB through the Internet at typical utilizations

Internet connection speed


Utilization 1 Gbps 500 Mbps 300 Mbps 150 Mbps
25% 95 190 316 632
50% 47 95 158 316
75% 32 63 105 211
Customer Use Case: Backup and Archive with
Snowball

rawdata1
rawdata2
rawdata3 Archive after Delete after
30 days 7 years

My S3 bucket Amazon Glacier


PetroBank Archive Service Migrated from Tape to Cloud
Cost effective storage in AWS with local data access

Self service loading of data


Reduced time-to-data by days or weeks
Cut storage archive costs by 90%

AWS Lambda
automated functions,
including inventory
AWS Snowball
initial bulk transfer

PetroBank
application
File Gateway Amazon S3 Amazon S3 Amazon
servers
continuous file AWS Direct Standard Infrequent Access Glacier
access & upload, Connect
with local cache Lifecycle policies migrate data
across storage tiers
Backup in the cloud
What should I use and when?

Amazon EBS Amazon S3 Amazon Glacier Amazon EFS


Block storage for use Durable object storage Archival storage File storage for use
with Amazon EC2 for all types of data for infrequently with Amazon EC2
accessed data

Reduce risk Economics Easy to use Agility, scale


 Durable and secure  Self service administration  Reduce time to market
 Pay as you go
 Avoid risks of physical  SDKs for simple  Focus on your business,
 No upfront investment
media handling integration not your infrastructure
 No commitment
 No risky capacity
planning
Amazon EBS
Block storage for use with Amazon EC2
Amazon EBS Lifecycle

AWS Cloud
EC2 Availability Zone Amazon S3
Create Snapshot
EBS EBS EBS EBS EBS EBS
EBS Snapshot

EBS Snapshot
Clone From
Snapshot EBS Snapshot

EC2 EC2 EC2 EBS Snapshot

EBS Snapshot

Internet
How Do Snapshots Work?
Time

Snapshot 1 Snapshot 2 Snapshot 3

S3
EBS Volume
Block 11
Chunk
Block 22
Chunk
Block 33
Chunk
Block 44
Chunk
Benefits of using EBS snapshots

More durable than an EBS volume


• Stored in Amazon S3
Incremental (space-efficient)
• First snapshot is a clone
• Pay only for what you use
Availability Zone-independent
• Clone into any AZ
Can be copied efficiently across regions
AWS Database Backups
RDS for MySQL, PostgreSQL, MariaDB, Oracle, SQL Server
• Scheduled daily backup of entire instance
• Archive database change logs
• 35 day retention for backups
• Multiple copies in each AZ where you have instances for a deployment

Aurora
• Automatic, continuous, incremental backups
• Point-in-time restore
• No impact on database performance
• 35 day retention

DIY on EC2
• Engine specific (RMAN, BAK)
• Third party (GoldenGate, Commvault)
Amazon S3
Durable object storage for all types of data
Amazon S3 Lifecycle

Use Amazon S3
for reliable, durable
primary storage

S3-IA

Use Amazon S3 Use Amazon Glacier


Infrequent Access for lowest-cost, durable cold
Storage storage of archival data
for secondary backups
at a lower cost
S3 lifecycle policies
Key prefix “logs/”
Transition objects to Amazon Glacier 30 days after
creation
Delete 365 days after creation date

<LifecycleConfiguration>
<Rule>
<ID>archive-in-30-days</ID>
<Prefix>logs/</Prefix>
<Status>Enabled</Status>
<Transition>
<Days>30</Days>
<StorageClass>GLACIER</StorageClass>
</Transition>
<Expiration>
<Days>365</Days>
</Expiration>
</Rule>
</LifecycleConfiguration
Cross-region replication: Details

Replication status Access control Cost Delete operation

HEAD operation on a source Object ACL updates are • Usual charges for DELETE without object
object to determine replication replicated storage, requests, and version ID
status • Objects with Amazon inter-region data transfer • Marker replicated
managed encryption key for the replicated copy of
• Replicated objects will not be replicated data DELETE specific object
re-replicated • KMS encryption not version ID
replicated • Replicate into Standard-IA • Marker NOT replicated
• Use Amazon S3 COPY to or Amazon Glacier
replicate existing objects
Versioning with cross-region replication

Vid1- v4
Vid1- v3

Vid1- v2 Vid1- v2

A
Vid1- v1 Vid1- v1

Key: A/vid1 Key: B/vid1


Why Amazon Web Services

Druva runs inSync Cloud on AWS using


Amazon Elastic Compute Cloud
(Amazon EC2) for compute, Amazon
Elastic Block Store (Amazon EBS) for
storage volume, Amazon Relational
Database Service (Amazon RDS) for
configuration management, and Amazon
Simple Storage Service (Amazon S3) for
storage.
Amazon Glacier
Archival storage for infrequently accessed data
Amazon Glacier Lifecycle

1 Create vault 3 Upload archives


UploadArchive(data) ->
Archive ID

2 Configure access policies


User policy
Effect:Allow
Resource:
arn:aws:glacier:<accountId>:vaults
Action: glacier:UploadArchive
Using vault lock policy with vault access policy
Compliance/Governance Flexibility
Vault lock policy Vault access policy
• Lockable/Immutable policy • Can be updated/deleted
• Cannot be updated/deleted
after lockdown

Use vault lock policy to: Use vault access policy to:
• Deploy regulatory controls such • Designate third-party access
as records retention
• Enforce data access through • Grant temporary read
multi-factor authentication only permissions when necessary
Vault lock best practices
• Map one vault to a single retention range
– Group regulatory data by retention: 1-year vault, 6-year vault, etc.
• Create new vault and lock it before storing production data
– Enforce the full ArchiveAgeInDays on all new archives
– Leave no “gap” on existing archives
• Thoroughly test a vault lock policy before locking it down (Abort/Initiate)
• Implement only the most restrictive controls with vault lock
– Leave the flexible controls to vault access policy
Amazon Glacier received a third-party assessment
from Cohasset Associates on how Amazon Glacier
with Vault Lock can be used to meet the
requirements of SEC 17a-4(f) and CFTC 1.31(b)-(c).
SoundCloud—leveraging Glacier for audio
transcoding

• World’s leading social sound


platform
• Audio files must be transcoded and
stored in multiple formats

S3
Glacier
Amazon EFS
File storage for use with Amazon EC2
Amazon EFS Backup

• Automated EFS backups based on a


schedule that you define (for example,
hourly, daily, weekly, or monthly)

• Automated rotation of the backups,


where the oldest backup is replaced
with the newest backup based on the
number of backups that you want to
retain
Amazon EFS Restore

• Restore a backup copy of an Amazon


EFS file system

• Restores can be done in parallel to


meet the recovery time objective

• Restore individual files from EFS


Backups
Why EFS for Database Backup

 Can be used with native backup commands


- ie. dump, RMAN, “hot-backup” mode
 Copy is stored to another storage target for availability
- production copy runs on EBS
- backup copy is on EFS
 Can be managed by the database administrators
- to meet their specific recovery points
- easy to restore online
 High performance network shares provide for fast recovery vs. tape
 Saves licensing costs and workload from traditional backup software
The Arcesium platform leverages Amazon EFS for shared data storage
between applications and for storing and analyzing operational data.
“Arcesium is a financial services SaaS platform that requires resilient,
secure, and scalable file storage. Amazon EFS offers us a powerful
way to operate and scale file storage for our Amazon EC2 instances,
which has allowed us to build out our platform quickly without
compromising quality.”

-- Gaurav Suri, CEO

“We are growing by leaps and bounds, and our core offering is all about better
support delivery. During the course of developing our next-generation internal
support system, we never wanted to worry about scale again, yet we had
existing architectural commitments that meant a distributed file solution was
required. Atlassian chose Amazon EFS because it was the only option
available that scaled both capacity and performance – without the up-front
payments or the management overhead of traditional models. This allows our
support teams to focus on what matters most - helping our customers.”

- Sri Viswanath, CTO


Customer References
Public Sector – King County

• Most populous county in Washington State


• Replace tape solution for backup from 17 agencies
• Meet compliance requirement
• Saved $1MM in first year, no more tape refresh or
management churn

https://aws.amazon.com/solutions/case-studies/king-county/
China Expansion – iQIYI

• 2nd largest Online Video Service – 100MM+ monthly viewers


• Self managed Swift cluster out of capacity
• 5PB media assets/stats, secondary back up on Glacier

https://aws.amazon.com/cn/solutions/case-studies/iqiyi/
AWS External Resources
• AWS Storage Solution Pages
– Backup, Archive and Disaster Recovery

• AWS Storage Competency and Storage Test Drives


– AWS Storage Competency
– APN Partner-provided labs

• AWS Marketplace Storage for in-cloud use cases


– AWS Online Software Store

https://aws.amazon.com/training
• Select Partner Microsites – additional in plan
– www.netapp.com/aws
– www.commvault.com/aws
– www.averesystems.com/aws
Thank you!

S-ar putea să vă placă și