SS1123 - D2T - Apache Cassandra Overview PDF

a
dr
Apache Cassandra
in
ah
M
An Overview
ch
Te
Copyright © 2013 Tech Mahindra. All rights reserved. 1

What is Apache Cassandra?
a
dr
“Apache Cassandra is an open source, distributed,
in
decentralized, elastically scalable, highly available,
fault-tolerant, tuneably consistent, column-oriented
ah
database, that bases its distribution design on Amazon’s
Dynamo and its data model on Google’s Bigtable.”
M
ch
Created at Facebook, it is now used at some of the most
Te
popular sites on the Web.

Why Cassandra?
a
1.98 billion 500 GB drives
dr
in
6 fold growth
In 4 years
ah
988 EB
322 million 500GB drives
161 EB
M
ch
Te
2006 2010
Source: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf

Scalability and Big Data?
a
 You Tube Serves 200 mn Videos every day

dr
Chevron accumulates 2TB Data everyday
 Indian Telecom collects call data 155 TB per month and Growing

in
900,000 android phones provisioned by Google everyday
 By 2015 there will be 2.5 billion email accounts
ah
 By 2015 there will be 1 billion Subscribers in the telecom sector in India
 Will RDBMS ever to scale these every growing volumes?
M
ch
Te

RDBMS
a
dr
 RDBMS - Structured and organized data
 Structured query language (SQL)
in
 Data and its relationships are stored in separate tables.
 Data Manipulation Language, Data Definition Language
ah
 Tight Consistency
M
ch
Te

SQL
a
dr
 Specialized data structures (think B-trees)
 Shines with complicated queries
in
 Focus on fast query & analysis quickly
 Not necessarily on large datasets
ah
M
ch
Te

NOSQL
a
dr
 Stands for Not Only SQL
 No declarative query language (recently evolving)
in
 No predefined schema
 Key-Value pair storage, Column Store, Document Store, Graph databases -
ah
Eventual consistency rather than ACID property
 Unstructured and unpredictable data


Driven by CAP Theorem
M
Prioritizes high performance, high availability and scalability
ch
Te

NOSQL Advantages & Disadvantages
a
dr
 Advantages
– High scalability
in
– Distributed Computing
– Lower cost
ah
– Schema flexibility, semi-structure data
– No complicated Relationships

–
Disadvantages
– No standardization
M
Object-oriented programming that is easy to use and flexible
ch
– Limited query capabilities (so far)
– Eventually consistent is not intuitive to program for
Te

CAP Theorem
a
dr
 Consistency:
– If we wrote a data in one node and read it from another node in a
in
distributed system, it will return what I wrote on the other node.
 Availability:
ah
– Each node of the distributed system should respond to the query unless it
dies.
 Partition-Tolerance:
M
– This shows the availability and seamless operation of the distributed
system even with the partition (add/remove node from different data center)
ch
or message loss over the network.
Te

Selecting the DB type
 CA
a
– To primarily support Consistency and Availability means that you’re likely
dr
using two-phase commit for distributed transactions. It means that the
system will block when a network partition occurs, so it may be that your
in
system is limited to a single data center cluster in an attempt to mitigate
this. If your application needs only this level of scale, this is easy to
ah
manage and allows you to rely on familiar, simple structures.
 CP
M
– To primarily support Consistency and Partition Tolerance, you may try to
advance your architecture by setting up data shards in order to scale. Your
data will be consistent, but you still run the risk of some data becoming
ch
unavailable if nodes fail.
 AP
Te
– To primarily support Availability and Partition Tolerance, your system may

return inaccurate data, but the system will always be available, even in the
face of network partitioning. DNS is perhaps the most popular example of a
system that is massively scalable, highly available, and partition-tolerant.

BASE, an alternative to ACID
a
dr
 ACID
– Atomic
in
– Consistent
– Isolation
ah
– Durability
– All of the above but not SCALABLE
 BASE
– Basic Availibility
– Soft-State
M
ch
– Eventual Consistency
– All of the Above but not Strongly Consistent
Te

Enter Cassandra
 Amazon Dynamo
a
– Consistent hashing
dr
– Partitioning
– Replication
in
– One-hop routing
 Google BigTable
ah
– Column Families
– Memtables
– SSTables
M
ch
Te

Distributed and Scalable
a
 Horizontal - commodity hardware, not specialized boxes
dr
 All nodes are identical
in
 No master or SPOF
ah
 Adding is simple
 Automatic cluster maintenance

M
ch
Te

Replication
a
dr
 Replication factor
– How many nodes data is replicated on
in
 Consistency level
– Zero, One, Quorum, All
ah
 Sync or async for writes
 Reliability of reads
– Read repair
M
ch
Te

Ring Topology
a
RF=3
dr
Conceptual Ring
in
a
One token per
ah
node
Multiple ranges M
ch
per node j d
Te
g
Ring Topology
a
RF=2
dr
Conceptual Ring
in
a
One token per
ah
node
Multiple ranges M
ch
per node j d
Te
g
New Node
a
RF=3
dr
Token assignment
in
a
Range adjustment
ah
m
Bootstrap
M
ch
Arrival only affects j d
immediate
Te
neighbors
g
Ring Partition
a
RF=3
dr
Node dies
in
a
Available?
ah
Hinting
Handoff
M
ch
Plan for this j d
Te
g
Schema-free Sparse-table
a
dr
 Flexible column naming
 You define the sort order
in
 Not required to have a specific column just because another row does
ah
M
ch
Te

Data Model Concepts
a
 Apache Cassandra DataModel has 4 main concepts
dr
– Cluster
– KeySpace
in
– Column Family
 A column family contains multiple columns referenced by a row key
ah
– Super Column Family
M
ch
Te

Cluster
a
dr
 Cassandra is meant to run on a cluster
 Although cassandra can run stand-alone, it defeats the purpose of what it is
in
built for
 Cluster is arranged as a ring of nodes
ah
 Clients send read/write requests to any node in the ring
 That node takes on the role of coordinator node, and forwards the request to


the node responsible for servicing it.
M
A partitioner decides which nodes store which rows.
Cluster is container for keyspaces
ch
Te

Keyspace
a
dr
 A keyspace is a namespace to group multiple column families, typically one
per application. keyspace is the outermost container for data in Cassandra
in
 The basic attributes that you can set per keyspace are
– Replication factor
ah
 Refers to the number of nodes that will act as copies
– Replica placement strategy
– There are different strategies

– SimpleStrategy (Single Data Center)
M
 refers to how the replicas will be placed in the ring
ch
– NetworkTopologyStrategy (Across Data Centers)
Te

Column Family (Table)
a
dr
 A column family is roughly analogous to a table in the relational model
 It is a container for a collection of rows
in
 Each row can have a different set of columns
 Column Family can have types
ah
– Static Column Family
– Static Set of columns
– Dynamic Column Family
M
– Can use application supplied column names to store data
ch
Te

Column
a
dr
 The column is the smallest increment of data in Cassandra.
 It is a tuple containing a name, a value and a timestamp.
in
 A column must have a name, and the name can be a static label (such as
name” or “email”) or it can be dynamically set when the column is created by
ah
your application
M
ch
Te

Super Column
a
dr
 A Cassandra column family can contain either regular columns or super
columns , which adds another level of nesting to the regular column family
in
structure.
 Super columns are comprised of a (super) column name and an ordered map
ah
of sub-columns.
 A super column can specify a comparator on both the super column name as
well as on the sub-column names
M
ch
Te

Bird’s Eye View
a
dr
in
ah
M
ch
Te

Data Model
a
dr
• Keyspace
• ColumnFamily
in
• Row (indexed)
ah
• Key
• Columns
 Name (sorted)
M
ch
 Value
Te

Data Model
a
dr
in
A single column
ah
M
ch
Te

Data Model
a
dr
A single row
in
ah
M
ch
Te

Data Model
a
dr
in
ah
M
ch
Te

Why Key-value Store?
a
dr
 (Business) Key -> Value
 (twitter.com) tweet id -> information about tweet
in
 (kayak.com) Flight number -> information about flight, e.g., availability
 (yourbank.com) Account number -> information about it
ah
 (amazon.com) item number -> information about it
 Search is usually built on top of a key-value store

M
ch
Te

Isn’t that just a database?
a
dr
 Yes
 Relational Databases
in
(RDBMSs) have
been around for ages
ah
 Data stored in tables
 Schema-based, i.e.,
structured tables
 Queried using SQL M
ch
Te
SQL queries: SELECT user_id from users WHERE

username = “jbellis”

Cassandra Data Model
 Column Families:
 Like SQL tables
 but may be unstructured
a
(client-specified)
dr
 Can have index tables
 Hence “column-
in
oriented databases”/
“NoSQL”
ah
 No schemas
 Some columns missing
from some entries
 “Not Only SQL”
 Supports get(key) and M
ch
put(key, value) operations
 Often write-heavy
workloads
Te

Eventually Consistent
a
 CAP Theorem
dr
– Consistency
– Availability
in
– Partition Tolerance
 Choose two
ah
– Cassandra chooses A and P
M
ch
Te

Tunable Consistency
a
dr
 Give up a little A and P to get more C
 Ratchet up the consistency level
in
 R + W > N  Strong consistency
ah
 More to come
M
ch
Te

Inserting: Overview
a
dr
 Simple: put(key, col, value)
 Complex: put(key, [col:value, …, col:value])
in
 Batch: multi key.
ah
M
ch
Te

Inserting: Writes
 Commit log for durability
a
dr
 Configurable fsync
 Sequential writes only
in
 Memtable – no disk access
ah
(no reads or seeks)
 Sstables are final (become
read only)
 Indexes
 Bloom filter
M
ch
 Raw data
Te
 Bottom line: FAST!!!

Querying: Overview
a
 You need a key or keys:
dr
 Single: key=‘a’
 Range: key=‘a’ through ’f’
 And columns to retrieve:
in
 Slice: cols={bar through kite}
ah
 By name: key=‘b’ cols={bar, cat, llama}
 Nothing like SQL “WHERE col=‘faz’”
 But secondary indices are being worked on
M
ch
Te

Querying: Reads
a
 Practically lock free
dr
 Sstable proliferation
 New in 0.6:
in
 Row cache (avoid sstable
ah
lookup, not write-through)
 Key cache (avoid index
scan)
M
ch
Te

Practical Considerations
• Partitioner-Random or Order Preserving
a
– Range queries
dr
• Provisioning
– Virtual or bare metal
in
– Cluster size
• Data model
ah
– Think in terms of access
– Giving up transactions, ad-hoc queries, arbitrary indexes and joins
• (you may already do this with an RDBMS!)
M
ch
Te

a
dr
 Wide rows
 Data life-span
in
 Cluster planning
 Bootstrapping
ah
M
ch
Te

a
dr
 Wide rows
 Data life-span
in
 Cluster planning
– Bootstrapping
ah
M
ch
Te

Future Direction
a
dr
 Vector clocks (server side conflict resolution)
 Alter keyspace/column families on a live cluster
in
 Compression
 Multi-tenant features
ah
 Less memory restrictions
M
ch
Te

Wrapping Up
a
dr
 Use Cassandra if you want/need
– High write throughput
in
– Near-linear scalability
– Automated replication/fault tolerance
ah
– Can tolerate missing RDBMS features
M
ch
Te

a
dr
Thank You!
in
ah
M
ch
Te

SS1123 - D2T - Apache Cassandra Overview PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

SS1123 - D2T - Apache Cassandra Overview PDF

Încărcat de

Drepturi de autor:

Formate disponibile

a

Copyright © 2013 Tech Mahindra. All rights reserved. 1

popular sites on the Web.

Copyright © 2013 Tech Mahindra. All rights reserved. 2

Copyright © 2013 Tech Mahindra. All rights reserved. 3

Copyright © 2013 Tech Mahindra. All rights reserved. 4

Copyright © 2013 Tech Mahindra. All rights reserved. 5

Copyright © 2013 Tech Mahindra. All rights reserved. 6

Copyright © 2013 Tech Mahindra. All rights reserved. 7

Copyright © 2013 Tech Mahindra. All rights reserved. 8

Copyright © 2013 Tech Mahindra. All rights reserved. 9

– To primarily support Availability and Partition Tolerance, your system may

Copyright © 2013 Tech Mahindra. All rights reserved. 10

Copyright © 2013 Tech Mahindra. All rights reserved. 11

Copyright © 2013 Tech Mahindra. All rights reserved. 12

 Automatic cluster maintenance

Copyright © 2013 Tech Mahindra. All rights reserved. 13

Copyright © 2013 Tech Mahindra. All rights reserved. 14

Copyright © 2013 Tech Mahindra. All rights reserved. 19

Copyright © 2013 Tech Mahindra. All rights reserved. 20

Copyright © 2013 Tech Mahindra. All rights reserved. 21

– There are different strategies

Copyright © 2013 Tech Mahindra. All rights reserved. 22

Copyright © 2013 Tech Mahindra. All rights reserved. 23

Copyright © 2013 Tech Mahindra. All rights reserved. 24

Copyright © 2013 Tech Mahindra. All rights reserved. 25

Copyright © 2013 Tech Mahindra. All rights reserved. 26

Copyright © 2013 Tech Mahindra. All rights reserved. 27

Copyright © 2013 Tech Mahindra. All rights reserved. 28

Copyright © 2013 Tech Mahindra. All rights reserved. 29

Copyright © 2013 Tech Mahindra. All rights reserved. 30

 Search is usually built on top of a key-value store

Copyright © 2013 Tech Mahindra. All rights reserved. 31

SQL queries: SELECT user_id from users WHERE

Copyright © 2013 Tech Mahindra. All rights reserved. 32

Copyright © 2013 Tech Mahindra. All rights reserved. 33

Copyright © 2013 Tech Mahindra. All rights reserved. 34

Copyright © 2013 Tech Mahindra. All rights reserved. 35

Copyright © 2013 Tech Mahindra. All rights reserved. 36

 Bottom line: FAST!!!

Copyright © 2013 Tech Mahindra. All rights reserved. 37

Copyright © 2013 Tech Mahindra. All rights reserved. 38

Copyright © 2013 Tech Mahindra. All rights reserved. 39

Copyright © 2013 Tech Mahindra. All rights reserved. 40

Copyright © 2013 Tech Mahindra. All rights reserved. 41

Copyright © 2013 Tech Mahindra. All rights reserved. 42

Copyright © 2013 Tech Mahindra. All rights reserved. 43

Copyright © 2013 Tech Mahindra. All rights reserved. 44

Copyright © 2013 Tech Mahindra. All rights reserved. 45

S-ar putea să vă placă și