Sunteți pe pagina 1din 161

MongoDB

E-commerce
and Transactions
My name is
Steve Francia

@spf13
• 15+ years building the
internet

• Father, husband, skateboarder


• Chief Solutions Architect @
10gen

• Author of upcoming O’Reilly


publication
“MongoDB and PHP”
Before 10gen I
worked
for

http://opensky.com
OpenSky was the first
e-commerce site built
on MongoDB
... also the first e-commerce site built on
Symfony2
Introduction to
MongoDB
Why MongoDB?
MongoDB Goals
• Open source
• Designed for today
• Today’s hardware / environments
• Today’s challenges
• Great developer experience
• Reliable
• Scalable
• Company behind MongoDB
• AGPL license, own copyrights, engineering
team
• support, consulting, commercial license
revenue
• Management
• Google/DoubleClick, Oracle, Apple, NetApp
• Funding: Sequoia, Union Square, Flybridge
• Offices in NYC, Redwood Shores & London
• 60+ employees
A bit of history
1974
The relational database is created
1979
1979 1982-1996
1979 1982-1996 1995
Computers in 1995

•Pentium 100 mhz


•10base T
•16 MB ram
•200 MB HD
Cloud in 1995
(Windows 95 cloud wallpaper)
Cell Phones in 2011

•Dual core 1.5 Ghz


•WiFi 802.11n (300+ Mbps)
•1 GB ram
•64GB Solid State
How about a DB
designed for today?
MongoDB philosophy
• Keep functionality when we can (key/
value stores are great, but we need more)
• Non-relational (no joins) makes scaling
horizontally practical
• Document data models are good
• Database technology should run
anywhere VMs, cloud, metal, etc
MongoDB is:
Application Document
Oriented
{ author: “steve”,
High date: new Date(),
text: “About MongoDB...”,
Performance tags: [“tech”, “database”]}

Fully
Consistent

Horizontally Scalable
Under the hood

•Written in C++
•Runs on nearly anything
•Data serialized to BSON
•Extensive use of memory-mapped files
Database Landscape
This has led
some to say


MongoDB has the best
features of key/ values
stores, document databases
and relational databases in
one.
John Nunemaker
Use Cases
CMS / Blog
Needs:
• Business needed modern data store for rapid development and
scale

Solution:
• Use PHP & MongoDB

Results:
• Real time statistics
• All data, images, etc stored together, easy access, easy
deployment, easy high availability
• No need for complex migrations
• Enabled very rapid development and growth
Photo Meta-Data
Problem:
• Business needed more flexibility than Oracle could deliver

Solution:
• Use MongoDB instead of Oracle

Results:
• Developed application in one sprint cycle
• 500% cost reduction compared to Oracle
• 900% performance improvement compared to Oracle
Customer Analytics
Problem:
• Deal with massive data volume across all customer sites

Solution:
• Use MongoDB to replace Google Analytics / Omniture options

Results:
• Less than one week to build prototype and prove business case
• Rapid deployment of new features
Online Dictionary
Problem:
• MySQL could not scale to handle their 5B+ documents

Solution:
• Switched from MySQL to MongoDB

Results:
• Massive simplification of code base
• Eliminated need for external caching system
• 20x performance improvement over MySQL
E-commerce
Problem:
• Multi-vertical E-commerce impossible to model (efficiently) in
RDBMS

Solution:
• Switched from MySQL to MongoDB

Results:
• Massive simplification of code base
• Rapidly build, halving time to market (and cost)
• Eliminated need for external caching system
• 50x+ improvement over MySQL
Tons more
Pretty much if you can use a RDMBS or Key/Value
MongoDB is a great fit
In Good Company
Why NoSQL for
e-commerce?

Using the right solution for each situation


Data dilemma of
e-commerce
Pick One
Data dilemma of
e-commerce
Pick One

•Stick to one vertical (Sane schema)


•Flexibility (Insane schema)
Sane schema
Sane schema

•Works ... for a while


•Fine for a few types of products
•Not possible when more product types
introduced
Let’s Use an Example
Let’s Use an Example
How about we start with books
Book Product Schema
Product {

id:
sku: General Product
product dimensions:
shipping weight:
attributes
MSRP:
price:
description:
...
author: Orson Scott Card
title: Enders Game
binding: Hardcover
publication date: July 15, 1994 Book Specific
publisher name: Tor Science Fiction attributes
number of pages: 352
ISBN: 0812550706
language: English
...
Seems simple enough
What happens when we add another vertical...
say music albums
Album Product Schema
Product {

id:
sku: General Product
product dimensions: attributes stay
shipping weight:
MSRP:
the same
price:
description:
...
artist: MxPx
title: Panic Album Specific
release date: June 7, 2005
label: Side One Dummy
attributes are
track listing: [ The Darkest ... different
language: English
format: CD
...
Okay, it’s getting
hairy but is still
manageable, right?
Now the business want to sell jeans
Jeans Product Schema
Product {

id: General Product


sku: attributes stay the
product dimensions: same
shipping weight:
MSRP:
price:
description:
...
brand: Lucky Jeans specific
gender: Mens attributes are
make: Vintage
totally different ...
style: Straight Cut
length: 34 and not consistent
width: 34 across brands &
color: Hipster
make
material: Cotten Blend
...
Now we’re screwed
Now we’re screwed
We need a flexible
schema in RDBMS
We need a flexible
schema in RDBMS

We got this ... right?


Many approaches
dealing with unknown
unknowns in RDBMS
Many approaches
dealing with unknown
unknowns in RDBMS

None work well


EAV
as popularized by Magento
“For purposes of flexibility, the Magento database heavily
utilizes an Entity-Attribute-Value (EAV) data model.

As is often the case, the cost of flexibility is complexity -


Magento is no exception.

The process of manipulating data in Magento is often


more “involved” than that typically experienced using
traditional relational tables.”
- Varien
EAV
• Crazy SQL queries
• Hundreds of joins in a
query... or

• Hundreds of queries joined


in the application

• No database enforced
integrity
Did I say crazy SQL
(this is a single query)
Did I say crazy SQL
(this is a single query)

You may have trouble reading this in the back


Selecting a single
product
Single Table Inheritance
(insanely wide tables)

• No data integrity enforcement


• Only can use FK for common
elements

• Very wasteful (but disk is


cheap!)

• Can’t effectively index


Generic Columns
• No data integrity enforcement
• No data type enforcement
• Only can use FK for common
elements

• Wasteful (but disk is cheap!)


• Can’t index
Serialized in Blob
• Not searchable
• No integrity
• All the disadvantages of a
document store, but none of the
advantages

• Never should be used


• One exception is Oracle XML
which operates similar to a
document store
Concrete Table Inheritance
(a table for each product attribute set)

• Allows for data integrity


• Querying across attribute
sets quite hard to do (lots
of joins, OR statements
and full table scanning)

• New table needs to be


created for each new
attribute set
Class table inheritance
(single product table,
each attribute set in own table)
• Likely best solution within the
constraint of SQL

• Supports data type enforcement


• No data integrity enforcement
• Easy querying across categories
(for browse pages) since
common data in single table

• Every set needs a new table


• Requires a ton of forsight, as
changes are very complicated
MongoDB to the
Rescue
MongoDB to the
Rescue
•Flexible (and sane) Schema
MongoDB to the
Rescue
•Flexible (and sane) Schema
•Easily searchable
MongoDB to the
Rescue
•Flexible (and sane) Schema
•Easily searchable
•Easily accessible
MongoDB to the
Rescue
•Flexible (and sane) Schema
•Easily searchable
•Easily accessible
•Fast
Flexible schema
{ {
sku: "00e8da9c", sku: "00e8da9d",
type: "Audio Album", type: "Film",
title: "Hoss", title: "The Matrix",
description: "by Lagwagon", description: "Set in the 22nd century, Th
asin: "B0000007QG", asin: "B000P0J0AQ",

shipping: { shipping: {
weight: 6, weight: 6,
dimensions: { dimensions: {
width: 10, width: 10,
height: 10, height: 10,
depth: 1 depth: 1
}, },
}, },

pricing: { pricing: {
list: 1000, list: 1200,
retail: 800, retail: 1100,
savings: 200, savings: 100,
pct_savings: 20 pct_savings: 8.5
}, },

details: { details: {
title: "Hoss", title: "The Matrix",
pct_savings: 20 pct_savings: 8.5
}, },

details: { details: {
title: "Hoss", title: "The Matrix",
artist: "Lagwagon", director: [ "Andy Wachowski", "Larry Wa
genre: [ "Punk", "Hardcore", "Indie Rock"
writer:
], [ "Andy Wachowski", "Larry Wach
label: "Fat Wreck Chords", actor: [ "Keanu Reeves" , "Lawrence Fis
number_of_discs: 1, genre: [ "Science Fiction", "Action" ],
issue_date: "November 21, 1995", number_of_discs: 1,
format: "CD", issue_date: "May 15 2007",
alternate_formats: [ 'Vinyl', 'MP3' ],original_release_date: "1999",
tracks: [ disc_format: "DVD",
"Kids Don't Like To Share", rating: "R",
"Violins", alternate_formats: [ 'VHS', 'Bluray' ],
"Name Dropping", run_time: "136",
"Bombs Away", studio: "Warner Bros",
"Move The Car", language: "English",
"Sleep", format: [ "AC-3", "Closed-captioned", "
"Sick", aspect_ratio: "1.66:1"
"Rifle", },
"Weak", }
"Black Eye",
"Bro Dependent",
"Razor Burn",
"Shaving Your Head",
"Ride The Snake",
],
Queries
db.products.find( { 'name': "The Matrix" } );
db.products.find( { 'name': "The Matrix" } );

{
"_id": ObjectId("4d8ad78b46b731a22943d3d3"),
"sku": "00e8da9d",
"type": "Film",
"name": "The Matrix",
"description": "Set in the 22nd century, The Matrix...",
"asin": "B000P0J0AQ",
"shipping": {
"weight": 6,
"dimensions": {
"width": 10,
"height": 10,
"depth": 1
}
},
"pricing": {
db.products.find( { 'details.actor': "Groucho Marx" } );
db.products.find( { 'details.actor': "Groucho Marx" } );

},
"pricing": {
"list": 1000,
"retail": 800,
"savings": 200,
"pct_savings": 20
},
"details": {
"title": "A Night at the Opera",
"director": "Sam Wood",
"actor": ["Groucho Marx", "Chico Marx", "Harpo Marx"],
"genre": "Comedy",
"number_of_discs": 1,
"issue_date": "May 4 2004",
"original_release_date": "1935",
"disc_format": "DVD",
db.products.find( {
'details.genre': "Jazz", 'details.format': "CD"
} );
db.products.find( {
'details.genre': "Jazz", 'details.format': "CD"
} );

"list": 1200,
"retail": 1100,
"savings": 100,
"pct_savings": 8
},
"details": {
"title": "A Love Supreme [Original Recording Reissued]",
"artist": "John Coltrane",
"genre": ["Jazz", "General"],
"format": "CD",
"label": "Impulse Records",
"number_of_discs": 1,
"issue_date": "December 9, 1964",
"alternate_formats": ["Vinyl", "MP3"],
"tracks": [
"A Love Supreme Part I: Acknowledgement",
db.products.find( { 'details.actor':
{ $all: ['James Stewart', 'Donna Reed'] }
} );
db.products.find( { 'details.actor':
{ $all: ['James Stewart', 'Donna Reed'] }
} );

},
"details": {
"title": "It's a Wonderful Life",
"director": "Frank Capra",
"actor": ["James Stewart", "Donna Reed", "Lionel Barrymore"],
"writer": [
"Frank Capra",
"Albert Hackett",
"Frances Goodrich",
"Jo Swerling",
"Michael Wilson"
],
"genre": "Drama",
"number_of_discs": 1,
"issue_date": "Oct 31 2006",
"original_release_date": "1947",
Wanna Play?

• grab products.js from


http://github.com/spf13/
mongoProducts
• mongo --shell products.js

• > use mongoProducts


Embedded documents
are great for orders
•Ordered items need to be fixed at the
time of purchase
•Embed them right in the order
db.order.find( { 'items.sku': '00e8da9f' } );
db.order.find( {
'items.details.actor': 'James Stewart'
} ).count();
What about
transactions?

Using the right solution for each situation


Data (like people) are
really sensitive when
it comes to money
Stricter data
requirements for $$
Stricter data
requirements for $$

•For financial systems any data


inconsistency is unacceptable
Stricter data
requirements for $$

•For financial systems any data


inconsistency is unacceptable
•Perhaps you’ve heard of ACID?
What about ACID?
What about ACID?

Q: Is MongoDB ACID?
What about ACID?

Q: Is MongoDB ACID?
A: Kinda
Atomicity
Atomicity

•MongoDB does atomic writes


Atomicity

•MongoDB does atomic writes


... for single document changesets
Atomicity

•MongoDB does atomic writes


... for single document changesets

• $set, $unset, $inc, $push,


$pushAll, $pull, $pullAll, $bit
Consistency
Consistency

•MongoDB can enforce unique keys


Consistency

•MongoDB can enforce unique keys


•MongoDB can't enforce referential
integrity
Isolation
Isolation
• // Pseudo-isolated updates
db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );
Isolation
• // Pseudo-isolated updates
db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

• // Isolated updates
db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
true );
Isolation
• // Pseudo-isolated updates
db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

• // Isolated updates
db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
true );

• But there are caveats...


Isolation
• // Pseudo-isolated updates
db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

• // Isolated updates
db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
true );

• But there are caveats...

• Despite the $atomic keyword, this is not an atomic update,


since atomicity implies “all or nothing”
Isolation
• // Pseudo-isolated updates
db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

• // Isolated updates
db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
true );

• But there are caveats...

• Despite the $atomic keyword, this is not an atomic update,


since atomicity implies “all or nothing”

• $atomic here means update is done without an interference


from any other operation (isolated)
Isolation
• // Pseudo-isolated updates
db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

• // Isolated updates
db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
true );

• But there are caveats...

• Despite the $atomic keyword, this is not an atomic update,


since atomicity implies “all or nothing”

• $atomic here means update is done without an interference


from any other operation (isolated)

• An isolated update can only act on a single collection. Multi-


collection updates are not transactional, thus not
isolatable.
Durability
Durability

•Mongo has this one covered


What does
MongoDB Support?
• Atomic single document writes
• If you need atomic writes across multi-document
transactions don't use Mongo

• Many if not most e-commerce transactions could be


accomplished within a single document write
• Atomic single document writes
• If you need atomic writes across multi-document
transactions don't use Mongo

• Many if not most e-commerce transactions could be


accomplished within a single document write
• Unique indexes
• This only works on keys used by the entire
collection
• Atomic single document writes
• If you need atomic writes across multi-document
transactions don't use Mongo

• Many if not most e-commerce transactions could be


accomplished within a single document write
• Unique indexes
• This only works on keys used by the entire
collection

• Isolated (not atomic) single collection updates.


• Mongo does not support locking
• There are ways to work around this
• Atomic single document writes
• If you need atomic writes across multi-document
transactions don't use Mongo

• Many if not most e-commerce transactions could be


accomplished within a single document write
• Unique indexes
• This only works on keys used by the entire
collection

• Isolated (not atomic) single collection updates.


• Mongo does not support locking
• There are ways to work around this
• It’s durable
There are ways to
guarantee ACID
properties in MongoDB

Here are 2 good approaches useful for


E-commerce transactions
Optimistic
Concurrency
Optimistic
Concurrency
•Read the current state of a product
Optimistic
Concurrency
•Read the current state of a product
•Make your changes with the assertion
that your product has the same state as
it did when you last read it
Optimistic concurrency
in MongoDB
Optimistic concurrency
in MongoDB
We’ll use an update-if-current strategy.
Optimistic concurrency
in MongoDB
We’ll use an update-if-current strategy.
This example is straight from the documentation:
Optimistic concurrency
in MongoDB
We’ll use an update-if-current strategy.
This example is straight from the documentation:

> t = db.inventory
> p = t.findOne({sku:'abc'})
> t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}});
> db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}


// it worked
Optimistic concurrency
in MongoDB
We’ll use an update-if-current strategy.
This example is straight from the documentation:

> t = db.inventory
> p = t.findOne({sku:'abc'})
> t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}});
> db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}


// it worked

... If that didn't work, try again until it does.


Optimistic
concurrency
•Read the current state of a product.
•Make your changes with the assertion
that your product has the same state as
it did when you last read it.
Optimistic
concurrency
•Read the current state of a product.
•Make your changes with the assertion
that your product has the same state as
it did when you last read it.
• It is also possible to use OCC to
bootstrap pessimistic concurrency and
fake row level locking
Optimistic concurrency
control assumes an
environment with low
data contention
OCC works great for
companies like Amazon

•Amazon has a long-tail catalog


•A long tail catalog lends itself well to
optimistic concurrency, because it has
low data contention
OCC fails miserably
for
OCC fails miserably
for
•eBay
OCC fails miserably
for
•eBay
•Gilt
OCC fails miserably
for
•eBay
•Gilt
•Groupon
OCC fails miserably
for
•eBay
•Gilt
•Groupon
•OpenSky
OCC fails miserably
for
•eBay
•Gilt
•Groupon
•OpenSky
•Living Social
OCC fails miserably
for
•eBay
•Gilt
•Groupon
•OpenSky
•Living Social
•InsertFlashSaleSiteOfTheMinute
Flash sales and
auctions are defined by
high data contention
Flash sales and
auctions are defined by
high data contention

•The model doesn't work otherwise


Flash sales and
auctions are defined by
high data contention

•The model doesn't work otherwise


•They can't afford to be optimistic
Flash sales and
auctions are defined by
high data contention

•The model doesn't work otherwise


•They can't afford to be optimistic
•Order really matters
What about high
contention
environments?
If we can avoid
concurrency we’ve
got it made
Commerce is ACID
In Real Life
1. I go to Barneys and see a pair of shoes I just have to
buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase
them:

• Store inventory has been manually decremented.


• I pay for them with my trusty AmEx.
4. If all goes according to plan, I walk out of the store.
5. If my card was declined, the shoes are “rolled back”
... out onto the shelves and sold to the next customer
who wants them.
All of this is
accomplished
without concurrency
Each item can only be
held by a consumer
We follow the same
model for e-commerce
1. Select a product.
1. Select a product.

2. Update the document to hold inventory.


1. Select a product.

2. Update the document to hold inventory.

• Store inventory has been


decremented.
1. Select a product.

2. Update the document to hold inventory.

• Store inventory has been


decremented.

3. Purchase the product(s)


1. Select a product.

2. Update the document to hold inventory.

• Store inventory has been


decremented.

3. Purchase the product(s)

• Process payment
1. Select a product.

2. Update the document to hold inventory.

• Store inventory has been


decremented.

3. Purchase the product(s)

• Process payment
4. Roll back if anything went wrong.
MongoDB e-commerce
transactions
MongoDB e-commerce
transactions
• Each Item (not SKU) has it’s own document
• Document consists of...
• a reference to the SKU (product)
• a state ( available / sold / ... )
• potentially other data (timestamp, order
ref)
Transactions
in MongoDB
Transactions
in MongoDB
We’ll use a simple update statement
here.
Transactions
in MongoDB
We’ll use a simple update statement
here.

> t = db.inventory
> sku = sku.findOne({sku:'abc'})
> t.update({ref_id:sku._id, state: 'available'}, {'$set':
{state: 'ordered'}});
> db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}


// it worked
Transactions
in MongoDB
We’ll use a simple update statement
here.

> t = db.inventory
> sku = sku.findOne({sku:'abc'})
> t.update({ref_id:sku._id, state: 'available'}, {'$set':
{state: 'ordered'}});
> db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}


// it worked

... If that didn't work, no inventory available


Cart in Cart Action
Cart in Cart Action
An added benefit, it can easily provide
inventory hold in cart.
Cart in Cart Action
An added benefit, it can easily provide
inventory hold in cart.
> t = db.inventory
> sku = sku.findOne({sku:'abc'})
> t.update({ref_id:sku._id, state: 'available'}, {'$set':
{state: 'in cart'}});
> db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}


// it worked
Cart in Cart Action
An added benefit, it can easily provide
inventory hold in cart.
> t = db.inventory
> sku = sku.findOne({sku:'abc'})
> t.update({ref_id:sku._id, state: 'available'}, {'$set':
{state: 'in cart'}});
> db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}


// it worked

just like reality, each item is either


available, in a cart, or purchased
http://spf13.com
http://github.com/spf13
@spf13

Questions?
download at mongodb.org
PS: We’re hiring!! Contact us at jobs@10gen.com

S-ar putea să vă placă și