Sunteți pe pagina 1din 46

SOCAL

Migrate Anything* to MongoDB Atlas

Sig Narváez
Principal Solution Architect
@SigNarvaez

#MDBlocal
Agenda

Why MongoDB? Why Atlas?


Prep Items
Which Migration Path? (Options)
Post steps
Migrating Other Data Stores
Q&A ⇒ db.SigNarvaez.find({}).explain()

#MDBLocal
Why MongoDB? Why Atlas?
Why MongoDB? A: Next Gen Multi-Model data platform
MongoDB is the most powerful data management platform in the market today

Flexible Multi-Structured Schema is designed to adapt to changes

Graph
Document Graph &
Rich JSON Hierarchical
Data Structures Recursive
Flexible Schema Lookups
Mobile
Apps
Relational 01 Search
Left-Outer Join JSON 10 Text Search
Views
Multiple Languages
Schema Validation
Faceted Search

GeoSpatial Binaries
Key/Value
GeoJSON
2D & Horizontal Scale Files & Metadata
In-Memory Encrypted
2DSphere

#MDBLocal
MongoDB Atlas Data Platform
Migration Prep
Self-Managed MongoDB to Fully Managed MongoDB Atlas
Prep Items

#MDBLocal
Prep Items: Atlas Cluster Sizing

What is the current cluster hardware like?


RAM
Disk (size & speed)
CPUs

APM: DataDog, NewRelic, ?


What is the workload like?
Reads / Sec? cmd line: mongostat, mongotop,
iostat, top, free, vmstat,
Writes / Sec? etc.

Docs / Sec? MongoDB Shell:


db.serverStatus().connections
Peak Connections?

#MDBLocal
Prep Items: Atlas Cluster Sizing

On-Prem or Cloud Reserved Instances


Most-likely Overprovisioned

Let ATLAS AUTO-SCALE


figure it out!

Match the current hardware


Run performance tests hours / days
Upscale: CPU or RAM > 75% (1 hr)
Dowscale: CPU and RAM < 50% (72 hrs)

#MDBLocal
Prep Items: Expert Atlas Cluster Sizing
Work with your local MongoDB Solution Architect
#Shards by Storage = Total Storage ÷ Max Storage Per Shard

#Shards by RAM = Total RAM ÷ Max RAM Per Shard

#Shards by Cores = Total Cores ÷ Max Cores Per Shard

#Shards by IOPS = Total IOPS ÷ Max IOPS Per Shard

#Shards by Network Bandwidth = Peak Gbps ÷ Gbps Capacity Per Shard

#Shards by Disk Bandwidth = Peak Mbps ÷ Mbps Capacity Per Shard

Complete MongoDB Atlas Sizing Talk from MDBW19:


https://www.slideshare.net/mongodb/mongodb-world-2019-finding-the-right-mongodb-atlas-cluster-size-does-this-instance-make-my-app-look-fast

#MDBLocal
Prep Items: Version, Driver & Retries

Ensure your current driver is 3.6+ compatible


As of Feb 2020 Atlas is 3.6+
You can still migrate from 2.6+!!

3.6 Retryable Writes


4.2 Retryable Reads
Fault Resiliency

#MDBLocal
Prep Items: Connectivity

● IP Whitelist | VPC Peer | Private Endpoint


● Create Users & Permissions
● Use SRV connection strings (3.6+)

vs.

#MDBLocal
Prep Items: Test Basic Ops mgeneratejs '{
"_id": "$objectid",
"dateTime": "$date",
"createdAt": "$date",
"Action" :"$string",
Test, Test, Test
"severityLevel": "$integer",
● Simulate Production Traffic "source": "$string",
"display": "$string",
● Your own test suite
"deviceServerIp": "$ip",
● POCDriver "details": {
"ipAddress": "$ip",
> https://github.com/johnlpage/POCDriver "macAddress": "$string",
"userId": "SYSTEM",
● mgeneratejs "method": "method"
> https://github.com/rueckstiess/mgeneratejs }}' --jsonArray -n 1000000 | mongoimport -
-jsonArray --port 27017 --upsert -d atlas -c
iot

#MDBLocal
Prep Items: Increase OpLog on Source Cluster

Initial Sync Initial Sync

Scans every document Source OpLog


Replicates to target cluster

Source OpLog

Must be large enough to contain entire


initial sync oplog window in order to
replicate data changes that occurred
during initial sync

#MDBLocal
Prep Items: Upscale Target Cluster

Recommend upscale by 1+ tier higher

Consider higher IOPS too

Increase disk size lower cost alternative

over provisioned IOPS.

Turn off Auto-Scale

Force Failover before migration

#MDBLocal
Migration Options
Comparing Options

Live Migrate mongomirror dump/restore or import


RS or Sharded RS only All deployments
Built-in cutover Sharded: Professional Services
Great for most customers Can avoid network hop Downtime proportional to data size
Built-in Atlas UI Works with Network peering Sharded -> RS
Must temporarily allow User-controlled cut-over
network access (hop)

#MDBLocal
Behind the scenes

1. initial sync - copying documents


and building indexes that already
exist on the source deployment.

2. oplog sync - tailing and applying


entries from the oplog (delta).
○ “CDC” - Continues replicating
as live data is changing
○ resumable from here

#MDBLocal
Migration Dry Run

Prod ⇒ Staging/QA Atlas Cluster


Dry-run:
Run initial sync at least 2 times
Connectivity & Security
1) Build Staging site with Initial Sync but w/o Cutover
a) Measure time
Time to perform initial sync
2) Repeat w/Cutover
Restart App(s) with
a) Let LM / MM reach 0s replication lag
new Connection b) Restarting Apps pointing to new Cluster
c) Test, Test, Test

#MDBLocal
Migration Execution

New Prod

#MDBLocal
DEMO

Live Migration
LiveMigrate
Live Migrate

#MDBLocal
DEMO

mongomirror
#MDBLocal
Post Migration
Housekeeping
Housekeeping

Monitor the deployment


Re-size oplog or instance size accordingly (72 hours recommended)
Update IP Whitelisting, if applicable
Set up backups, alerts, and other security settings

#MDBLocal
Extra Resources
https://www.mongodb.com/cloud/atlas/migrate

#MDBLocal
Extra Resources
https://www.mongodb.com/products/consulting

#MDBLocal
Other Data Stores
Cloud NoSQL & RDBMS

#MDBlocal
Safe Harbor Statement
This presentation contains “forward-looking statements” within the meaning of Section 27A of the Securities Act of 1933,
as amended, and Section 21E of the Securities Exchange Act of 1934, as amended. Such forward-looking statements are
subject to a number of risks, uncertainties, assumptions and other factors that could cause actual results and the timing of
certain events to differ materially from future results expressed or implied by the forward-looking statements. Factors that
could cause or contribute to such differences include, but are not limited to, those identified our filings with the Securities
and Exchange Commission. You should not rely upon forward-looking statements as predictions of future events.
Furthermore, such forward-looking statements speak only as of the date of this presentation.

In particular, the development, release, and timing of any features or functionality described for MongoDB products
remains at MongoDB’s sole discretion. This information is merely intended to outline our general product direction and it
should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver
any material, code, or functionality. Except as required by law, we undertake no obligation to update any forward-looking
statements to reflect events or circumstances after the date of such statements.

30
All Other Data Stores … 350+!!!

https://db-engines.com/en/ranking_categories

#MDBLocal
Let’s choose a few

MongoDB “compatible” Key-value stores Relational DBMS

AWS DocumentDB AWS DynamoDB

Azure CosmosDB

#MDBLocal
AWS DocumentDB

● Compatible with MongoDB 3.6

● Use the same MongoDB Drivers/SDKs, Tools and

Applications with Amazon DocumentDB

● Automatic Patching, Failover and Recovery

● Integrated with AWS services (CloudWatch, etc.)

● Functional Differences:

https://docs.aws.amazon.com/documentdb/latest/developerguide/functio

nal-differences.html

#MDBLocal
AWS DocumentDB Feature Gap vs. MongoDB

Fails > 60%* of MongoDB correctness tests


• Extensive testing, debugging & refactoring
required to migrate to DocumentDB
Lags mainline features by 5 years
• No retryable reads + writes MongoDB’s most
• No transactions important value is
• No support for storage or index compression developer productivity
• Missing many aggregation stages that allow
These limitations can
expressive data handling
significantly reduce
• No lossless decimal type that value
• No search and geospatial queries
• Indexes are not copied over via the utilities
(mongodump and mongorestore)
• No materialized views

* https://www.mongodb.com/atlas-vs-amazon-documentdb/compatibility *60% for 3.6, 64% for 4.2

#MDBLocal
AWS DocumentDB Feature Gap vs. MongoDB

Not based on the MongoDB server


emulates the MongoDB API
does not provide complete functionality

Yet, Developers are directed to use official


MongoDB Drivers, Documentation and University
to learn how to connect and develop?

What is this experience like? ...

#MDBLocal
Possible Migration Options

Method Considerations

Offline mongodump / mongorestore


Does not dump admin database
Recreate user(s) (DocumentDB does not provide RBAC*)

Online Does not support Kinesis Streams, Data Pipeline, etc.


build-your-own
Change Streams (limited) could be used (likely very fragile)

*https://docs.aws.amazon.com/documentdb/latest/developerguide/fu
nctional-differences.html#functional-differences.mongodump-
mongorestore

#MDBLocal
[ec2-user@ip-172-31-1-79 dump]$ mongodump --host sigsdocdb.caexbcw7y6up.us-west-
2.docdb.amazonaws.com:27017 --username snarvaez --ssl --sslCAFile /home/ec2-user/rds-
combined-ca-bundle.pem
2020-02-24T05:01:23.523+0000writing SigsTest.coll to
2020-02-24T05:01:23.525+0000done dumping SigsTest.coll (1 document)

[ec2-user@ip-172-31-1-79 bin]$ ./mongomirror --host rs0/sigsdocdb.caexbcw7y6up.us-west-


2.docdb.amazonaws.com:27017 --username snarvaez --ssl --sslCAFile /home/ec2-user/rds-
combined-ca-bundle.pem --destination Cluster0-shard-0/cluster0-shard-00-00-
tlsla.mongodb.net:27017,cluster0-shard-00-01-tlsla.mongodb.net:27017,cluster0-shard-00-02-
tlsla.mongodb.net:27017 --destinationUsername snarvaez
mongomirror version: 0.9.1
git version: 0bc45282784aa74bc25c336412efca7f84749aa4
Go version: go1.12.13
os: linux
arch: amd64
compiler: gc
2020-02-24T05:02:56.564+0000Error initializing mongomirror: could not initialize source
connection: could not connect to server: server selection error: server selection timeout
current topology: Type: Single
Servers:
Addr: sigsdocdb.caexbcw7y6up.us-west-2.docdb.amazonaws.com:27017, Type: Unknown, State:
Connected, Average RTT: 0, Last error: connection(sigsdocdb.caexbcw7y6up.us-west-
2.docdb.amazonaws.com:27017[-121]) connection is closed
#MDBLocal
Azure CosmosDB
Advertised Strengths
1. Globally Distributed 6. Multi-Consistency Support
2. Linearly Scalable 7. Indexes Data Automatically
3. Schema-Agnostic Indexing 8. High Availability
4. Multi-Model 9. Guaranteed Low Latency
5. Multi-API and Multi-Language Support 10. Multi-Master Support

#MDBLocal
Azure CosmosDB Feature Gap vs. MongoDB
Also not based on the MongoDB server - It emulates the MongoDB API
Large feature gaps vs. mainline

● No multi document ACID Transactions, Materialized Views, Retryable Writes, Lossless


Decimals, Text Search, Schema Validation, etc.

● 3.2 and 3.6 modes. 3.2 clusters cannot be upgraded to 3.6 at this time (Feb 2020)

● Numerous Incompatibilities
Many operations work differently and are not documented - left to developers to figure out

Scalability needs Handling + Rapid Cost Escalations


● RUs determine scalability - developers need error handling when max RUs exceeded
Azure Only - Lock-in

#MDBLocal
Possible migration options
Method Considerations

Offline mongodump / mongorestore


Not an option - backups cannot be restored to another target

Offline Via Azure Data Factory* or


Azure DocumentDB Data Migration Tool*
ETL Export to JSON / mongoimport

Online Via Change Feed


build-your-own Similar to using Change Streams + Azure Functions to write to Atlas

* https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db-mongodb-api
* https://www.microsoft.com/en-us/download/details.aspx?id=46436
* https://docs.microsoft.com/en-us/azure/cosmos-db/change-feed
#MDBLocal
AWS DynamoDB

DynamoDB is a wide-column key/value store. Each


entry is called Item and consists of Attributes.
Widely used in AWS Ecosystem ⇒ AWS Only

Migration may required due to


● Increased / Unpredictable Cost
● Functionality insufficient for Business or Dev
Productivity - App has outgrown the data store
● etc. https://aws.amazon.com/blogs/database/choosing-the-right-
dynamodb-partition-key/

#MDBLocal
Possible migration options
Method Considerations

Offline
https://docs.aws.amazon.com/d
mongoimport atapipeline/latest/DeveloperGuid
e/dp-importexport-ddb-part2

Online
CUD operations
build-your-own via MongoDB Driver

https://docs.aws.amazon.com/a
mazondynamodb/latest/develop
erguide/Streams.Lambda.html

#MDBLocal
RDBMS

Why?
• Modernization
• On-Prem to Cloud
• Monolith to MicroServices
• Oracle exit strategy
Who?
• Cisco migrated $4B
eCommerce Platform

https://www.mongodb.com/blog/post/cisco-and-mongodb-e-commerce-transformation

#MDBLocal
Possible migration options
Method Tools & Patterns

ETL & CDC

https://github.com/johnlpage/MongoSyphon

Strangler
Pattern

#MDBLocal
db.SigNarvaez.find({}).explain()
Q&A

S-ar putea să vă placă și