Sunteți pe pagina 1din 19

Architecting the Enterprise Data

Cloud
Todd Sylvester | VP Strategy & Operations
THE ENTERPRISE DATA CLOUD COMPANY

We believe that data We empower people We deliver an


can make what is to transform complex enterprise data cloud
impossible today, data into clear and for any data, anywhere,
possible tomorrow actionable insights from the Edge to AI

© 2019 Cloudera, Inc. All rights reserved. 2


THE “NEW” CLOUDERA

3,000+
Employees
85
Countries
2,000+
Customers

© 2019 Cloudera, Inc. All rights reserved. 3


WHAT ENTERPRISES ARE TELLING US...

Any Cloud Multi-Function Secure & Governed Open

© 2019 Cloudera, Inc. All rights reserved. 4


LET’S LOOK AT A REPRESENTATIVE ON PREM CUSTOMER:

Data Users Workloads

© 2019 Cloudera, Inc. All rights reserved. 5


THEIR ROADMAP:

Data Users Workloads

© 2019 Cloudera, Inc. All rights reserved. 6


THEIR CURRENT ARCHITECTURE: A MONOCLUSTER

Innovators

Traditional

Businesses

Business Units Data Lake

© 2019 Cloudera, Inc. All rights reserved. 7


OPTION 1: EXPAND THE MONOCLUSTER

Innovators

Traditional X

Businesses X
Business Units Data Lake

© 2019 Cloudera, Inc. All rights reserved. 8


OPTION 2
CREATE DATA
SILOS
Innovators
But this creates many problems:
• Complex ingestion pipeline
• Inconsistent security policies
• No single pane of glass Businesses
• Difficult to size clusters
• Difficult to map users to
clusters
• …

Traditional

© 2019 Cloudera, Inc. All rights reserved. 9


OPERATE AT CLOUD SCALE
a data platform that manages data, users,
and workloads independently while
providing a shared data experience

© 2019 Cloudera, Inc. All rights reserved. 10


ENTERPRISE DATA CLOUD ARCHITECTURE
CLOUDERA DATA PLATFORM

Control Management
Identity | Orchestration | Management | Operations
plane Console

Data Flow & Data Data Operational Machine


Analytic Streaming Engineering Warehouse Database Learning
experiences

Data Hub & Cloudera Runtime

Data Shared Data


Experience Catalog | Schema | Migration | Security | Governance
anywhere

Edge Private Hybrid


Public
Any Cloud
Multi-Cloud
Cloud
Infrastructure

© 2019 Cloudera, Inc. All rights reserved. 11


ARCHITECTURE

Management Console Management Console - A single pane of glass to manage one or more
environments and the services that run within each environment

Environment - A logical encapsulation of a customer network and the the


services that run within that network (like an Azure virtual network)
Environment

Data Cluster – A distributed computing service that running on VMs (Data


DH DW
DW ML
ML
Hub Hub) or K8s (the experiences) and has access the shared data lake
Clusters Clusters
Clusters Clusters
Clusters
Clusters

SDX – (Shared Data Experience) the data access control layer that sits
SDX on top of the backend object store and provides coherent data security
and governance for all the applications running with the environment

© 2019 Cloudera, Inc. All rights reserved. 12


MANAGEMENT SERVICES
Management Console Management Console - A single pane of glass to manage one or more
environments and the services that run within each environment
Data Workload Replication
Catalog Manager Manager Data Catalog - A centralized management tool for searching, organizing,
securing, and governing data across environments

Workload Manager - A centralized management tool for analyzing and


optimizing workloads within and across environments
Environment
Replication Manager - A centralized management tool for replicating and
migrating data and the associated metadata across environments
Data
DH DW
DW ML
ML
Hub
Clusters Clusters
Clusters Clusters
Clusters
Clusters

SDX

© 2019 Cloudera, Inc. All rights reserved. 13


COMPUTE SERVICES

Management Console

Data Hub – A service for creating general purpose clusters that enable
Environment
developers to create custom business applications

Data Warehouse – A service for creating self-service data warehouses


Data (and the underlying compute clusters) for teams of business analysts
DH DW
DW ML
ML
Hub
Clusters Clusters
Clusters Clusters
Clusters
Clusters Machine Learning - A service for creating self-service machine learning
workspaces (and the underlying compute clusters) for teams of data
scientists
SDX [more to come over time…]

© 2019 Cloudera, Inc. All rights reserved. 14


SETUP

Management Console 1. Login to the Management Console cloud service, or install the
downloadable version of the Management Console

2. Connect the Management Console to each of your Environments


(e.g. on premises, on AWS, on Azure, on GCP)

3. Acquire and organize the data within each Environment into the Data
Environment Lake, using the Management Console

4. Create an isolated cluster for each discrete team or application that


Data requires access to the Data Lake, using the Management Console
DH DW
DW ML
ML
Hub
Clusters Clusters
Clusters Clusters
Clusters
Clusters 5. Configure authorization policies and invite additional users to access
the platform

SDX

© 2019 Cloudera, Inc. All rights reserved. 15


HYBRID, MULTI-CLOUD ARCHITECTURE
Management Console

Data Catalog Workload Manager Replication Manager

Data Center Private Cloud Public Cloud


(monocluster, bare metal, no containers) (separate storage / compute, containers) (separate storage / compute, containers)

DataHub DWX MLX DataHub DWX MLX


(on VMs) (on K8s) (on K8s) (on VMs) (on K8s) (on K8s)
Spark, Hive, Impala, HBase, ...

SDX SDX SDX


(backed by HDFS) (backed by HDFS / Ozone) (backed by S3 / ADLS / GCS)

© 2019 Cloudera, Inc. All rights reserved. 16


WHY AN ENTERPRISE DATA CLOUD ARCHITECTURE?

Hybrid, Multi-Cloud Multi-Function Secure & Governed Open


• Move data and applications • Deploy one platform to • Manage data security and • Reduce vendor lock-in with
without rewriting and address current and future governance centrally 100% open source platform
retraining workload needs • Automate application • Extend enterprise data
• Separate data management • Connect disparate security at all layers cloud architecture with 3rd
strategy from infrastructure workload types to develop • Reduce time to value with party applications
strategy Edge2AI applications on enterprise-grade
• Manage all environments one platform productivity tools
from a single pane of glass

© 2019 Cloudera, Inc. All rights reserved. 17


THE ENTERPRISE DATA CLOUD COMPANY

© 2019 Cloudera, Inc. All rights reserved. 19

S-ar putea să vă placă și