Sunteți pe pagina 1din 43

Azure Databricks

Virtual Workshop
Unified Data Analytics
Agenda

2:00 - 2:10 PM Azure Databricks Opening Remarks

2:10 - 2:25 PM Unifying Data Science, Business Analytics and Data Engineering

2:25 - 2:45 PM Customer Use Case

2:45 - 3:05 PM Data Analytics Interactive Demo

3:05 - 3:15 PM Q&A


▪ Azure Databricks is a first party service on Azure.
rd

Azure Databricks is integrated seamlessly with Azure services:

Eliminates need to create a separate account and contract with


Databricks.
Unified Data Analytics Platform
Unified data analytics platform for accelerating innovation across
data science, data engineering, and business analytics

Global company with 5,000 customers and 450+ partners


Original creators of popular data and machine learning open source projects
Thousands of Customers: Data-Driven Innovations
Healthcare and Life Sciences Financial Services Media and Entertainment Retail, CPG, and eCommerce

Energy & Utilities Enterprise Technology Manufacturing & Automotive Telecom


These companies combined their massive data with
ML and BI capabilities to unlock business value

MACHINE LEARNING BUSINESS ANALYTICS

DATA
MACHINE LEARNING BUSINESS ANALYTICS
But most
organizations PRODUCTION DEPLOYMENT
Azure

fail to unlock
Machine
NOTEBOOKS & IDE’S Learning
MODEL
MANAGEMENT

business EXPERIMENT TRACKING


DATA-MARTS

value due to
data,
DATA
TABLES
VALIDATION STORE
KEY/VALUE DATA

technology
WAREHOUSE

REPROCESSING ETL

and people
JOBS STREAM BATCH
UPDATE AND MERGE
ETL

silos STREAMS DATA LAKES NoSQL


Kafka | Azure Event Hub Azure Blob | ADLS | HDFS Mongo | Azure Cosmos
Four major challenges companies face
to unlock business value

2
Data is messy, ML is hard, BI is limited to a Lack of enterprise
siloed and slow Production is harder fraction of data readiness

110001100011000100010
001000010111000100101 Fragmented security
010000111100101010011
111100111001110101000 Poor reliability
111001100011000110001
000100010000101110001
Lakes Disjointed governance
001010100001111001010
Make all your data ready for BI and ML

Unified Data Service


Build open, reliable, fast data lakes with all your data
Unify data and ML across the full lifecycle

ML is hard,
Production is harder
Enable BI directly on all your source data

3 BI Integrations for data lake

All your data with high quality and great performance


Cloud native platform for enterprise grade solution

Azure Databricks Enterprise Cloud Service

1000’s
of users

All Azure Azure


your Data Lake Blob
data Storage Storage
DATA DATA
ML ENGINEERS DATA ANALYSTS
ENGINEERS SCIENTISTS

BI & DATA WAREHOUSE

Data Science, ML, and BI on DATA SCIENCE WORKSPACE


INTEGRATIONS

one cloud platform Collaboration across the lifecycle

Access all business and big


UNIFIED DATA SERVICE
data in open data lake High quality data with great performance

Securely integrates with your


ENTERPRISE CLOUD SERVICE
cloud ecosystem A simple, scalable, and secure managed service

BIG DATA & BUSINESS DATA


First-Party Service, Natively Integrated into Azure
Integrated Data Services

Azure Security Azure Portal Azure DevOps


Example Architecture
Ingest Store Prep and Train Serve

Operational Databases

Logs (unstructured)

Azure Event Hub


Azure IoT Hub Ad-hoc
Kafka Analysis
Media (unstructured)

Azure Data Lake Storage

Files (unstructured)

Azure Data Factory Power BI


Azure Synapse
Data Warehouse

Business/custom apps
(structured)
Unifying Data, AI and People
One platform for data science, ML,
and analytics
AI

One reliable and scalable One collaborative workspace


data lake for all analytics for data and ML teams

Data People
Customer Use Case
Strategic Partnering with Databricks
Professional Services
Consulting, IP and Accelerators to Strategic Support
deliver projects successful and Dedicated Support Engineers with
Services
reduce time to value
Support use case and context awareness
Training & Certification Direct engineer access
Public and Private training &
access to Databricks Training Training
Customer Success
Academy Customer Success
...
Engineer
Product Alignment Customer Backlog & case control
Roadmap
Success
Escalation route
Product
acceleration and Mgt Cadence/QBRs
influence
DB Evangelisation Resident Solution Architect
Lunch & learns Architectural direction, design
Hackathons (use case or feature) Partners Trusted authority
POC support Advisor ML & AI authority
Data-driven innovations across industries

Healthcare Financial services Retail/CPG Media/entertainment


Retail and Consumer Goods Solution

Create a Consumer Driven Supply Chain


with an Unified Approach to Data Analytics

Rob Saker, Global Industry Lead for Retail & Consumer Goods
Customers want what they want,
when and where they want it
Consumer Behavior is Changing Supply Networks

Customers want what they want,


when and where they want it

DIRECT TO CONSUMER CUSTOMERS ENGAGE CONVENIENCE


PERSONALIZATION FROM ANYWHERE IS KEY

29.6% Nike’s revenue is direct to consumer and personalized


Consumer Behavior is Changing Supply Networks

Customers want what they want,


when and where they want it

DIRECT TO CONSUMER CUSTOMERS ENGAGE CONVENIENCE


PERSONALIZATION FROM ANYWHERE IS KEY

Adobe says $3B of the $9.4B cyber monday 2019 sales was mobile
Consumer Behavior is Changing Supply Networks

Customers want what they want,


when and where they want it

DIRECT TO CONSUMER CUSTOMERS ENGAGE CONVENIENCE


PERSONALIZATION FROM ANYWHERE IS KEY

50% restaurant food is consumed away from restaurant


Your Supply Chain Needs to be
Consumer-Driven
Create a Consumer-Driven Supply Chain
Predictive analytics use cases that empower store managers

GRANULAR DEMAND DYNAMIC


FORECASTING INVENTORY
• Localized Demand Forecast • Returns Prediction
• Optimal Safety Stock
• SKU Rationalization
• Inventory Allocation

RESPONSIVE FULFILMENT
• Real-time On Shelf Availability
• Freight and Logistics Optimization
• Last mile delivery
Aggregate Level Analyses are Problematic

Promo Group Market Area Week

• Traditional analysis tools create aggregate level analyses such as weekly, promo group, market area
and then allocate demand to stores, SKUs and day using basic weighting allocations.
• This is fundamentally flawed. It assumes that the demand curve for each store, SKU and day
resembles the aggregate analysis, with only the quantity changing.
Why can’t traditional tools perform fine grained forecasts?
Current tools Azure Databricks
Fine Grained Analysis Enable Higher Accuracy
Promo Group Market Area Week

• Stores, SKUs and Days all behave and interact differently.


• Fine grained forecasts enable us to localize models to incorporate local causals (weather,
promotions, consumer behaviors).
Fine-grained forecasting captures the difference

• With fine-grained forecasting, we identify the demand and depletion curve by day,
store and SKU.
• The difference between allocation and fine-grained forecasting can lead to a 10%+
improvement in forecast accuracy.
Emerging Data is Driving Consumer-Driven Businesses
STREAMING
MOBILE DATA
DATA

GEOSPATIAL
DATA

SHIPMENT
INVENTORY DATA
DATA

POS
DATA
VIDEO
DATA

WEATHER
DATA COMPETITOR SKU BATCH
POS
DATA DATA
Azure Databricks Advantages
Traditional Analysis Suites Databricks

Fine grained forecasting Aggregate level Day or hourly, store & SKU

Real-time data No Streaming data

Integrate weather, online &


Custom causal Data Limited
mobile location, digital and more

Structured, unstructured,
Multi-modal data for training No
image, video, sensor data

Localize models for greater


No Yes
accuracy

Push predictions to the edge No Yes


Using Data and ML effectively is challenging

LIMITED REAL-TIME AND LARGE VOLUMES OF FORECASTING NOT ABLE NOT EASY TO GET TO
CAUSAL DATA RAPIDLY CHANGING DATA TO SCALE TO FINE GRAIN ACTIONABLE INSIGHTS

Omnichannel engagement with Retail and manufacturing data Companies are making Store/Distribution managers
consumers is making real time is constantly shifting, being tradeoffs with traditional EDW receive summarized data from
mobile, IoT, and other causal restated and changing. Eg. based tools as they’re unable data warehouses the next day,
data more available and Revised data to account for to complete detailed analysis making many time sensitive
important. returns. at an atomic level. insights unactionable.
Azure Databricks Unified Data Analytics Solves These
Challenges
USE REAL-TIME AND KEEP UP WITH CHANGING DO GRANULAR AND ACTIONABLE AND EASY
CAUSAL DATA DATA ACCURATE FORECASTS INSIGHTS FOR MANAGERS

Single streamlined pipeline for real Azure Delta enables companies Use and track 100s of ML Power BI natively integrates to
time and streaming data with Delta to seamlessly manage rapidly models to forecast demand by Azure Delta, enabling front-line
Lake and Apache SparkTM changing data with ACID day/store/SKU using MLflow users to access analytic data as it is
compliance and full integration generated. It turns PowerBI into a
with Azure security…while next generation real-time analytic
greatly improving query powerhouse.
performance
What You Need for Consumer Driven Supply Chain

Quickly identify, load and integrate multi-modal, batch


and streaming data from internal and consumer facing
sources
Rapidly build and localize analytic models

Deliver fine-grained analysis at scale within service level


windows
Unified Data Analytics for Supply Chain

INVENTORY
DATA
CONSUMER DATA Forecast
Demand
GEO-LOCATION
IOT DATA
DATA

Single View of
COMPETITOR POS Supply Chain
DATA DATA

PRICING
MOBILE APP
DATA
Competitive
Fulfillment
SHIPMENT
WEB TRAFFIC
DATA
Use Case Maturity Model: Consumer-Driven Supply Chain
Typical Data Sets

Streaming Data
Innovation and
Business Value Promotion Plan
Optimize inventory and
Social customer satisfaction
further with clear
Location/IP
Shipment Plan forecasts
Clickstream Create a plan for the
most faster most
Ordering Plan optimal way to get
Batch Data Understand exactly product into customer
how much of each SKU hands
POS needs to ordered and
when
Inventory Saftey Stock
Plan how much inventory you
Marketing want at each location by SKU
by day
Billing & Payment

Service/Call Center Return Forecasts


Predict when and
Shipment where mechanize will
Demand Forecast
be returned
Entitlement/Identity Create SKU/Store/Day
level demand forecasts
Competitor

Third Party
Research Starter Use Cases Advanced Use Cases
Product Catalog

Maturity and Ability Execute


Top Big-Box Club Retailer

Use Case: Fresh Food Why Databricks: Impact:


Forecasting ● 10% reduction in fresh food
● Collaborative workspaces
This Big-box Club has more improved productivity across forecast accuracy
than 590 stores with unique 10+ workspaces, 100+ users,
1000+ notebooks ● Fresh processing analysis
demand patterns. They went from 7 hours to 40
needed to more accurately minutes
predict demand for fresh food ● Delta Lake significantly
(e.g. muffins, rotisserie reduced query speeds ● Reduced infrastructure
chicken) at a local level to compared to legacy Teradata costs by $900K
reduce fresh food waste. and Hadoop enabling real-time
forecasts
Top Consumer Appliance
Manufacturer
Use Case: Optimizing Why Databricks: Impact:
Sales Forecasts
● Ability to process large ● 3x improvement in forecast
Company sells 60M products a volumes of diverse data accuracy for promotion sales
year. Their goal is to predict including product pricing, through localized models.
and manufacture the exact ratings and promotions.
quantity customers demand ● Forecasting customer
and anticipate where to stock ● Iterate on and track 1000s of demand is now delivered
products across 400 global forecasting models using three months in advance
distribution centers. Facebook Prophet library.
Key Takeaways
Supply Chain Optimization is a top priority for Retail and Consumer Goods right
now. Azure Databricks enables them to:

1) Perform fine-grained analysis within service windows: Avoid making


sacrifices about the depth or breadth of analysis and realizing big
improvements in accuracy.
2) Solve current needs, prepared for the future: Quickly solve current
challenges, while leveraging a platform that can grow to future requirements
such as streaming and multi-modal data types.
3) Focus on speed to value: Generate positive business value in weeks and
months with the Unified Data Analytics platform.
Demo
Resources
§ Databricks Blog § Reference § Tutorials
Architectures ▪ ETL into Synapse
§ Microsoft Blog ▪ Modern Data ▪ Connect Cosmos DB
§ Documentation Warehouse ▪ Stream Data Using Event
▪ Real-Time Analytics Hubs
§ Knowledge Base ▪ Advanced Analytics § Learning Paths
§ Frequently Asked § Guides ▪ Perform Data
Engineering With ADB
Questions ▪ Apache Spark
▪ Extract Knowledge &
▪ Best Practices Insights
§ Databricks Academy
▪ Delta Lake
§ Connector to IDEs
▪ Machine Learning
▪ MLflow § Example Notebooks
▪ Structured Streaming § Free eBooks

S-ar putea să vă placă și