Sunteți pe pagina 1din 21

Module 2

Planning Data Warehouse


Infrastructure
Module Overview

Considerations for Data Warehouse Infrastructure


Planning Data Warehouse Hardware
Lesson 1: Considerations for Data Warehouse
Infrastructure

System Sizing Considerations


Data Warehouse Workloads
Typical Server Topologies for a BI Solution
Scaling-out a BI Solution
Planning for High Availability
System Sizing Considerations

Data Volume Analysis/Report Complexity

Number of Users Availability Requirements


Data Warehouse Workloads

Operations and
Maintenance

Data
Modelling

Data Warehouse

ETL
Processes
Reporting
Typical Server Topologies for a BI Solution

DW

Single Server Architecture Distributed Architecture

Few Servers Many

Hardware costs
Software license costs
Configuration complexity
Scalability & Performance
Flexibility
Scaling-out a BI Solution

Data Warehouse Analysis Services

Integration Services Reporting Services


Planning for High Availability

Data Warehouse Analysis Services


AlwaysOn Failover Cluster AlwaysOn Failover Cluster
RAID Storage

Integration Services Reporting Services


AlwaysOn Availability Group NLB Report Servers
AlwaysOn Availability Group/Failover Cluster
Lesson 2: Planning Data Warehouse Hardware

SQL Server Fast Track Reference Architectures


Core-Balanced System Architecture
Demonstration: Calculating Maximum
Consumption Rate
Determining Processor and Memory Requirements
Determining Storage Requirements
Considerations for Storage Hardware
SQL Server Data Warehouse Appliances
SQL Server Parallel Data Warehouse
SQL Server Fast Track Reference Architectures

Pre-tested and approved


hardware specifications and
guidance
Available from multiple
hardware vendors in
partnership with Microsoft
Support for a range of data
warehouse sizes
Tools provided to calculate
required specification
Core-Balanced System Architecture
Per-Core MCR = 200 MB/s 2 x FC Port per processor
Total MCR = 1600 MB/s Max I/O Rate = 2000 MB/s

Server Storage Enclosure

Storage
Processors

SQL Server 4-Spindle RAID 10 Disk Groups

Fiber Switch
Storage Enclosure

Windows Server Storage


Processors
4-Spindle RAID 10 Disk Groups
Quad Dual Port FC HBA
Core
CPU Storage Enclosure
Dual Port FC HBA
Quad
Core Storage
CPU Dual Port FC HBA Processors
4-Spindle RAID 10 Disk Groups

Max I/O Rate = 2000 MB/s Max I/O Rate = 1800 MB/s
Demonstration: Calculating Maximum
Consumption Rate

In this demonstration, you will see how to:


Create tables for benchmark queries
Execute a query to retrieve I/O statistics
Calculate MCR from the I/O statistics
Determining Processor and Memory
Requirements

Estimating CPU Requirements:


Determine core MCR
Apply formula to estimate required
number of cores:
((Average query size in MB/ MCR) x Concurrent users) / Target response time

Spread cores across CPUs based on the


number of storage arrays

Estimating RAM Requirements:


Use a minimum of 4 GB per core
(or 64 - 128 GB per socket)
Target 20% of data volume
Determining Storage Requirements
Data Warehouse
Determine initial data volume
Number of fact table rows x row size
Use 100 bytes per row as an estimate if unknown
Add 30-40% for dimensions and indexes
Project data growth
Number of new fact rows per month
Factor in compression
Typically 3:1
Other storage requirements
Configuration databases
Log files
TempDB
Staging tables
Backups
Analysis Services models
Considerations for Storage Hardware

Use more smaller disks instead of


fewer larger disks
Use the fastest disks you can afford
Consider solid state disksespecially for random I/O
Use RAID 10, or minimally RAID 5
Consider a dedicated storage area network for
manageability and extensibility
Balance I/O across enclosures, storage processors, and
disk groups
SQL Server Data Warehouse Appliances

Pre-built hardware and software solutions based


on tested configurations
Part of a range of SQL Server-based appliances
Available from multiple hardware vendors
SQL Server Parallel Data Warehouse

Special SQL Server Edition only available in


hardware appliances
Massively parallel processing
Shared-nothing architecture
Dedicated control nodes, compute nodes, and
storage nodes
Control Node
Cluster Database Servers

Dual Fiber Channel


(Compute Nodes)
Management
InfiniBand

Servers
Landing Zone
(ETL Interface)
Storage Arrays
Backup Nodes
Lab: Planning Data Warehouse Infrastructure

Exercise 1: Planning Data Warehouse Hardware

Logon Information
Virtual Machine: 20463C-MIA-SQL
Use Name: ADVENTUREWORKS\Student
Password: Pa$$w0rd

Estimated Time: 30 Minutes


Lab Scenario

You are planning a data warehouse solution for


Adventure Works Cycles, and have been asked to
specify the hardware required. You must design a
SQL Server-based solution that provides the right
balance of functionality, performance, and cost.
Lab Review

Review DW Hardware Spec.xlsx in the


D:\Labfiles\Lab02\Solution folder. How does the
hardware specification in this workbook compare
to the one you created in the lab?
Module Review and Takeaways

Review Question(s)

S-ar putea să vă placă și