Exadata Workshop Part1

Exadata and Database Machine
Administration Workshop
Student Guide
D67016GC20
Edition 2.0
January 2011
D71669
Authors Copyright © 2010, Oracle and/or it affiliates. All
rights reserved.
Peter Fusek
Disclaimer
Jean-Francois Verrier
Mark Fuller This document contains proprietary information and is
protected by copyright and other intellectual property
Dave Winter laws. You may copy and print this document solely for
your own use in an Oracle training course. The
document may not be modified or altered in any way.
Technical Contributors Except where your use constitutes "fair use" under
and Reviewers copyright law, you may not use, share, download,
upload, copy, print, display, perform, reproduce, publish,
Andrew Babb Sue Lee license, post, transmit, or distribute this document in
whole or in part without the express authorization of
Bharat Baddepudi Juan Loaiza
Oracle.
Maria Billings Barb Lundhild
The information contained in this document is subject to
Robert Carlin Varun Malhotra change without notice. If you find any problems in the
Michael Cebulla Louis Nagode document, please report them in writing to: Oracle
University, 500 Oracle Parkway, Redwood Shores,
Nilesh Choudhury Dan Norris California 94065 USA. This document is not warranted
Christian Craft Michael Nowak to be error-free.
Ravindra Dani Sriram Palapudi Restricted Rights Notice

Aslam Edah-Tally Umesh Panchaksharaiah If this documentation is delivered to the United States
Boris Erlikhman Sugam Pandey Government or anyone using the documentation on
behalf of the United States Government, the following
Amit Ganesh Robert Pastijn notice is applicable:
Ed Gilowski Marshall Presser
U.S. GOVERNMENT RIGHTS
Joel Goodman Georg Schmidt The U.S. Government’s rights to use, modify, reproduce,
release, perform, display, or disclose these training
Scott Gossett Akshay Shah
materials are restricted by the terms of the applicable
Jim Hall Kam Shergill Oracle license agreement and/or the applicable U.S.
Government contract.
Roger Hansen Tim Shelter
James He Eric Siglin Trademark Notice
David Hitchcock Sundararaman Sridharan Oracle and Java are registered trademarks of Oracle
Bill Hodak Vijay Sridharan and/or its affiliates. Other names may be trademarks of
their respective owners.
Vimala Jacob Mahesh Subramaniam
Martin Jensen Lawrence To
Kevin Jernigan Alex Tsukerman
Caroline Johnston Kodi Umamageswaran
Larry Justice Douglas Utzig
Vikram Kapoor Harald van Breederode
Bruce Kyro Mark Van de Wiel
Sumeet Lahorani Dave Winter
Publishers
Sujatha Nagendra
Giri Venugopal
Contents
1 Introduction
Course Objectives 1-2
Audience and Prerequisites 1-3
Course Scope 1-4
Course Contents 1-5
Terminology 1-6
Additional Resources 1-7
Practice 1 Overview: Introducing the Laboratory Environment 1-8
2 Exadata Overview
Objectives 2-2
Traditional Enterprise Database Storage Deployment 2-3
Exadata Storage Deployment 2-4
Exadata Implementation Architecture Overview 2-6
Introducing Exadata 2-7
Exadata Hardware Details (Sun Fire X4270 M2) 2-8
Exadata Specifications 2-9
InfiniBand Network 2-10
Classic Database I/O and SQL Processing Model 2-11
Exadata Smart Scan Model 2-12
Exadata Smart Storage Capabilities 2-13
Exadata Smart Scan Scale-Out Example 2-16
Exadata Hybrid Columnar Compression 2-19
Exadata Hybrid Columnar Compression Architecture Overview 2-20
Exadata Smart Flash Cache 2-21
Exadata Storage Index 2-23
Storage Index with Partitions Example 2-25
Database File System 2-26
I/O Resource Management 2-27
Benefits Multiply 2-28
Exadata Key Benefits for Data Warehousing 2-29
Exadata Key Benefits for OLTP 2-31
Quiz 2-32
Summary 2-34
iii
Practice 2 Overview: Introducing Exadata Features 2-36
3 Exadata Architecture
Objectives 3-2
Exadata Software Architecture Overview 3-3
Exadata Software Architecture Details 3-5
Exadata Smart Flash Cache Architecture 3-7
Exadata Monitoring Architecture 3-9
Disk Storage Entities and Relationships 3-10
Interleaved Grid Disks 3-12
Flash Storage Entities and Relationships 3-13
Disk Group Configuration 3-14
Quiz 3-15
Summary 3-17
Practice 3 Overview: Introducing Exadata Cell Architecture 3-19
4 Exadata Configuration
Objectives 4-2
Exadata Installation and Configuration Overview 4-3
Initial Network Preparation 4-4
Configuration of New Exadata Servers 4-6
Answering Questions During the Initial Boot Sequence 4-7
Exadata Administrative User Accounts 4-11
Configuring a New Exadata Cell 4-12
Important I/O Metrics for Oracle Databases 4-13
Testing Performance Using CALIBRATE 4-14
Configuring the Exadata Cell Server Software 4-15
Creating Cell Disks 4-16
Creating Grid Disks 4-17
Creating Flash-Based Grid Disks 4-18
Configuring Hosts to Access Exadata Cells 4-19
Configuring ASM and Database Instances for Exadata 4-20
Configuring ASM Disk Groups for Exadata 4-21
Optional Configuration Tasks 4-22
Exadata Storage Security Overview 4-23
Exadata Storage Security Implementation 4-24
Quiz 4-26
Summary 4-29
iv
Practice 4 Overview: Configuring Exadata 4-31
5 Exadata Performance Monitoring and Maintenance

Objectives 5-2
Monitoring Overview 5-3
Exadata Metrics and Alerts Architecture 5-4
Monitoring Exadata with Metrics 5-6
Monitoring Exadata with Metrics: Example 5-8
Monitoring Exadata with Alerts 5-9
Displaying Alert Examples 5-11
Monitoring Exadata with Active Requests 5-13
Monitoring SQL Execution Plans 5-14
Smart Scan Execution Plan Example 5-15
Predicate Offloading Considerations 5-16
Monitoring Exadata from Your Database 5-17
Monitoring Exadata with Wait Events 5-18
Monitoring Exadata with Enterprise Manager 5-19
Additional Monitoring Tools and Utilities 5-20
Cell Maintenance Overview 5-21
Automated Cell Maintenance Operations 5-23
Replacing a Damaged Physical Disk 5-24
Replacing a Damaged Flash Card 5-26
Moving All Disks from One Cell to Another 5-27
Using the Exadata Software Rescue Procedure 5-28
Quiz 5-30
Summary 5-32
Practice 5 Overview: Monitoring Exadata 5-34
6 Exadata and I/O Resource Management

Objectives 6-2
I/O Resource Management Overview 6-3
I/O Resource Management Concepts 6-5
I/O Resource Management Plans 6-6
IORM Architecture 6-7
I/O Resource Management Plans Example 6-8
Enabling Intradatabase Resource Management 6-11
Intradatabase Plan Example 6-12
Enabling IORM for Multiple Databases 6-13
Interdatabase Plan Example 6-14
v
Category Plan Example 6-16
Complete Example 6-17
Using Database I/Os Metrics 6-20
Quiz 6-21
Summary 6-25
7 Optimizing Database Performance with Exadata

Objectives 7-2
Optimizing Performance 7-3
Flash Memory Usage 7-4
Compression Usage 7-6
Index Usage 7-8
ASM Allocation Unit Size 7-9
Minimum Extent Size 7-10
Quiz 7-11
Summary 7-13
Practice 7 Overview: Optimizing Database Performance with Exadata 7-15
8 Database Machine Overview and Architecture

Objectives 8-2
Introducing Database Machine 8-3
Database Machine X2-2 Full Rack 8-4
X2-2 Database Server Hardware Details (Sun Fire X4170 M2) 8-5
Start Small and Grow 8-6
Database Machine X2-8 Full Rack 8-7
X2-8 Database Server Hardware Details (Sun Fire X4800) 8-8
Database Machine Capacity 8-9
Database Machine Performance 8-10
Database Machine X2-2 Architecture 8-11
InfiniBand Network Architecture 8-13
X2-2 Leaf Switch Topology 8-14
Full Rack Spine and Leaf Topology 8-15
Scale Performance and Capacity 8-16
Scaling Out to Multiple Full Racks 8-17
Quiz 8-18
Summary 8-20
vi
9 Database Machine Configuration
Objectives 9-2
Database Machine Implementation Overview 9-3
Configuration Worksheet Overview 9-5
Getting Started 9-6
Configuration Worksheet Example 9-7
Configuring ASM Disk Groups with Configuration Worksheet 9-11
Generating the Configuration Files 9-13
Other Pre-Installation Tasks 9-14
The Result After Installation and Configuration 9-15
Supported Additional Configuration Activities 9-17
Unsupported Configuration Activities 9-18
Quiz 9-20
Summary 9-22
10 Migrating Databases to Database Machine

Objectives 10-2
Migration Best Practices Overview 10-3
Performing Capacity Planning 10-4
Database Machine Migration Considerations 10-5
Choosing the Right Migration Path 10-6
Logical Migration Approaches 10-7
Physical Migration Approaches 10-9
Other Approaches 10-11
Post-Migration Best Practices 10-12
Quiz 10-13
Summary 10-15
Practice 10 Overview: Migrating to Databases Machine using Transportable
Tablespaces 10-18
11 Bulk Data Loading with Database Machine

Objectives 11-2
Bulk Data Loading Overview 11-3
Preparing the Data Files 11-4
Staging the Data Files 11-5
Configuring the Staging Area 11-6
Configuring the Staging Area 11-7
Configuring the Target Database 11-10
Loading the Target Database 11-11
vii
Quiz 11-13
Summary 11-15
Practice 11 Overview: Bulk Data Loading with Database Machine 11-17
12 Backup and Recovery with Database Machine

Objectives 12-2
Backup and Recovery Overview 12-3
Using RMAN with Database Machine 12-4
General Recommendations for RMAN 12-5
Disk Based Backup Strategy 12-7
Disk Based Backup Configuration 12-8
Tape Based Backup Strategy 12-10
Tape Based Backup Configuration 12-11
Hybrid Backup Strategy 12-15
Restore and Recovery Recommendations 12-16
Backup and Recovery of Database Machine Software 12-17
Quiz 12-18
Summary 12-20
Practice 12 Overview: Using RMAN Optimizations for Database Machine 12-22
13 Monitoring and Maintaining Database Machine

Objectives 13-2
Monitoring Tools Overview 13-3
ILOM Overview 13-4
ILOM Example 13-6
DCLI Overview 13-7
DCLI Examples 13-8
InfiniBand Diagnostic Utilities 13-9
Database Machine Support Overview 13-11
Patching and Updating Overview 13-12
Maintaining Exadata Software 13-13
Maintaining Database Server Software 13-14
Maintaining Other Software 13-15
Quiz 13-16
Summary 13-18
Practice 13 Overview: Using the distributed command line utility (dcli) 13-20
viii
A New Features in Update Release 11.2.1.3.1
Objectives A-2
New Features Overview A-3
Auto Service Request (ASR) A-4
The ASR Process A-5
ASR Requirements A-6
Oracle Linux 5.5 A-7
Enhanced Operating System Security A-8
Pro-active Disk Quarantine A-9
Other New Features A-10
Summary A-11
ix
I t d ti
Introduction
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Course Objectives
After completing this seminar, you should be able to:

• Describe the key capabilities of Exadata and Database
Machine
• Identify the benefits of using Database Machine for
different application classes
• Describe the architecture of Database Machine and its
integration with Oracle Database, Clusterware and ASM
• Complete the initial configuration of Database Machine
• Describe
D ib various
i recommendedd d approaches
h ffor migrating
i ti
to Database Machine
• Configure Exadata I/O Resource Management
• Monitor Database Machine health and optimize
performance
Exadata and Database Machine Administration Workshop 1 - 2

Audience and Prerequisites
• This course is primarily designed for administrators who

will configure and administer Oracle Exadata Database
Machine.
• Prior knowledgeg and understanding g of the following
g is
assumed:
– Oracle Database 11g Release 2, including RAC and ASM.
– Linux and general network, storage and system
administration concepts.
• Recommended prior training:
– Oracle Database 11g: Administration Workshop I
– Oracle Database 11g: Administration Workshop II
– Oracle 11g: RAC and Grid Infrastructure Administration
– Oracle Linux: Linux Fundamentals
Audience and Prerequisites

This seminar is primarily designed for administrators who will configure and administer Oracle
Exadata Database Machine
Machine.
Please be mindful of the prerequisites because this course does not teach all aspects of the
technologies used inside Database Machine. Rather it focuses on topics that are specific to
Exadata and Database Machine.
Prior knowledge and understanding of Oracle Database 11g Release 2, including Automatic
Storage Management (ASM) and Real Application Clusters (RAC), is assumed. In addition, a
workingg knowledge
g of Linux is assumed along g with an understand of g
general networking,
g
storage and system administration concepts.
For students that do not meet these prerequisites, the recommended prior training includes
the following courses:
• Oracle Database 11g: Administration Workshop I
• Oracle Database 11g: Administration Workshop II
• Oracle 11g: g RAC and Grid Infrastructure Administration
• Oracle Linux: Linux Fundamentals

Course Scope
• This course covers two main subject areas:

– Exadata Storage Server X2-2
— This section focuses on the architecture and key capabilities of
Exadata along with how to configure, monitor and optimize it.
– Oracle Exadata Database Machine
— This section introduces students to Database Machine.
— The installation and configuration process is covered so that
students can make appropriate configuration decisions.
— Students also learn how to maintain, monitor and optimize
Database Machine after initial configuration.
• Hardware is discussed during the course, however
detailed hardware installation and maintenance is outside
the scope of this course.
Course Scope
This course covers two main subject areas:
• The first section introduces students to Exadata Storage Server X2 X2-2
2 (formerly known
as Exadata Storage Server Version 2). Students learn about the architecture and key
capabilities of Exadata along with how to configure, monitor and optimize it.
• The second section introduces students to Oracle Exadata Database Machine. Students
learn about the various Database Machine configurations. The installation and
configuration process is covered so that students are equipped to make appropriate up-
front configuration decisions. They also learn how to maintain, monitor and optimize
Database Machine after initial configuration. Students are introduced to various options
for migrating to Database Machine and learn how to select the best approach.
Although the hardware components of Database Machine are introduced and described to
varying degrees throughout this course, you should consult the hardware documentation for
specific hardware installation and maintenance details.

Course Contents
1. Introduction
2. Exadata Overview
3. Exadata Architecture
4. Exadata Configuration
5. Exadata Monitoring and Maintenance
6. Exadata and I/O Resource Management
7. Optimizing Database Performance with Exadata
8. Database Machine Overview and Architecture
9. Database Machine Configuration
10
10. Migrating Databases to Database Machine
11. Bulk Data Loading with Database Machine
12. Backup and Recovery with Database Machine
13. Database Machine Monitoring and Maintenance
Course Contents
The slide shows the ordering of lessons in this course.

Terminology
• Unless otherwise indicated, ‘Exadata’ refers to ‘Exadata

Storage Server’.
– Typically a reference to Exadata refers to the combination of
software and hardware used in Exadata Storage Server.
However at times there are specific references to Exadata
However,
hardware or Exadata software.
– Unless otherwise indicated, Exadata X2-2 (formerly known
as Exadata Version 2) is implied throughout the course.
Exadata X2-2 is based on Sun hardware and is the only
version of Exadata supported in Oracle Exadata Database
Machine.
Machine
• Unless otherwise indicated, ‘Database Machine’ refers to
‘Oracle Exadata Database Machine’.
– Typically, Database Machine refers to the entire system
including both hardware and software.
Terminology
The slide indicates the conventions used throughout this course to abbreviate the formal
product names for Exadata Storage Server and Oracle Exadata Database Machine
Machine.

Additional Resources
• Demonstrations (Viewlets)
– http://www.oracle.com/technetwork/tutorials/index.html
– Enter the Oracle Learning Library and conduct a search for
content in the Database Machine functional category.
g y Look
out for demonstrations with Exadata and Database Machine
Version 2 Series in the title.
• Oracle Technology Network (OTN) Exadata and Database
Machine Page
– http://www.oracle.com/technetwork/database/exadata/index.
html
• OTN Exadata Discussion Forum
– http://forums.oracle.com/forums/forum.jspa?forumID=829

Practice 1 Overview:
Introducing the Laboratory Environment
In this practice you will be introduced to the laboratory
environment used to support all the practices during this
course.

E d t Overview
Exadata O i

Objectives
After completing this lesson, you should be able to:

• Contrast the Exadata storage architecture with traditional
shared storage offerings
• Describe the hardware components of Exadata
• Outline the capabilities of Exadata
• Describe the main advantages of using Exadata compared
to traditional storage servers

Traditional Enterprise Database Storage
Deployment
Database Servers
Storage Arrays
Traditional Enterprise Database Storage Deployment

The graphic in the slide illustrates the traditional deployment approach for multiple databases.
Each database has an isolated allocation of storage resources and its bandwidth is limited by
the hardware allocated to it. The isolation and dedication of hardware resources to individual
databases can simultaneously lead to unused space and unused input/output (I/O) bandwidth
for some databases, and overcommitted bandwidth with insufficient free space in others. The
right balance is almost never achieved because real-world workloads are very dynamic.
Large storage arrays are used today for many enterprise database deployments. These large
storage arrays must be partitioned and have their bandwidth and space allocated across the
d t b
databases andd applications
li ti sharing
h i th the storage
t array. B
Because th these storage
t arrays h
house
vast quantities of mission-critical data, they must be highly engineered, and consequentially
very expensive, to deliver high levels of reliability and availability. Enterprise-class storage
arrays are not only costly to procure, they also require highly specialized skills to manage and
maintain. The result is a very high total cost of ownership when traditional large storage
arrays are used in real-world enterprise database deployments.

Exadata Storage Deployment
Oracle Database 11g Servers
Smart
I/O Resource Management
storage
operations
High performance
storage network
Storage
consolidation
(Transparent to
databases)
Data compression
p
Exadata Storage Deployment

The graphic in the slide illustrates the general deployment approach with Exadata.
• You can use Exadata to consolidate your storage environment
environment. Using Exadata
Exadata, multiple
databases can use storage from a single pool. Exadata uses Oracle Automatic Storage
Management (ASM) to evenly distribute the storage load for every database across
every available disk in the storage pool. Every database can use all the available disks
to maximize performance. Exadata requires the use of Oracle Database 11g Release 2.
Exadata works equally well with single-instance or Oracle Real Application Clusters
(RAC) databases. Users and database administrators use the same tools and
k
knowledge
l d they
th are already
l d familiar
f ili with.
ith B
Being
i b basedd on iindustry-standard
d t t d d components t
and technologies, Exadata is inexpensive to deploy. In addition, tight integration with the
full suite of Oracle Database high-availability features, ensures that the reliability and
integrity needs of mission-critical environments are met.
• A key advantage of Exadata is the ability to offload some database processing to
Exadata servers. With Exadata, the database can offload single table scan predicate
filters and projections, join processing based on bloom filters, along with CPU-intensive
decompression and decryption operations. This ability is known as SQL processing
offload or Smart Scan.

Exadata Storage Deployment (continued)
In addition to Smart Scan, Exadata has other smart storage capabilities including the
ability to offload incremental backup optimizations, file creation operations, and more. This
approach yields substantial CPU
CPU, memory
memory, and I/O bandwidth savings in the database
server resulting in potentially massive performance improvements.
• Exadata includes Exadata Hybrid Columnar Compression. This feature provides very high
levels of data compression implemented inside Exadata. Exadata Hybrid Columnar
Compression allows the database to reduce the number of I/Os required to scan a table.
For example, for data with a compression ratio of 10 to 1, the I/Os required to scan the
data are reduced from 10 to 1 as well.
• Exadata ensures that I/O resources are made available whenever, and to whichever,
database needs them based on priorities and policies that you can define. The Database
Resource Manager (DBRM) and Exadata I/O Resource Management (IORM) work
together to manage intradatabase and interdatabase I/O resource usage to ensure that
your defined service-level agreements (SLAs) are met when multiple applications and
databases share Exadata storage.
• Finally, even for queries that do not use Smart Scan, Exadata has many advantages over
conventional storage. Exadata is highly optimized for fast processing of large queries. It
has been carefully architected to ensure no bottlenecks in the controller or in other
components inside the storage server. It makes intelligent use of high-performance flash
memory to boost performance and also uses a state-of-the-art InfiniBand network that has
much higher throughput than conventional storage networks.

Exadata Implementation Architecture Overview
Oracle Database 11g Servers
Exadata Cell Linux OS Exadata Cell Linux OS

Exadata Exadata
software software
Disk … …
Disk
Exadata Implementation Architecture Overview

Exadata is a self-contained storage platform that houses disk storage and runs the Exadata
Storage Server Software provided by Oracle
Oracle. A single Exadata server is also called a cell
cell. A
cell is the building block for a storage grid. More cells provide greater capacity and I/O
bandwidth. Databases are typically deployed across multiple cells, and multiple databases
can share a single cell. The databases and cells communicate with each other via a high-
performance InfiniBand network.
Each cell is a purely dedicated storage platform for Oracle Database files although you can
use Database File System (DBFS), a feature of Oracle Database, to store your business files
i id the
inside th ddatabase.
t b
Like other storage arrays, each cell is a computer with CPUs, memory, a bus, disks, network
adapters, and the other components normally found in a server. It also runs an operating
system (OS), which in the case of Exadata is Linux. The Oracle-provided software resident in
the Exadata cell runs under this operating system. The OS is accessible in a restricted mode
to administer and manage Exadata.

Introducing Exadata
• High performance storage for Oracle

Database
Exadata Storage
Server – Up to 1.8 GB/sec raw data bandwidth
– Up to 75,000 I/Os per second using flash
• 64 bit Intel-based Sun Fire Server
• Preinstalled software
– Exadata Storage Server Software
– Oracle Linux x86_64
– Drivers and Utilities
• Only available in conjunction with
Database Machine
Introducing Exadata
Exadata is highly optimized for use with Oracle Database. Exadata delivers outstanding I/O
and SQL processing performance for data warehousing and online transaction processing
(OLTP) applications.
Exadata is based on a 64 bit Intel-based Sun Fire server. Oracle provides the storage server
software to impart database intelligence to the storage, and tight integration with Oracle
Database and its features. Each cell is shipped with all the hardware and software
components preinstalled including the Exadata Storage Server Software, Oracle Linux
x86_64 operating system and InfiniBand protocol drivers.
Since March 2010, Exadata is no longer offered as a standalone storage product. Now
Exadata is only available for use in conjunction with Database Machine. Individual Exadata
servers can still be purchased, however they must be connected to Database Machine.
Custom configurations using Exadata are no longer supported for new installations.

Exadata Hardware Details
(Sun Fire X4270 M2)
Processors 2 Six-Core Intel® Xeon® L5640 Processors (2.26 GHz)
Memory 24 GB (6 x 4 GB)
Local Disks 12 x 600 GB 15K RPM High Performance SAS

or 12 x 2 TB 7.2K RPM High Capacity SAS
Flash 4 x 96 GB Sun Flash Accelerator F20 PCIe Cards
Disk Controller Disk controller HBA with 512 MB battery backed cache
N t
Network
k T InfiniBand
Two I fi iB d 4X QDR (40Gb/
(40Gb/s)) ports
t
(1 dual-port PCIe 2.0 HCA)
Four embedded Gigabit Ethernet ports
Remote Management 1 Ethernet port (ILOM)
Power Supplies 2 redundant hot-swappable power supplies
Exadata Hardware Details (Sun Fire X4270 M2)

The slide shows a description of the Exadata Storage Server hardware.

Exadata Specifications
HP Disks HC Disks
Exadata Smart Flash Cache1 384 GB 384 GB
Raw Disk Capacity1 7.2 TB 24 TB
Uncompressed Data Capacity2 2 TB 7 TB
Raw Disk Throughput (MBPS) 1,800 1,000
Effective Throughput with Flash (MBPS) 3,600 3,600
Disk I/Os per Second (IOPS) 3,600 1,440
Flash I/Os p
per Second (IOPS)
( ) 75,000 75,000
1 - Raw capacity calculated using 1 GB = 1000 x 1000 x 1000 bytes and 1 TB = 1000 x 1000 x 1000 x 1000 bytes.
2 - User Data: Actual space for uncompressed end-user data, computed after single mirroring (ASM normal redundancy)
and after allowing space for database structures such as temporary space, logs, undo space, and indexes. Actual user data
capacity varies by application. User Data capacity calculated using 1 TB = 1024 * 1024 * 1024 * 1024 bytes.
Exadata Specifications
Exadata is available in two configurations: with high performance (HP) disks or with high
capacity (HC) disks.
disks The table in the slide lists the key capacity and performance
specifications for both configuration options.
Note: MBPS stands for megabytes per second, IOPS stands for I/Os per second.
Note: These metrics do not take into account compression. With compressed data, you can
achieve much higher effective throughput rates. In all cases, actual performance will vary by
application.

InfiniBand Network
InfiniBand:
• Is the Exadata storage network:
– Provides highest performance available – 40 Gb/sec each direction
– Is widely used in high-performance computing since 2002
• Looks
oo s like
e normal
o a Ethernet
e e too host
os so
software:
ae
– All IP-based tools work transparently – TCP/IP, UDP, HTTP, SSH,
and so on
• Has the efficiency of a SAN:
– Zero copy and buffer reservation capabilities
• Is used for both storage and RAC interconnect:
– Less configuration
configuration, lower cost
cost, higher performance
• Uses high-performance ZDP InfiniBand protocol (RDS V3):
– Zero-copy, zero-loss Datagram protocol
– Open Source software developed by Oracle
– Very low CPU overhead
InfiniBand Network
InfiniBand is the only storage network supported by Exadata because of its performance and
proven track record in high-performance
p g p computing.
p g InfiniBand works like normal Ethernet but
much faster. It has the efficiency of a SAN, using zero copy and buffer reservation. Zero copy
means that data is transferred across the network without intermediate buffer copies in the
various network layers. Buffer reservation is used so that the hardware knows exactly where
to place buffers ahead of time. These are two important characteristics that distinguish
InfiniBand from normal Ethernet.
InfiniBand is also supported as a unified network fabric for Exadata and the Oracle RAC
interconnect. This facilitates easier configuration and fewer cables and switches. You can
also
l use it ffor hi
high-performance
h f external
t l connectivity,
ti it such h as tto connectt bbackup
k servers or
ETL servers.
On top of InfiniBand, Exadata uses the Zero Data loss UDP (ZDP) protocol. ZDP is open
source software that is developed by Oracle. It is like UDP but more reliable. Its full technical
name is RDS (Reliable Datagram Sockets) V3. The ZDP protocol has a very low CPU
overhead with tests showing only a 2 percent CPU utilization while transferring 1 GB/sec of
data.
E hE
Each Exadata
d t server iis configured
fi d with
ith one ddual-port
l t InfiniBand
I fi iB d card dddesigned
i d tto b
be
connected to two separate InfiniBand switches for high availability. Each InfiniBand link is
able to carry the full data bandwidth of the entire cell, which means you can lose an entire
network without losing any performance.

Classic Database I/O and SQL Processing Model
SELECT customer_id 1 6
FROM orders Row returned
WHERE order_amount>20000;
Extents identified 2 5 SQL processing:

2 MB returned
I/O issued 3 4 I/O executed:

10 GB returned
Classic Database I/O and SQL Processing Model

With traditional storage, all the database intelligence resides in the software on the database
server To illustrate how SQL processing is performed in this architecture
server. architecture, an example of a
table scan is shown in the graphic in the slide.
1. The client issues a SELECT statement with a predicate to filter a table and return only
the rows of interest to the user.
2. The database kernel maps this request to the file and extents containing the table.
3. The database kernel issues the I/Os to read all the table blocks.
4 All the blocks for the table being queried are read into memory
4. memory.
5. SQL processing is conducted against the data blocks searching for the rows that satisfy
the predicate.
6. The required rows are returned to the client.
As is often the case with the large queries, the predicate filters out most of the rows in the
table. Yet all the blocks from the table need to be read, transferred across the storage
network,, and copied
p into memory.
y Manyy more rows are read into memoryy than required
q to
complete the requested SQL operation. This generates a large amount of unproductive I/O,
which wastefully consumes resources and impacts application throughput and response time.

Exadata Smart Scan Model
SELECT customer_id 1 6
FROM orders Row returned
WHERE order_amount>20000;
iDB command Consolidated result

constructed 2 5 set built from all
and sent to Exadata cells Exadata cells
SQL processing 3 4 2 MB returned

in Exadata to server
Exadata Smart Scan Model

Using Exadata, database operations are handled differently. Queries that perform table scans
can be pprocessed within Exadata and return only y the required
q subset of data to the database
server. Row filtering, column filtering, some join processing, and other functions can be
performed within Exadata. Exadata uses a special direct-read mechanism for Smart Scan
processing. The above graphic illustrates how a table scan operates with Exadata:
1. The client issues a SELECT statement to return some rows of interest.
2. The database kernel determines that Exadata is available and constructs an iDB
command representing the SQL command and sends it to the Exadata cells. iDB is a
unique Oracle data transfer protocol that is used for Exadata storage communications.
3 The Exadata server software scans the data blocks to extract the relevant rows and
3.
columns which satisfy the SQL command.
4. Exadata returns to the database instance an iDB message containing the requested
rows and columns of data. These results are not block images, so they are not stored in
the buffer cache.
5. The database kernel consolidates the result sets from across all the Exadata cells. This
is similar to how the results from a parallel query operation are consolidated.
6 The rows are returned to the client
6. client.
Moving SQL processing off the database server frees server CPU cycles and eliminates a
massive amount of unproductive I/O transfers. These resources are free to better service
other requests. Queries run faster, and more of them can be processed.

Exadata Smart Storage Capabilities
• Predicate filtering:
– Only the rows requested are returned to the database server
rather than all the rows in a table.
• Column filtering:
g
– Only the columns requested are returned to the database
server rather than all the columns in a table.

The following database functions are integrated within Exadata:
• Exadata enables predicate filtering for table scans
scans. Rather than returning all the rows for
the database to evaluate, Exadata returns only the rows that match the filter condition.
The conditional operators that are supported include =, !=, <, >, <=, >=, IS [NOT] NULL,
LIKE, [NOT] BETWEEN, [NOT] IN, EXISTS, IS OF type, NOT, AND, OR. In addition, many
common SQL functions are evaluated by Exadata during predicate filtering. For a full list
of functions that can be offloaded to Exadata, use the following query:
SELECT * FROM v$sqlfn_metadata WHERE offloadable = 'YES';
• Exadata provides column filtering, also called column projection, for table scans. Only
the requested columns are returned to the database server rather than all columns in a
table. For tables with many columns, or columns containing LOBs, the I/O bandwidth
saved by column filtering can be very large.
When used together, the combination of predicate and column filtering dramatically improves
performance and reduces I/O bandwidth consumption. For example, when processing the
following query, Exadata returns only the employee names that are longer than five
characters:
SELECT name FROM employees WHERE LENGTH(name) > 5;
Without predicate and column filtering, the storage subsystem would need to send all the
rows and columns of the employees table to the database to evaluate.

• Join processing:
– Simple star join processing is performed within Exadata.
• Scans on encrypted data
• Scans on compressed data
• Scoring for Data Mining:
– All data mining scoring functions are offloaded.
– Up to 10x performance gains.
Exadata Smart Storage Capabilities (continued)

• Exadata performs join processing for star schemas (between large tables and small
lookup tables)
tables). This is implemented using Bloom Filters
Filters, which is a very efficient
probabilistic method to determine whether an element is a member of a set.
• Exadata performs Smart Scans on encrypted tablespaces and encrypted columns. For
encrypted tablespaces, Exadata can decrypt blocks and return the decrypted blocks to
Oracle Database, or it can perform row and column filtering on encrypted data.
Significant CPU savings can be made within the database server by offloading the CPU-
intensive decryption task to Exadata cells.
• Smart Scan works in conjunction with Exadata Hybrid Columnar Compression so that
column projection and row filtering can be executed along with decompression at the
storage level to save CPU cycles on the database servers.
• Exadata can perform scoring functions for data mining models. All data mining scoring
functions, such as PREDICTION_PROBABILITY, are offloaded to Exadata cells for
processing. This accelerates warehouse analysis while it reduces database server CPU
consumption
p and the I/O load between the database server and Exadata.

• Backups:
– I/O for incremental backups is much more efficient because
only changed blocks are returned to the database server.
• Create/extend tablespace:
p
– Exadata formats database blocks.
Exadata Smart Storage Capabilities (continued)

• The speed and efficiency of incremental database backups is enhanced with Exadata.
The granularity of change tracking in the database is much finer with Exadata
Exadata. With
Exadata, changes are tracked at the individual Oracle block level rather than at the level
of a large group of blocks. This results in less I/O bandwidth being consumed for
backups and faster running backups.
• With Exadata, the create/extend tablespace operation is also executed much more
efficiently. Instead of formatting blocks in database server memory and writing them to
storage, a single iDB command is sent to Exadata instructing it to format the blocks.
Database server memory usage is reduced and I/O associated with the creation and
formatting of the database blocks is eliminated with Exadata.

Exadata Smart Scan Scale-Out Example
Database dbs1
Server
InfiniBand Storage Network

40 Gb/s Maximum
Exadata
Cell
edsc1 edsc2 … edsc13 edsc14
Each cell can deliver 1.8 GB/s.
Total of 14 cells that can deliver

14 x 1.8 = 25.2 GB/s
Disks
(12/cell)

The example in the next three slides illustrates the power of Smart Scan in a quantifiable
manner using a typical case in which multiple Exadata cells scale-out
scale out to share a workload
workload.
The database server, depicted in the upper portion of the slide, is connected to the InfiniBand
storage network, which can deliver a maximum of 40 gigabits per second (Gb/s). To keep the
example clear and simple, assume that the InfiniBand storage network can deliver data at 40
Gb/s with no messaging overhead. We will also assume that a single database server has
access to the full I/O bandwidth of all the Exadata cells.
g that each Exadata cell can deliver 1.8
In this scenario, there are 14 Exadata cells. Assuming
gigabytes (GB) of I/O throughput per second, the potential scanning power of all the Exadata
cells is 25.2 GB per second.

select /*+ full(lineitem) */ count(*)

from lineitem
where l_orderkey < 0;
Database asks to retrieve all blocks

Database dbs1 by doing a full table scan, and then
Server
filters matching rows.
If the table is evenly distributed

If the table is 4800 GB in size, the across all disks, each cell
complete scan would take approximately cannot send more than 40 / 14 =
16 minutes. 2.85 Gb/s = 0.357 GB/s
to the database instance.
Exadata
Cell
0 357 GB/s
0.357
Disks are throttled

by the network bandwidth!
Disks
(12/cell)
Exadata Smart Scan Scale-Out Example (continued)

Now assume a 4800 gigabyte table is evenly spread across the 14 Exadata cells and a query
is executed which requires a full table scan.
scan As is commonly the case
case, assume that the query
returns a small set of result records.
Without Smart Scan capabilities, each Exadata server behaves like a traditional storage
server by delivering database blocks to the client database.
Because the storage network is bandwidth-limited to 40 gigabits per second, it is not possible
for the Exadata cells to deliver all their power. In this case, each cell cannot deliver more than
0.357 ggigabytes
g y p
per second to the database and it would take approximately
pp y 16 minutes to
scan the whole table.

select /*+ full(lineitem) */ count(*)

from lineitem
where l_orderkey < 0;
Database dbs1 Database asks Exadata cells

Server to send back all matching rows.
If the table is evenly distributed

If the table is 4800 GB in size, the complete across all disks, each cell
table scan will complete in approximately cannot send more than 40 / 14 =
three minutes and ten seconds! 2.85 GB/s = 0.357 GB/s
to the database instance.
Exadata
Cell
1 8 GB/s
1.8 Each
E h cellll can scan att a
speed of 1.8 GB/s,
and send its matching
rows to the database
instance. This represents
a total scan at a speed
of 25.2 GB/s!
Disks
(12/cell)
Exadata Smart Scan Scale-Out Example (continued)

Now consider if Smart Scan is enabled for the same query. The same storage network
bandwidth limit applies
applies. However this time the entire 4800 GB is not transported across the
storage network; only the matching rows are transported back to the database server. So
each Exadata cell can process its part of the table at full speed; that is, 1.8 GB per second. In
this case, the entire table scan would be completed in approximately three minutes and ten
seconds.

Exadata Hybrid Columnar Compression
Warehouse Compression Archival Compression

Optimized for Speed Optimized for Space
• 10
10x average storage
t savings
i • 15
15x average storage
t savings
i
• 10x scan I/O reduction – Up to 50x on some data
• Optimized for query performance • Some access overhead
• For cold or historical data
Reduced Warehouse Size Reclaim Disks

Better Performance Keep Data Online
Can mix compression types by partition for ILM

In addition to the basic and OLTP compression capabilities of Oracle Database 11g, Exadata
includes Exadata Hybrid Columnar Compression
Compression.
Exadata Hybrid Columnar Compression offers higher compression ratios for direct path
loaded data. This compression capability is recommended for data that is not updated
frequently. You can specify Exadata Hybrid Columnar Compression at the table, partition, and
tablespace level. You can also choose between two types of Exadata Hybrid Columnar
Compression, to achieve the proper trade-off between disk usage and CPU consumption,
depending on your requirements:
• Warehouse compression: This type of compression is optimized for query performance,
and is intended for data warehouse applications.
• Online archival compression: This type of compression is optimized for maximum
compression ratios, and is intended for data that does not change frequently.
You can use Exadata Hybrid Columnar Compression on complete tables or in combination
with basic and OLTP compression by using partitioning.
Note: A compression advisor, provided by the DBMS_COMPRESSION package, helps you
determine the expected compression ratio for a particular table with a particular compression
method.

Architecture Overview
Compression Unit (CU)
Block Header Block Header Block Header Block Header

CU Header
C2 C5 C7 C8
C1 C4
C3
C2 C5 C6
• A compression unit is a logical structure spanning multiple

database blocks.
• E h row iis self-contained
Each lf t i d within
ithi a compression
i unit.
it
• Data organized by column during data load.
• Each column compressed separately.
• Smart Scan is supported.
Exadata Hybrid Columnar Compression Architecture Overview

Exadata Hybrid Columnar Compression is a new method for organizing data in database
blocks Tables are organized into sets of rows called compression units (CU)
blocks. (CU). Within a
compression unit, data is organized by column and then compressed. The column
organization of data brings similar values close together, enhancing compression ratios. Each
row is self-contained within a compression unit.
In addition to providing excellent compression, Exadata Hybrid Columnar Compression works
in conjunction with Smart Scan so that column projection and row filtering can be executed
along with decompression at the storage level to save CPU cycles on the database servers.
Note: Although the diagram in the slide shows a compression unit containing four data
blocks, it should not be assumed that a compression unit always contains fours blocks. The
size of a compression unit is determined automatically by Oracle Database based on various
factors in order to deliver the most effective compression result while maintaining excellent
query performance.

Exadata Smart Flash Cache
• High performance cache for frequently accessed objects

• Excellent for absorbing repeated random reads
• Allows optimization by application table
Hundreds of Tens of Thousands

I/Os per Sec of I/Os per Second

For many years, a constraining factor for storage performance has been the number of
random I/Os per second (IOPS) that a disk can deliver.
deliver To compensate for the fact that even
a high performance disk can deliver only a few hundred IOPS, large storage arrays with
hundreds of disks are required to deliver in excess of 60,000 IOPS.
Exadata provides Exadata Smart Flash Cache, a caching mechanism for frequently accessed
data. It is a write-through cache which is useful for absorbing repeated random reads, and
very beneficial to OLTP. Using Exadata Smart Flash Cache, a single Exadata cell can support
up to 75,000 IOPS, two cells can support up to 150,000 IOPS, and so on.
Exadata Smart Flash Cache focuses on caching frequently accessed data and index blocks,
along with performance critical information such as control files and file headers. In addition,
DBAs can influence caching priorities using the CELL_FLASH_CACHE storage attribute for
specific database objects.

High performance cache that understands different types of

database I/O:
• Frequently accessed data and index blocks are cached.
• Control file reads and writes are cached
cached.
• File header reads and writes are cached.
• DBA can influence caching priorities.
• I/Os to mirror copies are not cached.

• Backup-related I/O is not cached.
• Data Pump I/O is not cached.
• Data file formatting is not cached.
• Table scans do not monopolize the cache.
Exadata Smart Flash Cache (continued)

In more recent times, vast and expensive storage arrays have introduced equally expensive
nonvolatile memory caches to improve performance
performance. However,
However these caches know nothing
about the applications using them, so their efficiency is limited when compared to their cost.
With Exadata, each database I/O is tagged with metadata indicating the I/O type. Exadata
Smart Flash Cache uses this information to make intelligent decisions about how to use the
cache. This cooperation ensures the efficient use of Exadata Smart Flash Cache.
For example, with ASM mirroring turned on, multiple copies of each data block must be
written to disk to deliver the desired level of data p
protection. However, there is usually
y no
need to cache the secondary copies of a block because ASM will read the primary copy if it is
available. A traditional storage array would not know about this characteristic leading to
caching inefficiencies.
Similarly, with traditional storage arrays, backups and exports will typically cause all the data
to be loaded into the cache even though the operation will not read the data repeatedly.
Exadata knows that there is no need to fill the cache with backup and export data. The same
is true for data file formatting operations.
operations Finally,
Finally Exadata does not flood the cache with data
from full table scans, as is the case with most storage arrays.

Exadata Storage Index
Storage Index in Memory Only first block can match
Region Index
SELECT * FROM T1 WHERE B<2;
B:1/5 B:3/8
E:a/j
G:4/9
… …
1 ASM AU
… 1MB Storage Region
1 ASM Disk
DBA
Table T1 Table T1 Table T2

A B C D A B C D E F G
… 1 … … … 5 … … a … 4
Min B = 1 Min B = 3 d … 7
… 3 … … … 8 … …
Max B = 5 Max B = 8
… 5 … … … 3 … … j … 9
Exadata Storage Index

A storage index is a memory-based structure that reduces the amount of physical I/O required
byy the cell. The storage
g index keeps p track of minimum and maximum column values and this
information is used to avoid useless I/Os.
For example, the slide shows table T1 which contains column B. Column B is tracked in the
storage index so it is known that the first half of T1 contains values for column B that range
between 1 and 5. Likewise it is also known that the second half of T1 contains values for
column B that range between 3 and 8. Any query on T1 looking for values of B less than 2 can
quickly proceed without any I/O against the second part of the table.
Given a favorable combination of data distribution and q query yppredicates,, a storage
g index
could be used to drastically speed up a query by quickly skipping much of the I/O. For another
query, the storage index may provide little or no benefit. In any case, the ease of maintaining
and querying the memory-based storage index means that any I/O saved through its use
effectively increases the overall I/O bandwidth of the cell while consuming very few cell
resources.
The storage space inside each cell disk is logically divided into 1 MB chunks called storage
regions. The boundaries of ASM allocation units (AUs) are aligned with the boundaries of
t
storage regions.
i F
For each
h off th
these storage
t regions,
i d
data
t di
distribution
t ib ti statistics
t ti ti are h held
ld iin a
memory structure called a region index. Each region index contains distribution information for
up to 8 columns. The storage index is a collection of the region indexes.

Exadata Storage Index (continued)
The storage statistics represent the data distribution (minimum and maximum values) of
columns that are considered well clustered by Exadata. Exadata has heuristics to transparently
determine what
hat col
columns
mns are cl
clustered
stered eno
enough
gh to be incl
included
ded in the storage inde
index.
The storage index works best when the following conditions are true:
• The data is roughly ordered so that the same column values are clustered together.
• The query has a predicate on a storage index column checking for =, <, > or some
combination of these.
It is important to note that the storage index works transparently with no user input. There is no
need d tto create,
t d drop, or tune
t th
the storage
t index.
i d The
Th only
l way tto iinfluence
fl th
the storage
t index
i d isi to
t
load your tables using presorted data.
Also, because the storage index is kept in memory, it disappears when the cell is rebooted. The
first queries that run after a cell is rebooted automatically cause the storage index to be rebuilt.
The storage index works for data types whose binary encoding is such that byte-wise binary
lexical comparison of two values of that data type is sufficient to determine the ordering of those
two values.
values This includes data types like NUMBER, DATE and VARCHAR2.
NUMBER DATE, VARCHAR2 However,
However NLS data
types are an example of data types that are not included for storage index filtering.

Storage Index with Partitions Example
ORDER# ORDER_DATE SHIP_DATE ITEM

(Partition Key)
1 2007 2007
2 008
2008 2008
008
3 2009 2009
• Queries on SHIP_DATE do not benefit from ORDER_DATE

partitioning:
– However SHIP_DATE is highly correlated with ORDER_DATE.
• Storage index provides partition pruning like performance for
queries on SHIP_DATE:
– Takes advantage of ordering created by partitioning
Storage Index with Partitions Example

The example in the slide contains correlated columns. ORDER_DATE is highly correlated with
SHIP DATE The dates are generally correlated because usually a ship date is close to an
SHIP_DATE.
order date.
If your table is partitioned by ORDER_DATE, and you execute a query using ORDER_DATE as a
filter, then partition pruning is used to read only the relevant partitions. However, if you do a
query using only SHIP_DATE in the WHERE clause, partition pruning cannot be used to
optimize the query.
However, if SHIP_DATE is part of the storage index, the storage index is used to skip all the
blocks that do not correspond to your query. This filtering takes place at the storage level. The
storage index helps the SHIP_DATE query to take advantage of the natural ordering implied
by the ORDER_DATE partitioning and the natural correlation that exists between the
ORDER_DATE and SHIP_DATE columns.

Database File System
• Database File System (DBFS) enables the database to be used

as a file system.
• Files are stored as SecureFiles LOBs inside database tables that
are stored in Exadata.
– Protected like any Oracle data – ASM mirroring, Data Guard,
Flashback, and so on
– Shared storage for ETL staging, scripts, reports and other
application files
– 5 to 7 GB/sec file system I/O throughput capable on a full rack
Database Machine
Transform and load into

Copy files to DBFS database tables
Database File System

Oracle Database File System (DBFS) enables an Oracle database to be used as a POSIX-
compatible file system on Linux
Linux. DBFS is an Oracle Database capability that provides
Exadata users with a high performance mechanism to load data into an Oracle database.
DBFS can be used to stage your ETL files for example.
Inside DBFS files are stored as SecureFiles LOBs. A set of PL/SQL procedures implement
the file system access primitives, such as open, create, and so on. The dbfs_client utility
enables the mounting of a DBFS file system as a mount point on Linux. It provides the
mapping from file system operations to database operations. The dbfs_client utility runs
completely
l t l iin user space and
d iinteracts
t t with
ith the
th kernel
k l through
th h the
th FUSE lib
library iinfrastructure.
f t t
Note: ASM Cluster File System (ACFS) is not supported over Exadata.

Traditional
I/O FIFO Disk Queue
Requests
Storage Server
RDBMS H L H L L L
Y cannott
You High-priority Low-priority
influence the workload workload
I/O scheduler. request request
Exadata I/O scheduler based on

prioritization scheme
I/O
Requests
H H
RDBMS L H H H
L L L L

With traditional shared storage, balancing the work of multiple databases sharing the storage
subsystem is inherently difficult
difficult. This issue is illustrated by the graphic at the top of the slide
slide,
which shows how traditional storage servers handle I/O requests. In essence, they queue I/O
requests in a first-in, first-out (FIFO) order, which makes no distinction between high-priority
and low-priority requests.
Exadata allows for allocation of I/O resources based on user-specified priorities and policies.
This is illustrated in the graphic at the bottom of the slide where the Exadata I/O scheduler
executes I/O requests based on a prioritization scheme. It does that by internally queuing I/O
requestst to
t preventt a low-priority
l i it but
b t intensive
i t i workload
kl d ffrom flflooding
di ththe disks.
di k
I/O resource management is covered in more detail in the lesson titled Exadata and I/O
Resource Management.

Benefits Multiply
Multiple terabytes of user Less with Exadata Even less with

data normally requires Hybrid Column partition pruning
multiple terabytes of I/O Compression
Results in
real-time on
Database
Storage index skips Smart scan means Machine
worthless I/O that only the results
are returned to the
database
Benefits Multiply
This is an example that shows you how the main Exadata features that were introduced in this
lesson can work together to multiply the benefits of Exadata
Exadata.
Assume you have a multi-terabyte table and somebody runs a query that is interested in a
small subset of the data, but causes a full table scan. Traditionally, the system would have to
scan the terabytes of data.
However, using Exadata Hybrid Columnar Compression could reduce the size of the table.
If the table is partitioned, the optimizer could use partition pruning to eliminate a substantial
proportion of the data
data.
Using storage indexes, Exadata might further reduce the amount of physical I/O that is
executed.
Finally, because of Smart Scan, the only data returned to the database is the data of interest
to the query, some of which may have been cached inside Exadata Smart Flash Cache.
This example shows how the various Exadata and Oracle Database features can work in
harmony to improve the performance of a single operation using Database Machine
Machine.

Exadata Key Benefits for Data Warehousing
• Exadata uses more connections:
– Modular storage cell building blocks organized into
massively parallel grid
• Exadata has bigger network pipes:
– InfiniBand network transfers data faster than Fibre Channel.
• Exadata transports less data between the storage and the
database:
– Query processing is moved into storage to dramatically
reduce data sent to servers while unloading server CPUs.
• Exadata Hybrid Columnar Compression reduces the
number of physical I/Os for large table scans.
• In-memory parallel query provides a powerful alternative
query strategy that complements Exadata.
Exadata Key Benefits for Data Warehousing

One of the key benefits of Exadata is extremely enhanced performance for data warehousing
applications By replacing your existing storage with Exadata,
applications. Exadata it is possible to get up to 100
times speedup for your data warehousing queries. The larger the data warehouse, the greater
the speedup from using Exadata.
Exadata addresses three key dimensions of database I/O that can hamper data warehouse
performance.
• Exadata is based on a massively parallel architecture, which provides more connections
to deliver more data faster between the storage servers and the database servers.
• Exadata is built using wide network pipes that provide extremely high bandwidth
between the storage servers and the database servers. Exadata uses InfiniBand as the
storage network ,which provides a throughput of 40 Gb/sec with very low latency. This is
many times the bandwidth provided by traditional SAN storage networks.
• Exadata is database-aware and can transport just the data required to satisfy SQL
requests resulting in less data being sent between the storage servers and the database
servers
servers.
Basically, Exadata reduces the volume of data transported and moves data faster compared
with other storage solutions.

Exadata Key Benefits for Data Warehousing (continued)
In addition, Exadata introduces additional capabilities that can further enhance data warehouse
performance.
Exadata includes Exadata Hybrid Columnar Compression. This feature provides very high
levels of data compression implemented inside Exadata. Exadata Hybrid Columnar
Compression benefits large scale scans, commonly used in data warehousing, by efficiently
scanning vast volumes of data using a fraction of I/Os. Compression ratios of 10 to 1 are
common which means that a 10 TB table can be scanned using 1 TB of disk I/O.
Exadata’s tight integration with Oracle Database results in an intelligent platform for data
warehousing The complete solution uses a range of technologies to deliver the best result
warehousing. result, not
just relying on one approach to the problem. An example of this is the new in-memory parallel
query feature of Oracle Database 11g Release 2.
Normally, a Smart Scan would be used to execute portions of a query inside Exadata and return
the minimum amount of data to the database server. In some cases, however, it may be more
efficient to read all the required data into the memory on the database servers and process the
query that way.
In-memory parallel query enhances query performance by minimizing or even completely
eliminating additional physical I/O for a particular query. Oracle automatically decides if an
object being accessed using parallel execution benefits from being cached in the database
buffer cache. The decision to cache an object is based on a well-defined set of heuristics
including size of the object and the frequency that it is accessed.
In-memory parallel query harnesses the aggregated memory across a database cluster for
parallel operations,
operations enabling it to scale-out as the number of nodes in a cluster increases
increases. In an
Oracle RAC environment, Oracle maps fragments of the object into each of the buffer caches on
the active instances. By creating this mapping, Oracle knows which buffer cache to access to
find a specific part or partition of an object. Using this information, Oracle Database will prevent
multiple instances from reading the same information from disk over and over again, thus
maximizing the amount of memory that can be used to cache the objects.
In-memory parallel query nicely complements Exadata. Using this combination, some queries
can be
b efficiently
ffi i tl executed
t d with
ith little
littl or no additional
dditi l I/O by
b pinning
i i ttables
bl iin th
the d
database
t b b
buffer
ff
cache whereas others can harness the power of Smart Scan inside Exadata.

Exadata Key Benefits for OLTP
• Exadata uses more connections:

– Modular storage cell building blocks organized into
massively parallel grid
• Exadata has bigger network pipes:
– InfiniBand network
net ork transfers data faster than Fibre Channel
Channel.
• Exadata Smart Flash Cache:
– Provides high-performance cache for frequently accessed
objects
– Is excellent for absorbing repeated random reads
– Allows
All optimization
i i i b by application
li i table
bl
Hundreds of Tens of Thousands
I/Os per Sec of I/Os per Second
Exadata Key Benefits for OLTP

Some of the fundamental architectural characteristics of Exadata that are beneficial for data
warehousing are equally relevant and beneficial for online transaction processing (OLTP)
(OLTP).
The high-performance, low-latency, InfiniBand network used in conjunction with the massively
parallel grid architecture of Exadata is ideal for supporting many thousands of simultaneous
users.
In addition, the introduction of Exadata Smart Flash Cache is of particular benefit to OTLP
performance. Exadata Smart Flash Cache allows each Exadata cell to deliver up to 75,000
IOPS. In addition, Oracle Database and Exadata Smart Flash Cache work closely with each
other.
th This
Thi cooperationti optimizes
ti i th
the usage off Exadata
E d t S Smartt Flash
Fl h CCache
h so th
thatt only
l th
the
most frequently accessed and performance-sensitive data is cached. Users have additional
control over which database objects should be cached more aggressively than others, and
which ones should not be cached at all.

Quiz
Exadata and Database Machine are two different names that

designate the same thing.
1. TRUE
2 FALSE
2.
Answer: 2

Quiz
What are the three unique benefits of Exadata compared to

traditional storage servers?
1. Larger disk sizes
2 Smart storage capabilities
2.
3. Higher storage network bandwidth
4. Higher RAM capacity
5. Integrated database I/O resource management
Answer: 2, 3, 5

Summary
In this lesson, you should have learned how to:

• Contrast the Exadata storage architecture with traditional
shared storage offerings
• Describe the hardware components of Exadata
• Outline the capabilities of Exadata
• Describe the main advantages of using Exadata compared
to traditional storage servers

• Lesson Demonstrations (Viewlets)

– Introduction to Smart Scan
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/021ExadataSmartScanIntro
/021exadatasmartscanintro_viewlet_swf.html
– Introduction to Exadata Hybrid Columnar Compression
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/022ExadataCompressionInt
ro/022exadatacompressionintro_viewlet_swf.html
– Introduction to Exadata Smart Flash Cache
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/023ExadataFlashCacheIntr
o/023exadataflashcacheintro_viewlet_swf.html
– Smart Scan Scale Out Example
— http://st-
htt // t
curriculum.oracle.com/demos/db/11g/r2/exadatav2/smartscanscaleoutexamp
le/smartscanscaleoutexample.swf
– Storage Index
— http://st-
curriculum.oracle.com/demos/db/11g/r2/exadatav2/storageindex/storageinde
x.swf

Introducing Exadata Features
In these practices, you are introduced to four major capabilities
of Exadata, namely:
• Smart Scan
• Exadata Hybrid Columnar Compression
• Exadata Smart Flash Cache
• Storage Index

Exadata Architecture

Objectives
After completing this lesson, you should be able to describe:

• The Exadata architecture
• The relationship between the various storage abstractions
used in Exadata

Exadata Software Architecture Overview
Single-instance DB RAC DB
DB Server DB Server DB Server
Enterprise
DB Instance DB Instance DB Instance Manager
DBRM DBRM DBRM

Single
ASM cluster ASM ASM ASM
LIBCELL LIBCELL LIBCELL
iDB Protocol over

InfiniBand with Path InfiniBand Storage Switch/Network
Failover
Oracle Linux Oracle Linux Oracle Linux

Cell Control
CELLSRV MS CELLSRV MS CELLSRV MS
CLI
(cellcli/dcli)
IORM RS IORM RS IORM RS
SSH
Exadata Server Exadata Server Exadata Server

The architecture of Exadata includes components on the database server and on the Exadata
g components reside on the
server. The overall architecture is shown in the slide. The following
database server:
• Oracle Database communicates with Exadata using the Intelligent Database protocol
(iDB). iDB is implemented in the database kernel and LIBCELL. iDB is a unique Oracle
data transfer protocol, built on Reliable Datagram Sockets (RDS), that runs on industry
standard InfiniBand networking hardware. iDB provides data intelligence between the
database and Exadata and enables ASM and database instances to utilize Exadata-
specific features,
features such as Smart Scan and I/O Resource Management
Management. iDB transparently
maps database operations to Exadata-enhanced operations. Single-instance or Oracle
RAC databases access Exadata storage cells using iDB.
• Automatic Storage Management (ASM) is required and provides a file system and volume
manager optimized for Oracle Database.
• Database Resource Manager (DBRM), in combination with Exadata I/O Resource
Management (IORM), ensures that I/O resources are allocated based on defined priorities.
Note: The slide illustrates the recommended configuration where a single ASM cluster is used
to consolidate storage for all of your databases. Alternatively, you can connect multiple separate
ASM environments with separate disk groups to Exadata.

Single-instance DB RAC DB
DB Server DB Server DB Server
Enterprise
DB Instance DB Instance DB Instance Manager
DBRM DBRM DBRM

Single
ASM cluster ASM ASM ASM
LIBCELL LIBCELL LIBCELL
iDB Protocol over

InfiniBand with Path InfiniBand Storage Switch/Network
Failover
Oracle Linux Oracle Linux Oracle Linux

Cell Control
CELLSRV MS CELLSRV MS CELLSRV MS
CLI
(cellcli/dcli)
IORM RS IORM RS IORM RS
SSH
Exadata Server Exadata Server Exadata Server
Exadata Software Architecture Overview (continued)

The software components that reside in Exadata include:
• Oracle Linux p provides the Exadata server operating
p g system.
y
• Cell Server (CELLSRV) is the primary Exadata software component and provides the
majority of Exadata storage services. CELLSRV is a multithreaded server. CELLSRV serves
simple block requests, such as database buffer cache reads, and Smart Scan requests,
such as table scans with projections and filters. CELLSRV also implements I/O Resource
Management (IORM), which works in conjunction with Database Resource Manager
(DBRM), to meter out I/O bandwidth to the various databases and consumer groups
issuing I/Os. Finally, CELLSRV collects numerous statistics relating to its operations.
O l D
Oracle Database
t b and
d ASM processes use LIBCELL to t communicate
i t with
ith CELLSRV, and
d
LIBCELL converts I/O requests into messages that are sent to CELLSRV using the iDB
protocol.
• Management Server (MS) provides Exadata cell management and configuration. It works
in cooperation with the Exadata cell command-line interface (CellCLI). Each cell is
individually managed with CellCLI. CellCLI can only be used from within a cell to manage
that cell, however you can run the same CellCLI command remotely on multiple cells with
the dcli utility.
utility In addition,
addition MS is responsible for sending alerts and collects some
statistics in addition to those collected by CELLSRV.
• Restart Server (RS) is used to start up/shut down the CELLSRV and MS services and
monitors these services to automatically restart them if required.

Exadata Software Architecture Details
Exadata Cell Database Server
RDBMS instance ASM instance

Smart
SGA SGA
Data Flash Cache
ASM ASM
I/O dskm dskm I/O
Proc Proc
LIBCELL LIBCELL
/opt/oracle/cell/
cellsrv/deploy/ CellCLI
config cellsrv MS
cell_disk_ adrci diskmon css
config.xml
iDB Protocol iDB Protocol
CELLSRV /etc/oracle/cell/network-config
cellinit.ora RS ADR
cellip.ora cellinit.ora
bond0
MS internal List accessible List local

dictionary Exadata cells interface IP
and InfiniBand switch
CELLSRV internal
parameters and
local interface IP
Exadata Software Architecture Details

Database-host side Exadata software:
• LIBCELL
C Library:
b a y Provides
o des U
UNIX-like
e I/O
/O primitives
p t es aandd iss linked
ed with
t ASM,
S , RDBMS,
S, a
and
d
ASM utilities. It uses the iDB Protocol to communicate with Exadata.
• DISKMON (Network/Cell Monitor): Checks the network interface state and cell liveness. It
uses a nodewide master process and one slave process (dskm) for each RDBMS or ASM
instance. The master performs monitoring and propagates state information to the slaves.
Slaves use the SGA to communicate with RDBMS or ASM processes. If there is a failure
in the cluster, DISKMON performs I/O fencing to protect data integrity. Cluster
Synchronization Services (CSS) still decides what to fence
fence. Master DISKMON starts with
the clusterware processes. DISKMON also performs DBRM plan propagation.
Cell-side Exadata software:
• CELLSRV is a multithreaded server which provides the majority of Exadata storage
services. It provides smart storage capabilities, serves data blocks when offloading is not
possible, and implements I/O Resource Management to meter out I/O bandwidth.
• Management Server (MS) is an OC4J application that provides storage cell management
and configuration functions, such as cell administration, and metrics and alerts generation.
It also communicates with CELLSRV and the operating system.

Exadata Software Architecture Details (continued)
• Restart Server (RS): Monitors CELLSRV and MS and restarts them, if necessary.
• CellCLI: Executes user cell administration commands. The user must connect to the cell
to use CellCLI
CellCLI. CellCLI communicates with MS using Web Services
Services.
• ADRCI: CELLSRV uses the Automatic Diagnostic Repository (ADR) to log software errors.
An Exadata administrator may use the ADR viewer (ADRCI) to view and package ADR
incidents.
InfiniBand provides a high-speed, high-bandwidth, and low-latency network fabric to support
Exadata. InfiniBand is the only network fabric supported for communication between Exadata
and database servers. The InfiniBand implementation in Exadata and Database Machine uses
the open source RDS/Open Fabrics Enterprise Distribution (OFED). These packages are
preinstalled in Exadata and Database Machine.
Note: Exadata requires Oracle Database 11g Release 2 or later.

Exadata Smart Flash Cache Architecture
Write Operation Read Operation Read Operation

on previously cached data on uncached data
DB 1 DB 1 DB
est
est
Read Reque
Read Reque
3 1 3 3
Acknowledgement
cellsrv
cellsrv
cellsrv
2 2 2 4
4
Exadata Smart
Flash Cache
Exadata Smart Flash Cache Architecture

Exadata Smart Flash Cache provides a caching mechanism for frequently accessed data on
each Exadata cell
cell. Exadata Smart Flash Cache works in conjunction with Oracle Database to
intelligently optimize the efficiency of the cache.
Each database I/O is tagged with the following metadata:
• The CELL_FLASH_CACHE setting for the object associated with the I/O:
- DEFAULT specifies that Exadata Smart Flash Cache is used normally.
- KEEP specifies that Exadata Smart Flash Cache is used more aggressively.
- NONE specifies that Exadata Smart Flash Cache is not used
used.
• A cache hint, which is assigned by the database based on the reason for the I/O:
- CACHE indicates that the I/O should be cached. For example, the I/O is for an
index lookup.
- NOCACHE indicates that the I/O should not be cached. For example, the I/O is for
a mirrored block of data or is a log write.
- EVICT indicates that data should
sho ld be removed
remo ed from the cache.
cache For eexample,
ample when
hen
an ASM rebalance operation moves data between different disks, the cached
copies that correspond to the original location are removed from the cache.

Exadata Smart Flash Cache Architecture (continued)
In addition, Exadata Smart Flash Cache takes the following into consideration when processing
I/O:
• I/O size: Large I/Os on objects with CELL_FLASH_CACHE
CELL FLASH CACHE set to DEFAULT are not cached.
cached
• Current cache load: Smart table scans are usually directed to disk. However, if the object
has a CELL_FLASH_CACHE setting of KEEP, some reads may be satisfied using Exadata
Smart Flash Cache in order to best utilize the combined throughput of the disks and the
cache.
Exadata Smart Flash Cache uses all of the aforementioned information to make intelligent
decisions about which data is suitable for caching and which is not.
Exadata Smart Flash Cache is a write-through cache. This means that for write operations,
CELLSRV writes data to disk and sends an acknowledgement to the database so it can continue
without any interruption. Then, if the data is suitable for caching, it is written to Exadata Smart
Flash Cache. Write performance is not improved or diminished using this method. However, if a
subsequent read operation needs the same data, it is likely to benefit from the cache.
For read operations, CELLSRV must first determine if the requested data is already in Exadata
S
Smart t Flash
Fl h CCache.
h CELLSRV maintains
i t i an iin-memory hashh h ttable,
bl whichhi h it uses tto quickly
i kl
determine which data blocks reside in Exadata Smart Flash Cache. If the requested data is
cached, a cache lookup is used to satisfy the I/O request.
For read operations that cannot be satisfied using Exadata Smart Flash Cache, a disk read is
performed and the requested information is sent to the database. Then if the data is suitable for
caching, it is written to Exadata Smart Flash Cache.
When suitable data is inserted into a full cache
cache, a prioritized least recently used (LRU) algorithm
determines which data to replace. Objects with a CELL_FLASH_CACHE setting of KEEP are
subject to a different cache retention policy than objects with a CELL_FLASH_CACHE setting of
DEFAULT. KEEP objects have priority over DEFAULT objects so that new data from a DEFAULT
object will not push out cached data from any KEEP objects. To prevent KEEP objects from
monopolizing the cache, they are allowed to occupy no more than 80% of the total cache size.
Also, to prevent unused KEEP objects from indefinitely occupying the cache, they are subject to
an additional aging policy
policy, which periodically purges unused KEEP object datadata.

Exadata Monitoring Architecture
Exadata Cell
Exadata Cell From the Enterprise
Exadata Cell
Enterprise Manager
Smart
Data Flash Cache OMS agent dcli
CellCLI
cellsrv MS
adrci
SSH / CellCLI
CELLSRV
ADR
eth0
eth0
Network switch
Exadata Monitoring Architecture

For monitoring, there is an Enterprise Manager plug-in that you use in conjunction with Grid
g this plug-in,
Control. Using g yyou can monitor all the Exadata cells in yyour enterprise.
The Enterprise Manager plug-in for Exadata does not require an agent on each Exadata cell.
Instead, an existing Enterprise Manager agent uses SSH to connect to each cell and execute
CellCLI commands. Using this architecture, monitoring information from numerous Exadata
cells can be consolidated on to a single Enterprise Manager screen.
The dcli utility facilitates centralized management across a group of cells. It can be used to
execute CellCLI and other cell-level operating system commands across a group of cells and
provide
id a consolidated
lid t d view
i off the
th output.
t t The
Th dcli
d li utility
tilit runs commands
d on multiple
lti l cells
ll iin
parallel threads. The cells are referenced by their network name or IP address. Files can be
copied to cells and command scripts can be executed on cells by using this utility. Finally, you
can use the dcli utility to set up SSH user-equivalence to a cell or group of cells.
Note: dcli is a Python script that is available on Exadata. You can copy it to your designated
central management console and execute it from there. The dcli utility requires Python version
g and Maintaining
2.3 or later. dcli is discussed further in the lesson entitled Monitoring g
Database Machine.

Disk Storage Entities and Relationships
Disk LUN CELLDISK GRIDDISK ASM disk
Exadata Cell CellCLI> CREATE GRIDDISK ...
First two
Data LUNs only
Storage Grid
Partition Disk
System Area
OR Cell OR Visible to
Disk ASM
Grid Disk
(hot part)
LUN
Other ten
LUNs Grid Disk
(cold part)
Disk Storage Entities and Relationships

Each Exadata cell contains 12 physical disks. On each of the first two disks, Exadata reserves
a system area that spans multiple partitions with a total size of approximately 29 GB.
GB The
system area contains the OS image, swap space, Exadata software binaries, metric and alert
repository, and various other configuration and metadata files. The two system areas are
mirror copies of each other which are maintained via software mirroring.
Exadata automatically senses the physical disks in each cell. As a cell administrator you can
only view a predefined set of physical disk attributes. Each physical disk is mapped to a
logical abstraction called a Logical Unit (LUN). A LUN exposes additional predefined
metadata
t d t attributes
tt ib t to
t a cellll administrator.
d i i t t You Y cannott create
t or remove a LUN,
LUN they
th are
automatically created.
A cell disk is a higher level abstraction that represents the data storage area on each LUN.
For the two LUNs that contain the system areas, Exadata recognizes the way that the LUN is
partitioned and maps the cell disk to the disk partition reserved for data storage. For the other
10 disks, Exadata maps the cell disk directly to the LUN.
After a cell disk is created
created, it can be subdivided into one or more grid disks
disks, which are directly
exposed to ASM.

Disk Storage Entities and Relationships (continued)
Placing multiple grid disks on a cell disk allows the administrator to segregate the storage into
pools with different performance characteristics. For example, a cell disk could be partitioned so
that one g
grid disk resides on the highest
g p
performing gpportion of the disk ((the outermost tracks on
the physical disk), whereas a second grid disk might be configured on the lower performing
portion of the disk (the inner tracks). The first grid disk might then be used in an ASM disk group
that houses highly active (hot) data, whereas the second grid disk might be used to store less
active (cold) data files.
Placing multiple grid disks on a cell disk also allows the administrator to segregate the storage
into separate pools that can be assigned to different databases.
In cases where the entire cell capacity is required for a single database or where it is difficult to
clearly define hot and cold data sets, an Exadata administrator will usually define a single grid
disk containing all the space on each cell disk.
Note: The diagram in the slide shows the cases where one or two grid disks are created from
the space on a cell disk. However, you can create more than two grid disks on a cell disk.

Interleaved Grid Disks
Grid Disk 1 Grid Disk 3

Fastest Tracks Fast Tracks Fastest Tracks Slower Tracks
s of the disk
m the higher
Grid Disk 4 is more evenly balanced

The perform
ance outer tracks
sk1 benefits from
mance of Grid Diisk 3 and

50% 50% 50% 50%
performa
Grid Dis
b
Slower Tracks Slowest Tracks Fast Tracks Slowest Tracks
Grid Disk 2 Grid Disk 4
Interleaved Grid Disks

By default, space for grid disks is allocated from the outer tracks to the inner tracks of a physical
disk. However, space for g grid disks can be allocated in an interleaved manner. Grid disks that
use this type of space allocation are referred to as interleaved grid disks. This method attempts
to equalize the performance of the grid disks residing on the same physical disk.
The slide contrasts default grid disk allocation with interleaved grid disks. On the left, two grid
disks have been created on a physical disk using default space allocation. In this case, Grid
Disk 1 occupies all the fastest (outer) tracks, whereas Grid disk 2 occupies all the slower (inner)
tracks.
On the
O th right,
i ht you see an example l off iinterleaved
t l d grid
id di
disks.
k With iinterleaving
t l i enabled,
bl d a didisk
k iis
divided into two equal parts: the outer half (upper portion) and the inner half (lower portion).
When a new grid disk is created, half of the grid disk space is allocated on the upper portion,
and the other half of the grid disk space is allocated on the lower portion.
Interleaved grid disks are best used in situations where you want to create separate ASM disk
groups that share cell disks without a performance bias.
Note that interleaving
g is enabled by y setting g the INTERLEAVING attribute for the cell disk. For
example:
CellCLI> CREATE CELLDISK cd_03_cell01_int LUN=03 –
INTERLEAVING='normal_redundancy'

Flash Storage Entities and Relationships
Flash LUN CELLDISK GRIDDISK ASM disk

OR
FLASHCACHE
Exadata Cell CellCLI> CREATE FLASHCACHE ...

CellCLI> CREATE GRIDDISK ... FLASHDISK ...
Flash
Cache
Flash Cell
OR
LUN Disk
Flash Cache
Grid Disk
Visible to
ASM
Flash Storage Entities and Relationships

Each Exadata cell contains 384 GB of high performance flash memory distributed across 4 PCI
flash memory cards. Each card has 4 flash devices for a total of 16 flash devices on each cell.
Each flash device has a capacity of 24 GB.
Essentially, each flash device is much like a physical disk in the Exadata storage hierarchy.
Each flash device is visible to the Cell Server software as a LUN. You can create a cell disk
based on a flash-based LUN. You can then create numerous grid disks on each flash-based cell
disk. In addition, space on a flash-based cell disk can be allocated to a special area that
supports Exadata Smart Flash Cache.
By default
default, the initial cell configuration process creates flash-based cell disks on all the flash
devices, and then allocates all the available flash space to Exadata Smart Flash Cache. To
create space for flash-based grid disks, you need to drop the default flash cache. Then you can
create a flash cache and flash-based grid disks with your chosen sizes.
Unlike physical disk devices, the order in which you allocate your flash space is not important
from a performance perspective. Likewise, interleaving is not applicable for flash-based cell
disks.
Note: The diagram in the slide shows the case where a flash-based cell disk is allocated
entirely to flash cache, and the case where a flash-based cell disk is used for flash cache and
one grid disk. However, you can allocate up to one flash cache area along with zero or more
flash-based grid disks from a flash-based cell disk.
Disk Group Configuration
SQL> CREATE DISKGROUP
Exadata Cell Exadata Cell
DATA
CELL1 Failure Group CELL2 Failure Group
Disk Group
FRA
CELL1 Failure Group CELL2 Failure Group
Disk Group
Disk Group Configuration

After the grid disks are configured, ASM disk groups can be defined across your Exadata
g
configuration.
The slide illustrates an example where two ASM disk groups are defined. The DATA disk group
is defined across all the red grid disks, and the FRA disk group is defined across the blue grid
disks. When data is loaded into each disk group, ASM will evenly distribute the data and I/O
across the grid disks in each disk group.
To protect against the failure of an entire Exadata cell, ASM failure groups are automatically
defined on a per cell basis. This is to ensure that mirrored ASM extents are placed on different
E d t cells.
Exadata ll Thi
This iis also
l ill
illustrated
t t d iin th
the slide.
lid B By d
default,
f lt when
h ffailure
il groups are
automatically created, their names correspond to the cell name. So, different disk groups can
have the same failure group names.
When using Exadata, it is strongly recommended to use at least NORMAL ASM redundancy for
all of your disk groups in conjunction with ASM failure groups spread across at least two
Exadata cells. Following this recommendation provides good protection against disk and cell
failure.
Using HIGH ASM redundancy in conjunction with ASM failure groups spread across at least
three Exadata cells provides the best available level of data protection. Such a configuration can
tolerate the simultaneous failure of two complete cells without compromising data availability.

Quiz
What are the three main Exadata services?

1. OMS
2. MS
3 GMON
3.
4. CELLSRV
5. RS
Answer: 2, 4, 5

Quiz
If you use NORMAL ASM redundancy for all of your disk groups
in conjunction with ASM failure groups spread across two
Exadata cells, under which of the following scenarios will you
maintain data availability?
1. A single disk failure in a single cell
2. Simultaneous failure of multiple disks in a single cell
3. Simultaneous failure of a single disk in both cells
4. Complete failure of a single cell
Answer: 1 ,2, 4
The prescribed configuration may provide protection against failure scenario 3 if, and only if,
there are no data extents mirrored to both of the failed disks. To g
guarantee data availability
y in
cases where simultaneous failures affect two cells, you must use HIGH ASM redundancy in
conjunction with failure groups spread across at least three Exadata cells.

Summary
In this lesson, you should have learned to describe:

• The Exadata architecture
• The relationship between the various storage abstractions
used in Exadata


– Exadata Process Introduction
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/031ExadataProcessIntro/031exadatapr
ocessintro_viewlet_swf.html
– Hierarchy of Exadata Storage Objects
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/032ExadataStorageObjects/032exadata
i l l /d /db/11 / 2/db h/032E d t St Obj t /032 d t
storageobjects_viewlet_swf.html
– Creating Interleaved Grid Disks
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/033ExadataInterleavedGridDisks/033ex
adatainterleavedgriddisks_viewlet_swf.html
– Examining Exadata Smart Flash Cache
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/034ExadataFlashCacheAdmin/034exad
ataflashcacheadmin_viewlet_swf.html
– Exadata Smart Flash Cache Architecture
— http://st-
curriculum.oracle.com/demos/db/11g/r2/exadatav2/smartflashcachearchitecture/smartfla
shcachearchitecture.swf
• My Oracle Support Notes
– Oracle Reliable Datagram Sockets (RDS) and InfiniBand (IB) Support for RAC
Interconnect and Exadata Storage
— https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=745616.1

Introducing Exadata Cell Architecture
In these practices, you will be familiarized with the Exadata cell
architecture. You will:
• Examine the Exadata processes
• Examine the hierarchy of cell objects
• Create interleaved grid disks
• Examine Exadata Smart Flash Cache

E d t C
Exadata Configuration
fi ti

Objectives

• Perform the initial Exadata boot sequence
• Configure Exadata software
• Create and configure ASM disk groups using Exadata
• Use the CellCLI Exadata administration tool

Exadata Installation and Configuration Overview
Initial network
1 preparation
Configuring
ASM disk group Configuration
6 for Exadata
2 of new Exadata
servers
Configuring
ASM and Database
5 instances 3 Configuring
Exadata software
for Exadata
4
Configuring hosts
to use Exadata
Exadata Installation and Configuration Overview

Exadata ships with all hardware and software preinstalled. However, it is necessary to configure
Exadata. This slide introduces a g
general overview of the configuration
g tasks.
Note: In most cases the installation and configuration activities described in this lesson occur as
part of the installation and configuration of Database Machine and there is no requirement to
perform cell-by-cell configuration. You may need to conduct some of the activities described in
this lesson during the normal lifecycle of maintaining your Database Machine environment
however the complete Exadata configuration process would only be required in rare
circumstances, such as when upgrading from a Quarter-Rack Database Machine to a Half-Rack
or Full-Rack configuration
configuration, for example.
example The Database Machine configuration process is
described later in this course in the lesson entitled Database Machine Configuration.

Initial Network Preparation 1
For each storage cell, assign the following IP addresses:

• One IP address for the bonded InfiniBand port
• One IP address for administration network access
• One IP address for lights out management
Note these network configuration recommendations:
• Set up a fault-tolerant, private network subnet for the
InfiniBand network.
– Use the InfiniBand network for Oracle Clusterware.
• Assign a block of IP addresses
for each network type. Repeat
for each
– Do not allocate IP addresses ending cell
in .0, .1, or .255.
• Define your storage cells to DNS.
Initial Network Preparation

Each storage cell contains the following network ports:
1. One dual-port
dual port InfiniBand card for high-speed,
high speed, high-volume
high volume data transfer: Each Exadata
cell is designed to be connected to two separate InfiniBand switches for high availability.
The dual port card is only for availability reasons because each port is capable of
transferring the full data bandwidth generated by the storage cell. You will need to assign
one IP address to the bonded InfiniBand interface during the initial configuration of the
storage cell.
2. Gigabit Ethernet ports for general administration network access to the cell operating
system: Each Exadata server comes with four Gigabit Ethernet ports.
ports However,
However only one
is required for administrative access. You will need to assign one IP address to the cell for
network access during the initial configuration process.
3. One gigabit Ethernet port for lights out management: Exadata uses Sun Integrated Lights
Out Manager (ILOM). You should assign one IP address to the cell for ILOM during the
initial configuration of the storage cell.

Initial Network Preparation (continued)
Note the following network configuration and IP address recommendations:
• It is recommended that the InfiniBand network should be a dedicated private network
subnet for Exadata cells and database server hosts
hosts. Multiple InfiniBand switches are
recommended to eliminate the switch as a single point of failure.
• The InfiniBand network should be used for Oracle Clusterware network and storage
communications. Use the following command on your clusterware hosts to verify that the
private network for Oracle Clusterware communication is using InfiniBand:
oifcfg getif -type cluster_interconnect
• The Reliable Datagram Sockets (RDS) protocol should be used over the InfiniBand
network for database server to cell communication and Oracle Real Application Clusters
(RAC) communication. Check the database alert log to verify that the private network for
Oracle RAC is running the RDS protocol over the InfiniBand network. The following
message should be in the log:
cluster interconnect IPC version: Oracle RDS/IP (generic)
• Dedicate a block of IP addresses for the InfiniBand network and ensure that you allow for
f t
future expansion.
i
• Dedicate a block of IP addresses for the general administration interfaces and the lights
out management interfaces. The general administration interfaces and the lights out
management interfaces may be on the same subnet and may share that subnet with other
hosts. For example, on the 192.168.200.0/24 subnet, you might assign the block of IP
addresses between 192.168.200.31 and 192.168.200.50 for your Exadata general
administration interfaces and the lights
g out management
g interfaces. Other hosts sharing
g
the subnet would be allocated IP addresses outside the dedicated block. If you want, you
can place the general administration interfaces and the lights out management interfaces
on separate subnets; however, this is not required.
• Do not allocate addresses that end in .0, .1, or .255, or those that would be used as
broadcast addresses for the specific netmask that you have selected. For example, avoid
addresses such as 192.168.200.0, 192.168.200.1, and 192.168.200.255.
• Exadata cells do not require Domain Name System (DNS) however DNS is recommended
for use in conjunction with Database Machine. If DNS is available in your network,
configure your DNS with the IP addresses and host names associated with the general
administrative network on each Exadata cell.

Configuration of New Exadata Servers 2
1. Check all physical connections.

2. Power on the Exadata server.
3. Answer questions during boot sequence:
– Domain Name Service (DNS) server IP addresses
– Time preference (time region and location)
– Network Time Protocol (NTP) servers
– Ethernet and InfiniBand IP addresses,
netmasks, gateway, and hostnames
– Remote management configuration
details
Repeat
4. Change the initial passwords for the root, for each
cell
celladmin, and cellmonitor users.
Configuration of New Exadata Servers

The slide lists the general steps to configure a new Exadata server:
1. Check all the physical connections to the Exadata server. It is important that all the
physical network connections are correct prior to configuring the cell. Check also that both
power supplies are connected and that you have a keyboard, video display, and mouse.
2. Power on the cell to boot its operating system.
3. Answer the configuration questions when you are prompted. The slide lists the information
that you need to provide.
4. After you successfully perform the previous step, the login screen is displayed. Change
the initial passwords for the root, celladmin, and cellmonitor users to more secure
passwords. The initial password for root is welcome1. The initial password for the
cellmonitor and celladmin users is welcome.

Answering Questions During the Initial Boot
Sequence
...
Network interfaces
Name State IP address Netmask Gateway Hostname
eth0 Linked
eth1 Unlinked
eth2 Unlinked
eth3 Unlinked
ib0 Linked
ib1 Linked
Warning. Some network interface(s) are disconnected. Check cables and switches and retry
Do you want to retry (y/n) [y]: n
Nameserver: mynameserv.company.com
Add more nameservers (y/n) [n]: n
Setting up local time...
Select country by number, [n]ext, [l]ast: 230
Select zone by number, [n]ext: 17
Selected timezone: America/Denver
Is this correct (y/n) [y]: y
The current ntp server(s):
Do you want to change it (y/n) [n]: y
Fully qualified hostname or ip address for NTP server. Press enter if none: ntp1.company.com
Continue adding more ntp servers (y/n) [n]: n
...
Answering Questions During the Initial Boot Sequence

The next four slides show an example of the initial boot configuration process for Exadata. On
each slide, the text in blue indicates a user input.
The configuration commences during the server boot sequence. The output from the initial part
of the boot sequence is not shown. This slide commences at the beginning of the interview
phase where user input is required.
In this slide, settings are made for the DNS name server, time zone, and NTP server.
Notice that the interview phase commences with a warning indicating that a number of network
interfaces are disconnected. As shown in the slide, it is safe to ignore this warning because
each Exadata server comes equipped with four Ethernet ports however only one (eth0) is
required. So it is normal for eth1, eth2, and eth3 to be disconnected. Always make sure that
the required network interfaces (eth0, ib0 and ib1) are correctly linked.

Sequence
...
Network interfaces
eth0 Linked
bond0 ib0,ib1
Select interface name to configure or press Enter to continue: eth0
Selected interface. eth0
IP address or none: 10.XXX.XXX.XXX
Netmask: 255.255.248.0
255 255 248 0
Gateway (IP address or none): 10.XXX.XXX.1
Fully qualified hostname or none: cell01.company.com
Continue configuring or re-configuring interfaces? (y/n) [y]: y
Network interfaces
eth0 Linked 10.XXX.XXX.XXX 255.255.248.0 10.XXX.XXX.1 cell01.company.com
bond0 ib0,ib1
Select interface name to configure or press Enter to continue: bond0
Selected interface. bond0
IP address: 192.168.50.76
Netmask:
k 255.255.255.0
2 2 2 0
Fully qualified hostname or none: cell01-priv.company.com
Continue configuring or re-configuring interfaces? (y/n) [y]: y
Network interfaces
bond0 ib0,ib1 192.168.50.76 255.255.255.0 cell01-priv.company.com
Select interface name to configure or press Enter to continue: <enter>
...
Answering Questions During the Initial Boot Sequence (continued)

In this slide, the configuration phase continues with settings specified for the Ethernet network
(eth0) that supports administrative access to the storage server, along with the InfiniBand
network (bond0) that supports the main storage network.
Notice that the InfiniBand interface is named bond0 and uses bonding between the physical
InfiniBand interfaces ib0 and ib1. Bonding provides the ability to transparently fail over from
ib0 to ib1 or from ib1 to ib0 if connectivity to either interface is lost.
If you choose not to configure each interface in the list, the unconfigured interfaces will not be
started during system startup and the cell will not be fully functional. You can later configure, or
change,
h cellll network
t k settings
tti using
i ththe ipconf
i f utility.
tilit

Sequence
...
Select canonical hostname from the list below
1: cell01.company.com
2: cell01-pric.company.com
Canonical fully qualified domain name: 1
Select default gateway interface from the list below
1: eth0
Default gateway interface: 1
Canonical hostname: cell01.company.com

Nameservers: mynameserv.company.com
Timezone: America/Denver
NTP servers: ntp1.company.com
Default gateway device: eth0
Network interfaces
bond0 ib0,ib1 192.168.50.76 255.255.255.0 cell01-priv.company.com
Is this correct (y/n) [y]: y
...

In this slide, the network configuration is finalized with the specification of the canonical host
name and default g gateway.
y Both of these settingsg map to the ethernet network providing g
administrative access to the cell.

Sequence
...
Do you want to configure basic ILOM settings (y/n) [y]: y
Loading basic configuration settings from ILOM ...
ILOM Fully qualified hostname [cell01-ilom.company.com]: cell01-ilom.company.com
ILOM IP address [10.XXX.XXX.YYY]: 10.XXX.XXX.YYY
ILOM Netmask [255.255.248.0]: 255.255.248.0
ILOM Gateway [10.XXX.XXX.1]: 10.XXX.XXX.1
ILOM Nameserver or none [mynameserv.company.com]: mynameserv.company.com
ILOM Use NTP Servers (enabled/disabled) [enabled]: enabled
ILOM First NTP server. Fully qualified hostname or ip address or none [ntp1.company.com]: ntp1.company.com
ILOM Second NTP server. Fully qualified hostname or ip address or none [none]: none
Basic ILOM configuration settings:

Hostname : cell01-ilom.company.com
IP Address : 10.XXX.XXX.YYY
Netmask : 255.255.248.0
Gateway : 10.XXX.XXX.1
DNS servers : mynameserv.company.com
Use NTP servers : enabled
First NTP server : ntp1.company.com
Second NTP server : none
Timezone (read-only) : America/Denver
Is the correct (y/n) [y]: y

...

Configuration completes with settings for Integrated Lights Out Manager (ILOM).
If you choose not to configure ILOM at this time, you can use the ipconf utility to do so later.
After the user interview phase is completed, the Exadata server finalizes its system startup
process. The output from the remaining system startup activities is not shown in the slide.
Finally, a login prompt is displayed.

Exadata Administrative User Accounts
Three operating system users are configured for each Exadata server:
• The root user can:
– Edit configuration files such as cellinit.ora and cellip.ora
– Change network configuration settings
– Run support and diagnostic utilities located under the
/opt/oracle.SupportTools directory
– Run the CellCLI CALIBRATE command
– Perform all the tasks that the celladmin user can perform
• The celladmin user can:
– Perform administrative tasks (CREATE
(CREATE, DROP, ALTER and so on)
DROP ALTER,
using the CellCLI utility
– Package incidents for Oracle Support using the adrci utility
• The cellmonitor user can only view (LIST) Exadata cell
objects using the CellCLI utility.
Exadata Storage Server Administrative User Accounts

Three operating system users are configured for each Exadata server: root, celladmin,
and cellmonitor.
cellmonitor The slide describes the function of each user account.
account
As mentioned before, after you successfully configure the cell, you should log in and change
the initial passwords for the root, celladmin, and cellmonitor users to more secure
passwords. The initial password for root is welcome1. The initial password for the
cellmonitor and celladmin users is welcome.

Configuring a New Exadata Cell 3
1. Run performance tests on the cell with CALIBRATE.

2. Configure the cell server software.
3. Create cell disks.
4 Create grid disks
4. disks.
Repeat
for each
cell
Configuring a New Exadata Cell

As part of the initial boot configuration, the cell server software is started with a basic
g
configuration. In addition, the flash modules are configured
g as cell disks and all the flash-based
cell disks are allocated to Exadata Smart Flash Cache.
At this point, you are ready to finalize the configuration of the Exadata cell. Following is a
summary of the recommended procedure. All the steps are executed using CellCLI:
1. As the root user, run performance tests on the cell with the CALIBRATE command.
2. As the celladmin or root user, configure the cell server software with the ALTER CELL
command.
3. As celladmin or root, create the disk-based cell disks by using the CREATE CELLDISK
command.
4. As celladmin or root, create the grid disks on each cell disk of the storage cell by using
the CREATE GRIDDISK command.
Repeat this process on each Exadata cell.

Important I/O Metrics for Oracle Databases
Disk bandwidth Channel bandwidth
Metric = IOPS Metric = MBPS
Need large
Need high RPM and
I/O channel
fast seek time
OLTP DW/OLAP
(Small random I/O) (Large sequential I/O)
CALIBRATE
Important I/O Metrics for Oracle Databases

The CALIBRATE command runs raw performance tests on Exadata disks and flash modules.
This enables yyou to measure two important database metrics – IOPS and MBPS:
• IOPS (I/O per second): This metric represents the number of small random I/O that can
be serviced in a second. The IOPS rate mainly depends on how fast the disk media can
spin and how many disks are present in the storage system.
• MBPS (megabytes per second): The rate at which data can be transferred between the
computing server node and the storage array. This mainly depends on the capacity of the
I/O channel that is used to transfer data.
The database I/O workload typically consists of small random I/Os and large sequential I/Os.
Small random I/Os are more prevalent in an OLTP application environment in which each server
process reads a data block into the buffer cache for updates and the changed blocks are written
to storage in batches by the database writer (DBWn) process. Large sequential I/Os are
common in a data warehouse environment.
OLTP application performance mainly depends on how fast small I/Os are serviced, which
depends on how fast the disk can spin and find the data. Large I/O performance depends on the
capacity of the I/O channel that connects the server to the storage array; throughput is better
when the capacity of the channel is larger.

Testing Performance Using CALIBRATE
[root@cell01 ~]# cellcli

CellCLI: Release 11.2.1.2.0 - Production on Mon Nov 02 16:42:06 PST 2009
Copyright (c) 2007, 2009, Oracle. All rights reserved.

Cell Efficiency ratio: 1.0
CellCLI> CALIBRATE FORCE

Calibration will take a few minutes
minutes...
Aggregate random read throughput across all hard disk luns: 1601 MBPS
Aggregate random read throughput across all flash disk luns: 4194.49 MBPS
Aggregate random read IOs per second (IOPS) across all hard disk luns: 4838
Aggregate random read IOs per second (IOPS) across all flash disk luns: 137588
Controller read throughput: 1615.85 MBPS
Calibrating hard disks (read only) ...
Lun 0_0 on drive [20:0 ] random read throughput: 152.81 MBPS, and 417 IOPS
...
Lun 0_10 on drive
d i [
[20:10 ] random
d read
d throughput:
h h 156.84 MBPS, and
d 421 IOPS
Calibrating flash disks (read only, note that writes will be significantly slower).
Lun 1_0 on drive [[10:0:0:0]] random read throughput: 269.06 MBPS, and 19680 IOPS
...
CALIBRATE results are within an acceptable range.
Testing Performance Using CALIBRATE

The CALIBRATE command enables you to verify the disk and flash memory performance before
the cell is put online. You must execute this command while being logged in as the root user at
the operating system level.
The CALIBRATE FORCE command allows you to run the tests when Cell Server is running. If
you do not use the FORCE option, Cell Server must be shut down. Running CALIBRATE at the
same time as the Cell Server will impact performance which is why it is not recommended
during normal operations.
Because the Cell Server software is running immediately after the initial boot sequence, you
mustt either
ith shut
h td
down th
the C
Cellll S
Server software
ft or execute
t th
the CALIBRATE FORCE command. d
CALIBRATE FORCE is acceptable in this circumstance because the cell is not yet running a
user workload, so there is no work to disrupt. In the above example, which shows a typical
output for high performance disks, the results matched expectations. A message will alert you if
the performance measurements are substandard.

Configuring the Exadata Cell Server Software
[celladmin@cell01 ~]$ cellcli

CellCLI: Release 11.2.1.2.0 - Production on Mon Nov 02 17:46:13 PST 2009
Copyright (c) 2007, 2009, Oracle. All rights reserved.

Cell Efficiency ratio: 1.0
1 0
CellCLI> ALTER CELL smtpServer='my_mail.example.com', -

smtpFromAddr='exadata.cell01@example.com', -
smtpPwd=<email_address_password> -
smtpToAddr='jane.smith@example.com', -
notificationPolicy='critical,warning,clear', -
notificationMethod='mail'
Cell cell01 successfully altered
CellCLI>
Configuring the Exadata Cell Server Software

The settings provided during the initial boot sequence configure the hardware and cell operating
system. In addition, the Cell Server software is automatically configured using the CREATE
CELL command. By default, the cell name is set to the network host name of the Exadata server
and the INTERCONNECT1 attribute is set to bond0, which is the InfiniBand storage network
interface.
You can change the name of the cell or configure the optional Cell Server attributes by using the
ALTER CELL command.
The slide shows an example ALTER CELL command that configures email notification. This
f ilit sends
facility d emailil messages tto th
the administrator
d i i t t off th
the storage
t cellll whenever
h critical,
iti l warning,
i
and clear alerts are detected by the cell. In addition to email notification, it is possible to
configure notification using Simple Network Management Protocol (SNMP).
Note: After the initial boot configuration, Restart Server (RS) and Management Server (MS)
should be running. If not, an error message will display when using the CellCLI utility. In that
case, run the following CellCLI commands to start the RS and MS services:
ALTER CELL STARTUP SERVICES RS
ALTER CELL STARTUP SERVICES MS

Creating Cell Disks
CellCLI> CREATE CELLDISK ALL

CellDisk CD_00_cell01 successfully created
...
CellCLI> LIST CELLDISK

CD_00_cell01 normal
...
CD_10_cell01 normal
CD_11_cell01 normal
FD_00_cell01 normal
...
FD_14_cell01 normal
FD_15_cell01 normal
CellCLI>
Creating Cell Disks

After the Exadata cell is first configured, there are 16 flash-based cell disks, which are allocated
to Exadata Smart Flash Cache.
Before you can use the disk-based storage, you must create disk-based cell disks using the
CREATE CELLDISK command. The example in the slide shows the use of the CREATE
CELLDISK ALL command to automatically create 12 disk-based cell disks with default names.
In most cases, you can use the default cell disk names.
If desired, you can configure your cell disks to enable the creation of interleaved grid disks. Use
the following command to create cell disks with interleaving enabled:
CREATE CELLDISK ALL HARDDISK INTERLEAVING='normal_redundancy'
The above example also shows the use of the LIST CELLDISK command to display the disk-
based and flash-based cell disks. Check whether the command shows a status of normal for
all the cell disks.

Creating Grid Disks
CellCLI> CREATE GRIDDISK ALL PREFIX=data, SIZE=300G

Use fastest
GridDisk data_CD_00_cell01 successfully created disk portion
...
GridDisk data_CD_11_cell01 successfully created
CellCLI> CREATE GRIDDISK ALL PREFIX

PREFIX=fra
fra
GridDisk fra_CD_00_cell01 successfully created
...
GridDisk fra_CD_11_cell01 successfully created
Before After
Cell Grid
disk disks
CellCLI> LIST GRIDDISK
data_CD_00_cell01 active
...
data_CD_11_cell01 active
fra_CD_00_cell01 active
... … …
fra_CD_11_cell01 active
CellCLI> exit
[celladmin@cell01 ~]$
Creating Grid Disks

After cell disks are created, you can create grid disks by using the CREATE GRIDDISK
command. In the example in the slide, the ALL PREFIX option is used to automatically create
one grid disk on each cell disk. When the ALL PREFIX option is used, the generated grid disk
names are composed of the grid disk prefix followed by an underscore (_) and then the cell disk
name.
It is best practice to use the ASM disk group name as the prefix name for the corresponding grid
disks. In the example, prefix values data and fra are the names of the ASM disk groups that
will be created. Grid disk names must be unique across all cells within a single deployment. By
following the recommended naming conventions for naming the grid and cell disks, you will
automatically get unique names.
The optional SIZE attribute specifies the size of each grid disk. If omitted, the grid disk will
automatically consume all the space remaining on the corresponding cell disk.
The LIST GRIDDISK command shows all the grid disks that are created.
Note that for cell disks that are not enabled for interleaving, the first grid disk created on each
cell disk uses the outermost p portion of the disk. In this area,, each track contains more data than
the inner tracks resulting in higher transfer rates and better performance. Because the best
available offset is chosen automatically in chronological order of grid disk creation, you should
first create those grid disks expected to contain the most frequently accessed data, and then
create the grid disks that will contain the relatively colder data.
Creating Flash-Based Grid Disks
CellCLI> DROP FLASHCACHE

Flash cache cell01_FLASHCACHE successfully dropped
CellCLI> CREATE FLASHCACHE ALL SIZE=100G

Flash cache cell01_FLASHCACHE successfully created
CellCLI> CREATE GRIDDISK ALL FLASHDISK PREFIX=flash

GridDisk flash_FD_00_cell01 successfully created
GridDisk flash_FD_01_cell01 successfully created Flash
Before After Cache
... Flash
Cache
GridDisk flash_FD_15_cell01 successfully created Grid disk
CellCLI> LIST GRIDDISK

...
flash_FD_00_cell01 active
... … …
flash_FD_15_cell01 active
CellCLI> exit
[celladmin@cell01 ~]$
Creating Flash-Based Grid Disks

By default, the initial cell configuration process creates flash-based cell disks on all the flash
devices, and then allocates all the available flash space to Exadata Smart Flash Cache. In
certain
t i circumstances,
i t you can benefit
b fit from
f creating
ti flash-based
fl h b d grid
id di
disks
k tto actt as a
permanent flash-based data store. To create space for flash-based grid disks, you first need to
drop the default flash cache. Then you can create a flash cache and flash-based grid disks with
your chosen sizes.
In the example in the slide, the default flash cache is dropped. Next, a new Exadata Smart
Flash Cache is created. The new cache is 100 GB in total size with 6.25 GB of space allocated
on each of the 16 flash-based cell disks.
The CREATE GRIDDISK command is used to create flash flash-based
based grid disks in the same way as
for disk-based grid disks. Note the use of the FLASHDISK option to specify the use of flash-
based cell disks as the basis for the grid disks. In the example in the slide, 16 flash-based grid
disks are created and each consumes the remaining 17.75 GB of space available on the flash-
based cell disks. The flash-based grid disks follow the same default naming convention as disk-
based grid disks.
Although this example does not show it, you can create multiple grid disks on a flash-based cell
disk. Unlike physical disk devices, the order in which you allocate your flash space is not
important from a performance perspective. Likewise, interleaving is not applicable for flash-
based disks.
Note: Circumstances that favor the use of flash-based grid disks are discussed in the lesson
titled Optimizing Database Performance with Exadata.
Configuring Hosts to Access Exadata Cells 4
1. Create the following directory and files:

# mkdir -p /etc/oracle/cell/network-config
# chown oracle:dba /etc/oracle/cell/network-config
# chmod ug+wx /etc/oracle/cell/network-config
$ cd /etc/oracle/cell/network-config
$ cat - > /etc/oracle/cell/network-config/cellinit.ora

ipaddress1=192.168.50.23/24
$ cat - > /etc/oracle/cell/network-config/cellip.ora

cell="192.168.51.27"
ll
Repeat
cell="192.168.51.28"
for each
cell="192.168.51.29" host.
2. Restart database and ASM instances.
Configuring Hosts to Access Exadata Cells

After your Exadata cells are configured, the database server hosts must be configured to use
the cells:
• The cellinit.ora file contains the database server IP address that connects to the
storage network. This file is host specific, so the IP address will be specific to each
database server. The IP address is specified in Classless Inter-Domain Routing (CIDR)
format.
• The cellip.ora file contains the IP addresses that are used by storage cells to send
data to the database server host. These IP addresses correspond to the bonded
InfiniBand interface (bond0) on the cells
cells.
Restart the database and the Oracle ASM instances on the database server host after you finish
creating the cellinit.ora and cellip.ora files. After the files have been configured, they
should not be edited while your database or ASM instances are running.

Configuring ASM and Database Instances for 5
Exadata
• Oracle Database and ASM software must be at least
version 11.2.0.1
• Use ASM to store OCR and voting disks on Exadata
• Set the ASM_DISKSTRING
ASM DISKSTRING ASM initialization parameter:
– ASM_DISKSTRING='o/*/*'
• Set the COMPATIBLE database initialization parameter:
– COMPATIBLE='11.2.0.0.0'
Repeat
for each
host
Configuring ASM and Database Instances for Exadata

Oracle Database and Oracle Grid Infrastructure 11g Release 2 (11.2.0.1) or later must be
installed on the database server before you can access Exadata from ASM and database
instances.
If you are using Oracle Clusterware, it is recommended that you place the Oracle Cluster
Registry (OCR) and voting disks on ASM.
To ensure that ASM discovers Exadata grid disks, set the ASM_DISKSTRING initialization
parameter. A search string with the following form is used to discover Exadata grid disks:
o/<cell IP address>/<grid disk name>
Wildcards may be used to expand the search string. For example, to explicitly discover all the
available Exadata grid disks set ASM_DISKSTRING='o/*/*'. To discover a subset of
available grid disks having names that begin with data, set ASM_DISKSTRING='o/*/data*'.
Note that if the ASM_DISKSTRING initialization parameter is not set, then the default is to
discover all the available Exadata grid disks.
To configure a database instance to access cell storage, ensure that the COMPATIBLE
t is
parameter i sett to
t 11.2.0.0.0 or later
l t ini th
the d
database
t b initialization
i iti li ti filfile.
Note that Database Configuration Assistant (DBCA) 11.2.0.1 does not set the COMPATIBLE
initialization parameter to 11.2.0.0.0 by default, and you must set this parameter on the
Initialization Parameters page.
Configuring ASM Disk Groups for Exadata 6
Disk group DATA

Failure group cell01 Failure group cell02
o/<cell01 IP address>/data_cd_00_cell01 o/<cell02 IP address>/data_cd_00_cell02
... ...
o/<cell01 IP address>/fra_cd_00_cell01 o/<cell02 IP address>/fra_cd_00_cell02

... ...
All candidate disks on cell01 and cell02
CREATE DISKGROUP data NORMAL REDUNDANCY

DISK 'o/*/data*'
ATTRIBUTE 'compatible.rdbms' = '11.2.0.0.0',
'compatible.asm' = '11.2.0.0.0',
'cell.smart_scan_capable' = 'TRUE',
'au_size' = '4M';
Configuring ASM Disk Groups for Exadata

You can now create ASM disk groups from your ASM instance. An ASM disk group can include
Exadata grid disks and conventional disks. However, to enable Smart Scan processing, all the
di k iin an ASM di
disks disk
k group mustt b
be E
Exadata
d t grid
id di
disks,
k andd th
the ffollowing
ll i di disk
k group attribute
tt ib t
settings must be used:
'compatible.rdbms' = '11.2.0.0.0'
'compatible.asm' = '11.2.0.0.0'
'cell.smart_scan_capable' = 'TRUE'
In addition, it is recommended that you set the AU_SIZE disk group attribute value to 4M to
optimize disk scanning.
The example in the slide shows candidate ASM disks from two Exadata cells: cell01 and
cell02. The CREATE DISKGROUP statement references all of the candidate ASM disks having
names that start with data. By default, ASM failure groups corresponding to each cell are
automatically defined. As a result, two failure groups are automatically created using
corresponding grid disks from each cell. By default, the failure group names correspond to the
cell names.
Once created
created, an Exadata
Exadata-based
based disk group can be used to house Oracle data files in the same
way as an ASM disk group based on any other storage. To complement the recommended
AU_SIZE setting of 4 MB, you should set the initial extent size to 8 MB for large segments. This
can be done using segment-level or tablespace-level settings. The recommended approaches
are discussed in the lesson entitled Optimizing Database Performance with Exadata.
Optional Configuration Tasks
• Configure Exadata storage security.

• Configure I/O Resource Management (IORM).
Optional Configuration Tasks

After you complete the cell configuration, you can perform the following optional tasks on the
storage cell:
• Configure Exadata storage security.
• Configure I/O Resource Management (IORM). IORM is covered in detail in the lesson
titled Exadata and I/O Resource Management.
Note: Repeat each configuration task on each relevant storage cell.

Exadata Storage Security Overview
ASM cluster A
ASM-scoped
security Non-RAC RAC
DB DB
mode Instance Instances
Grid
disk
Exadata Cell 1 Exadata Cell 2
Non-RAC RAC
DB DB Database-scoped
Instance Instances
security mode
ASM cluster B
Exadata Storage Security Overview

Exadata storage security is implemented by controlling which ASM clusters and database
clients can access specific ggrid disks on storage
g cells.
• To set up security so that all database clients of an ASM cluster have access to specified
grid disks, configure ASM-scoped security.
• To set up security so that specific database clients of an ASM cluster have access to
specified grid disks, configure database-scoped security.
Both concepts are illustrated in the slide. ASM cluster A shares two grid disks per cell with all of
its database clients. ASM cluster B shares one grid disk per cell to store the single instance
database, and another two grid disks (one per cell) to store the RAC database.
Note: By default, none of these security modes are implemented. This situation is called open
security where all database clients can access all grid disks. Open security does not require any
configuration, and as long as the network and database hosts are well secured you can use this
mode for your production databases. Open security is also useful for non-production
environments such as those that house test or development databases.

Exadata Storage Security Implementation
Exadata
Cell
ASM cluster
hosts
CREATE KEY A A
S S /etc/oracle/cell/network.config
M M
cellkey.ora
Each ASSIGN KEY

cell FOR <ASM>
Each
database
D
CREATE/ALTER B $ORACLE_HOME/admin/<db_unique_name>/pfile
Each GRIDDISK cellkey.ora

disk availableTo
<ASM> Each
Each disk
D cell
B
CREATE/ALTER
ASSIGN KEY
CREATE KEY GRIDDISK
FOR <DB>
availableTo <DB>
Exadata Storage Security Implementation

The slide briefly describes the steps to configure ASM-scoped and database-scoped security. It
is important to realize that yyou must set up ASM-scoped securityy first if yyou want to set up
database-scoped security.
To implement ASM-scoped security, perform the following steps:
1. Shut down your ASM and database instances.
2. Generate a security key using the CREATE KEY CellCLI command. Run this command
once only on any cell.
3. Construct a cellkey.ora file using the generated security key. Copy the cellkey.ora
file into the /etc/oracle/cell/network
/etc/oracle/cell/network-config/config/ directory on every host in your
ASM cluster.
4. Use the ASSIGN KEY command to assign the security key to the Oracle ASM cluster on
all the cells that you want the Oracle ASM cluster to access. The ASM cluster name is
determined by the DB_UNIQUE_NAME initialization parameter setting.
5. Enter the Oracle ASM cluster name in the availableTo attribute with the CREATE
GRIDDISK or ALTER GRIDDISK command to configure security on the grid disks on all
the cells that you want the Oracle ASM cluster to access.
access At the conclusion of this step step,
each grid disk has an association with the ASM cluster that is allowed to use the disk.
6. Restart your ASM and database instances.

Exadata Storage Security Implementation (continued)
After you have configured and tested ASM-scoped security, you can proceed to set up
database-scoped security. Perform the following steps for each database you want to configure
with database-scoped security:
1. Shut down your ASM and database instances.
2. Generate a security key using the CREATE KEY CellCLI command. Run this command
once only on any cell.
3. Construct a cellkey.ora file using the generated security key.
Copy the cellkey.ora file into the
$ORACLE_HOME/admin/<db_unique_name>/pfile/ directory on every host running
your database
database.
4. Use the ASSIGN KEY command to assign the security key to the database on all the cells
that you want the database to access. The database name is determined by the
DB_UNIQUE_NAME initialization parameter setting.
5. Enter the database name in the availableTo attribute with the CREATE GRIDDISK or
ALTER GRIDDISK command to configure security on the grid disks on all the cells that
you want the database to access. At the conclusion of this step, each grid disk has an
association with the ASM cluster and specific database that is allowed to use the disk
disk.
6. Restart your ASM and database instances.
Note: For more information, including examples and further details, refer to the Oracle Exadata
Storage Server Software User's Guide 11g Release 2 (11.2).

Quiz
Grid disks are seen by ASM by using a discovery string that

starts with:
1. c/
2 o/
2.
3. g/
4. e/
Answer: 2

Quiz
The first grid disk you create uses the slowest tracks of the
corresponding physical disk.
1. TRUE
2 FALSE
2.
Answer: 2

Quiz
When you create a disk group for which you want Exadata
smart storage capabilities enabled, what three attributes must
you specify?
1. compatible.rdbms
p
2. compatible.asm
3. au_size
4. disk_repair_time
5. cell.smart_scan_capable
Answer: 1, 2, 5

Summary

• Perform the initial Exadata boot sequence
• Configure Exadata software
• Create and configure ASM disk groups using Exadata
• Use the CellCLI Exadata administration tool


– Exadata Cell Configuration
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/041ExadataCellConfig/041exadat
acellconfig_viewlet_swf.html
– Exadata Storage Provisioning
— http://st-
http://st
curriculum.oracle.com/demos/db/11g/r2/dbmach/042ExadataStorageProvisioning/0
42exadatastorageprovisioning_viewlet_swf.html
– Consuming Exadata Grid Disks Using ASM
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/043ExadataConsumingGridDisks/
043exadataconsuminggriddisks_viewlet_swf.html
– Exadata Cell User Accounts
— http://st-
curriculum oracle com/demos/db/11g/r2/dbmach/044ExadataUserAccounts/044exa
curriculum.oracle.com/demos/db/11g/r2/dbmach/044ExadataUserAccounts/044exa
datauseraccounts_viewlet_swf.html
– Exadata Cell First Boot
— http://st-
curriculum.oracle.com/demos/db/11g/r2/exadatav2/cellfirstboot/cellfirstboot.swf
– Another Example of Exadata Cell Configuration
— http://st-curriculum.oracle.com/demos/db/11g/r2/exadatav2/cellcli/cellcli.swf

Configuring Exadata
In these practices, you will perform a variety of Exadata
configuration tasks, including cell configuration and storage
provisioning. You will also consume Exadata storage using
ASM and exercise the privileges associated with the different
cell user accounts.

E d t P
Exadata Performance
f M
Monitoring
it i and
d
Maintenance

Objectives

• Describe the various performance monitoring facilities
available for Exadata
• Monitor Exadata from directly within a cell,
cell from a
database instance, and through Enterprise Manager
• Interpret SQL execution plans that use Smart Scan
• Outline probable maintenance scenarios

Monitoring Overview
1 2 3
Metrics Alerts Active requests
4 5 6
Execution V$ Wait
plans views events
Monitoring Overview
After Exadata is configured and in use, the administrative focus shifts to ongoing monitoring
and maintenance
maintenance. To monitor Exadata
Exadata, you can use the following tools and information:
1. Exadata cell metrics
2. Exadata cell alerts
3. Exadata active requests
4. Database SQL statement execution plans
5. Database V$ views
6. Database wait events
7. Oracle Enterprise Manager Exadata monitoring plug-in

Exadata Metrics and Alerts Architecture
MS keeps a
set of the metric values.
Collected metrics: One hour of LIST METRICCURRENT
Cell, Cell Disks,
Grid Disk, IORM, in-memory
Interconnect
Metric metric values
thresholds
CELLSRV
exceeded
internal
errors
Every hour MS
CELLSRV
flushes metric
collects CELLSRV ADR MS values to disk.
metrics
Cell ALTER CELL

software Disk Seven days
issues metrics
Email
Cell and/or
1h
hour
LIST METRICHISTORY
SNMP
Cell LIST ALERTHISTORY
alerts
hardware
issues
Metric and Alert
History
Admin
Exadata Metrics and Alerts Architecture

You can monitor each cell with Exadata cell metrics. CELLSRV periodically records important
run-time properties, called metrics, for cell components such as CPUs, cell disks, g
grid disks,
flash cache, and IORM statistics. These metrics are recorded in memory.
Based on its own metric collection schedule, MS gets the set of metric data accumulated by
CELLSRV. MS keeps a subset of the metric values in memory, and writes a history to the disk
repository every hour.
The retention period for metric and alert history entries is specified by the
metricHistoryDays cell attribute. You can modify this setting with the CellCLI ALTER CELL
command. d B
By d
default,
f lt it is
i seven days.
d Thi
This process iis conceptually
t ll similar
i il tto d
database
t b AWR
snapshots.
You can get the metric value history by using the CellCLI LIST METRICHISTORY command,
and the current metric values by using the LIST METRICCURRENT command.
At the Exadata cell level, you can define thresholds for metrics. Using the Enterprise Manager
plug-in for Exadata, you can set separate EM thresholds for all the Exadata metrics supported
by the plug-in
plug-in.

Exadata Metrics and Alerts Architecture (continued)
In addition to metrics, Exadata can trigger alerts. Alerts represent events of importance
occurring within the cell, typically indicating that an Exadata cell function is compromised. MS
gg
triggers an alert when it discovers a:
• Cell hardware issue
• Cell software or configuration issue
• CELLSRV internal error
• Metric that has exceeded an alert threshold
You can view triggered alerts using the LIST ALERTHISTORY command. In addition, you can
configure the cell to instruct MS to automatically send an email and/or SNMP messages to a
designated set of storage administrators.

Monitoring Exadata with Metrics
1
Metrics
alertState metricObjectName unit objectType metricValue metricType name …
normal number cumulative

warning % (percentage) instantaneous
critical F (fahrenheit) rate
C (celsius) transition
SHOLDS
IORM_CONSUMER_GROUP
IORM_DATABASE
CREATE|ALTER THRES
Th h ld
Thresholds IORM_CATEGORY
CELL
CELLDISK
name CELL_FILESYSTEM
comparision
critical GRIDDISK
occurances HOST_INTERCONNECT
observation
warning FLASHCACHE
Monitoring Exadata with Metrics

Metrics are recorded observations of important run-time properties or internal instrumentation
values of the storage cell and its components, such as cell disks or grid disks. Metrics are a
series of measurements that are computed and retained in memory for an interval of time, and
stored on a disk for a more permanent history.
The graphic in the slide describes some of the important metric attributes. Each metric:
• Has a name and description
• Is associated with a metricObjectName that is the name of the object being measured,
such as a specific cell disk, grid disk, or consumer group
• Belongs
B l to
t a group that
th t iis d
defined
fi d bby itits objectType attribute.
tt ib t The
Th possible
ibl groups are
shown in the slide.
• Has a metricType, which is an indicator of how the statistic was created or defined.
Possible values and their meanings are:
- cumulative: Cumulative statistics since the metric was created
- instantaneous: Value at the time that the metric is collected
t Rates
- rate: R t computed t dbby averaging i statistics
t ti ti over observation
b ti periods
i d
- transition: Are collected at the time when the value of the metrics has changed,
and typically captures important transitions in hardware status
• Has a measurement unit. Possible units are shown in the slide.
Monitoring Exadata with Metrics (continued)
Understanding the composition of the metric name provides a good insight into the meaning of
the metric. The value of the name attribute is a composite of abbreviations. The attribute value
sstarts
a s with a
an abb
abbreviation
e a o o of the
e objec
object type
ype o
on which
c the
e metric
e c is
s de
defined:
ed
• CL_ (cell)
• CD_ (cell disk)
• GD_ (grid disk)
• FC_ (flash cache)
• DB_ (database)
• CG_CG (consumer group)
• CT_ (category)
• N_ (interconnect network)
After the abbreviation of the object type, many metric names conclude with an abbreviation that
relates to the description of the metric. For example, CL_FANS is the instantaneous number of
working fans on the cell.
I/O l t d metric
I/O-related t i name attributes
tt ib t continue
ti with
ith one off the
th following
f ll i combinations
bi ti tto id
identify
tif th
the
operation:
• IO_RQ (number of requests)
• IO_BY (number of MB)
• IO_TM (I/O latency)
• IO_WT (I/O wait time)
Next in the name could be _R for read or _W for write. Following that, there might be _SM or _LG
to identify small or large I/Os, respectively. At the end of the name, there could be _SEC to
signify per second or _RQ to signify per request. For example:
• CD_IO_RQ_R_SM is the number of requests to read small blocks on a cell disk.
• GD_IO_BY_W_LG_SEC is the number of MB of large block I/O per second on a grid disk.
If a metric value crosses a user-defined threshold, an alert will be generated. Metrics can be
associated with warning and critical thresholds. Thresholds relate to extreme values in the
metric, which might indicate a problem or other event of interest to an administrator.
Thresholds are supported on cell disk and grid disk I/O error count metrics (CD_IO_ERRS_MIN
and GD_IO_ERRS_MIN), along with the cell memory utilization (CL_MEMUT ) and cell filesystem
utilization (CL_FSUT) metrics. In addition, you can set thresholds for I/O Resource Management
(IORM) related metrics. The CellCLI LIST ALERTDEFINITION command lists the metrics for
which thresholds can be set.
Users of Enterprise Manager Grid Control with the Exadata Plug-in can configure a separate set
of thresholds and alerts in the Grid Control environment. These can be used in conjunction with
metrics and alerts from across your systems to provide an enterprise-level view of system health
and state.
Note: For a complete reference of metric and threshold attributes, refer to the Oracle Exadata
Storage Server Software User's Guide. For more information about the Exadata Plug-in for
Enterprise Manager Grid Control, refer to the Oracle Exadata Storage Server Documentation
library.

Monitoring Exadata with Metrics: Example
CellCLI> LIST METRICDEFINITION WHERE objectType ='CELL' DETAIL

name: CL_CPUT
description: "Cell CPU Utilization is the percentage of time over
the previous minute that the system CPUs were not
idle (from /proc/stat). "
metricType: Instantaneous objectType: CELL
unit: %
...
CellCLI> LIST METRICHISTORY WHERE name like 'CL_.*' –

AND collectionTime > '2009-10-11T15:28:36-07:00'
CL_RUNQ cell03_2 6.0 2009-10-11T15:28:37-07:00
CL_CPUT cell03_2 47.6 % 2009-10-11T15:29:36-07:00
CL_FANS cell03_2 1 2009-10-11T15:29:36-07:00
CL_TEMP cell03_2 0.0 C 2009-10-11T15:29:36-07:00
CL RUNQ
CL_RUNQ cell03_2
ll03 2 5
5.2
2 2009
2009-10-11T15:29:37-07:00
10 11T15 29 37 07 00
...
CellCLI> LIST METRICCURRENT WHERE objectType = 'CELLDISK'

CD_IO_TM_W_SM_RQ CD_1_cell03 205.5 us/request
...
Monitoring Exadata with Metrics: Example

The slide shows you some basic commands that you could use to display metric information:
• Use tthe e LISTS METRICDEFINITION
C O cocommand
a d to d
display
sp ay tthe
e metric
et c de
definitions
t o s for
o tthe
e
cell. A metric definition describes the configuration of a metric. The example does not
specify any particular metric, so all metrics corresponding to the WHERE clause are printed.
In addition to the WHERE clause, you can also specify the metric definition attributes you
want to print. If the ATTRIBUTES clause is not used, a default set of attributes is displayed.
To list all the attributes, you can add the DETAIL keyword at the end of the command.
• Use the LIST METRICHISTORY command to display the metric history for the cell. A
metric history describes a collection of past metric observations.
observations Similar to the LIST
METRICDEFINITION command, you can specify attribute filters, an attribute list, and the
DETAIL keyword for the LIST METRICHISTORY command. The above example lists
metrics having names that start with CL_ that were collected after the specified time.
• Use the LIST METRICCURRENT command to display the current metric values for the
cell. The above example lists all cell disk metrics. The metric values shown in the slide
correspond to the average latency per request of writing small blocks to a cell disk. For
this metric there is a metric observation for every cell disk.

Monitoring Exadata with Alerts
2
Alerts
alertSource severity alertType metricObjectName examinedBy metricName
name
BMC warning stateful
ADR critical stateless alertAction
Metric clear
info alertMessage
failedMail
ALTER ALERTHISTORY ALL examinedBy="<administrator>"
y f il dSNMP
failedSNMP
beginTime
0 EndTime
1
2 notificationState
3 …
Monitoring Exadata with Alerts

Alerts represent events of importance occurring within the storage cell, typically indicating that
g cell functionality
storage y is either compromised or in danger
g of failure. An administrator should
investigate alerts, because they might require urgent corrective or preventative action. Use the
ALTER CELL command to configure email or SNMP notification for alerts.
Alerts are either stateful or stateless. Stateful alerts represent observable cell states that can be
subsequently retested to detect whether the state has changed, indicating that a previously
observed alert condition is no longer a problem. Stateless alerts represent point-in-time events
that do not represent a persistent condition; they simply show that something has occurred.
Al t can h
Alerts have one off th
the ffollowing
ll i severities:
iti warning,
i critical,
iti l clear,
l or info.
i f
Examples of possible events that trigger alerts are physical disk failure, disk read/write errors,
cell temperature exceeding recommended value, cell software failure, and excessive I/O
latency.
Metrics can be used to signal stateful alerts using warning or critical threshold values.
When the metric value crosses the threshold value, an alert is signaled. An alert with a clear
severityy indicates that a p
previous critical or warningg condition has returned to normal. For
threshold-based alerts, a clear alert is generated when the measured value crosses back over
the threshold value.

Monitoring Exadata with Alerts (continued)
Alerts with an info severity are stateless and log conditions that might be informative to an
administrator but for which no administrator action is required. Informational alerts are not
distributed
d s bu ed by e email
a oor S
SNMP notifications.
o ca o s
The slide illustrates some of the important alert attributes. Each alert has the following attributes:
• name provides an identifier for the alert.
• alertSource provides the source of the alert. Some possible sources are listed in the
slide.
• severity determines the importance of the alert. Possible values are warning,
t ca , c
critical,
c ea , and
clear, a d info.o
• alertType provides the type of the alert: stateful or stateless. Stateful alerts are
automatically cleared on transition to normal. Stateless alerts are never cleared unless you
change the alert by setting the examinedBy attribute. This attribute identifies the
administrator who reviewed the alert and is the only alert attribute that can be modified by
the administrator using the ALTER ALERTHISTORY command.
• metricObjectName is the object for which a metric threshold has caused an alert.
• metricName provides the metric name if the alert is based on a metric.
• alertAction is the recommended action to perform for this alert.
• alertMessage provides a brief explanation of the alert.
• failedMail is the intended email recipient when a notification failed.
• failedSNMP is the intended SNMP subscriber when a notification failed.
• beginTime
g provides the timestamp
p p when an alert changesg its state.
• endTime provides the timestamp for the end of the period when an alert changes its state.
• notificationState indicates progress in notifying subscribers to alert messages:
- 0: never tried
- 1: sent successfully
- 2: retrying (up to 5 times)
- 3: five failed retries
Note: Some I/O errors may result in an ASM disk going offline without generating an alert in
Exadata. You should continue to perform I/O monitoring from your databases and ASM
environments to identify and remedy these kinds of problems.

Displaying Alert Examples
CellCLI> LIST ALERTDEFINITION ATTRIBUTES name, metricName, description

ADRAlert "CELL Incident Error"
HardwareAlert "Hardware Alert"
StatefulAlert_CG_IO_RQ_LG CG_IO_RQ_LG "Threshold Based Stateful Alert"
StatefulAlert_CG_IO_RQ_LG_SEC CG_IO_RQ_LG_SEC "Threshold Based …Alert"
StatefulAlert_CG_IO_RQ_SM CG_IO_RQ_SM "Threshold Based Stateful Alert"
...
CellCLI> LIST ALERTHISTORY WHERE severity = 'critical' -

AND examinedBy = '' DETAIL
CellCLI> ALTER ALERTHISTORY 1671443814 examinedBy="JFV"
CellCLI> CREATE THRESHOLD ct_io_wt_lg_rq.interactive -

warning=1000, critical=2000, comparison='>', -
occurrences=2, observation=5
Displaying Alert Examples

The slide shows you some examples of commands that display alert information. The
commands for displayingy g alerts are veryy similar to the ones used for displayingy g metric
information:
• Use the LIST ALERTDEFINITION command to display the definition for every alert that
can be produced on the cell. The example in the slide displays the alert name, metric
name, and description. The metric name identifies the metric on which the alert is based.
ADRAlert and HardwareAlert are not based on any metric and, therefore, do not have
metric names.
• Use the LIST ALERTHISTORY command to display the alert history that has occurred on
a cell. The example in the slide lists in detail all critical alerts that have not been reviewed
by an administrator.
• Use the ALTER ALERTHISTORY command to update the alert history for the cell. The
above example shows how to set the examinedBy attribute to the user ID of the
administrator that examined the alert. The examinedBy attribute is the only
ALERTHISTORY attribute that can be modified. The example uses the alert sequence ID to
identify the alert. alertSequenceID provides a unique sequence ID number for the alert.
When an alert changes its state, another occurrence of the alert is created with the same
sequence number but with a different timestamp.

Displaying Alert Examples (continued)
• The CREATE THRESHOLD command creates a threshold that specifies the conditions for
generation of a metric alert. The example creates a threshold for the CT_IO_WT_LG_RQ
metric associated with the INTERACTIVE category.g y This metric specifies
p the average
g
number of milliseconds that large I/O requests issued by the category have waited to be
scheduled by IORM in the past minute. A large value indicates that the I/O workload from
this category is exceeding the allocation specified for it in the category plan. The alert is
triggered by two consecutive measurements (occurrences=2) over the threshold values:
one second for a warning alert (warning=1000) and two seconds for a critical alert
(critical=2000). The observation attribute is the number of measurements over
which measured values are averaged.

Monitoring Exadata with Active Requests
3
Active Requests
ioGridDisk ioBytes ioOffset ioReason ioType objectNumber id name
parentID asmDiskGroupNumber
requestState asmFileIncarnation
file initialization
read
sessionID asmFileNumber
write
predicate pushing
sessionSerNumber filtered backup read consumerGroupName
predicate push read
t bl
tablespaceNumber
N b dbN
dbName
sqlID instanceNumber
LIST ACTIVEREQUEST - fileType

WHERE IoType = 'predicate pushing' -
DETAIL
Monitoring Exadata with Active Requests

An active request provides a client-centric or application-centric view of client I/O requests that
are currently being processed by a cell.
The slide shows the most important attributes of an active request. You can see that an active
request is characterized at all levels: instance, database, ASM, and cell. Most of the attributes
have self-explanatory names. Here is a brief explanation of some of the attributes:
• ioReason is the reason for the I/O activity, such as a control-file read.
• ioType identifies the type of active request. Possible values are listed in the slide.
• requestState identifies the state of the active request. Possible values include:
- Accessing Disk - Computing Result
- Network Receive - Network Send
- Queued Extent - Queued for Disk
- Queued for File Initialization - Queued for Filtered Backup Read
- Queued for Network Send - Queued for Predicate Pushing
- Queued for Read - Queued for Write
- Queued in Resource Manager
Use the LIST ACTIVEREQUEST command to display active request details for the cell. The
syntax is very similar to other LIST commands. You can specify which attributes to display or
you can display them all using the DETAIL clause. You can also filter the output using a WHERE
clause.
Monitoring SQL Execution Plans
4
Relevant Initialization Parameters:

• CELL_OFFLOAD_PROCESSING
– TRUE | FALSE
– Enables or disables Smart Scan and other smart storage
capabilities
– Dynamically modifiable at the session or system level using
ALTER SESSION or ALTER SYSTEM
– Specifiable at the statement level using the OPT_PARAM hint
• CELL OFFLOAD PLAN DISPLAY
CELL_OFFLOAD_PLAN_DISPLAY
– NEVER | AUTO | ALWAYS
– Allows execution plan to show offloaded predicates
– Dynamically modifiable at the session or system level using
ALTER SESSION or ALTER SYSTEM
Monitoring SQL Execution Plans

The CELL_OFFLOAD_PROCESSING initialization parameter enables SQL processing offload to
Exadata. The default value of the parameter is TRUE which means that predicate evaluation can
be offloaded to Exadata. If set to FALSE, the database performs all the predicate evaluation with
cells serving blocks like traditional storage. To enable offloading for a particular SQL statement,
use the OPT_PARAM hint as shown in the following example:
SELECT /*+ OPT_PARAM('cell_offload_processing' 'true') */ ...
The CELL_OFFLOAD_PLAN_DISPLAY initialization parameter determines whether the SQL
EXPLAIN PLAN statement displays the predicates that can be evaluated by Exadata as
STORAGE predicates for a given SQL statement
statement. The possible values are:
• AUTO instructs the SQL EXPLAIN PLAN statement to display the predicates that can be
evaluated as STORAGE only if a cell is present and if a table is on the cell.
• ALWAYS produces changes to the SQL EXPLAIN PLAN statement whether or not Exadata
is present or the table is on the cell. You can use this setting to identify statements that are
candidates for offloading before migrating to Exadata.
• NEVER produces no changes to the SQL EXPLAIN PLAN statement due to Exadata Exadata. This
may be desirable, for example, if you wrote tools that process execution plan output and
these tools have not been updated to deal with new syntax or when comparing plans for
two systems: one with Exadata and one without.

Smart Scan Execution Plan Example
SQL> alter session set CELL_OFFLOAD_PROCESSING = TRUE;

Session altered.
SQL> alter session set CELL_OFFLOAD_PLAN_DISPLAY = ALWAYS;
Session altered.
SQL> explain plan for select * from customers where c_customer_sk < 10;
Explained.
p
SQL> select * from table(dbms_xplan.display);
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 196 | 326 (1)|
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 196 | 326 (1)|
| 3 | PX BLOCK ITERATOR | | 1 | 196 | 326 (1)|
|* 4 | TABLE ACCESS STORAGE FULL| CUSTOMER | 1 | 196 | 326 (1)|
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - storage("C_CUSTOMER_SK"<10) filter("C_CUSTOMER_SK"<10)
Smart Scan Execution Plan Example

Smart Scan is enabled by default for direct read operations. So Exadata uses only direct reads,
not the buffer cache, to process queries that can be offloaded.
Exadata optimization is a run-time decision, and it is not integrated with the Oracle optimizer. So
offloading is possible only for full scans and is available only with segments stored on disk
groups that are completely stored on Exadata.
If Exadata is not sure that a block is current, it transfers the read of that block to the traditional
buffer cache/read consistency path. So if you run updates at the same time as queries, you will
benefit less from Smart Scan than if you were executing a read-only workload. This is also true
f indirect
for i di t rows.
The slide shows an example of SQL processing offload manifested in a query plan. The first
command enables offloading for the session. The second command enables storage predicates
to be showed in the SQL execution plans of the session, even if Exadata is not present.
At the bottom of the plan output, you can see the STORAGE operation indicating the predicate
being offloaded to Exadata ("C_CUSTOMER_SK"<10).
Note:
N t Smart
S tS
Scan iis also
l available
il bl ffor iindex
d ffastt ffullll scans, nott jjustt ttable
bl scans. Al
Also, you
cannot see column projection in a query plan.

Predicate Offloading Considerations
Predicate evaluation is not offloaded when:
• CELL_OFFLOAD_PROCESSING is set to FALSE
• The table or partition being scanned is small
• The optimizer decides not to use direct path reads
• A scan is pperformed on a clustered table
• The table has row dependencies enabled or the rowscn is being
fetched
• The optimizer wants the scan to return rows in ROWID order
• The command is CREATE INDEX using NOSORT
• A LOB or LONG column is being selected or queried
• A scan is performed on a flashback table
• The data is encrypted and cell-based decryption is disabled
• The tablespace is not completely stored on Exadata
• More than 255 columns are used in a query
• The predicate evaluation is on a virtual column
Predicate Offloading Considerations

The slide lists the cases where predicate evaluation is not offloaded to Exadata. The following
provides additional information for some of these cases:
• The optimizer decides not to use direct path reads: Direct path reads are mostly used by
parallel operations. Serial operations can do direct reads too, depending on factors such
as the table size and the state of the buffer cache. Direct path reads can also be forced
for serial access by setting _serial_direct_read to TRUE.
• The data is encrypted and cell-based decryption is disabled: In order for Exadata to
perform decryption, Oracle Database needs to send the decryption key to Exadata. If
there are security concerns about keys being shipped across the storage network, you
can disable cell-based decryption by setting the CELL_OFFLOAD_DECRYPTION
parameter to FALSE.

Monitoring Exadata from Your Database
5
V$CELL V$SQL
- CELL_PATH cellip.ora - SQL_TEXT
- CELL_HASHVAL - PHYSICAL_READ_BYTES
- PHYSICAL_WRITE_BYTES
cell - IO_INTERCONNECT_BYTES
cell p
physical
y IO
flash cache - IO CELL OFFLOAD ELIGIBLE BYTES
IO_CELL_OFFLOAD_ELIGIBLE_BYTES
read hits bytes saved during
optimized RMAN file - IO_CELL_UNCOMPRESSED_BYTES
cell physical IO restore
bytes eligible for - IO_CELL_OFFLOAD_RETURNED_BYTES
cell physical IO
predicate offload
interconnect bytes - OPTIMIZED_PHY_READ_REQUESTS
cell physical IO ...
interconnect bytes cell physical IO
returned by smart bytes saved by
scan storage index
physical write
total bytes physical read
total bytes V$BACKUP_DATAFILE
cell physical IO - DATAFILE_BLOCKS
bytes saved during
- BLOCKS_READ
V$SYSSTAT optimized file
creation - BLOCKS_SKIPPED_IN_CELL
- NAME
cell IO ...
- VALUE uncompressed
bytes
...
Monitoring Exadata from Your Database

You can use the following V$ views and corresponding statistics to monitor Exadata from a
database instance:
• V$CELL provides the cell IP address extracted from the cellip.ora file. It also
contains a numeric hash value for the cell which is used as an identifier for the cell in
other views, such as V$SESSION_WAIT and V$ACTIVE_SESSION_HISTORY.
• V$BACKUP_DATAFILE contains statistics relevant to Exadata during RMAN incremental
backups. The BLOCKS_SKIPPED_IN_CELL column indicates the number of blocks that
were filtered in Exadata to optimize the RMAN incremental backup.
• V$SYSSTAT
$S SS and
a d V$SESSTAT
$S SS contain
co a key
ey sstatistics
a s cs that
a ca
can be used to
o co
compute
pu e
Exadata effectiveness at both the system and session level. Statistics in these views
can be used to monitor the effectiveness of Exadata Smart Flash Cache, Exadata
Hybrid Columnar Compression, SQL offloading, storage indexes, fast file creation, and
optimized incremental backups. In addition, other statistics provide the total volume of
I/O exchanged over the interconnect and the total volume of physical disk reads and
writes.
• V$SQL lists statistics on shared SQL areas. It contains statement-level statistics for the
volume off physical I/O
/O (reads
( and writes),
) the volume off I/O
/O exchanged over the
interconnect, along with information relating to the effectiveness of Exadata Smart Flash
Cache, Exadata Hybrid Columnar Compression, and SQL offloading.
Note: For more information, refer to the Oracle Exadata Storage Server Software User's
Guide.
Monitoring Exadata with Wait Events
6
SELECT w.event, c.cell_path, d.name, w.p3

FROM V$SESSION_WAIT w, V$EVENT_NAME e, V$ASM_DISK d, V$CELL c
WHERE e.name LIKE 'cell%' AND e.wait_class_id = w.wait_class_id
AND w.p1 = c.cell_hashval AND w.p2 = d.hash_value;
Wait Event Description

cell interconnect retransmit Database wait during retransmission for an I/O of a
during physical read single-block or multiblock read
cell single block physical read Cell equivalent of db file sequential read
cell multiblock physical read Cell equivalent of db file scattered read
cell smart table scan Database wait for table scan to complete
cell smart index scan Database wait for index or IOT fast full scan
cell smart file creation Database wait for file creation operation
cell smart incremental backup Database wait for incremental backup operation
cell smart restore from backup Database wait during file initialization for restore
cell statistics gather Wait during query of V$CELL views
Monitoring Exadata with Wait Events

Oracle uses a specific set of wait events for disk I/O to Exadata that identifies the corresponding
cell and g
grid disk being
g accessed. This information is more useful for performance and
diagnostics purposes than the database file number and block number information that is
provided by wait events for conventional storage.
Information about wait events is displayed in V$ dynamic performance views, such as
V$SESSION_WAIT, V$SYSTEM_EVENT and V$SESSION_EVENT.
The slide shows an example of a query used to display the cell IP address and disk name
corresponding to cell wait events. A list of cell wait events with a brief description is also shown.
M t off th
Most the cellll wait
it events
t are self-explanatory.
lf l t
The cell statistics gather event is a little different. It appears when a select is done on
the V$CELL_STATE, V$CELL_THREAD_HISTORY, or V$CELL_REQUEST_TOTALS view. During
such a query, data from the cells and any wait events are shown in this wait event. Normally,
these V$CELL views are only used by Oracle Support Services.
Note: For more information about these wait events, refer to the Oracle Exadata Storage Server
Software User
User's
s Guide.
Guide

Monitoring Exadata with Enterprise Manager
7
Monitoring Exadata with Enterprise Manager

The System Monitoring Plug-In for Exadata extends Grid Control to add support for managing
Exadata targets. By deploying the plug-in to your Grid Control environment, you gain the
following management features:
• Monitor Exadata, individually or in groups, using a GUI environment.
• Gather storage configuration and performance information of various Exadata related
storage components, such as grid disks and cell disks.
• Raise alerts and violations based on thresholds set for monitoring and configuration data.
• Provide rich out-of-the-box metrics and reports based on the gathered data.
This plug-in supports the following versions of products:
• Exadata Storage Server software release 11.2 and later.
• Enterprise Manager Grid Control 10g Release 3 (10.2.0.3) and later (OMS and Agent).
The plug-in requires SSH connectivity between the celladmin user on the Exadata cells and
the Management Agent user on the computer running the Management Agent. Typically, the
agent on the OMS server is used to monitor Exadata, but another agent can be used.
Before you set up alerts in Grid Control, you must configure the Exadata cells to send SNMP
alerts to the Management Agent that is monitoring them.
Note: Refer to the Enterprise Manager System Monitoring Plug-In Installation Guide for Sun
Oracle Exadata Storage Server Release 10 (1.1.4.0.0) for more information about the plug-in.
Additional Monitoring Tools and Utilities
Facility Application More Information
Integrated Lights Primarily http://docs.sun.com/app/docs/coll/ilom3.0 for

Out Manager (ILOM) storage server more information about ILOM
hardware
Also network
net ork
and operating
system
Standard Linux Primarily http://linux.oracle.com for more information
monitoring tools and storage server about Oracle Linux
utilities (vmstat, operating See Linux man pages for specific utilities.
iostat, top, system
syslog, and so on) Also network
and hardware
Additional Monitoring Tools and Utilities

In addition to the facilities described so far in this lesson, Exadata administrators have
numerous tools and utilities that are provided to monitor Exadata hardware,
hardware operating system
and network components. While thresholds and alerts provide the primary method for
highlighting issues and failures, there are circumstances where suboptimal performance will
not lead to an alert, but will be clearly manifested using a specific tool or utility. Also, where
administrators are already familiar with some tools, they are able to apply existing knowledge,
skills, procedures, and even code to assist in monitoring and maintaining Exadata.
The table in the slide lists some of the additional tools and utilities that are provided, you may
fi d more information
find i f ti using i the
th resources listed
li t d in
i the
th table.
t bl Integrated
I t t d Lights
Li ht OOutt M
Manager
(ILOM) is introduced in more detail in the lesson entitled Monitoring and Maintaining
Database Machine.

Cell Maintenance Overview
• Planned maintenance
– Examples
— Patch or upgrade Exadata software
– Procedure overview
1. Take the corresponding ASM failure groups offline.
2. Execute your planned maintenance operation.
3. Bring the ASM failure groups back online.
• Unplanned maintenance
– Examples
p
— Disk failure, Cell hardware failure, or CELLSRV process failure
– Procedure overview
1. Remedy the failure.
2. Bring online or re-create the affected ASM failure groups.
Cell Maintenance Overview

Cell maintenance operations can be broadly divided into two categories: planned and
unplanned.
Planned maintenance will most likely involve patching or upgrading the Exadata software but it
may also include replacing or upgrading an item of hardware prior to a failure. For example, you
might replace one of the storage server power supplies as a planned maintenance operation
because it is no longer functioning but its failure has not impacted on the operations of the cell.
If you are following the recommended pattern of maintaining ASM redundancy using failure
groups on multiple Exadata cells, then planned maintenance operations can be undertaken with
minimal
i i l di
disruption
ti tto normall processing.
i IIn essence, you needd tto:
1. Take offline the failure groups associated with the cell being maintained. You should
define a maintenance window, which provides ample time for the maintenance operation.
This can be achieved by specifying a timeout clause in your ALTER DISKGROUP ...
OFFLINE statement or by setting the DISK_REPAIR_TIME ASM initialization parameter.
2. Perform the required maintenance operation within the planned maintenance window.
3 Bring the affected ASM failure groups back online
3. online.

Cell Maintenance Overview (continued)
If the maintenance operation takes longer than planned, you can either extend the maintenance
window or if it has expired, you will have to re-create the ASM disks and failure groups.
Some failure scenarios can be automatically remedied by Exadata
Exadata. For example
example, if the
CELLSRV process is killed, the restart server process (RS) will restart CELLSRV and in most
cases processing will continue uninterrupted.
In situations where failure cannot be automatically remedied, an unplanned maintenance
operation is required. Unplanned maintenance may be required as a result of a disk failure,
some other form of cell hardware failure, or unrecoverable cell software failure.
If you are following the recommended pattern of maintaining ASM redundancy using failure
groups on multiple Exadata cells, then the failure event will automatically cause your affected
ASM disks to be taken offline. If possible, remedy the failure. After the failure is remedied, the
affected ASM disks can either be brought back online or will need to be re-created depending
on whether or not the failure is remedied before the amount of time specified in the
DISK_REPAIR_TIME ASM initialization parameter. If the failure cannot be repaired, the failure
groups associated with the failed cell will be dropped and should be re-created on another cell.
Note: The general procedures described in this section rely on ASM redundancy (mirroring)
across multiple Exadata cells to maintain at least one copy of data online during a planned or
unplanned maintenance operation. If you do not implement ASM redundancy across multiple
Exadata cells or you suffer simultaneous failures that affect all of your data copies, then you will
need to rebuild your database (in whole or in part) or perform a recovery operation.

Automated Cell Maintenance Operations
• Automatic addition of a replacement disk to the original

disk group:
– Cell disk and grid disks are automatically re-created.
– Each grid
g disk is added back to the original
g disk g
group.
p
• Automatic cell restart:
– Grid disks are brought online when a cell restarts.
• Automatic firmware upgrades:
– A golden firmware copy is kept on the cell and flashed to
replacement components:
— Immediately for a disk replacement
— During reboot for other components, such as the motherboard,
InfiniBand HCA, disk controller, flash card, and so on
Automated Cell Maintenance Operations

Exadata simplifies various cell maintenance operations by automating tasks that previously
required administrator intervention:
• Automatic addition of a replacement disk to the original disk group: When a replacement
disk is inserted after a physical disk failure, the cell disk and grid disks are automatically
re-created, and each grid disk is automatically added back to the original ASM disk group.
The same occurs after the replacement of a flash card containing flash-based grid disks.
• Automatic cell restart: Grid disks are automatically changed to online when a cell recovers
from a failure, or after a restart.
• Automatic firmware upgrades: A copy of the required firmware firmware, called the golden version
version,
is kept on the cell as part of the cell software distribution.
- If a disk is replaced, the new disk is automatically flashed with the golden version of
the firmware before being rebuilt as previously mentioned.
- For other components, the cell must be shut down to replace the component. After
the Exadata cell is powered on, it will apply the golden version of the firmware on the
new component and restart. For a replacement motherboard, the storage server will
shut down and the administrator will need to power on once again.
- If a firmware change is made while the cell is running, a periodic check will raise an
alert if a component does not match the golden firmware level. After the cell is
rebooted, it will update itself by reapplying the golden firmware version.
Replacing a Damaged Physical Disk
1 Determine the damaged disk.
CellCLI> LIST ALERTHISTORY -

WHERE ALERTMESSAGE LIKE "Logical drive lost.*" DETAIL
Logical drive lost. Lun:0_5. Status: normal. Physdisk: 20:5.
Celldisk on it: CD_05_cell01.
CD 05 cell01. Griddisks on it: data_CD_05_cell01.
data CD 05 cell01.
The suggested action is: Refer to section Maintaining Physical Disks in
the User Guide.
2 Replace
physical disk.
20:5
LIST PHYSICALDISK normal
3 Monitor ASM to confirm the readdition of the disk.
SQL> SELECT NAME, STATE FROM V$ASM_DISK

SQL> SELECT * FROM GV$ASM_OPERATION
Replacing a Damaged Physical Disk

Replacing a physical disk due to problem or failure is probably the most likely hardware
maintenance operation that Exadata might ever require. Assuming you are using ASM
redundancy,
d d the
th procedure
d tto replace
l a problem
bl di
disk
k iis quite
it simple.
i l
The first step requires that you identify the problem disk. This could occur in a number of ways:
• Hardware monitoring using ILOM may report a problem disk.
• If a disk fails, an Exadata alert is generated. The alert includes specific instructions for
replacing the disk. If you have configured the system for alert notifications, the alert will be
sent to the designated email address or SNMP target. You can also use the LIST
ALERTHISTORY command shown in the slide to identify the failed disk.
• The LIST PHYSICALDISK command may identify a disk reporting a status of warning
or critical. Even if the cell is still functioning, the problem may be a precursor to a disk
failure.
• The CALIBRATE command may identify a disk delivering abnormally low throughput or
IOPS. Even if the cell is still functioning, a single bad physical disk can degrade the
performance of other good disks so you may decide to replace the identified disk. Note
that running CALIBRATE at the same time as the cell is active will impact performance.
After you have identified the problem disk, you can replace it. When you remove the disk, you
will get an alert. When you replace a physical disk, the disk must be acknowledged by the RAID
controller before it can be used. This does not take a long time, and you can use the LIST
PHYSICALDISK command to monitor the status until it returns to normal.
Replacing a Damaged Physical Disk (continued)
The grid disks and cell disks that existed on the previous disk in the slot will be automatically
re-created on the new disk. If these grid disks were part of an Oracle ASM disk group with
O
NORMAL or HIGH
o G redundancy,
edu da cy, they ey will be added back
bac too the
eddisk
s g group
oup aandd the
e da
data
a will be
rebalanced based on disk group redundancy and the asm_power_limit parameter.
Re-creating the ASM disk and rebalancing the data may take some time to complete. You can
monitor the progress of these operations within ASM. You can monitor the status of the disk as
reported by V$ASM_DISK.STATE until it returns to NORMAL. You can also monitor the rebalance
progress using GV$ASM_OPERATION.
Review the following considerations when replacing a failed disk:
• If the repair timer (specified in the DISK_REPAIR_TIME ASM initialization parameter) has
not expired, the ASM disk could be offline (not dropped) and the disk group is yet to be
rebalanced. In this case, the prompt replacement of the failed disk can avoid a needless
rebalance operation.
• The disk could be dropped by Oracle Automatic Storage Management (Oracle ASM), and
the rebalance operation may have been successfully run. Check the Oracle ASM alert logs
to confirm this.
this After the failed disk is replaced,
replaced a second rebalance will be required
required.
• The disk could be dropped, and the rebalance operation is currently running. Check the
GV$ASM_OPERATION view to determine if the rebalance operation is still running. In this
case the rebalance operation following the disk replacement will be queued.
• The disk could be dropped by ASM, and the rebalance operation failed. Check
GV$ASM_OPERATION.ERROR to determine why the rebalance operation failed. Monitor
the rebalance operation following the disk replacement to ensure it runs.
• Rebalance operations from multiple disk groups can be done on different Oracle ASM
instances in the same cluster if the physical disk being replaced contains grid disks from
multiple disk groups. Multiple rebalance operations cannot be run in parallel on just one
Oracle ASM instance. The operations will be queued for the instance.

Replacing a Damaged Flash Card
1 Determine the damaged flash card.
CellCLI> LIST PHYSICALDISK DETAIL

name: [9:0:2:0]
diskType: FlashDisk
...
slotNumber: "PCI Slot: 1; FDOM: 2"
status: critical
Power down Replace the Power up

2 the cell. 3 flash card. 4 the cell.
If the card contained a flash-based grid disk,

5 monitor ASM to confirm the readdition of the disk.
SQL> SELECT NAME, STATE FROM V$ASM_DISK

SQL> SELECT * FROM GV$ASM_OPERATION
Replacing a Damaged Flash Card

Each Exadata server is equipped with 4 PCI flash memory cards. Each card has 4 flash
modules ((FDOMs)) for a total of 16 flash modules on each cell.
Identifying a damaged flash module is similar to identifying a damaged physical disk. Hardware
monitoring using ILOM or a drop in performance indicated by the CALIBRATE command may
indicate a problem. If a failed FDOM is detected, an alert is generated. The alert message
includes if any flash-based grid disks were on the flash module.
As shown in the slide, a damaged flash module can also be reported using the LIST
PHYSICALDISK DETAIL command. The slotNumber attribute shows the PCI slot and the
FDOM number. b IIn thi
this example,
l th
the status
t t attribute
tt ib t indicates
i di t a critical
iti l ffault.
lt
If there were no grid disks on the flash module, the flash module was probably being used for
Exadata Smart Flash Cache. In this mode, the bad flash module results in a decreased amount
of flash memory on the cell. The performance of the cell is affected proportional to the size of
flash memory lost, but the database and applications are not at risk of failure.
Although technically the PCI slots in a Exadata server are hot-replaceable, it is recommended to
power down the cell while servicing a damaged flash card
card. After replacing the card and
powering up the cell, no additional steps are required to re-create any flash-based grid disks.
Optionally, you can monitor ASM to confirm the readdition of a flash-based grid disk.

Moving All Disks from One Cell to Another
Original New Original New
1. Make the grid disks inactive:

CellCLI> ALTER GRIDDISK ALL INACTIVE
2. Back up the operating system configuration files that will
change when the new cell is booted.
3. Move the disks from the original cell to the new cell.
• Ensure the system disks occupy the first two slots.
4. Boot the new cell.
5. Restart Exadata cell services:
CellCLI> ALTER CELL RESTART SERVICES ALL
6. Import the cell disks:
CellCLI> IMPORT CELLDISK ALL
Moving All Disks from One Cell to Another

You may need to move all drives from one Exadata server to another Exadata server. This may
be necessary when there is a chassis-level component failure, or when troubleshooting a
h d
hardware problem.
bl T
To move ththe d
drives,
i perform
f th
the ffollowing
ll i steps:
t
1. If possible, use the ALTER GRIDDISK ALL INACTIVE command to make the grid disks
inactive.
2. If possible, back up /etc/hosts, /etc/modprobe.conf, and the files in
/etc/sysconfig/network. This is a precautionary step if you want to retain the
settings associated with your original Exadata server in case you plan to move the disks
back to the original Exadata server in the future.
3. Move the disks from
f the original Exadata cell to the new Exadata cell.
Caution: Ensure the first two disks, which are the system disks, are in the same first two
slots. Failure to do so will cause the Exadata cell to not function properly.
4. Start the cell. The cell operating system will be automatically reconfigured to suit the new
server hardware.
5. Restart the cell services using ALTER CELL RESTART SERVICES ALL.
6. Import the cell disks using IMPORT CELLDISK ALL.
If you are using ASM redundancy and the procedure is completed before the amount of time
specified in the DISK_REPAIR_TIME ASM initialization parameter, then the ASM disks will be
automatically brought back online and updated with any changes made during the cell outage.

Using the Exadata Software Rescue Procedure
• Every Exadata server is equipped with a CELLBOOT USB

flash drive to facilitate cell rescue
– Cell rescue is required in the unlikely event that both system
disks fail simultaneously
– Use with extreme caution
• To perform cell rescue:
1. Connect to Exadata using the console
2. Boot the cell, and as soon as you see the "Oracle Exadata"
splash
p screen,, p
press anyy keyy on the keyboard
y
3. In the displayed list of boot options, select the last option,
CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and
press Enter
4. Select the rescue option, and proceed with the rescue
5. Reconfigure the cell
Using the Exadata Software Rescue Procedure

Exadata maintains mirrored system areas on separate physical disks. If one system area
becomes corrupt or unavailable
unavailable, Exadata can use the mirrored copy to recover
recover.
In the rare event that both system disks fail simultaneously, you must use the rescue
functionality provided on the CELLBOOT USB flash drive that is built into every Exadata
server.
It is important to note the following when using the rescue procedure:
• Use extreme caution when using this procedure, and pay attention to the prompts. The
rescue procedure can potentially rewrite some or all of the disks in the cell. If this
happens, then you can irrevocably lose the contents of those disks. Ideally, you should
use the rescue procedure only with assistance from Oracle Support Services.
• The rescue procedure does not destroy the contents of the data disks or the contents of
the data partitions on the system disks unless you explicitly choose to do so during the
rescue procedure.
• The rescue procedure restores the Exadata software to the same release. This includes
any patches that existed on the cell as off the last successful
f boot.

Using the Exadata Software Rescue Procedure (Continued)
• The following is not be restored using the rescue procedure:
- The crash kernel support rpms kernel-debuginfo-common, and kernel-
debuginfo You will need to reinstall them.
debuginfo. them These cannot be restored due to
space limitations on the CELLBOOT USB flash drive.
- Some cell configuration details, such as alert configurations, SMTP information,
and administrator e-mail address. Note that the cell network configuration is
restored, along with SSH identities for the cell, and the root, celladmin and
cellmonitor users.
- ILOM configurations. Typically, ILOM configurations remain undamaged even in
case of Exadata software failures.
• The rescue procedure does not examine or reconstruct data disks or data partitions
on the system disks. If you have data corruption on the grid disks, then do not use the
rescue procedure. Instead use the database backup and recovery procedures.
The following rescue options are available for the rescue procedure:
• Partial reconstruction recovery: During partial reconstruction recovery, the rescue
process re-createst partitions
titi on the
th system
t disks
di k andd checks
h k th the disks
di k ffor th
the
existence of a file system. If a file system is discovered, then the process attempts to
boot. If the cell boots successfully, then you use the CellCLI commands, such as
LIST CELL DETAIL, to verify the cell is usable. You must also recover any data
disks, as appropriate. If the boot fails, then you must use the full original build
recovery option.
• Full original build recovery: This option rewrites the system area of the system disks to
restore the Exadata software. It also allows you to erase any data on the data disks,
and data partitions on the system disks.
• Re-creation of the CELLBOOT USB flash drive: This option is used to make a copy of
the CELLBOOT USB flash drive.
To perform a rescue using the CELLBOOT USB flash drive:
1. Connect to Exadata using the console.
2. Boot the cell, and as soon as you see the "Oracle Exadata" splash screen, press any
key on the keyboard. The splash screen remains visible for only 5 seconds.
3. In the displayed list of boot options, scroll down to the last option,
CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and press Enter.
4. Select the rescue option, and proceed with the rescue.
5. After a successful rescue, you must reconfigure the cell to return it to the pre-failure
configuration,
fi ti and
d reinstall
i t ll th
the kernel-debuginfo and d kernel-debuginfo-
common rpms to use crash kernel support. If you chose to preserve the data when
prompted by the rescue procedure, then import the cell disks. If you chose not to
preserve the data, then you should create new cell disks, and grid disks.

Quiz
You can define thresholds for all Exadata metrics?

1. TRUE
2. FALSE
Answer: 2
Thresholds are supported on cell disk and grid disk I/O error count metrics (CD_IO_ERRS_MIN
and GD_IO_ERRS_MIN), along with the cell memory utilization (CL_MEMUT ) and cell file system
utilization (CL_FSUT) metrics. In addition, you can set thresholds for I/O Resource Management
(IORM) related metrics. The CellCLI LIST ALERTDEFINITION command lists the metrics for
which thresholds can be set.

Quiz
You enable SQL processing offload using the

CELL_OFFLOAD_PLAN_DISPLAY initialization parameter.
1. TRUE
2 FALSE
2.
Answer: 2
The CELL_OFFLOAD_PROCESSING parameter is used to enable SQL processing offload.

Summary

• Describe the various performance monitoring facilities
available for Exadata
• Monitor Exadata from directly within a cell,
cell from a
database instance and through Enterprise Manager
• Interpret SQL execution plans that use offloading
• Outline probable maintenance scenarios


– Monitoring Exadata Using Metrics, Alerts and Active
Requests
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/051ExadataMe
tricsAlerts/051exadatametricsalerts_viewlet_swf.html
– Monitoring Exadata From Within Oracle Database
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/052ExadataDB
Monitoring/052exadatadbmonitoring_viewlet_swf.html
g g_ _
– Exadata High Availability
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/053ExadataHig
hAvailability/053exadatahighavailability_viewlet_swf.html

Monitoring Exadata
In these practices, you will monitor Exadata using metrics,
alerts and active requests. You will also monitor Exadata
statistics using dynamic performance views (V$ views) in your
database. Finally, you will exercise Exadata high availability by
examining the effect of a cell crash.

E d t and
Exadata d I/O Resource
R Management
M t

Objectives
After completing this lesson, you should be able to use Exadata

I/O Resource Management to manage workloads within a
database and across multiple databases.

I/O Resource Management Overview
• Traditional benefits of shared storage:

– Lower administration costs
– More efficient use of storage
• Common challenge for shared storage:
– Workloads interfere with each other. For example:
— Large queries impact on each other
— Data loads impact on warehouse queries
— Batch workloads interfere with OLTP performance
• Exadata I/O Resource Management allows you to govern
I/O resource usage among different:
– User types – Applications
– Workload types – Databases
I/O Resource Management Overview

Storage is often shared by different workloads on multiple databases. Shared storage provides
some important benefits:
• When a storage system is dedicated to a single database, the administrator must size the
storage system based on the database’s peak anticipated load and size. The correct
balance of storage resources is seldom achieved because real-world workloads are very
dynamic. This leads to unused I/O bandwidth and space on some systems, whereas
others suffer with insufficient bandwidth and space. Sharing facilitates more efficient usage
of storage space and I/O bandwidth.
• Sharing lowers administration costs by reducing the number of storage systems.
systems
Shared storage, however, is not a perfect solution. Running multiple types of workloads and
databases on shared storage often leads to performance problems. For example, large parallel
queries on one production data warehouse can impact the performance of critical queries on
another production data warehouse. Also, a data load on a data warehouse can impact the
performance of critical queries also running on it. You can mitigate these problems by over
provisioning the storage system, but this diminishes the cost savings of shared storage. You
can also avoid running noncritical tasks at peak times, but manually achieving this is laborious.
When databases have different administrators who do not coordinate their activities, the task is
even more difficult.

I/O Resource Management Overview (continued)
I/O Resource Management (IORM) allows workloads and databases to share Exadata I/O
resources automatically according to user-defined policies. To manage workloads within a
database you can define intradatabase resource plans using the Database Resource Manager
database,
(DBRM), which has been enhanced to work in conjunction with Exadata. To manage workloads
across multiple databases, you can define IORM plans.
For example, if a production database and a test database are sharing an Exadata cell, you can
configure resource plans that give priority to the production database. In this case, whenever the
test database load would affect the production database performance, IORM will schedule the
I/O requests such that the production database I/O performance is not impacted. This means
th t the
that th test
t t database
d t b I/O requests
t are queued d until
til they
th can be
b issued
i d without
ith t disturbing
di t bi th the
production database I/O performance.

I/O Resource Management Concepts
Database A Database B
Finance OnlineQuery
consumer group Interactive consumer group
category
HR BatchQuery
consumer group
consumer group
Reporting Batch ETL

consumer group category consumer group
I/O Resource Management Concepts

A database often has many types of workloads. These workloads may differ in their
performance requirements and the amount of I/O that they y issue. Resource consumer g groups
provide a way to group sessions that comprise a particular workload. For example, if your
database is running four different applications, you can create four consumer groups, one for
each application. Alternatively, if your data warehouse has three types of workloads, such as
critical queries, normal queries, and ETL (extraction, transformation, and loading), then you can
create a consumer group for each type of workload. After you have created the consumer
groups, you must create rules that specify how sessions are mapped to consumer groups.
The database resource plan
plan, or intradatabase resource plan
plan, specifies how resources are
allocated among consumer groups in a database. A database may have multiple resource
plans, however, only one resource plan can be active at any point in time. This allows database
resource management to cater for different requirements associated with different time periods.
Exadata IORM extends the consumer group concept using categories. While consumer groups
represent collections of users within a database, categories represent collections of consumer
groups across all databases. The diagram in the slide shows an example of two categories
containing
i i consumer groups across two databases.
d b Y
You can manage I/O resources based
b d on
categories by creating a category plan. For example, you can specify precedence to consumer
groups in the Interactive category over consumer groups in the Batch category for all the
databases sharing an Exadata cell.
I/O Resource Management Plans
I/O
Resource
Management
Inside Across
one multiple
database databases
Intradatabase Interdatabase Category

g y
Resource Resource Resource
Plan Plan Plan
IORM Plan
I/O Resource Management Plans

IORM provides different approaches for managing resource allocations. Each approach can be
used independentlyy or in conjunction
j with other approaches.
Database resource management enables you to manage workloads within a database.
Database resource management is configured within each database, using Database Resource
Manager to create an intradatabase resource plan. You should use this feature if you have
multiple types of workloads within a database and you need to define a policy for specifying how
these workloads share the database resource allocation. If only one database is using Exadata,
this is the only IORM feature that you need.
I t d t b
Interdatabase resource managementt isi managed d with
ith an interdatabase
i t d t b plan.
l An
A interdatabase
i t d t b
plan specifies how resources are allocated among multiple databases for each cell. The
directives in an interdatabase plan specify allocations to databases, rather than consumer
groups.
Category resource management is an advanced feature. It is useful when Exadata is hosting
multiple databases and you want to allocate resources primarily by the category of the work
g done. For example,
being p , suppose
pp all databases have three categories
g of workloads: OLTP,,
reports, and maintenance. To allocate the I/O resources based on these workload categories,
you would use category resource management.
Note: The combination of the interdatabase plan and the category plan is called the IORM plan.

IORM Architecture
Exadata Cell
Database A Database A CELLSRV
CG1Database
queue A
CG1Database
queue A
Database CG1Database
CG2 queuequeue A
CG1 queue
sends CG2 queue
IO requests
… CG2 queue
… CG2 queue
to cells
cells. ……
CG queue
CGn
CGn queue
CGn queue
CGn queue
BG queues
IO request tag: …
BG queues
- DB name …BG queues
Database
BGZqueues
- Type …
CG1 queue Z
Database
- Consumer group CG1Database
queue Z
CG1Database
CG2 queuequeue Z
CG1 queue IORM
CG2 queue Cell disk
… CG2 queue Disk queue
… CG2 queue
……
CGn queue
CGn queue
CGn queue
CGn queue Performance
BG queues statistics
BG queues Resource
BG queues plans
Database Z BG queues
IORM Architecture
IORM manages Exadata I/O resources on a per-cell basis. Whenever the I/O requests start to
saturate the cell, IORM schedules incoming I/O requests according to the configured resource
plans.
l IORM schedules
h d l I/O b by selecting
l ti requests t ffrom diff
differentt CELLSRV queues. The
Th resource
plans are used to determine the order in which the queued I/O requests are issued to disk. The
goal of IORM is to fully utilize the available disk resources. Any allocation that is not fully utilized
is made available to other workloads in proportion to the configured resource plans.
IORM only intervenes when needed. For example, IORM does not intervene if there is only one
active consumer group on one database because there is no possibility of contention with
another consumer group or database.
Background I/Os are scheduled based on their priority relative to the user I/Os.
I/Os For example,
example
redo writes and control file I/Os are critical to performance and are always prioritized above all
user I/Os. Writes by the database writer process (DBWn) are scheduled at the same priority level
as user I/Os.
The diagram in the slide illustrates the high-level implementation of IORM. For each cell disk,
each database accessing the cell has one I/O queue per consumer group and three background
I/O queues. The background I/O queues correspond to high, medium, and low priority requests
with different I/O types mapped to each queue. If you do not set an intradatabase resource plan,
all nonbackground I/O requests are grouped into a single consumer group called
OTHER_GROUPS.
Note: IORM is only used to manage I/O requests to physical disks. IORM does not manage
requests to flash-based grid disks or requests serviced by Exadata Smart Flash Cache.
I/O Resource Management Plans Example
Database A Database B
(Single Inst) (RAC)
Intradatabase Plan A Intradatabase Plan B

(DBMS_RESOURCE_MANAGER) (DBMS_RESOURCE_MANAGER)
Consumer group 1: 15% Consumer group 5: 22%

Controlled I/O
distribution
Exadata Storage Server
Disk … Disk
DB A Plan DB B Plan
Interdatabase Plan IORM Plan Category Plan

(CellCLI) (CellCLI)
Database A : 70% INTERACTIVE : 60%

Database B : 30% BATCH : 40%

For each database, you can use DBRM to create an intradatabase resource plan. When you set
y sent to each cell. In
an intradatabase resource plan, a description of the plan is automatically
the example in the slide, Database A and Database B have separate intradatabase plans. Note
also that each consumer group in each intradatabase plan is associated with either the
INTERACTIVE or BATCH category.
At each cell, an interdatabase plan can be configured and enabled. In the example in the slide,
the interdatabase plan is configured with a larger resource allocation for Database A (70%) than
for Database B (30%).
Also within
Al ithi each h cell,
ll you can categorize
t i consumer groups ffrom diffdifferentt databases
d t b and
d
distribute I/O resources according to the various categories. In the example in the slide, the
INTERACTIVE category (60%) is allocated a greater resource share than the BATCH category
(40%).

Database A Database B Database A Database B

IORM
allocation
Intradatabase
15%
10%
22%
18%
35%
40%
15%
45%
Interdatabase
70%
30%
70%
30%
Categories
40% 60%
BATCH INTERACTIVE
All
User I/Os
(100%)
I/O Resource Management Plans Example (continued)

The category, interdatabase, and intradatabase plans are used together by Exadata to allocate
I/O resources.
The category plan is first used to allocate resources among the categories. When a category is
selected, the interdatabase plan is used to select a database; only databases that have
consumer groups with the selected category can be selected. Finally, the selected database’s
intradatabase plan is used to select one of its consumer groups. The percentage of resource
allocation represents the probability of making a selection at each level.
Expressing this as a formula:
Pcgn = cgn / sum(catcgs) * db% * cat%
where:
• Pcgn is the probability of selecting consumer group n
• cgn is the resource allocation for consumer group n
• sum(catcgs) is the sum of the resource allocations for all consumer groups in the same
category as consumer group n and on the same database as consumer group n
• db% is the database allocation percentage in the interdatabase plan
• cat% is the category allocation percentage in the category plan

I/O Resource Management Plans Example (continued)
The hierarchy used to distribute I/Os is illustrated in the slide. The example is continued from
the previous slide but the consumer group names are abbreviated to CG1, CG2, and so on.
Notice that although each consumer group allocation is expressed as a percentage within each
database, IORM is concerned with the ratio of consumer group allocations within each category
and database. For example, CG1 nominally receives 16.8% of I/O resources from IORM
(15/(15+10)*70%*40%); however, this does not change if the intradatabase plan allocations for
CG1 and CG2 are doubled to 30% and 20%, respectively. This is because the allocation to CG1
remains 50% greater than the allocation to CG2. This behavior also explains why CG1 (16.8%)
and CG3 (19.6%) have a similar allocation through IORM even though CG3 belongs to the
hi h priority
higher i it category
t (60% versus 40%) anddh
has a much h llarger iintradatabase
t d t b plan
l allocation
ll ti
(35% versus 15%).
Note: ASM I/Os (for rebalance and so on) and I/Os issued by Oracle background processes are
handled separately and automatically by Exadata. For clarity, background I/Os are not shown in
the example.

Enabling Intradatabase Resource Management
• You can enable intradatabase resource management:

– Manually:
— Set the database’s RESOURCE_MANAGER_PLAN parameter.
– Automatically:
— Create a job scheduler window.
— Associate a resource plan with the window.
• Exadata is notified when an intradatabase resource plan is
set or modified:
– Enabled or modified plan sent to each cell using iDB
• You must activate the IORMPLAN on all Exadata cells.
• Following are the commonly used intradatabase plans:
– mixed_workload_plan
– dss_plan
– default_maintenance_p
plan
Enabling Intradatabase Resource Management

An intradatabase resource plan can be manually enabled with the RESOURCE_MANAGER_PLAN
initialization parameter or automatically
y enabled using
g the jjob scheduler.
When you set an intradatabase resource plan on the database, a description of the plan is
automatically sent to each cell. When a new cell is added or an existing cell is restarted, the
current intradatabase plan is automatically sent to the cell. This resource plan is used to
manage resources on both the database server and cells.
Before IORM can be used, you must activate the IORMPLAN on all corresponding Exadata cells.
Oracle Database provides several predefined intradatabase plans. The most commonly used
are mixed_workload_plan, dss_plan and default_maintenance_plan.
Intradatabase plans do not contain a directive for background I/O activity. Background I/Os are
scheduled based on their priority relative to the user I/Os. For example, redo writes, and control
file reads and writes are critical to performance and are always prioritized above all user I/Os.
Note: When an Oracle RAC database uses Exadata, all instances in the Oracle RAC cluster
must be set to the same resource plan.

Intradatabase Plan Example
BEGIN
DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => 'my_plan',
CONSUMER_GROUP1 => 'high_priority', GROUP1_PERCENT => 80,
CONSUMER_GROUP2 => 'low_priority' , GROUP2_PERCENT => 20);
END;
/
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = 'my_plan';
Consumer Group Level 1 Level 2 Level 3

The plan is sent SYS_GROUP 100% Percentages
di tl tto th
directly the HIGH_PRIORITY 80% are used
sed for both
Exadata cells LOW_PRIORITY 20% CPU and I/O
via iDB. OTHER_GROUP 100%
resources.
CellCLI> ALTER IORMPLAN ACTIVE
Intradatabase Plan Example

The intradatabase I/O resource plan specifies how I/O resources are allocated among
consumer ggroups in a specific database.
An intradatabase I/O resource plan is created with the procedures in the
DBMS_RESOURCE_MANAGER PL/SQL package. There are no specific I/O resource parameters
or procedures. You create an intradatabase I/O resource plan exactly the same way as you
would create a CPU resource plan. When you specify an allocation percentage, this percentage
applies to both database server CPU and Exadata I/O resources if you are using Exadata.
There are no specific I/O settings because typically you are constrained by CPU or I/O, but not
both at the same time
time. The intradatabase I/O resource plan is applicable only when the
database uses Exadata.
The example in the slide uses the CREATE_SIMPLE_PLAN procedure to create MY_PLAN. This
resource plan is used to manage CPU resources at the database level, and I/O resources at the
Exadata cell level.
Before I/O resources for an intradatabase plan can be managed by Exadata I/O Resource
Management,
g yyou need to make sure that the IORMPLAN is active. This can be done by
y
executing the ALTER IORMPLAN ACTIVE command.

Enabling IORM for Multiple Databases
• Enable IORM for multiple databases by configuring an

IORMPLAN:
– The category plan assigns I/O resources using categories.
– The interdatabase plan assigns I/O resources using
database names
names.
– All combinations are possible.
• Use CellCLI to define and activate the IORMPLAN on each
cell.
• Configure the same IORMPLAN on each cell.
• O l one IORMPLAN can be
Only b active
i at a time
i on a cell.
ll
• IORMPLAN settings are persistent across cell reboots.
• All databases get equal allocations in the absence of an
IORMPLAN.
Enabling IORM for Multiple Databases

I/O resource management for multiple databases is configured with the IORMPLAN. The
IORMPLAN specifies how I/O resources are allocated for each cell. If you are using multiple
cells, you need to configure them all. In most cases, all of your cells should use the same
IORMPLAN.
The IORMPLAN contains both an interdatabase plan, also called a DB plan, and a category plan.
The directives in the DB plan specify I/O resource allocations to database names, rather than
consumer groups. The directives in the category plan specify I/O resource allocations to
categories, rather than databases or consumer groups. The IORMPLAN is configured and
enabled with CellCLI on each cell
cell. Only one IORMPLAN can be active on a cell at any given
time.
At startup, the IORMPLAN is an empty string, which effectively turns off IORM. In that case all
databases receive an equal allocation.
The IORMPLAN must be activated for I/O resource management to occur. When the IORMPLAN
is deactivated, IORM will not manage I/O resources, even if an intradatabase resource plan is
set or an IORMPLAN is configured.
g

Interdatabase Plan Example
CellCLI> alter iormplan -

> dbplan=((name=sales_prod, level=1, allocation=80), -
> (name=finance_prod, level=1, allocation=20), -
> (name=sales_dev, level=2, allocation=100), -
> (name=sales_test, level=3, allocation=50), -
> (name=other,
(name other level=3,
level 3 allocation=50)),
allocation 50)) -
> catplan=''
CellCLI> alter iormplan active
Database Level 1 Level 2 Level 3

sales_prod 80%
finance_prod 20%
sales_dev 100%
sales_test 50%
other 50%
Interdatabase Plan Example

On each Exadata cell, an interdatabase plan specifies how resources are divided among
y allocations to databases,
multiple databases. The directives in an interdatabase plan specify
rather than consumer groups. The interdatabase plan is configured and activated with CellCLI,
on each cell.
The above example implements an interdatabase plan following the directives shown in the
table.
The interdatabase plan is created by specifying the DBPLAN part of the IORMPLAN. The
interdatabase plan is similar to an intradatabase plan in that each directive consists of a level
f
from 1 to
t 8 andd an allocation
ll ti amountt in i percentage
t terms.
t F
For a given
i plan,
l allll th
the
allocations at any level must add up to 100 or less. An interdatabase plan differs from an
intradatabase plan in that it cannot contain subplans and it only contains I/O resource directives.
As a best practice, you should create a directive for each database using the same Exadata
cell. To make sure that any database without an explicit directive can be managed, you need to
create an allocation named OTHER.

Interdatabase Plan Example (continued)
The role attribute indicates that the directive is applied only when the databases are in that
database role. This provides the flexibility to automatically adjust the IORM plan according to
the role of the database in an Oracle Data Guard environment. If the role attribute is not
specified, the directive is applied regardless of the database role. Following is an example of an
interdatabase plan using the role attribute:
ALTER IORMPLAN dbplan=( -
(name=sales1, level=1, allocation=30, role=primary), -
(name=sales2, level=1, allocation=35, role=primary), -
(name=sales1,
(name sales1, level
level=2, 2, allocation=20,
allocation 20, role=standby),
role standby), -
(name=sales2, level=2, allocation=25, role=standby), -
(name=other, level=3, allocation = 50))
You can remove an interdatabase plan using:
ALTER IORMPLAN dbplan=''

Category Plan Example
DBA_RSRC_CONSUMER_GROUPS
CONSUMER_GROUP CATEGORY
---------------------------- --------------- DBMS_RESOURCE_MANAGER.CREATE_CATEGORY
SYS_GROUP ADMINISTRATIVE
BATCH_GROUP BATCH
INTERACTIVE_GROUP INTERACTIVE
ORA$… MAINTENANCE
OTHER_GROUPS OTHER
Category Level 1 Level 2 Level 3
DEFAULT_CONSUMER_GROUP OTHER
LOW_GROUP OTHER Interactive 90%
AUTO_TASK_CONSUMER_GROUP OTHER
Batch 80%
CellCLI> alter iormplan - Maintenance 50%

> dbplan= '' -
Other 50%
> catplan=(
catplan ( -
> (name=interactive, level=1, allocation=90), -
> (name=batch, level=2, allocation=80), -
> (name=maintenance, level=3, allocation=50), -
> (name=other, level=3, allocation=50) -
> )
Category Plan Example

Database Resource Manager enables you to specify a category for every consumer group. The
predefined categories and their associated consumer groups are listed in the slide. This is the
default situation after database creation. If you decide to use these default categories, you
should map all administrative consumer groups in all databases to the ADMINISTRATIVE
category. All high-priority user activity, such as consumer groups for important online
transaction processing (OLTP) transactions and time-critical reports, should be mapped to the
INTERACTIVE category. All low-priority user activity, such as reports, maintenance, and low-
priority OLTP transactions, should be mapped to the BATCH, MAINTENANCE, and OTHER
g
categories.
You can create your own categories using the CREATE_CATEGORY procedure in the
DBMS_RESOURCE_MANAGER package, and then assign your category to a consumer group
using the CREATE_CONSUMER_GROUP or UPDATE_CONSUMER_GROUP procedures.
You can then manage I/O resources based on categories by creating a category plan. The
example shown in the slide implements a category plan based on the allocations described in
the table. With this plan, consumer groups associated with the INTERACTIVE category get up
to 90 percent of I/O resources.
resources 80 percent of the remainder
remainder, including any unutilized allocation
from the INTERACTIVE category, is allocated to the BATCH category. The MAINTENANCE and
OTHER categories share the remainder.
Any consumer group without an explicitly specified category defaults to the OTHER category.
Complete Example
Database A
BEGIN
DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => ‘DB_A_Plan',
CONSUMER_GROUP1 => ‘CG1', GROUP1_PERCENT => 15,
CONSUMER_GROUP4 => ‘CG4’, GROUP2_PERCENT => 40);
DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => ‘CG1’,
NEW_CATEGORY => ‘BATCH’);
NEW_CATEGORY => ‘INTERACTIVE’);
DBMS RESOURCE MANAGER UPDATE CONSUMER GROUP(CONSUMER GROUP => ‘CG4’,
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP ‘CG4’
DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
END;
/
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = ‘DB_A_Plan';
Complete Example
This slide is the first in a series of 3 slides which provide a more complete example showing
the use of the different IORM plan types at the same time.time The example is based on the
scenario introduced on pages 8, 9 and 10 of this lesson.
On this slide, the commands required to configure DBRM on Database A are shown.
Note that the example does not show the creation of any categories using
DBMS_RESOURCE_MANAGER.CREATE_CATEGORY because the categories used in the
scenario (BATCH and INTERACTIVE) are categories that are predefined inside Oracle
Database byy default.

Complete Example
Database B
BEGIN
DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => ‘DB_B_Plan',
CONSUMER_GROUP4 => ‘CG8’, GROUP2_PERCENT => 45);
DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
DBMS RESOURCE MANAGER UPDATE CONSUMER GROUP(CONSUMER GROUP => ‘CG8’,
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP ‘CG8’
DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
END;
/
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = ‘DB_B_Plan';
Complete Example (continued)

On this slide, the commands required to configure DBRM on Database B are shown. These
commands are essentially the same as for Database A except for the different consumer
group names and resource allocation percentages.

Complete Example
Exadata Cells
CellCLI> alter iormplan -

> dbplan=((name=Database_A, level=1, allocation=70), -
> (
(name=Database_B,
b level=1,
l l allocation=30)),
ll i )) -
> catplan=((name=INTERACTIVE, level=1, allocation=60), -
> (name=BATCH, level=1, allocation=40))
Complete Example (continued)

This slide shows the commands required to configure IORM on the Exadata cells. Exadata
uses the IORM plan in conjunction with the DBRM plans propagated by the databases to
allocate I/O resources.

Using Database I/Os Metrics
• You can monitor IORM to understand resource

consumption and make required adjustments.
• There are separate metrics for small (≼ 128 KB) and large
I/Os.
• Which database has the heaviest load?
– Look for highest DB_IO_RQ_SM + DB_IO_RQ_LG values.
• Which database was throttled the most?
– Look for highest DB_IO_WT_SM + DB_IO_WT_LG values.
Name Description
DB_IO_RQ_SM Total number of I/O requests issued by the database since
DB_IO_RQ_LG any resource plan was set
DB_IO_RQ_SM_SEC I/O requests per second issued by the database in past
DB_IO_RQ_LG_SEC minute
DB_IO_WT_SM Total number of seconds that I/O requests issued by the
DB_IO_WT_LG database waited to be scheduled
Using Database I/Os Metrics

Exadata provides three groups of I/O metrics that correspond to the three types of IORM plans:
category metrics, database metrics, and consumer group metrics. I/O metrics allow you to
understand
d t d your I/O consumption ti and d make
k adjustments
dj t t to
t optimize
ti i performance
f and d resource
utilization.
For each I/O metric, a distinction is made between small I/Os, typically associated with OLTP
applications, and large I/Os, which are usually indicative of DSS workloads. I/O metric names
include _SM or _LG to identify small or large I/Os, respectively.
For database metrics the objectType attribute is set to IORM_DATABASE. The table in the
slide gives you a quick description of some important database I/O metrics. A separate set of
metric observations is available for each database specified in the IORM plan
plan. Metric
observations for different databases are differentiated by the name of the database, which is set
in the metricObjectName attribute. You can compare metrics between databases to
determine which one has the heaviest load or which one was throttled most as illustrated in the
slide. A special metricObjectName value of _OTHER_DATABASE_ is used for database I/O
metrics associated with ASM and for databases that are not explicitly mentioned in the
interdatabase IORM plan.
While this slide focuses on database metrics, the same principles apply for category metrics and
consumer group metrics. For example, the CG_IO_RQ_SM_SEC metric specifies the rate of
small I/O requests issued by a consumer group per second over the past minute. A large value
indicates a heavy I/O workload from this consumer group in the past minute.

Quiz
If a consumer group does not require its full resource allocation,

what happens to the leftover allocation?
1. It remains unused.
2 It is divided equally among other consumer groups
2. groups.
3. It is allocated to other active consumer groups, according
to the resource plan.
Answer: 3

Quiz
Which of the following conditions are required for IORM to

intervene and control the allocation of I/O resources?
1. The IORM plan must be active.
2 More than one consumer group must be active.
2. active
3. The disks must be heavily utilized.
Answer: 1, 2, 3
All of the conditions listed in this question must be present for IORM to intervene.

Quiz
In which order are the different I/O resource plans applied to

allocate I/O resources?
1. Category, intradatabase, interdatabase
2 Interdatabase,
2. Interdatabase category
category, intradatabase
3. Category, interdatabase, intradatabase
4. Interdatabase, intradatabase, category
5. Intradatabase, interdatabase, category
Answer: 3

Quiz
You can create categories using the CellCLI utility.

1. TRUE
2. FALSE
Answer: 2
You can create your own categories using the CREATE_CATEGORY procedure in the
DBMS_RESOURCE_MANAGER package, and then assign your category to a consumer group
using the CREATE_CONSUMER_GROUP or UPDATE_CONSUMER_GROUP procedures.
You can then manage I/O resources based on categories by creating a category plan. The
category plan can be created using the CellCLI utility.

Summary
In this lesson, you should have learned how to use Exadata I/O
Resource Management to manage workloads within a
database and across multiple databases.


– Intradatabase I/O Resource Management
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/061ExadataIntr
aDBIORM/061exadataintradbiorm_viewlet_swf.html
– Interdatabase I/O Resource Management
— http://st-
curriculum.oracle.com/demos/db/11g/r2/dbmach/062ExadataInt
erDBIORM/062exadatainterdbiorm_viewlet_swf.html

Exadata Workshop Part1

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Exadata Workshop Part1

Încărcat de

Drepturi de autor:

Formate disponibile

Exadata and Database Machine

Ravindra Dani Sriram Palapudi Restricted Rights Notice

5 Exadata Performance Monitoring and Maintenance

6 Exadata and I/O Resource Management

7 Optimizing Database Performance with Exadata

8 Database Machine Overview and Architecture

10 Migrating Databases to Database Machine

11 Bulk Data Loading with Database Machine

12 Backup and Recovery with Database Machine

13 Monitoring and Maintaining Database Machine

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

After completing this seminar, you should be able to:

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 2

• This course is primarily designed for administrators who

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Audience and Prerequisites

Exadata and Database Machine Administration Workshop 1 - 3

• This course covers two main subject areas:

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 4

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 5

• Unless otherwise indicated, ‘Exadata’ refers to ‘Exadata

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 6

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 7

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 8

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

After completing this lesson, you should be able to:

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 2

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Traditional Enterprise Database Storage Deployment

Exadata and Database Machine Administration Workshop 2 - 3

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Storage Deployment

Exadata and Database Machine Administration Workshop 2 - 4

Exadata and Database Machine Administration Workshop 2 - 5

Exadata Cell Linux OS Exadata Cell Linux OS

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Implementation Architecture Overview

Exadata and Database Machine Administration Workshop 2 - 6

• High performance storage for Oracle

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 7

Processors 2 Six-Core Intel® Xeon® L5640 Processors (2.26 GHz)

Local Disks 12 x 600 GB 15K RPM High Performance SAS

Power Supplies 2 redundant hot-swappable power supplies

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Hardware Details (Sun Fire X4270 M2)

Exadata and Database Machine Administration Workshop 2 - 8

Raw Disk Capacity1 7.2 TB 24 TB

Uncompressed Data Capacity2 2 TB 7 TB

Raw Disk Throughput (MBPS) 1,800 1,000

Effective Throughput with Flash (MBPS) 3,600 3,600

Disk I/Os per Second (IOPS) 3,600 1,440

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 9

Copyright © 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 10

select /+ full(lineitem) / count(*)

select /+ full(lineitem) / count(*)