Sunteți pe pagina 1din 64

Teradata

An Overview

Access patterns are different, and hence

The access patterns of these two approaches are very different and hence they make very different demands on the underlying database engine The basic database architecture has to be different to be optimized for one type of processing Teradata leader in DSS and Data warehouse space

What is Teradata
Teradata is a Relational Database Management System (RDBMS) composed of hardware and software Designed for worlds largest commercial databases. Used by Customer who are looking out for answers to their business questions from data of over 1 Terabyte
6 of the top 10 Retailers 6 of the top 9 Communications companies Over 40% of the leading Manufacturers in the world 3 of the top 4 Blue Cross/Blue Shield insurance companies Many of the world's leading Banks

Teradata a brief history

1979 - Teradata Corp founded in Los Angeles, California. Development begins on a massively parallel database computer 1984 - Teradata sells first DBC/1012 1986 - Product of the Year 1990 - First Terabyte system installed and in production 1992 - Teradata is merged into NCR 1995 - Teradata Version 2 for UNIX operating systems released

Why Teradata
Capacity:
Scaling from Gigabytes to Terabytes of detailed data stored in billions of rows Scaling to thousands of millions of instructions per second (MIPS) to process data

Performance:

Shared Nothing Architecture - able to achieve parallelism in each and every stage of query execution Makes Teradata Database faster than other relational systems

Single Data Store:

Can be accessed by network-attached and channel-attached systems Supports the requirements of many diverse clients

Fault Tolerance & Availability: Data Integrity: Scalability:

High fault tolerance, no single point failure Automatically detects and recovers from hardware failures

Ensures that transactions either complete or rollback to a stable state if a fault occurs
Linearly expandable - as your database grows, additional nodes may be added Allows expansion without sacrificing performance

Teradata Architecture, the SMP


CPU (Processors) PEs Vprocs AMPs Node Parsing AMPs: VirtualAccess Engine: Processor Module is a set Processors of software processes running on a Storing Checks the andSQL retrieving Syntaxrows to and node. Each Vproc is a separate, from the disks independent Resource Availability copy of the and processor Rights software Lock management isolated from the other Parses the SQL vprocs but sharing some of the Sorting rows and Aggregating physical Generates resources AMP Steps of the node such columns as memory and CPUs. Creates plan Join processing Dispatches to the AMPs over Output conversion and formatting BYNET Creating answer sets for clients EBCDIC-ASCII Conversion Disk space management and Handle up to 120 User Sessions Accounting Special utility protocols Recovery processing

Vdisks

This is called SMP Symmetric Multiprocessor - A multiprocessing node that contains a number of central processing units sharing a single memory pool "Shared Nothing Architecture" - each AMP has its own disk (data) and it shares this with no other AMP and solely responsible for any changes/access to that data

And then comes the MPP


BYNET: Dual redundant, fault-tolerant bi-directional interconnect network that enables: Automatic load balancing of message traffic Automatic reconfiguration after fault detection

BYNET

Scalable bandwidth as nodes are added The BYNET is responsible for: Broadcast, multicast, and point-topoint communications between nodes and virtual processors Merging answer sets back to the PE Making Teradata parallelism possible

MPP (Massively Parallel Processing) consists of a number of nodes (SMPs) that work on a problem at the same time Each node (SMP) has one or more CPUs, own memory, I/O, network connections and disk arrays and doesn't share its resources with other nodes

Important components
SMP Symmetric Multiprocessing is a single node that contains multiple CPUs sharing memory pool. MPP SMP combined with a communication network (BYNET) form a MPP. A MPP comprises of two or more loosely coupled SMP nodes connected by the BYNET with shared SCSI access to multiple disk arrays BYNET Hardware inter-processor network to link nodes on an MPP system. It implements point to point, multicast, broadcast communications depending upon situation. BYNET is usually used for merging and sorting of data from different nodes. The accumulated data is then sent back to the User. Disk Array Teradata employs RAID storage technology where drives are configured logically in one or more logical unit (LUN) which is further sliced into Pdisk that is assigned to each AMP. Group of Pdisk assigned to a AMP is called Vdisk.

More Definitions
PDE - Parallel Database Extension is an interface layer on top of operating system. It enhances the processing by providing capability of parallel processing and priority scheduling. It executes Vprocs. It take advantage of BYNET and Shared Disk hardware to improve performance. It may visualized as a layer on top of Operating System File System - Teradata File System service calls allow Teradata RDBMS to store and retrieve data efficiently without being concerned about underlying operating system interfaces. It divides the disk in to logical blocks, MI, CI, CID, DB, DBD TPA - Teradata Parallel Application is responsible for distribution, coordination and balancing of processes/threads across nodes TDP - Teradata Director Program is responsible for session balancing across multiple PEs, failure notification, logging, verification, recovery, restart and security

Logical Processors
VPROCS - Virtual Processors. Vprocs are set of software processors that run on a node under Teradata PDE within the multitasking environment of the operating system. A single node (SMP) can have as high as 128 Vprocs
PE - Parsing Engine performs session control and dispatches tasks to fetch, return and merge data. It communicates with the client system on one side and with the AMPs on the other side (via BYNET) AMP - Access Modular Processor retrieve and update data on the virtual disks. It is accountable for doing locking, joining, sorting, aggregation, data conversion, disk space management, accounting, and journaling

A single PE can handle a request at a time. This request is parsed, optimized, steps are built and then dispatched to corresponding AMP(s) An AMP has 80 worker task which perform different kind of work related to the steps. If the request is a select, these worker tasks after finishing the work sends data to BYNET where it is merged and sorted PE dispatches the resultant data to the user

Query Lifecycle
Application sends the request Application sends the SELECT *request FROM t1 WHERE id = 4; to the PE - PE sends back the to the PE - PE sends back the SELECT * FROM t1 WHERE id IN (2,8); acknowledge to application acknowledge to application The SQL is parsed by the PE CLI The SQL is parsed by the PE CLI PE uses the Hashmap to locate PE uses the Hashmap to locate the AMP the AMPs TDP (Teradata Director Program) PE sends the request to the PE sends the request to the particular AMP - AMP sends back individual AMPs - AMP sends Hashmap PE (1) PE (2) the acknowledge to PE back the acknowledge to PE AMP retrieves the data from its own Vdisk AMP sends data to BYNET AMP (1) AMP (2) BYNET merges the data BYNET sends merged data to PE V Disk (1) V Disk (2) Result is sent to application from PE - Application sends back acknowledge to PE ID (PI) Desc ID (PI) Desc 3 C 1 A 5 E 4 D BYNET Merge AMP retrieves the data from its own Vdisk AMP sends the data to PE AMP (3) AMP (4) Result is sent to application from PE - Application sends back acknowledge to PE V Disk (3) ID (PI) Desc 2 B 6 F V Disk (4) ID (PI) Desc 7 G 8 H

Client Server

Data is distributed across all AMPs based on row-hash of PI

Data Distribution and Access Methods


Hashing: Teradata uses hashing for data distribution & access Data row is hashed based on primary index value. Hash maps direct the data row to a particular AMP based on its hash value.
PI

Row Hash

HashMap

Hashing and Indexing


Indexing:

A data value (or values, if the index is compound) from a row acts as an index key to that row Associates the index key with a relative row address that reports the location of the row on disk Stored in order of their index key values and are said to be value-ordered

Hashing:

Index key data value is transformed by a mathematical function to produce an abstract value not related to the original data value in an obvious way Hashed data is assigned to hash buckets that correspond in a 1:1 manner to the relationship a particular hash code with an AMP location There is no obvious correspondence between a hash code and the location of the row it refers to

Teradata does not use indexing. What we refer to as indexes are either row hash values or data tables (join index) Tradeoffs Between Hashing and Indexing:
Hashing is far better suited for the parallel database architecture Hashing provides consistently better performance because rows are always distributed evenly across the AMPs Primary indexes are not stored in an index subtable - directly as part of the row data Primary index columns on frequently used join constraints can be co-located on the same AMP Range queries Retrievals having selection criteria that involve only part of a multicolumn hash key

Hashing
Teradata Database hashing algorithms are proprietary mathematical functions that transform an input data value of any length into a 32-bit value A 32-bit row hash value provides 4.2 billion possible values 16-bit Destination Selection Word Row Hash Row ID First 16 bits - Destination Selection Word - used to define the hash bucket for the hashed row The remaining 16 bits are a remainder from the operation of the hash function on the original input value Uniqueness Value - additional 32-bit system-generated Uniqueness Value to ensure the uniqueness of any RowID. Generated at AMP level There are 65,536 hash buckets, distributed as evenly as possible among the AMPs The BYNET interface board on each AMP maintains a hash map - an index of which hash buckets are assigned to which AMPs Row assignment is performed in a manner that ensures as equal a distribution of table rows as possible among all the AMPs 16-bit Remainder 32-bit Uniqueness Value

Hash-Related Functions

To predict the distribution on AMP for a chosen PI

SELECT HASHAMP (HASHBUCKET (HASHROW (empno))) AS amp_no, COUNT(*) FROM employee GROUP BY 1 ORDER BY 2 DESC; amp_no count(*) 25 3510 29 3468 17 3181

To see the selectivity of a PI

SELECT HASHROW (empno)) AS hash_value, COUNT(*) FROM employee GROUP BY 1 ORDER BY 2 DESC; hash_value count(*) 63524 14 8069 14 4191 1

If there are no hash collisions, the result ratio is close to 1

SELECT(COUNT (*) (FLOAT))/(COUNT(DISTINCT HASHROW(empno))) FROM employee;

Data Distribution Issues


Hash Collisions
Situations in which the row hash value for different rows is identical, making it difficult for a system to discriminate among the hash synonyms when one unique row is requested for retrieval from a set of hash synonyms Systems define 4.2 billion hash values System-generated 32-bit Uniqueness Value to the row hash

Skewing of Hash Bucket Distribution


Caused by wrong selection of PI which is having less unique values It Impacts parallel processing of the data

Data Partitioning
For Join-on columns, a row hash value is recalculated based on new columns involved in the join. If tables are being joined on 3 column (a,b,c), then a row hash value is computed as if (a,b,c) was a PI. If row hash values of the joining columns are not on AMP, then the rows are redistributed across all AMP which is overhead

Teradata Indexes
Indexes are method of storing and retrieving data from Teradata optimally
By default every table would have one index. It is called Primary Index (PI). In addition, if the user is making use of columns other than PI in a query, then he/she can declare Secondary Index (SI) on that column for faster access of data

Types of indexes:

Primary Index Unique and Non-Unique, no Subtable, affects data distribution Secondary Index Unique and Non-Unique, avoids FTS, Subtable, does not affect data distribution, extra overhead of updating Subtable in case insert/delete/update is done on table Join Index Single Table, Multi Table and Aggregate Join Index

Single Table JI allows hashing of rows based on some other column. This column might be used in condition of SQL qualifying the JI for data access Multi-Table JI on columns from more than one table avoids recalculating join values in a query which is frequently used Aggregate JI on columns help queries which perform frequent aggregation on same column(s)

Hash Index:

are file structures that share properties with STJI and SI

Primary Key vs. Primary Index


Teradata uses Primary Index or Secondary Index to enforce a Primary Key
Primary Key Primary Index

Important component of logical data


model Used to maintain referential integrity Values can never be changed

Not used in logical model

Used to distribute and retrieve data Values can be changed

Cannot be null
Does not imply access path

Can be null
Defines the most common access paths

Not required for physical table definition

Mandatory for physical table definition

Primary Index (PI)


The Teradata Database distributes tables horizontally across all AMPs on a system. The system assigns rows to AMPs based on the value of their primary index. The determination of which hash bucket, and hence which AMP the row is to be stored on, is made solely on the row hash value of its primary index. Each Teradata Database table must have a primary index. Restrictions:
Only one PI per table Not more than 64 columns Cannot include columns having BLOB or CLOB data types If no explicit definition, a NUPI is created on the 1st column of the table.

No separate physical storage stored in-line with the row in the base table Rows are hash-ordered within the same AMP Types of Primary Index : A PI can be defined over two orthogonal dimensions
Unique (UPI) or non-unique (NUPI) Partitioned (PPI) or non-partitioned (NPPI)

Types of PI
Unique Primary Index Non-unique Primary Index Non-Partitioned Primary Index
Standard Teradata Database primary index Rows are hashed to the appropriate AMPs and stored there in row hash order

Partitioned Primary Index


Rows are hashed to the appropriate AMPs and then assigned to an appropriate partition based on the value of a partitioning expression Rows are stored in row hash order within the same partition Designed to optimize range queries

NPPI & PPI Data Storage within AMPs


NPPI
Create Table
CREATE MULTISET TABLE orders_1, NO FALLBACK,NO BEFORE JOURNAL,NO AFTER JOURNAL( order_nr VARCHAR(10) NOT NULL, order_cre_dt DATE FORMAT 'YYYY-MM-DD' NOT NULL ) UNIQUE PRIMARY INDEX upi_orders_1 (order_nr);
CREATE MULTISET TABLE orders_2,NO FALLBACK, NO BEFORE JOURNAL,NO AFTER JOURNAL( order_nr VARCHAR(10) NOT NULL, order_cre_dt DATE FORMAT 'YYYY-MM-DD' NOT NULL ) UNIQUE PRIMARY INDEX upi_orders_2 (order_nr) PARTITION BY RANGE_N(order_cre_dt BETWEEN DATE '0001-01-01 AND DATE '9999-12-31' EACH INTERVAL '1' MONTH);

PPI

Insert Data
Row Hash A11111 A22222 A33333 A44444 order_nr 10 20 30 40 order_cre_dt 2007-01-11 2007-02-22 2007-01-12 2007-02-23 Row Hash A11111 A22222 A33333 A44444 order_nr 10 20 30 40 order_cre_dt 2007-01-11 2007-02-22 2007-01-12 2007-02-23

Data Distribution within AMPs

Selecting a Primary Index


Uniform Data Distribution:

The more distinct the primary index values, the better Rows having the same primary index value are distributed to the same AMP Parallel processing is more efficient when table rows are distributed evenly across the AMPs

Optimal Data Access:

The primary index should be chosen on the most frequently used access path Primary index operations must provide the full primary index value Primary index retrievals on a single value are always one-AMP operations

Volatility:

How often the value of index column is changed. The lesser it is changed the better choice in index it holds

The Trade-Off:

Data Distribution vs. Access Path Normal Access vs. Range Access NPPI vs. PPI

Secondary Index
Enhances set selection by specifying access paths other than the primary index path SI storage - System maintains a subtable for each SI. Subtables keep base table SI row hash, column values, and RowID of the base table which contains actual value. There is a overhead in maintaining SI subtable if the table involves INSERT/UPDATE/DELETE operations. Restrictions on Secondary Indexes: A table can have up to 32 secondary, hash and join indexes No more than 64 columns can be included in a secondary index definition Cannot include columns having BLOB or CLOB data types SI Types: Unique Secondary Index (USI) Non-Unique Secondary Index (NUSI) Value-Ordered Secondary Index NUSI and Query Covering NUSI Bit Mapping

USI Subtable Row Layout

USI access is usually a two-AMP operation

The process for locating a row using a USI is as follows: 1. After checking the syntax and lexicon of the query, the Parser looks up the Table ID for the USI subtable that contains the specified USI value 2. The hashing algorithm hashes the USI value 3. The Generator creates an AMP step message containing the USI Table ID, USI row hash value, and USI data value 4. The Dispatcher uses the USI row hash to send the message across the BYNET to AMP 3, which contains the appropriate USI subtable row 5. The file system on AMP 3 locates the appropriate USI subtable using the USI Table ID 6. The file system on AMP 3 uses the USI row ID to locate the appropriate index row in the subtable 7. This operation might require a search through a number of rows with the same row hash value before the row with the desired value is located 8. AMP 3 reads the base table row ID from the USI row and distributes a message containing the base table ID and the row ID for the requested row across the BYNET to AMP 10, which contains the requested base table row 9. The file system uses the row ID to locate the base table row

NUSI - Subtable and access path different from that of USI


NUSI subtables are created and stored locally on the AMPs the corresponding part of the subtable is stored on the same AMP as that of the base table. NUSI Subtable stores RowID of base table that are located on the same AMP NUSI access is always an all-AMPs operation Because NUSI subtable access is not hashed, the subtables must be scanned in order to locate the relevant pointers to base table rows

NUSI Subtable Row Layout

NUSI access is a all-AMP operation

The process used by this example for locating a row using the NUSI value CA is as follows: 1. After checking the syntax and lexicon of the query, the Parser looks up the Table ID for the NUSI subtable that contains the NUSI value CA 2. The hashing algorithm hashes the NUSI value 3. The Generator creates an AMP steps message containing the NUSI Table ID (734596), NUSI row hash value (53), and NUSI data value (CA) and then the Dispatcher distributes it across the BYNET to all AMPs 4. The file system on a receiving AMP locates the appropriate NUSI subtable using the NUSI Table ID 5. The file system on a receiving AMP uses the NUSI row hash value to locate the appropriate index row in the Subtable 6. If there is a NUSI row, its table row ID list is scanned for base table row IDs 7. The file system uses the row IDs to locate the base table rows containing the NUSI value CA

USI and NUSI Examples


CREATE MULTISET TABLE t1,NO FALLBACK,NO BEFORE JOURNAL,NO AFTER JOURNAL (i INTEGER NOT NULL, j INTEGER NOT NULL, a CHAR(10)) UNIQUE PRIMARY INDEX upi_t1 (i), UNIQUE INDEX usi_t1_01 (j);
i 100 200 300 400 100 200 300 400 j a a a a a

EXPLAIN SELECT * FROM t1 WHERE j = 100; 1) First, we do a two-AMP RETRIEVE step from t1 by way of unique index # 4 "t1.j = 100" with no residual conditions. The estimated time for this step is 0.02 seconds. CREATE MULTISET TABLE t2,NO FALLBACK,NO BEFORE JOURNAL,NO AFTER JOURNAL (i INTEGER NOT NULL, j INTEGER NOT NULL, a CHAR(10)) UNIQUE PRIMARY INDEX upi_t2 (i), INDEX nusi_t2_01 (j);
i 100 200 300 400 100 100 300 400 j a a a a a

EXPLAIN SELECT * FROM t2 WHERE j = 100; 1) We do an all-AMPs RETRIEVE step from t2 by way of an all-rows scan with a condition of ("t2.j = 100") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 2 rows. The estimated time for this step is 0.03 seconds.

Value-Ordered NUSI
Value-ordered NUSIs are very efficient for range conditions Because the NUSI rows are sorted by data value, it is possible to search only a portion of the index subtable for a given range of key values Examples:
CREATE INDEX Idx_Date (o_orderdate) ORDER BY VALUES (o_orderdate) ON Orders; SELECT * FROM Orders WHERE o_orderdate BETWEEN 1997-10-01 AND 1997-10-07;

Value-ordered NUSIs have the following limitations:

The sort key is limited to a single numeric or DATE column The sort key column cannot exceed four bytes in length They count as 2 consecutive indexes against the total of 32 non-primary indexes you can define on a base or join index table. One index represents the column list and the other index represents the ordering column

NUSI Bit-Mapping
Bit mapping is a technique used by the Optimizer to effectively link several weakly selective indexes in a way that creates a result that drastically reduces the number of base rows that must be accessed to retrieve the desired data. Teradata only performs NUSI bit mapping when weakly selective indexed conditions are ANDed and their composite selectivity is strong. Optimizer instruct each AMP to construct bit maps to determine which rowIDs their local NUSI rows have in common and then access just those rows, applying the conditions to them exclusively. Example:

Covering Index
An index is said to be covering if all of the columns requested in a query are also available from existing index subtable, making it unnecessary to access the base table rows to complete the query. Example:
Simple Query Considered for Index Covering: CREATE INDEX IdxOrd (o_orderkey, o_date, o_totalprice) ON ORDERS; SELECT o_date, AVG(o_totalprice) FROM ORDERS WHERE o_orderkey >1000 GROUP BY o_date; Aggregate Query Considered for Index Covering: CREATE INDEX IdxEmployee (DeptNo) ON Employee; SELECT DeptNo, COUNT(*) FROM Employee GROUP BY DeptNo;

Secondary Index selection criteria


Consider creating secondary indexes on columns which are highly selective USI is good choice when the table does not have UPI. This helps in avoiding duplicate data check when INSERT/UPDATE Operation is performed on the table While USI retrievals are always very efficient, the efficiency of NUSI retrievals varies greatly depending on their selectivity Consider creating covering indexes wherever possible Consider creating secondary indexes on columns frequently operated on by built-in functions such as aggregates Consider assigning a uniqueness constraint such as PRIMARY KEY, UNIQUE through USI Consider naming secondary indexes whenever possible using a standard naming convention Avoid assigning secondary indexes to frequently updated column sets Avoid creating excessive secondary indexes on a table

Join Index
Join indexes allows denormalization of physical database without affecting the normalization of the physical and logical database models These can serve the purpose of storing aggregated data as being used in Fact table in Dimensional Modeling Unlike traditional indexes, join indexes do not store pointers to their associated base table rows Instead, they are generally used as a fast path final access point that eliminates the need to access and join the base tables they represent. They substitute for rather than point to base table rows. The only exception to this is the case where an index partially covers a query If the index is defined using either the ROWID keyword or the UPI of its base table as one of its columns, then it can be used to join with the base table to cover the query Statistics should be collected on Join Index to have an updated information Join Index provide overhead if the table(s) are updated which are part of its definition. JI would simultaneously be rebuilt User cannot directly select from a Join Index

Types of Join Index


Single Table Join Indexes - allows hashing of rows based on column other than PI. This column might be used in condition of SQL

qualifying the JI for data access. This helps in preventing


redistribution of underlying base table based on some other column. Multitable Join Indexes - are useful for queries where the index structure contains all the columns referenced by one or more joins, thereby allowing the index to cover that part of the query, making it possible to retrieve the requested data from the index rather than accessing its underlying base. Aggregated Join Index allows to define a summary table without violating the normalization of the database schema. This will allow a join index to pre-compute an aggregate value that would otherwise potentially require a full table scan and sort operation.

Examples of different Join Indexes


Single-table Join Index:
CREATE TABLE t1 (x1 INTEGER, y1 INTEGER, z1 INTEGER) PRIMARY INDEX (x1); CREATE TABLE t2 (x2 INTEGER, y2 INTEGER, z2 INTEGER)PRIMARY INDEX (x2); CREATE JOIN INDEX j1 AS SELECT y1, ROWID FROM t1 PRIMARY INDEX (y1);

Multi-table Join Index:

CREATE JOIN INDEX order_join_line AS SELECT (l_orderkey, o_orderdate, o_custkey, o_totalprice), (l_partkey, l_quantity, l_extendedprice, l_shipdate) FROM lineitem LEFT JOIN orders ON l_orderkey = o_orderkey ORDER BY o_orderdate PRIMARY INDEX (l_orderkey);

Aggregated Join Index:

CREATE JOIN INDEX ord_cust_idx AS SELECT c_nationkey, SUM(o_totalprice(FLOAT)) AS price, o_orderdate FROM orders, customer WHERE o_custkey = c_custkey GROUP BY c_nationkey, o_orderdate ORDER BY o_orderdate;

Hash Index
Hash indexes are file structures that share properties with both single-table join indexes and secondary indexes Hash indexes can optionally be specified to be distributed in such a way that their rows are AMP-local with their associated base table rows They can also provide a transparent direct access path to those base table rows to complete a query only partially covered by the index Example:
CREATE TABLE Orders (o_orderkey INTEGER NOT NULL, o_custkey INTEGER, o_orderstatus CHARACTER(1) CASESPECIFIC, o_totalprice DECIMAL(13,2) NOT NULL, o_orderdate DATE FORMAT 'yyyy-mm-dd' NOT NULL, o_orderpriority CHARACTER(21), o_clerk CHARACTER(16), o_shippriority INTEGER, o_comment VARCHAR(79)) UNIQUE PRIMARY INDEX (o_orderkey);

CREATE HASH INDEX OrdHIdx_1 (o_orderdate) ON orders BY (o_orderdate) ORDER BY (o_orderdate);

Teradata Joins
Joins available to user:

Left Outer Join Right Outer Join Full Outer Join Inner Join Cross Join Self Join
Teradata Internal Joins: Product Join Merge Join Nested Join Hash Join Self Join Correlated Join

Product Join and Merge Join


Product Join: Compares every qualifying row from one table to every qualifying row from the other table and saves the rows that match the WHERE condition. Time consuming and hence a costly join. Requires bigger spool spaces. Usually used when
The join condition is not based on equality The join conditions are ORed It is less costly than other join forms

Merge Join: Comparison of rows are done based on hash values of the joining columns. Sorting is performed before comparison. Comparison involves lesser number of rows in comparison to Product Join Different methods to perform comparison of hash values:
Redistribution of rows based on hash values Duplication of rows based on hash values Matching Indexes

Example of Merge Join based on Hash Redistribution


ENum
(UPI,PK)

Name Brown Smith Jones Clay Peters Foster Gray Baker

Dept
(FK)

Dept
(UPI, PK)

Name Delivery Payroll

1 2 3 4 5 6 7 8

200 310 310 400 150 400 310 310

400 150

200
310

Finance
Mfg

SELECT Name, DeptName, Loc FROM Employee, Department WHERE Employee.DeptNo = Department.DeptNo; Since DeptNo in Employee table is not a UPI, but is a foreign key. The table would be hash redistributed based on the DeptNo Hash Redistribution takes place local to AMP Rows are sorted before applying join condition

Example of Merge Join based on Hash Redistribution


Employee Row Hash Distributed on Employee.ENum (UPI)

6 FOSTER 400 8 BAKER 310

4 CLAY 400 3 JONES 310

1 BROWN 200 7 GRAY 310

5 PETER 150 2 SMITH 310

Employee Row Hash Re-Distributed on Employee.Dept Row Hash


7 3 8 2 GRAY 310 JONES 310 BAKER 310 SMITH 310

5 PETER

150

1 BROWN 200

6 FOSTER 400 4 CLAY 400

J O I

Department Row Hash Distributed on Department.Dept (UPI)

150

PAYROLL

310

MFG

200

FINANCE

400 DELIVERY

Example of Merge Join based on Duplication of Table


Department table rows Hash Distributed on Department.Dept (UPI)
150 PAYROLL 310 MFG 200 FINANCE 400 DELIVERY

Employee table rows Hash Distributed on Employee.ENum (UPI)


6 FOSTER 400 8 BAKER 310 4 CLAY 400 3 JONES 310 1 BROWN 200 7 GRAY 310 5 PETER 150 2 SMITH 310

Spool file after duplicating and sorting on Department.Dept Row Hash


150 PAYROLL 200 FINANCE 310 MFG 400 DELIVERY 150 PAYROLL 200 FINANCE 310 MFG 400 DELIVERY 150 PAYROLL 200 FINANCE 310 MFG 400 DELIVERY 150 PAYROLL 200 FINANCE 310 MFG 400 DELIVERY

J O I N

Spool file after locally copying and sorting on Employee.Dept Row Hash
8 BAKER 310 6 FOSTER 400 3 JONES 310 4 CLAY 400 1 BROWN 200 7 GRAY 310 2 SMITH 310 5 PETER 150

Example of Merge Join using Matching Indexes


If the primary indexes of the joining tables are matching. No Redistribution is required. Example SELECT * FROM Employee, Employee_Phone WHERE Employee.Enum = Employee_Phone.Enum;

Nested Join
A nested join is a join for which the WHERE conditions specify a constant value for a unique index in one table and those conditions also match some column of that single row to the primary or secondary index of the second table. Example SELECT DeptName, Name, YrsExp FROM Employee, Department WHERE Employee.EmpNo = Department.MgrNo AND Department.DeptNo = 100;

Correlated Queries
A correlated query is a subquery whose outer query results are processed a row at a time against the subquery result.

SELECT last_name, department_number as DEPTNO, salary_amount FROM employee ee WHERE salary_amount = (SELECT MAX(salary_amount) FROM employee em WHERE em.department_number = ee.department_number); Steps of execution: 1. Read an employee row 2. Get max salary for his/her department from the subquery 3. Compare his/her salary to the max salary 4. If equal, output this row 5. Go to 1

Teradata Database Objects


Tables

Base Tables Global Temporary Tables Volatile Tables Derived Tables

Views Macros Stored Procedures Triggers Join Index Hash Index

Global Temporary Tables


Global Temporary Tables: holds information for intermediate results of queries. Can be accessed by any sessions when materialized but data cannot be shared across sessions Uses spool space to store data Local instance is materialized when data is inserted or an index is defined or collect statistics is issued Optionally emptied at the end of each transaction Materialized tables are valid for session only. Data is lost once the logoff takes place. Stored in database schema CREATE GLOBAL TEMPORARY TABLE gt_deptsal (deptno SMALLINT,avgsal DEC(9,2), maxsal DEC(9,2),minsal DEC(9,2),sumsal DEC(9,2),empcnt SMALLINT) ON COMMIT PRESERVE ROWS; INSERT INTO gt_deptsal SELECT dept ,AVG(sal) ,MAX(sal) ,MIN(sal) ,SUM(sal) ,COUNT(emp) FROM emp GROUP BY 1;

Volatile Tables
Volatile Tables Holds information for intermediate results of queries. Valid for a session only Are not available after a session get a restart during dbs restart No access logging can be done No indexes and referential integrity can be implemented Not stored in database schema CREATE VOLATILE TABLE vt_deptsal, LOG (deptno SMALLINT,avgsal DEC(9,2),maxsal DEC(9,2),minsal DEC(9,2),sumsal DEC(9,2),empcnt SMALLINT) ON COMMIT PRESERVE ROWS; INSERT INTO vt_deptsal SELECT dept ,AVG(sal) ,MAX(sal) ,MIN(sal) ,SUM(sal) ,COUNT(emp)FROM emp GROUP BY 1;

Derived Tables
Derived tables are temporary tables that are created in spool and dropped when the query is completed Example Employees who salary is greater than the company average SELECT last_name, salary_amount, avgsal, FROM (SELECT AVG(salary_amount) FROM employee) my_temp(avgsal), employee WHERE salar_amount > avgsal ORDER BY 2 DESC;

Teradata Macro
A macro consists of one or more statements that are executed in a single transaction Macro is similar to performing a multi statement request. i.e. either all statements in the request complete successfully, or the entire request is aborted All statements can be executed in parallel, making use of the parallel processing architecture of Teradata, thus reducing processing time Macros simplify an operation that is complex or must be performed frequently Can return multi-row answer set Typically called from a trigger Creating a Macro:
CREATE MACRO NewEmpAdd (id INTEGER, name VARCHAR(50)) AS ( INSERT INTO EMPLOYEE values(:Id,:name); );

EXEC NewEmpAdd(25,ABC);

Macro vs. Stored Procedure

Locking in Teradata
Default locking mechanism in Teradata:

READers can simultaneously READ the same database object READer needs to wait while a WRITE operation is in effect on the same database object WRITEer needs to wait while a READ operation is in effect on the same database object Everybody needs to wait while there is an EXCLUSIVE lock on the database object

This definitely affects the transaction concurrency The solution is: ACCESS lock

Down-grade the severity of lock by explicit specification LOCKING t1 FOR ACCESS But at the expense of Uncommitted Dependencies (Dirty Read) chances

So at times, there is a trade-off between transaction concurrency and data integrity


The solution has to be build up at the application level

Locking Severity
The available lock severities, from most restrictive to least restrictive, are as follows:

Compatibility Among Locking Severities

Locking Level
Locking level the database object on which the lock is placed

Default Lock Assignments


The default lock assignments the Lock Manager applies:

Statistics
Statistics on a column or index of a table provides Optimizer about the details of: Total number of rows Total values for the column Unique values for the column Null values of the column Maximum number of rows per value Minimum number of rows per value Minimum value for an interval Maximum value for an interval Number of Intervals Using Statistics values, Optimizer plans for the best plan for the execution of the query Statistics should be updated regularly so that Optimizer has access to the current information about the table Random AMP Samples (RAS) - If statistics are not available, then Teradata Optimizer uses Random AMP Samples which is the information collected from a single AMP about the table columns and the data stored in it.

Collect Statistics
Statistics can be collected on A single column Primary Index Secondary Indexes Primary Index of a Join Index Primary Index of a Hash Index Column which are part of Join Condition in a query

Collect Statistics Example


CREATE TABLE t1 ( i int, j int, k int);
i 100 150 200 250 300 350 400 450 600 650 100 150 200 250 400 450 500 550 700 750 j 100 200 100 200 100 200 200 300 500 600 k

EXPLAIN SEL * FROM t1 WHERE i > 200;


3) We do an all-AMPs RETRIEVE step from t1 by way of an all-rows scan with a condition of ("t1.i > 200") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 1 row. The estimated time for this step is 0.03 seconds.

Collect Statistics Example


COLLECT STATISTICS ON t1 INDEX (i);

EXPLAIN SEL * FROM t1 WHERE i > 200;


3) We do an all-AMPs RETRIEVE step from NS.t1 by way of an all-rows scan with a condition of ("NS.t1.i > 200") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 7 rows. The estimated time for this step is 0.03 seconds.

Explain
EXPLAIN <query> Explain describes about the execution plan that Optimizer has prepared for a query. It will tell number of steps involved in the execution of a query Tables/Views to be used in the query Parallel steps Internal Joins to be used Rows estimation for each step Time estimation for each step Explain can be viewed through BTEQ, SQL Assistant and Visual Explain. Visual Explain provides graphics version of the explain steps which is more readable. Using it, explains for two queries can also be compared

Food for Optimizer


Optimizer requires the following information to build a successful plan for the query execution Environmental Cost parameters weights of CPU, disk, and network, disk delays, dbscontrol settings, pde control settings Performance Constraints data transfer rates for each type of storage medium and network interconnection Statistics Information about the table and columns used in the query. It includes total rows, number of unique values, number of rows per unique values, null values, minimum row value, maximum row value. Based on these costs, Optimizer decides how to perform joins, how to pull data from AMPs and how to redistribute it.

Tips for query optimization


Collect statistics on the join fields Check if you have included all the necessary join conditions Isolate the join that is your bottleneck Avoid data transformation in join conditions Avoid DISTINCT. Use GROUP BY Avoid IN and NOT IN. Use EXISTS and NOT-EXISTS Replace Outer Join with UPDATE Replace IN(..,..,..) by UNION for large queries if possible

Data Load Utilities


MultiLoad utility (MLOAD) loads large quantities of data into unpopulated tables. MultiLoad also supports bulk inserts, updates, and deletions against populated tables

FastLoad utility loads unpopulated tables only. This program is similar to BulkLoad except that it runs much faster than BulkLoad and does not support update and delete operations
TPump Provides for continuous update of tables; performs insert, update, and delete operations or a combination of these operations on tables using the same source feed FastExport utility Provides parallel export of data Exports large quantities of data from the Teradata RDBMS to a client and is the functional complement of the FastLoad and MultiLoad utilities

References
http://www.teradataforum.com/ncr_pdf.htm http://www.teradata.com

S-ar putea să vă placă și