Sunteți pe pagina 1din 3

Teradata Best Practices

Teradata Best Practices


31/05/2011

Retail ISU Thomson Dcruz thomson.dcruz@tcs.com

TCS Internal

Teradata Best Practices


About this document
This document contains best practices that should be used by developers while coding BTEQs or stored procedures. Usage of these practices would help in better Query performance and lesser I/Os

Intended Audience

The intended audience for this document are developers working on the Teradata Database. In Teradata, the Primary Index is used to distribute the data among the AMP's (Access Module Processors). Hence, at least one column in a Primary Index should be defined as Non Null-able Selection of Primary Indexes should be based on the following o Joining columns: Using join columns as the PI ensures that the data is matched locally on the AMP's, thereby preventing re-distribution o Uniqueness: This ensures that data is distributed evenly among all the AMP's. In case, character fields are used in the Primary Index, then they should always be defined in the CHAR data type and not in the VARCHAR data type Date fields should always be defined with a format of 'YYYY-MM-DD' UNION/Minus/Except should be avoided in joins Columns should never be casted to other data types while joining as this degrades performance (where clause) "Group By" on all columns to get distinct values instead of using the "Distinct" functionality Statistics should always be collected on non PI columns that are being used to join with other tables Character fields should always be "Non Case Specific" Partitioned Primary Index should be used on large Fact Tables as they help in faster query execution and reduced I/O Query should either use alias for Table or full qualification. Using both may result in a "Product Join"

TCS Internal

Teradata Best Practices


Example of Product Join
SELECT * FROM Product prod, Sales sal WHERE Product.Product_Key=sal.Product_Key

Following should not be used in SQL as they cause "Full Table Scan" Like Operator NOT IN Inequality Operators (e.g. <,> etc)

MULTISET Tables are preferred to SET Tables as they prevent time taken in checking for duplicates. Use of NOT IN should be avoided as NOT IN results in a full table scan. Instead one should use correlated subqueries using NOT EXIST clause.

TCS Internal

S-ar putea să vă placă și