Sunteți pe pagina 1din 25

Tools and Utilities

TPT: Overview

LEVEL PRACTITIONER

About the Author

Created By:

Naveen Kambhoji

Credential
Information:

Teradata Certified Professional. Worked in Teradata


Technologies for last 8 years.

Version and
Date:

V 1.0
Date 04/30/2014

ognizant Certified Official Curriculum

Icons Used

Questions

Tools

Coding
Standards

Test Your
Understandi
ng

Demonstration

Best
Practices &
Industry
Standards

Hands on
Exercise

Case
Study

Workshop

Teradata Parallel Transporter ( TPT ):


Overview
Teradata Parallel Transporter is a flexible, highperformance tool that facilitates real-time data
extraction, transformation and loading.
It supports an infrastructure that enables
parallel execution of the products components
known as operatorsto integrate with the
infrastructure ( i.e TD )in a plug-in fashion to
perform these functions
TPT works around the concept of Operators and
Data Streams.
4

Operators + Data Streams

Operators

Data Streams

Access to
external
resources
Filter &
Transformatio
n functions

Files, DBMS
tables
Message Qs
MSMQ,
WebSphere
MQ

Transmits the
data between
2 operators

TPT : Operators
Load. Places data into an empty table using Teradata FastLoad
Update. Loads data and applies updates to new and existing tables using Teradata
MultiLoad; the updates can be applied either conditionally or unconditionally based
on user-defined rules
Export. Extracts data from Teradata tables using Teradata FastExport
Stream. Loads data and applies updates to new and existing tables by using
multiple SQL protocol sessions in a high-performance, workload-balancing manner;
the updates can be applied either conditionally or unconditionally based on userdefined rules; this operator can be used for loading data from continuous data
sources, such as queuing systems (e.g., Microsoft Message Queuing (MSMQ) and
WebSphere MQ) and enterprise application integration products
SQL Inserter. Loads data, including large objects (LOBs), into a new or an existing
table using a single SQL protocol session
SQL Selector. Extracts data, including LOBs, from an existing table using a single
SQL protocol session
Open Database Connectivity (ODBC). Extracts data from external third-party
ODBC sources
Data Connector. Supports simultaneous, parallel reading of multiple data sources,
such as various types of files or queuing systems; also allows writing to external data
sources
6

TPT: Operators

TPT : Scalable access to message


queues

Teradata Parallel Transporter offers several ways for moving data from
third-party sources into the Teradata Database. Aside from using the
producer and consumer operators for data extraction and loading,
Teradata Parallel Transporter also allows external units, called access
modules, to be used as plug-ins through the Data Connector operator.
Access modules are software that encapsulate the details of access to
various data stores, such as files, tapes, named pipes and message
queues. The Data Connector operator, which acts as an adapter for
access modules, insulates Teradata Parallel Transporter from knowing
the inner working of these modules, thus allowing them to be userdefined and -constructed and then executed under Teradata Parallel
Transporter as if they were operators. Teradata Parallel Transporter
provides these access modules, all of which are checkpoint-restartable
Named pipes Access Module allows users to load data into
Teradata Database from a named pipe.
WebSphere MQ Access Module allows users to load data into
Teradata Database from a message queue using WebSphere MQ
message queuing middleware.

TPT: Scalable, continuous loading using


the Stream operator
Like Teradata TPump, the Stream operator is a general-purpose load
utility for the Teradata Database that supports continuous loading.
Unlike TPump, which runs as a single-process application, the Stream
operator can scale to run with multiple instances in a Teradata Parallel
Transporter job while supporting most TPump features. The Stream
operator uses standard SQL to insert, update or delete data. It also
supports UPSERT, an operation that allows rows to be inserted if they
are not found for an update.

TPT: Scalable access to transactional files

Transact
ion Files

ACTIVE
approac
h
Minibatch
approac
active directory
h

Active Approach:
Employ the
scan feature to
continuously collect data from these directories, based on a userdefined time interval, while the Data Connector operator activates the
start and stop time for the entire scan job
Mini-batch Approach: Like the active directory scan, this approach
allows transactional files to be collected periodically from a directory
and their contents loaded into the Teradata Database. Unlike the
active directory scan, the mini-batch approach uses the Load or
Update operator instead of the Stream operator for loading the data.
10There are trade-offs between active, continuous and mini-batch

TPT: Active Approach

11

TPT: Operators Continued

12

TPT-Continued

13

TPT + Informatica Architecture

14

Informatica with Scripting

15

Informatica with API ( TPT )

16

Configure Informatica TPT

17

Configure Informatica TPT

18

Configure Informatica TPT

19

Configure Informatica TPT

20

Session Configuration

21

Best Practices

22

TPT : ELT approach


Although periodic load with the Load or Update operator offers the block-at-a-time
performance of high-volume data loading, it has restrictions for applications. For
example, it does not support target tables with unique secondary indexes (USI), join
indexes (JI), referential integrity (RI) or triggers.
With the extract, load and transform (ELT) approach, however, these restrictions can
be avoided. By first loading a small batch of files into a staging table, you can use SQL
statements (such as INSERT-SELECT, UPDATE-FROM or MERGE-INTO) to apply the data
from the staging table to the target table.
The ELT approach can also take advantage of the SQL bulk load operations that are
available within the Teradata Database. These operations not only support MERGEINTO but also enhance INSERT-SELECT and UPDATE-FROM. This enables primary,
fallback and index data processing with block-at-a-time optimization.
The Teradata bulk load operations also allow users to define their own error tables to
handle errors from operations on target tables. These are separate and different from
the Update operators error tables. Furthermore, the no primary index (NoPI) table
feature also extends the bulk load capabilities. By allowing NoPI tables, Teradata can
load a staging table faster and more efficiently

23

Source

http://www.teradatamagazine.com/
www.teradataforum.com
www.teradata.com
Tera-Tom on Teradata Utilities V12-V13
Informatica.com

Disclaimer: Parts of the content of this course is based on the materials available from the
Web sites and books listed above. The materials that can be accessed from linked sites are
not maintained by Cognizant Academy and we are not responsible for the contents thereof.
All trademarks, service marks, and trade names in this course are the marks of the
respective owner(s).
24

Tools and Utilities

TPT

S-ar putea să vă placă și