Documente Academic
Documente Profesional
Documente Cultură
Informatica PowerCenter®
(Version 7.1.1)
Informatica PowerCenter Getting Started
Version 7.1.1
August 2004
This software and documentation contain proprietary information of Informatica Corporation, they are provided under a license agreement
containing restrictions on use and disclosure and is also protected by copyright law. Reverse engineering of the software is prohibited. No
part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
without prior consent of Informatica Corporation.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software
license agreement as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR
12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to
us in writing. Informatica Corporation does not warrant that this documentation is error free.
Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX, and SuperGlue are trademarks or registered trademarks
of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be
trade names or trademarks of their respective owners.
Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington
University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.
Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU
Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials
are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the
implied warranties of merchantability and fitness for a particular purpose.
Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark
of Meta Integration Technology, Inc.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/).
The Apache Software is Copyright (c) 1999-2004 The Apache Software Foundation. All rights reserved.
DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information
provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or
changes in the products described in this documentation at any time without notice.
Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
New Features and Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
PowerCenter 7.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
PowerCenter 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xiv
PowerCenter 7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviii
About Informatica Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv
About this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxvi
Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . xxvi
Visiting the Informatica Webzine . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxvi
Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxvi
Visiting the Informatica Developer Network . . . . . . . . . . . . . . . . . . . . xxvi
Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxvii
iii
Folders in this Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Creating Source Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
What Comes Next . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
iv Table of Contents
Using the Overview Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Arranging Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Creating a Session and Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Creating the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Creating the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Running the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
What Comes Next . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Table of Contents v
Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Mapplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
vi Table of Contents
List of Figures
Figure 4-1. Pass-Through Mapping . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Figure 4-2. Sample Workflow . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 5-1. Transformation Toolbar . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Figure 6-1. Mapping with Fact and Dimension Tables ... . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Figure 7-1. Mapping with XML Sources and Targets . .... . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Figure 7-2. XML Editor . . . . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Figure 7-3. ENG_SALARY.XML Output . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . 116
Figure 7-4. SLS_SALARY.XML Output . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . 117
List of Tables ix
x List of Tables
Preface
Welcome to PowerCenter, Informatica’s software product that delivers an open, scalable data
integration solution addressing the complete life cycle for all data integration projects
including data warehouses and data marts, data migration, data synchronization, and
information hubs. PowerCenter combines the latest technology enhancements for reliably
managing data repositories and delivering information resources in a timely, usable, and
efficient manner.
The PowerCenter metadata repository coordinates and drives a variety of core functions,
including extracting, transforming, loading, and managing data. The PowerCenter Server can
extract large volumes of data from multiple platforms, handle complex transformations on the
data, and support high-speed loads. PowerCenter can simplify and accelerate the process of
moving data warehouses from development to test to production.
xi
New Features and Enhancements
This section describes new features and enhancements to PowerCenter 7.1.1, 7.1, and 7.0.
PowerCenter 7.1.1
This section describes new features and enhancements to PowerCenter 7.1.1.
Data Profiling
♦ Data sampling. You can create a data profile for a sample of source data instead of the
entire source. You can view a profile from a random sample of data, a specified percentage
of data, or for a specified number of rows starting with the first row.
♦ Verbose data enhancements. You can specify the type of verbose data you want the
PowerCenter Server to write to the Data Profiling warehouse. The PowerCenter Server can
write all rows, the rows that meet the business rule, or the rows that do not meet the
business rule.
♦ Session enhancement. You can save sessions that you create from the Profile Manager to
the repository.
♦ Domain Inference function tuning. You can configure the Data Profiling Wizard to filter
the Domain Inference function results. You can configure a maximum number of patterns
and a minimum pattern frequency. You may want to narrow the scope of patterns returned
to view only the primary domains, or you may want to widen the scope of patterns
returned to view exception data.
♦ Row Uniqueness function. You can determine unique rows for a source based on a
selection of columns for the specified source.
♦ Define mapping, session, and workflow prefixes. You can define default mapping,
session, and workflow prefixes for the mappings, sessions, and workflows generated when
you create a data profile.
♦ Profile mapping display in the Designer. The Designer displays profile mappings under a
profile mappings node in the Navigator.
PowerCenter Server
♦ Code page. PowerCenter supports additional Japanese language code pages, such as JIPSE-
kana, JEF-kana, and MELCOM-kana.
♦ Flat file partitioning. When you create multiple partitions for a flat file source session, you
can configure the session to create multiple threads to read the flat file source.
♦ pmcmd. You can use parameter files that reside on a local machine with the Startworkflow
command in the pmcmd program. When you use a local parameter file, pmcmd passes
variables and values in the file to the PowerCenter Server.
xii Preface
♦ SuSE Linux support. The PowerCenter Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase sources, targets, and repositories using
native drivers. Use ODBC drivers to access other sources and targets.
♦ Reserved word support. If any source, target, or lookup table name or column name
contains a database reserved word, you can create and maintain a file, reswords.txt,
containing reserved words. When the PowerCenter Server initializes a session, it searches
for reswords.txt in the PowerCenter Server installation directory. If the file exists, the
PowerCenter Server places quotes around matching reserved words when it executes SQL
against the database.
♦ Teradata external loader. When you load to Teradata using an external loader, you can
now override the control file. Depending on the loader you use, you can also override the
error, log, and work table names by specifying different tables on the same or different
Teradata database.
Repository
♦ Exchange metadata with other tools. You can exchange source and target metadata with
other BI or data modeling tools, such as Business Objects Designer. You can export or
import multiple objects at a time. When you export metadata, the PowerCenter Client
creates a file format recognized by the target tool.
Repository Server
♦ pmrep. You can use pmrep to perform the following functions:
− Remove repositories from the Repository Server cache entry list.
− Enable enhanced security when you create a relational source or target connection in the
repository.
− Update a connection attribute value when you update the connection.
♦ SuSE Linux support. The Repository Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase repositories.
Security
♦ Oracle OS Authentication. You can now use Oracle OS Authentication to authenticate
database users. Oracle OS Authentication allows you to log on to an Oracle database if you
have a logon to the operating system. You do not need to know a database user name and
password. PowerCenter uses Oracle OS Authentication when the user name for an Oracle
connection is PmNullUser.
Preface xiii
♦ Pipeline partitioning. You can create multiple partitions in a session containing web
service source and target definitions. The PowerCenter Server creates a connection to the
Web Services Hub based on the number of sources, targets, and partitions in the session.
XML
♦ Multi-level pivoting. You can now pivot more than one multiple-occurring element in an
XML view. You can also pivot the view row.
PowerCenter 7.1
This section describes new features and enhancements to PowerCenter 7.1.
Data Profiling
♦ Data Profiling for VSAM sources. You can now create a data profile for VSAM sources.
♦ Support for verbose mode for source-level functions. You can now create data profiles
with source-level functions and write data to the Data Profiling warehouse in verbose
mode.
♦ Aggregator function in auto profiles. Auto profiles now include the Aggregator function.
♦ Creating auto profile enhancements. You can now select the columns or groups you want
to include in an auto profile and enable verbose mode for the Distinct Value Count
function.
♦ Purging data from the Data Profiling warehouse. You can now purge data from the Data
Profiling warehouse.
♦ Source View in the Profile Manager. You can now view data profiles by source definition
in the Profile Manager.
♦ PowerCenter Data Profiling report enhancements. You can now view PowerCenter Data
Profiling reports in a separate browser window, resize columns in a report, and view
verbose data for Distinct Value Count functions.
♦ Prepackaged domains. Informatica provides a set of prepackaged domains that you can
include in a Domain Validation function in a data profile.
Documentation
♦ Web Services Provider Guide. This is a new book that describes the functionality of Real-time
Web Services. It also includes information from the version 7.0 Web Services Hub Guide.
♦ XML User Guide. This book consolidates XML information previously documented in the
Designer Guide, Workflow Administration Guide, and Transformation Guide.
Licensing
Informatica provides licenses for each CPU and each repository rather than for each
installation. Informatica provides licenses for product, connectivity, and options. You store
xiv Preface
the license keys in a license key file. You can manage the license files using the Repository
Server Administration Console, the PowerCenter Server Setup, and the command line
program, pmlic.
PowerCenter Server
♦ 64-bit support. You can now run 64-bit PowerCenter Servers on AIX and HP-UX
(Itanium).
♦ Partitioning enhancements. If you have the Partitioning option, you can define up to 64
partitions at any partition point in a pipeline that supports multiple partitions.
♦ PowerCenter Server processing enhancements. The PowerCenter Server now reads a
block of rows at a time. This improves processing performance for most sessions.
♦ CLOB/BLOB datatype support. You can now read and write CLOB/BLOB datatypes.
Repository Server
♦ Updating repository statistics. PowerCenter now identifies and updates statistics for all
repository tables and indexes when you copy, upgrade, and restore repositories. This
improves performance when PowerCenter accesses the repository.
♦ Increased repository performance. You can increase repository performance by skipping
information when you copy, back up, or restore a repository. You can choose to skip MX
data, workflow and session log history, and deploy group history.
♦ pmrep. You can use pmrep to back up, disable, or enable a repository, delete a relational
connection from a repository, delete repository details, truncate log files, and run multiple
pmrep commands sequentially. You can also use pmrep to create, modify, and delete a
folder.
Repository
♦ Exchange metadata with business intelligence tools. You can export metadata to and
import metadata from other business intelligence tools, such as Cognos Report Net and
Business Objects.
♦ Object import and export enhancements. You can compare objects in an XML file to
objects in the target repository when you import objects.
♦ MX views. MX views have been added to help you analyze metadata stored in the
repository. REP_SERVER_NET and REP_SERVER_NET_REF views allow you to see
information about server grids. REP_VERSION_PROPS allows you to see the version
history of all objects in a PowerCenter repository.
Preface xv
Transformations
♦ Flat file lookup. You can now perform lookups on flat files. When you create a Lookup
transformation using a flat file as a lookup source, the Designer invokes the Flat File
Wizard. You can also use a lookup file parameter if you want to change the name or
location of a lookup between session runs.
♦ Dynamic lookup cache enhancements. When you use a dynamic lookup cache, the
PowerCenter Server can ignore some ports when it compares values in lookup and input
ports before it updates a row in the cache. Also, you can choose whether the PowerCenter
Server outputs old or new values from the lookup/output ports when it updates a row. You
might want to output old values from lookup/output ports when you use the Lookup
transformation in a mapping that updates slowly changing dimension tables.
♦ Union transformation. You can use the Union transformation to merge multiple sources
into a single pipeline. The Union transformation is similar to using the UNION ALL SQL
statement to combine the results from two or more SQL statements.
♦ Custom transformation API enhancements. The Custom transformation API includes
new array-based functions that allow you to create procedure code that receives and
outputs a block of rows at a time. Use these functions to take advantage of the
PowerCenter Server processing enhancements.
♦ Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.
Usability
♦ Viewing active folders. The Designer and the Workflow Manager highlight the active
folder in the Navigator.
♦ Enhanced printing. The quality of printed workspace has improved.
Version Control
You can run object queries that return shortcut objects. You can also run object queries based
on the latest status of an object. The query can return local objects that are checked out, the
latest version of checked in objects, or a collection of all older versions of objects.
xvi Preface
Note: PowerCenter Connect for Web Services allows you to create sources, targets, and
transformations to call web services hosted by other providers. For more informations, see
PowerCenter Connect for Web Services User and Administrator Guide.
Workflow Monitor
The Workflow Monitor includes the following performance and usability enhancements:
♦ When you connect to the PowerCenter Server, you no longer distinguish between online
or offline mode.
♦ You can open multiple instances of the Workflow Monitor on one machine.
♦ You can simultaneously monitor multiple PowerCenter Servers registered to the same
repository.
♦ The Workflow Monitor includes improved options for filtering tasks by start and end
time.
♦ The Workflow Monitor displays workflow runs in Task view chronologically with the most
recent run at the top. It displays folders alphabetically.
♦ You can remove the Navigator and Output window.
XML Support
PowerCenter XML support now includes the following features:
♦ Enhanced datatype support. You can use XML schemas that contain simple and complex
datatypes.
♦ Additional options for XML definitions. When you import XML definitions, you can
choose how you want the Designer to represent the metadata associated with the imported
files. You can choose to generate XML views using hierarchy or entity relationships. In a
view with hierarchy relationships, the Designer expands each element and reference under
its parent element. When you create views with entity relationships, the Designer creates
separate entities for references and multiple-occurring elements.
♦ Synchronizing XML definitions. You can synchronize one or more XML definition when
the underlying schema changes. You can synchronize an XML definition with any
repository definition or file used to create the XML definition, including relational sources
or targets, XML files, DTD files, or schema files.
♦ XML workspace. You can edit XML views and relationships between views in the
workspace. You can create views, add or delete columns from views, and define
relationships between views.
♦ Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.
Preface xvii
♦ Support for circular references. Circular references occur when an element is a direct or
indirect child of itself. PowerCenter now supports XML files, DTD files, and XML
schemas that use circular definitions.
♦ Increased performance for large XML targets. You can create XML files of several
gigabytes in a PowerCenter 7.1 XML session by using the following enhancements:
− Spill to disk. You can specify the size of the cache used to store the XML tree. If the size
of the tree exceeds the cache size, the XML data spills to disk in order to free up
memory.
− User-defined commits. You can define commits to trigger flushes for XML target files.
− Support for multiple XML output files. You can output XML data to multiple XML
targets. You can also define the file names for XML output files in the mapping.
PowerCenter 7.0
This section describes new features and enhancements to PowerCenter 7.0.
Data Profiling
If you have the Data Profiling option, you can profile source data to evaluate source data and
detect patterns and exceptions. For example, you can determine implicit data type, suggest
candidate keys, detect data patterns, and evaluate join criteria. After you create a profiling
warehouse, you can create profiling mappings and run sessions. Then you can view reports
based on the profile data in the profiling warehouse.
The PowerCenter Client provides a Profile Manager and a Profile Wizard to complete these
tasks.
Documentation
♦ Glossary. The Installation and Configuration Guide contains a glossary of new PowerCenter
terms.
♦ Installation and Configuration Guide. The connectivity information in the Installation
and Configuration Guide is consolidated into two chapters. This book now contains
chapters titled “Connecting to Databases from Windows” and “Connecting to Databases
from UNIX.”
♦ Upgrading metadata. The Installation and Configuration Guide now contains a chapter
titled “Upgrading Repository Metadata.” This chapter describes changes to repository
xviii Preface
objects impacted by the upgrade process. The change in functionality for existing objects
depends on the version of the existing objects. Consult the upgrade information in this
chapter for each upgraded object to determine whether the upgrade applies to your current
version of PowerCenter.
Functions
♦ Soundex. The Soundex function encodes a string value into a four-character string.
SOUNDEX works for characters in the English alphabet (A-Z). It uses the first character
of the input string as the first character in the return value and encodes the remaining
three unique consonants as numbers.
♦ Metaphone. The Metaphone function encodes string values. You can specify the length of
the string that you want to encode. METAPHONE encodes characters of the English
language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase.
Installation
♦ Remote PowerCenter Client installation. You can create a control file containing
installation information, and distribute it to other users to install the PowerCenter Client.
You access the Informatica installation CD from the command line to create the control
file and install the product.
PowerCenter Server
♦ DB2 bulk loading. You can enable bulk loading when you load to IBM DB2 8.1.
♦ Distributed processing. If you purchase the Server Grid option, you can group
PowerCenter Servers registered to the same repository into a server grid. In a server grid,
PowerCenter Servers balance the workload among all the servers in the grid.
♦ Row error logging. The session configuration object has new properties that allow you to
define error logging. You can choose to log row errors in a central location to help
understand the cause and source of errors.
♦ External loading enhancements. When using external loaders on Windows, you can now
choose to load from a named pipe. When using external loaders on UNIX, you can now
choose to load from staged files.
Preface xix
♦ External loading using Teradata Warehouse Builder. You can use Teradata Warehouse
Builder to load to Teradata. You can choose to insert, update, upsert, or delete data.
Additionally, Teradata Warehouse Builder can simultaneously read from multiple sources
and load data into one or more tables.
♦ Mixed mode processing for Teradata external loaders. You can now use data driven load
mode with Teradata external loaders. When you select data driven loading, the
PowerCenter Server flags rows for insert, delete, or update. It writes a column in the target
file or named pipe to indicate the update strategy. The control file uses these values to
determine how to load data to the target.
♦ Concurrent processing. The PowerCenter Server now reads data concurrently from
sources within a target load order group. This enables more efficient joins with minimal
usage of memory and disk cache.
♦ Real time processing enhancements. You can now use real-time processing in sessions that
also process active transformations, such as the Aggregator transformation. You can apply
the transformation logic to rows defined by transaction boundaries.
Repository Server
♦ Object export and import enhancements. You can now export and import objects using
the Repository Manager and pmrep. You can export and import multiple objects and
objects types. You can export and import objects with or without their dependent objects.
You can also export objects from a query result or objects history.
♦ pmrep commands. You can use pmrep to perform change management tasks, such as
maintaining deployment groups and labels, checking in, deploying, importing, exporting,
and listing objects. You can also use pmrep to run queries. The deployment and object
import commands require you to use a control file to define options and resolve conflicts.
♦ Trusted connections. You can now use a Microsoft SQL Server trusted connection to
connect to the repository.
Security
♦ LDAP user authentication. You can now use default repository user authentication or
Lightweight Directory Access Protocol (LDAP) to authenticate users. If you use LDAP, the
repository maintains an association between your repository user name and your external
login name. When you log in to the repository, the security module passes your login name
to the external directory for authentication. The repository maintains a status for each
user. You can now enable or disable users from accessing the repository by changing the
status. You do not have to delete user names from the repository.
♦ Use Repository Manager privilege. The Use Repository Manager privilege allows you to
perform tasks in the Repository Manager, such as copy object, maintain labels, and change
object status. You can perform the same tasks in the Designer and Workflow Manager if
you have the Use Designer and Use Workflow Manager privileges.
♦ Audit trail. You can track changes to repository users, groups, privileges, and permissions
through the Repository Server Administration Console. The Repository Agent logs
security changes to a log file stored in the Repository Server installation directory. The
xx Preface
audit trail log contains information, such as changes to folder properties, adding or
removing a user or group, and adding or removing privileges.
Transformations
♦ Custom transformation. Custom transformations operate in conjunction with procedures
you create outside of the Designer interface to extend PowerCenter functionality. The
Custom transformation replaces the Advanced External Procedure transformation. You can
create Custom transformations with multiple input and output groups, and you can
compile the procedure with any C compiler.
You can create templates that customize the appearance and available properties of a
Custom transformation you develop. You can specify the icons used for transformation,
the colors, and the properties a mapping developer can modify. When you create a Custom
transformation template, distribute the template with the DLL or shared library you
develop.
♦ Joiner transformation. You can use the Joiner transformation to join two data streams that
originate from the same source.
Version Control
The PowerCenter Client and repository introduce features that allow you to create and
manage multiple versions of objects in the repository. Version control allows you to maintain
multiple versions of an object, control development on the object, track changes, and use
deployment groups to copy specific groups of objects from one repository to another. Version
control in PowerCenter includes the following features:
♦ Object versioning. Individual objects in the repository are now versioned. This allows you
to store multiple copies of a given object during the development cycle. Each version is a
separate object with unique properties.
♦ Check out and check in versioned objects. You can check out and reserve an object you
want to edit, and check in the object when you are ready to create a new version of the
object in the repository.
♦ Compare objects. The Repository Manager and Workflow Manager allow you to compare
two repository objects of the same type to identify differences between them. You can
compare Designer objects and Workflow Manager objects in the Repository Manager. You
can compare tasks, sessions, worklets, and workflows in the Workflow Manager. The
PowerCenter Client tools allow you to compare objects across open folders and
repositories. You can also compare different versions of the same object.
♦ Delete or purge a version. You can delete an object from view and continue to store it in
the repository. You can recover or undelete deleted objects. If you want to permanently
remove an object version, you can purge it from the repository.
♦ Deployment. Unlike copying a folder, copying a deployment group allows you to copy a
select number of objects from multiple folders in the source repository to multiple folders
in the target repository. This gives you greater control over the specific objects copied from
one repository to another.
Preface xxi
♦ Deployment groups. You can create a deployment group that contains references to
objects from multiple folders across the repository. You can create a static deployment
group that you manually add objects to, or create a dynamic deployment group that uses a
query to populate the group.
♦ Labels. A label is an object that you can apply to versioned objects in the repository. This
allows you to associate multiple objects in groups defined by the label. You can use labels
to track versioned objects during development, improve query results, and organize groups
of objects for deployment or export and import.
♦ Queries. You can create a query that specifies conditions to search for objects in the
repository. You can save queries for later use. You can make a private query, or you can
share it with all users in the repository.
♦ Track changes to an object. You can view a history that includes all versions of an object
and compare any version of the object in the history to any other version. This allows you
to see the changes made to an object over time.
XML Support
PowerCenter contains XML features that allow you to validate an XML file against an XML
schema, declare multiple namespaces, use XPath to locate XML nodes, increase performance
for large XML files, format your XML file output for increased readability, and parse or
generate XML data from various sources. XML support in PowerCenter includes the
following features:
♦ XML schema. You can use an XML schema to validate an XML file and to generate source
and target definitions. XML schemas allow you to declare multiple namespaces so you can
use prefixes for elements and attributes. XML schemas also allow you to define some
complex datatypes.
♦ XPath support. The XML wizard allows you to view the structure of XML schema. You
can use XPath to locate XML nodes.
♦ Increased performance for large XML files. When you process an XML file or stream, you
can set commits and periodically flush XML data to the target instead of writing all the
output at the end of the session. You can choose to append the data to the same target file
or create a new target file after each flush.
♦ XML target enhancements. You can format the XML target file so that you can easily view
the XML file in a text editor. You can also configure the PowerCenter Server to not output
empty elements to the XML target.
Usability
♦ Copying objects. You can now copy objects from all the PowerCenter Client tools using
the copy wizard to resolve conflicts. You can copy objects within folders, to other folders,
and to different repositories. Within the Designer, you can also copy segments of
mappings to a workspace in a new folder or repository.
♦ Comparing objects. You can compare workflows and tasks from the Workflow Manager.
You can also compare all objects from within the Repository Manager.
xxii Preface
♦ Change propagation. When you edit a port in a mapping, you can choose to propagate
changed attributes throughout the mapping. The Designer propagates ports, expressions,
and conditions based on the direction that you propagate and the attributes you choose to
propagate.
♦ Enhanced partitioning interface. The Session Wizard is enhanced to provide a graphical
depiction of a mapping when you configure partitioning.
♦ Revert to saved. You can now revert to the last saved version of an object in the Workflow
Manager. When you do this, the Workflow Manager accesses the repository to retrieve the
last-saved version of the object.
♦ Enhanced validation messages. The PowerCenter Client writes messages in the Output
window that describe why it invalidates a mapping or workflow when you modify a
dependent object.
♦ Validate multiple objects. You can validate multiple objects in the repository without
fetching them into the workspace. You can save and optionally check in objects that
change from invalid to valid status as a result of the validation. You can validate sessions,
mappings, mapplets, workflows, and worklets.
♦ View dependencies. Before you edit or delete versioned objects, such as sources, targets,
mappings, or workflows, you can view dependencies to see the impact on other objects.
You can view parent and child dependencies and global shortcuts across repositories.
Viewing dependencies help you modify objects and composite objects without breaking
dependencies.
♦ Refresh session mappings. In the Workflow Manager, you can refresh a session mapping.
Preface xxiii
About Informatica Documentation
The complete set of documentation for PowerCenter includes the following books:
♦ Data Profiling Guide. Provides information about how to profile PowerCenter sources to
evaluate source data and detect patterns and exceptions.
♦ Designer Guide. Provides information needed to use the Designer. Includes information to
help you create mappings, mapplets, and transformations. Also includes a description of
the transformation datatypes used to process and transform source data.
♦ Getting Started. Provides basic tutorials for getting started.
♦ Installation and Configuration Guide. Provides information needed to install and
configure the PowerCenter tools, including details on environment variables and database
connections.
♦ PowerCenter Connect® for JMS® User and Administrator Guide. Provides information
to install PowerCenter Connect for JMS, build mappings, extract data from JMS messages,
and load data into JMS messages.
♦ Repository Guide. Provides information needed to administer the repository using the
Repository Manager or the pmrep command line program. Includes details on
functionality available in the Repository Manager and Administration Console, such as
creating and maintaining repositories, folders, users, groups, and permissions and
privileges.
♦ Transformation Language Reference. Provides syntax descriptions and examples for each
transformation function provided with PowerCenter.
♦ Transformation Guide. Provides information on how to create and configure each type of
transformation in the Designer.
♦ Troubleshooting Guide. Lists error messages that you might encounter while using
PowerCenter. Each error message includes one or more possible causes and actions that
you can take to correct the condition.
♦ Web Services Provider Guide. Provides information you need to install and configure the Web
Services Hub. This guide also provides information about how to use the web services that the
Web Services Hub hosts. The Web Services Hub hosts Real-time Web Services, Batch Web
Services, and Metadata Web Services.
♦ Workflow Administration Guide. Provides information to help you create and run
workflows in the Workflow Manager, as well as monitor workflows in the Workflow
Monitor. Also contains information on administering the PowerCenter Server and
performance tuning.
♦ XML User Guide. Provides information you need to create XML definitions from XML,
XSD, or DTD files, and relational or other XML definitions. Includes information on
running sessions with XML data. Also includes details on using the midstream XML
transformations to parse or generate XML data within a pipeline.
xxiv Preface
About this Book
Getting Started is written for the IS developers and software engineers who are responsible for
implementing a data warehouse. It provides a tutorial to help first-time users learn how to use
PowerCenter. Getting Started assumes you have knowledge of your operating systems,
relational database concepts, and the database engines, flat files, or mainframe systems in your
environment. The guide also assumes you are familiar with the interface requirements for
your supporting applications.
The material in this book is available for online use.
Document Conventions
This guide uses the following formatting conventions:
italicized monospaced text This is the variable name for a value you enter as part of an
operating system command. This is generic text that should be
replaced with user-supplied values.
Warning: The following paragraph notes situations where you can overwrite
or corrupt data, unless you follow the specified procedure.
bold monospaced text This is an operating system command you enter from a prompt to
run a task.
Preface xxv
Other Informatica Resources
In addition to the product manuals, Informatica provides these other resources:
♦ Informatica Customer Portal
♦ Informatica Webzine
♦ Informatica web site
♦ Informatica Developer Network
♦ Informatica Technical Support
xxvi Preface
The site contains information on how to create, market, and support customer-oriented add-
on solutions based on Informatica’s interoperability interfaces.
Belgium
Phone: +32 15 281 702
Hours: 9 a.m. - 5:30 p.m. (local time)
France
Phone: +33 1 41 38 92 26
Hours: 9 a.m. - 5:30 p.m. (local time)
Germany
Phone: +49 1805 702 702
Hours: 9 a.m. - 5:30 p.m. (local time)
Netherlands
Phone: +31 306 082 089
Hours: 9 a.m. - 5:30 p.m. (local time)
Singapore
Phone: +65 322 8589
Hours: 9 a.m. - 5 p.m. (local time)
Switzerland
Phone: +41 800 81 80 70
Hours: 8 a.m. - 5 p.m. (local time)
Preface xxvii
xxviii Preface
Chapter 1
1
Overview
In this Getting Started guide, you will find multiple lessons that introduce you to
PowerCenter, and how to use it to load transformed data into file and relational targets. The
lessons in this book are designed for beginners to PowerCenter.
This tutorial walks you through the process of creating a data warehouse. The tutorial teaches
you how to:
♦ Create users and groups.
♦ Add source definitions to the repository.
♦ Create targets and add their definitions to the repository.
♦ Map data between sources and targets.
♦ Instruct the PowerCenter Server to write data to targets.
♦ Monitor the PowerCenter Server as it writes data to targets.
In general, you can set your own pace for completing the tutorial. However, Informatica
recommends completing an entire lesson in one sitting, since each lesson builds on a sequence
of related tasks.
For additional information, case studies, and updates on using Informatica products, see the
Informatica online journal, the Informatica Webzine. You can access the webzine at
http://my.Informatica.com.
Getting Started
Before you can begin the lessons, read “Product Overview” in the Installation and
Configuration Guide. The product overview explains the different components that work
together to extract, transform, and load data.
Also, your administrator must install and configure the PowerCenter Client applications and
the PowerCenter Server. Verify your administrator has completed the following steps:
♦ Install PowerCenter client applications. You will use the PowerCenter Client applications
to manage users, define sources and targets, build mappings and mapplets with the
transformation logic, and create sessions and workflows to run the mapping logic.
♦ Install the Repository Server. The Informatica Repository Server manages connections to
the repository from client applications. It inserts, updates, and fetches objects from the
repository database tables.
♦ Create a repository. The Informatica repository is at the center of the Informatica suite.
When you create objects with the Informatica applications, you create a set of metadata
tables within the repository database that the Informatica applications access. The
PowerCenter Client and Server access the repository to save and retrieve metadata.
♦ Install the PowerCenter Server. The PowerCenter Server extracts the source data,
performs the data transformation, and loads the transformed data into the targets.
Overview 3
Connecting to Databases
To use the lessons in this book, you need to connect to your source, target, and repository
databases. You can use the tables in this section to record the connectivity information you
need to connect to the databases. Contact your administrator if you need any connection
information listed below.
Use Table 1-1 to enter the information you need to connect to the repository as the
Administrator:
Repository
Repository Name
Note: Use the Administrator profile for the lessons “Creating Repository Users and Groups”
on page 8 and “Creating a Folder” on page 14 only. For all other lessons, you will use the user
profile you create to login to the repository.
Use Table 1-2 to enter the information you need to connect to the repository in each
PowerCenter Client tool. Use the user profile you create in “Creating a User” on page 12:
Repository
Repository Name
Repository Username
Repository Password
You need to create an ODBC connection for your source and target databases, if not already
created. For details, see “Connecting to Databases from Windows” in the Installation and
Configuration Guide.
Database Username
Database Password
For more information about ODBC drivers, see “Using ODBC” in the Installation and
Configuration Guide.
Use Table 1-4 to enter the information you need to create database connections in the
Workflow Manager:
Database Type
Username
Password
Connect String
Code Page
Database Name
Server Name
Domain Name
Note: You may not need all properties in this table.
Table 1-5 lists the native connect string syntax to use for different database platforms:
Connecting to Databases 5
6 Chapter 1: Before You Begin
Chapter 2
Tutorial Lesson 1
7
Creating Repository Users and Groups
You can create a repository user profile for everyone working in the repository, each with a
separate user name and password. You can also create user groups and assign each user to one
or more groups. Then, grant repository privileges to each group, so users in the group can
perform tasks within the repository (such as use the Designer or create workflows).
The repository user profile is not the same as the database user profile. While a particular user
might not have access to a database as a database user, that same person can have privileges to
a repository in the database as a repository user.
Informatica tools include two types of security:
♦ Privileges. Repository-wide security that controls which task or set of tasks a single user or
group of users can access.
♦ Permissions. Security assigned to individual folders within the repository.
PowerCenter uses the following privileges:
♦ Use Designer
♦ Browse Repository
♦ Use Repository Manager
♦ Use Workflow Manager
♦ Workflow Operator
♦ Administer Repository
♦ Administer Server
♦ Super User
You can perform various tasks for each privilege. For a list of the tasks you can perform with
each privilege, see “Repository Security” in the Repository Guide.
Privileges depend on your group membership. Every repository user belongs to at least one
group. For example, the user who administers the repository belongs to the Administrators
group. By default, you receive the privileges assigned to your group. While it is most common
to assign privileges by group, the repository administrator, who has either the Super User or
Administer Repository privilege, can also grant privileges to individual users.
As an administrator, you can perform the following tasks:
♦ Create groups.
♦ Assign privileges to groups.
♦ Create users and assign them to groups.
In the following steps, you will perform the following tasks:
1. Connect to the repository as an Administrator. If necessary, ask your administrator for
the user name and password. Otherwise, ask your administrator to complete the lessons
in this chapter for you.
Creating a Group
In the following steps, you will create a new group.
1. Select the repository in the Navigator, and choose Security-Manage Users and Privileges.
The Manage Users and privileges dialog box contains tabs for users, groups, and
privileges that lists all existing users, groups, and privileges in the repository.
2. Select the Groups tab.
The Groups tab includes the default groups, Administrators and Public. You cannot edit
or remove these groups.
3. Click Add.
The New Group dialog box appears.
4. Type TUTORIAL for the name of the new group and Tutorial as the description. Click
OK.
The new TUTORIAL group appears on the Groups tab.
1. In the Manage Users and privileges dialog box, select the Privileges tab. The privileges
currently assigned to users and groups are displayed.
1. In the Manage Users and privileges dialog box, select the Users tab.
The dialog box lists all the users in the repository.
2. Click Add.
3. In the New User dialog box, enter your name as the user name.
4. Enter contact information, such as a phone number, if needed.
5. In both the Password and Confirm Password fields, enter your password.
6. Click the Group Memberships tab.
8. Click OK.
You now have all the privileges associated with the TUTORIAL group.
Folder Permissions
Permissions allow repository users to perform tasks within a folder. With folder permissions,
you can control user access to the folder, and the tasks you permit them to perform.
Folder permissions work closely with repository privileges. Privileges grant access to specific
tasks while permissions grant access to specific folders with read, write, and execute access.
However, any user with the Super User privilege can perform all tasks across all folders in the
repository. Folders have the following types of permissions:
♦ Read permission. Allows you to view the folder as well as objects in the folder.
♦ Write permission. Allows you to create or edit objects in the folder.
♦ Execute permission. Allows you to run or schedule workflows in the folder.
Creating a Folder 15
Creating Source Tables
With most data warehouses, you already have existing source tables or flat files. Before you
continue with the other lessons in this book, you need to create the source tables in the
database. In this lesson, you run an SQL script in the Warehouse Designer to create sample
source tables. The SQL script creates sources with 7-bit ASCII table names and data.
When you run the SQL script, you create the following source tables:
♦ CUSTOMERS
♦ DEPARTMENT
♦ DISTRIBUTORS
♦ EMPLOYEES
♦ ITEMS
♦ ITEMS_IN_PROMOTIONS
♦ JOBS
♦ MANUFACTURERS
♦ ORDERS
♦ ORDER_ITEMS
♦ PROMOTIONS
♦ STORES
Generally, you use the Warehouse Designer to create target tables in the target database. The
Warehouse Designer generates SQL based on the definitions in the workspace. However, we
will use this feature to generate the source tutorial tables from the tutorial SQL scripts that
ship with the product.
1. Launch the Designer, double-click the icon for your repository, and log into the
repository.
Use your user profile to open the connection.
2. Double-click the Tutorial_yourname folder.
3. Choose Tools-Warehouse Designer to switch to the Warehouse Designer.
The Database Object Generation dialog box gives you several options for creating tables.
5. Click the Connect button to connect to the source database.
6. Select the ODBC data source you created for connecting to the source database. Use the
information you entered in Table 1-3 on page 5.
7. Enter the database user name and password and click the Connect button.
You now have an open connection to the source database. You know that you are
connected when the Disconnect button displays and the ODBC name of the source
database appears in the dialog box.
8. Make sure the Output window is open at the bottom of the Designer.
If it is not open, choose View-Output.
9. Click the browse button to find the SQL file. The SQL file is installed in the Tutorial
folder in the PowerCenter Client installation directory.
10. Select the SQL file appropriate to the source database platform you are using. Click
Open.
Platform File
Informix SMPL_INF.SQL
Oracle SMPL_ORA.SQL
DB2 SMPL_DB2.SQL
Teradata SMPL_TERA_SQL
Alternatively, you can enter the file name and path of the SQL file.
Tutorial Lesson 2
21
Creating Source Definitions
Now that you have added the source tables containing sample data, you are ready to create the
source definitions in the repository. The repository contains a description of source tables, not
the actual data contained in them. After you add these source definitions to the repository,
you can use them in a mapping.
6. Click Connect.
7. In the Select tables list, expand the database owner and the TABLES heading.
If you click the All button, you can see all tables in the source database.
A new database definition (DBD) node appears under the Sources node in the tutorial
folder. This new entry has the same name as the ODBC data source to access the sources
you just imported. If you double-click the DBD node, the list of all the imported sources
displays.
1. Double-click the title bar of the source definition for the EMPLOYEES table to open the
EMPLOYEES source definition.
The Edit Tables dialog box opens and displays all the properties of this source definition.
The Table tab shows the name of the table, business name, owner name, and the database
type. You can add a comment in the Description section.
2. Click the Columns tab.
The Columns tab displays the column descriptions for the source table.
Add Button
Note: Use your name for the CreatorName value and today’s date for the
SourceCreationDate value.
6. Click Apply.
7. Click OK to close the dialog box.
8. Choose Repository-Save to save your changes to the repository.
Add Button
Delete Button
Note that the EMPLOYEE_ID column is a primary key. The primary key cannot accept
null values. The Designer automatically selects Not Null and disables the Not Null
option. You now have a column ready to receive data from the EMPLOYEE_ID column
in the EMPLOYEES source table.
Note: If you want to add a business name for any column, scroll to the right and enter it.
If you installed the client software in a different location, enter the appropriate drive
letter and directory.
4. If you are connected to the source database from the previous lesson, click Disconnect,
and then click Connect.
5. Select the ODBC data source to connect to the target database.
6. Enter the necessary user name and password, and then click Connect.
7. Select the Create Table, Drop Table, and Primary Key options.
8. Click the Generate and Execute button.
The Designer runs the DDL code needed to create T_EMPLOYEES. If you want to
review the actual code, click Edit SQL file to open the MKT_EMP.SQL file.
9. Click Close to exit.
Tutorial Lesson 3
33
Creating a Pass-Through Mapping
In the previous lesson, you added source and target definitions to your repository. You
generated and ran the SQL code to create target tables.
The next step is to create a mapping to depict the flow of data between sources and targets.
For this step you’ll create a Pass-Through mapping. A Pass-Through mapping inserts all the
source rows into the target.
To create and edit mappings, you use the Mapping Designer tool in the Designer. The
mapping interface in the Designer is component-based. You add transformations to a mapping
that depict how the PowerCenter Server extracts and transforms data before it loads a target.
Figure 4-1 illustrates a mapping between a source and a target with a Source Qualifier
transformation:
Output Port
Input Port
Input/Output
Port
The Source Qualifier represents the rows that the PowerCenter Server reads from the source
when it runs a session in a workflow.
If you examine the mapping, you see that data flows from the source definition to the Source
Qualifier transformation to the target definition through a series of input and output ports.
The source provides information, so it contains only output ports, one for each column. Each
output port is connected to a corresponding input port in the Source Qualifier
transformation. The Source Qualifier transformation contains both input and output ports.
The target contains input ports only.
When you design mappings containing different types of transformations, you can configure
transformation ports as inputs, outputs, or both. You can rename ports and change their
datatypes.
To create a mapping:
3. Click and drag the EMPLOYEES source definition into the Mapping Designer
workspace.
The Designer creates a new mapping and prompts you to provide a name.
4. In the Mapping Name dialog box, enter m_PhoneList as the name of the new mapping
and click OK.
The naming convention for mappings is m_MappingName.
5. Expand the Targets node in the Navigator to open the list of all target definitions.
6. Click and drag the T_EMPLOYEES target definition into the workspace.
The target definition appears. The final step is to connect the Source Qualifier
transformation to the target definition.
Connecting Transformations
The port names in the target definition are the same as some of the port names in the Source
Qualifier transformation. When you need to link ports between transformations that have the
same name, the Designer can automatically link them based on name.
In the following steps, you will use the autolink option to automatically connect the Source
Qualifier transformation to the target definition.
1. Choose Layout-Autolink.
The Auto Link dialog box appears.
Note: When you need to link ports with different names, you can click and drag from the
port of one transformation to a port of another transformation or target. If you
accidentally connect the wrong columns, select the connector and press the Delete key.
4. Choose Layout-Arrange.
5. In the Select Targets dialog box, select the T_EMPLOYEES target and click OK.
The Designer rearranges the source, Source Qualifier transformation, and target from left
to right, making it easier to see how one column maps to another.
6. Drag the lower edge of the source and Source Qualifier transformation windows until all
columns display.
7. Choose Repository-Save to save the new mapping to the repository.
Command Task
Assignment Task
Start Task
Session Tasks
Workflow tasks are instructions the PowerCenter Server executes when running a workflow.
These tasks perform functions that you can use with the workflow tasks that extract,
transform, and load data. Workflow tasks include Session, Command, Decision, Timer,
Worklet, and Email.
You create and maintain tasks and workflows in the Workflow Manager.
In this lesson, you will create a session and workflow that runs the session. Before you create a
session in the Workflow Manager, you need to configure database connections in the
Workflow Manager.
Note: Make sure your administrator has registered the PowerCenter Server in the Workflow
Manager before you complete the following steps. For details, see “Registering the
PowerCenter Server” in the Installation and Configuration Guide.
7. In the Name field, enter TUTORIAL_SOURCE as the name of the database connection.
The PowerCenter Server uses this name as a reference to this database connection.
8. Enter the database name. Enter the user name and password to connect to the database.
9. Select a code page for the database connection. The source code page must be a subset of
the PowerCenter Server code page and the target code page.
Note: Use the database connection information you entered for the source database in
Table 1-4 on page 5.
10. Enter any additional information necessary to connect to this database, such as native
connect string, and click OK.
TUTORIAL_SOURCE now appears in the list of registered database connections in the
Relational Connection Browser dialog box.
11. Repeat steps 5–10 to create another database connection called TUTORIAL_TARGET
for the target database.
The target code page must be a superset of the PowerCenter Server code page and the
source code page.
Note: Use the database connection information you entered for the target database in
Table 1-4 on page 5.
1. In the Workflow Manager Navigator, double-click the tutorial folder to open it.
2. Choose Tools-Task Developer to open the Task Developer.
3. Choose Tasks-Create.
11. Under the Connections setting, click the Open button in the Value column for the
SQ_EMPLOYEES - DB Connection.
The Relational Object Browser appears.
12. Select TUTORIAL_SOURCE and click OK.
13. Select Targets from Task Types.
14. Under the Connections setting, click the Open button in the Value column for the
T_EMPLOYEES - DB Connection.
The Relational Object Browser appears.
15. Select TUTORIAL_TARGET and click OK.
16. Click the Properties tab.
17. Select a session sort order associated with the PowerCenter Server code page.
These are the only session properties you need to define for this session. For details on
the tabs and settings in the session properties, see “Session Properties Reference” in the
Workflow Administration Guide.
18. Click OK to close the session properties with the changes you made.
19. Choose Repository-Save to save the new session to the repository.
You have now created a reusable session. The next step is to create a workflow that runs the
session.
Creating a Workflow
You create workflows in the Workflow Designer. When you create a workflow, you can
include reusable tasks and sessions that you create in the Task Developer. You can also include
non-reusable tasks that you create in the Workflow Designer.
In the following steps, you create a workflow that runs the session s_PhoneList.
To create a workflow:
Choose a
Server.
Edit Scheduler
Run On Demand
Note: By default, the workflow is scheduled to run on demand. That is, the PowerCenter
Server only runs the workflow when you manually start the workflow. You can schedule
workflows to run automatically. For example, you can schedule a workflow to run once a
day or run on the last day of the month. Click the edit scheduler button to configure
schedule options. For more information on scheduling workflows, see “Working with
Workflows” in the Workflow Administration Guide.
10. Accept the default schedule for this workflow.
11. Click OK to close the Create Workflow dialog box.
The Workflow Manager creates a new workflow in the workspace, including the reusable
session you added. All workflows begin with the Start task, but you need to instruct the
PowerCenter Server which task to run next. To do this, you link tasks in the Workflow
Manager.
Note: You can choose Workflows-Edit to edit the workflow properties at any time.
13. Click and drag from the Start task to the Session task.
To run a workflow:
Navigator
Time
Window
Workflow
Session
Gantt Chart
View
3. Click the Gantt Chart tab at the bottom of the Time window to verify the Workflow
Monitor is in Gantt Chart view.
4. In the Navigator, expand the node for your workflow.
All tasks in the workflow appear in the Navigator. For more information on Gantt Chart
view, see “Monitoring Workflows” in the Workflow Administration Guide.
The session returns the following results:
Tutorial Lesson 4
51
Overview
In this lesson, you create a mapping that contains a source, multiple transformations, and a
target.
A transformation is a part of a mapping that generates or modifies data. Every mapping
includes a Source Qualifier transformation, representing all data read from a source and
temporarily stored by the PowerCenter Server. In addition, you can add transformations that
calculate a sum, look up a value, or generate a unique ID before the source data reach the
target.
Figure 5-1 shows the Transformation toolbar:
Table 5-1 lists the transformations displayed in the Transformations toolbar in the Designer:
Transformation Description
Application Source Qualifier Represents the rows that the PowerCenter Server reads from an application, such
as an ERP source, when it runs a workflow.
External Procedure Calls a procedure in a shared library or in the COM layer of Windows.
Input Defines mapplet input rows. Available only in the Mapplet Designer.
Normalizer Source qualifier for COBOL sources. Can also use in the pipeline to normalize
data from relational or flat file sources.
Output Defines mapplet output rows. Available only in the Mapplet Designer.
Source Qualifier Represents the rows that the PowerCenter Server reads from a relational or flat file
source when it runs a workflow.
Transformation Description
XML Generator Reads data from one or more input ports and outputs XML through a single output
port.
XML Parser Reads XML from one input port and outputs data to one or more output ports.
XML Source Qualifier Represents the rows that the PowerCenter Server reads from an XML source
when it runs a workflow.
For more information on using transformations, see “Transformations” in the Designer Guide.
For details on each transformation, see the corresponding chapter in the Transformation
Guide.
In this lesson, you will perform the following steps:
1. Create a new target definition to use in the mapping, and create a target table based on
the new target definition.
2. Create a mapping using the new target definition. You will add the following
transformations to the mapping:
♦ Lookup transformation. Finds the name of a manufacturer.
♦ Aggregator transformation. Calculates the maximum, minimum, and average price of
items from each manufacturer.
♦ Expression transformation. Calculates the average profit of items, based on the
average price.
3. Learn some tips for using the Designer.
4. Create a session and workflow to run the mapping, and monitor the workflow in the
Workflow Monitor.
Overview 53
Creating a New Target Definition and Target
Before creating the mapping in this lesson, you need to design a new target definition that
holds summary data about products from various manufacturers. This table includes the
maximum and minimum price for products from a given manufacturer, an average price, and
an average profit.
After you create the target definition, you create the table in the target database.
1. Open the Designer, connect to your repository, and open the tutorial folder.
2. Choose Tools-Warehouse Designer.
3. Click and drag the MANUFACTURERS source definition from the Navigator to the
Warehouse Designer workspace.
The Designer creates a new target definition, MANUFACTURERS, with the same
column definitions as the MANUFACTURERS source definition and the same database
type.
Note: If you need to change the database type for the target definition, you can select the
correct database type when you edit the target definition.
Next, you will add new target column definitions.
4. Double-click the MANUFACTURERS target definition to open it.
The Edit Tables dialog box appears.
5. Click Rename and name the target definition T_ITEM_SUMMARY.
6. Click the Columns tab.
The target column definitions are the same as the MANUFACTURERS source
definition.
Add Button
8. Add the following columns with Money datatype, and select Not Null:
♦ MAX_PRICE
♦ MIN_PRICE
♦ AVG_PRICE
♦ AVG_PROFIT
Use the default precision and scale with the Money datatype. If the Money datatype does
not exist in your database, use Number (p,s) or Decimal. Change precision to 15 and
scale to 2.
10. Select the Indexes tab to add an index to the target table.
If your target database is Oracle, skip to the final step. You cannot add an index to a
column that already has the PRIMARY KEY constraint added to it.
Add Button
Add Button
When you select the Group By option for MANUFACTURER_ID, the PowerCenter
Server groups all incoming rows by manufacturer ID when it runs the session.
9. Click the Add button three times to add three new ports.
10. Configure the following ports:
Open Button
1. Click the open button in the Expression column of the OUT_MAX_PRICE port to open
the Expression Editor.
8. Click Validate.
If you followed the steps in this portion of the lesson, the Designer displays a message
that the expression parsed successfully. The syntax you entered has no errors.
9. Click OK to close the message box from the parser, and then click OK again to close the
Expression Editor.
1. Enter and validate the following expressions for the other two output ports:
Port Expression
OUT_MIN_PRICE MIN(PRICE)
OUT_AVG_PRICE AVG(PRICE)
Both MIN and AVG appear in the list of Aggregate functions, along with MAX.
2. Click OK to close the Edit Transformations dialog box.
1. Choose Transformation-Create.
2. Choose Expression and name the transformation EXP_AvgProfit. Click Create, and then
click Done.
The naming convention for Expression transformations is EXP_TransformationName.
The Mapping Designer adds an Expression transformation to the mapping.
3. Select the MANUFACTURERS table from the list and click OK.
4. Click Done to close the Create Transformation dialog box.
The Designer now adds the transformation.
You can use source and target definitions in the repository to identify a lookup source for
the Lookup transformation. Alternatively, using the Import button, you can import a
lookup source.
5. Open the Lookup transformation.
6. Add a new input port, IN_MANUFACTURER_ID, using the same datatype as
MANUFACTURER_ID.
In a later step, you will connect the MANUFACTURER_ID port from the Aggregator
transformation to this input port. IN_MANUFACTURER_ID will receive
MANUFACTURER_ID values from the Aggregator transformation. When the Lookup
transformation receives a new value through this input port, it looks up the matching
value from MANUFACTURERS.
Note: By default, the Lookup transformation queries and stores the contents of the lookup
table before the rest of the transformation runs, so it performs the join through a local
copy of the table that it has cached. For more information on caching the lookup table,
see “Lookup Caches” in the Transformation Guide.
MANUFACTURER_ID = IN_MANUFACTURER_ID
Note: If the datatypes (including precision and scale) of these two columns do not match,
the Designer displays a message and marks the mapping invalid.
9. View the Properties tab.
Do not change any settings in this section of the dialog box. For details on the Lookup
properties, see “Lookup Transformation” in the Transformation Guide.
10. Click OK.
You now have a Lookup transformation that reads values from the MANUFACTURERS
table and performs lookups using values passed through the IN_MANUFACTURER_ID
input port. The final step is to connect this Lookup transformation to the rest of the
mapping.
11. Choose Layout-Link Columns.
12. Connect the MANUFACTURER_ID output port from the Aggregator transformation to
the IN_MANUFACTURER_ID input port in the Lookup transformation.
1. Click and drag the following output ports to the corresponding input ports in the target:
2. Choose Repository-Save.
3. Verify mapping validation in the Output window.
2. Click and drag the dotted square (the viewing rectangle) within this window.
As you move the viewing rectangle, your perspective on the mapping changes.
To arrange a mapping:
1. Choose Layout-Arrange.
The Select Targets dialog box appears showing all target definitions in the mapping.
Designer Tips 69
Creating a Session and Workflow
You have two mappings:
♦ m_PhoneList. A pass-through mapping that reads employee names and phone numbers.
♦ m_ItemSummary. A more complex mapping that performs simple and aggregate
calculations as well as lookups.
You have a reusable session based on m_PhoneList. Next, you will create a session for
m_ItemSummary in the Workflow Manager. You will create a workflow that runs both
sessions.
To create a workflow:
To run a workflow:
The Workflow Monitor opens and connects to your repository and opens the tutorial
folder.
2. Click the Gantt Chart tab at the bottom of the Time window to verify the Workflow
Monitor is in Gantt Chart view.
Note: You can also click the Task View tab at the bottom of the Time window to view the
Workflow Monitor in Task view. You can switch back and forth between views at all
times.
3. In the Navigator, expand the node for your workflow.
All tasks in the workflow appear in the Navigator.
Tutorial Lesson 5
75
Creating a Mapping with Fact and Dimension Tables
In previous lessons, you used the Source Qualifier, Expression, Aggregator, and Lookup
transformations in mappings. In this lesson, you learn how to use the following
transformations:
♦ Stored Procedure. Call a stored procedure and capture its return values.
♦ Filter. Filter data that you do not need, such as discontinued items in the ITEMS table.
♦ Sequence Generator. Generate unique IDs before inserting rows into the target.
You will create a mapping that outputs data to a fact table and its dimension tables.
Figure 6-1 displays the mapping you create in this lesson:
Creating Targets
Before you create the mapping, create the following target tables:
♦ F_PROMO_ITEMS (a fact table of promotional items)
♦ D_ITEMS, D_PROMOTIONS, and D_MANUFACTURERS (the dimensional tables)
For more information about fact and dimension tables, see “Creating Cubes and Dimensions”
in the Designer Guide.
1. Open the Designer, connect to your repository, and open the tutorial folder.
2. Switch to the Warehouse Designer.
To clear your workspace, right-click the workspace and choose Clear All.
3. Choose Targets-Create.
4. In the Create Target Table dialog box, enter F_PROMO_ITEMS as the name of the new
target table, select the database type, and click Create.
5. Repeat step 4 to create the other tables needed for this schema: D_ITEMS,
D_PROMOTIONS, and D_MANUFACTURERS. When you have created all these
tables, click Done.
6. Open each new target, and add the following columns to the appropriate table:
D_ITEMS
ITEM_NAME Varchar 72
D_PROMOTIONS
PROMOTION_NAME Varchar 72
D_MANUFACTURERS
MANUFACTURER_NAME Varchar 72
NUMBER_ORDERED Integer NA
1. In the Designer, switch to the Mapping Designer and create a new mapping.
2. Name the mapping m_PromoItems.
When you create a single Source Qualifier transformation, the PowerCenter Server
increases performance with a single read on the source database instead of multiple reads.
7. Choose View-Navigator to close the Navigator window to allow extra space in the
workspace.
8. Choose Repository-Save.
1. Connect the ports ITEM_ID, ITEM_NAME, and PRICE to the corresponding columns
in D_ITEMS.
2. Choose Repository-Save.
7. Choose Repository-Save.
Database Syntax
-- Declare handler
DECLARE EXIT HANDLER FOR SQLEXCEPTION
SET SQLCODE_OUT = SQLCODE;
BEGIN
SELECT COUNT(*)
INTO: SP_RESULT
FROM ORDER_ITEMS
WHERE ITEM_ID = : ARG_ITEM_ID;
END;
3. Select the stored procedure named SP_GET_ITEM_COUNT from the list and click
OK.
4. In the Create Transformation dialog box, click Done.
The Stored Procedure transformation appears in the mapping.
5. Open the Stored Procedure transformation, and select the Properties tab.
6. Click the open button in the Connection information section.
The Select Database dialog box appears.
8. Click OK.
9. Connect the ITEM_ID column from the Source Qualifier transformation to the
ITEM_ID column in the Stored Procedure transformation.
10. Connect the RETURN_VALUE column from the Stored Procedure transformation to
the NUMBER_ORDERED column in the target table F_PROMO_ITEMS.
11. Choose Repository-Save.
1. Connect the following columns from the Source Qualifier transformation to the targets:
2. Choose Repository-Save.
The mapping is now complete. You can create and run a new workflow with this mapping.
To create a workflow:
1. Choose Tasks-Create.
The Create Tasks dialog box appears. The Workflow Designer provides more task types
than the Task Developer. These tasks include the Email and Decision tasks.
Creating a Workflow 87
2. Create a Session task and name it s_PromoItems. Click Create.
In the Mappings dialog box, select the mapping m_PromoItems and click OK.
3. Click Done.
4. Open the session properties for s_PromoItems.
5. Click the Mappings tab.
6. Select your source database connection for the sources connected to the SQ_AllData
Source Qualifier transformation.
7. Select your target database for each target definition.
8. Select OK to save your changes.
9. Click the link tasks button on the toolbar.
10. Click and drag from the Start task to the s_PromoItems Session task.
11. Choose Repository-Save to save the workflow in the repository.
You can now create a link condition in the workflow.
1. Double-click the link from the Start task to the Session task.
The Expression Editor appears.
Tip: You can double-click the built-in workflow variable on the PreDefined tab and
double-click the TO_DATE function on the Functions tab to enter the expression. For
more information on using functions in the Expression Editor, see “The Transformation
Language” in the Transformation Language Reference.
Creating a Workflow 89
4. Press Enter to create a new line in the Expression. Add a comment by typing the
following text:
// Only run the session if the workflow starts before the date specified
above.
The Workflow Monitor opens and connects to the repository and opens the tutorial
folder.
2. Click the Gantt Chart tab at the bottom of the Time window to verify the Workflow
Monitor is in Gantt Chart view.
3. In the Navigator, expand the node for your workflow.
All tasks in the workflow appear in the Navigator.
Creating a Workflow 91
92 Chapter 6: Tutorial Lesson 5
Chapter 7
Tutorial Lesson 6
93
Overview
XML is a common means of exchanging data on the web. You can use XML files as a source of
data and as a target for transformed data.
In this lesson, you have an XML schema file that contains data on the salary of employees in
different departments, and you have relational data that contains information about the
different departments. You want to find out the total salary for employees in two
departments, and you want to write the data to a separate XML target for each department.
In the XML schema file, employees can have three types of wages, which appear in the XML
schema file as three occurrences of salary. You pivot the occurrences of employee salaries into
three columns: BASESALARY, COMMISSION, and BONUS. Then you calculate the total
salary in an Expression transformation.
You use a Router transformation to test for the department ID. You use another Router
transformation to get the department name from the relational source. You send the salary
data for the employees in the Engineering department to one XML target and the salary data
for the employees in the Sales department to another XML target.
Figure 7-1 shows the mapping you create in this lesson:
1. Open the Designer if it is not already open, connect to your repository, and open the
tutorial folder.
2. Open the Source Analyzer.
3. Choose Sources-Import XML Definition.
4. Choose Advanced Options.
The Change XML Views Creation and Naming Options dialog box opens.
8. Verify that the name for the XML definition is Employees, and click Next.
9. Select Skip Create XML Views.
Because you only need to work with a few elements and attributes in the Employees.xsd
file, you skip creating a definition using the XML Wizard. Instead, you create a custom
view in the XML Editor. This allows you to exclude the elements and attributes that you
do not need in the mapping.
10. Click Finish to create the XML definition.
The XML Wizard creates an XML definition with no columns or groups.
When you skip creating XML views, the Designer imports metadata into the repository, but
does not create the XML view. In the next step, you use the XML Editor to add groups and
columns to the XML view.
Base Salary
Commission
Bonus
To work with these three instances separately, you pivot them to create three separate columns
in the XML definition.
You create a custom XML view with columns from several groups. You then pivot the
occurrence of SALARY to create the columns, BASESALARY, COMMISSION, and
BONUS.
XML View
Navigator
1. To open the XML Editor, double-click the XML definition or right-click the XML
definition and choose Edit XML Definition.
2. Select XML Views-Create XML View to create a new XML view.
3. From the EMPLOYEE group, select DEPTID and right-click it.
4. Choose Show XPath Navigator.
5. From the XPath Navigator, select the following elements and attributes and drag them
into the new view:
♦ DEPTID
♦ EMPID
Note: The XML Wizard may transpose the order of the DEPTID and EMPID attributes
when it imports them. If this occurs, add the columns in the order they appear in the
Schema Navigator or XPath Navigator. Transposing the order of attributes does not affect
data consistency.
6. Expand the EMPLOYMENT group so that the SALARY column shows.
7. Click the Mode icon on the XPath Navigator, and choose Pivot Mode.
9. Drag the SALARY column into the new XML view two more times to create three
pivoted columns.
Note: Although the new columns appear in the column window, the view shows only one
instance of SALARY.
SALARY0 COMMISSION 2
SALARY1 BONUS 3
Note: The pivoted SALARY columns do not display the names you entered in the
Columns window. However, when you drag the ports to another transformation, the
edited column names appear in the transformation.
13. Choose Repository-Save to save the changes to the XML definition.
Note: The XML Editor may transpose the order of the attributes DEPTNAME and
DEPTID. If this occurs, add the columns in the order they appear in the Schema
Navigator. Transposing the order of attributes does not affect data consistency.
8. Right-click DEPARTMENT group in the Schema Navigator and select Show XPath
Navigator.
9. From the XPath Navigator, drag DEPTNAME and DEPTID into the empty XML view.
The XML Editor names the view X_DEPARTMENT.
10. In the X_DEPARTMENT view, right-click the DEPTID column, and choose Set as
Primary Key.
11. Select XMLViews-Create XML View.
The XML Editor creates an empty view.
12. From the EMPLOYEE group in the Schema Navigator, open the XPath Navigator.
13. From the XPath Navigator, drag EMPID, FIRSTNAME, LASTNAME, and
TOTALSALARY into the empty XML view.
The XML Editor names the view X_EMPLOYEE.
14. Right-click the X_EMPLOYEE view, and choose Create Relationship. Drag the pointer
from the X_EMPLOYEE view to the X_DEPARTMENT view to create a link.
1. In the Designer, switch to the Mapping Designer and create a new mapping.
2. Name the mapping m_EmployeeSalary.
3. Drag the Employees XML source definition into the mapping.
4. Drag the DEPARTMENT relational source definition into the mapping.
By default, the Designer creates a source qualifier for each source.
5. Drag the SALES_SALARY target into the mapping two times.
6. Rename the second instance of SALES_SALARY as ENG_SALARY.
7. Choose Repository-Save.
Because you have not yet completed the mapping, the Designer displays a warning that
the mapping m_EmployeeSalary is invalid.
Next, you add an Expression transformation and two Router transformations. Then, you
connect the source definitions to the Expression transformation. You connect the pipeline to
the Router transformations and then to the two target instances.
The Designer adds a default group to the list of groups. All rows that do not meet the
condition you specify in the group filter condition are routed to the default group. If you
do not connect the default group, the PowerCenter Server drops the rows.
5. Click OK to close the transformation.
7. Choose Repository-Save.
Next, you create another Router transformation to filter the Sales and Engineering
department data from the DEPARTMENT relational source.
1. Connect the following ports from rtr_Salary groups to the ports in the XML target
definitions:
LASTNAME1 LASTNAME
FIRSTNAME1 FIRSTNAME
TotalSalary1 TOTALSALARY
LASTNAME3 LASTNAME
FIRSTNAME3 FIRSTNAME
TotalSalary3 TOTALSALARY
DeptName1 DEPTNAME
DeptName3 DEPTNAME
3. Choose Repository-Save.
The mapping is now complete. When you save the mapping, the Designer displays a
message that the mapping m_EmployeeSalary is valid.
1. Open the Workflow Manager if it is not open already. Connect to the repository and
open the tutorial folder.
2. Go to the Workflow Designer.
3. Choose Workflows-Wizard.
The Workflow Wizard opens.
4. Name the workflow wf_EmployeeSalary and select a server on which to run the
workflow. Then click Next.
6. Click Next.
7. Choose Run on demand and click Next.
The Workflow Wizard displays the settings you chose.
8. Click Finish to create the workflow.
The Workflow Wizard creates a Start task and session. You can add other tasks to the
workflow later.
9. Choose Repository-Save to save the new workflow.
10. Double-click the s_m_EmployeeSalary session to open it for editing.
11. Click the Mapping tab.
15. Click the ENG_SALARY target instance on the Mapping tab and verify that the output
file name is eng_salary.xml.
16. Click the SALES_SALARY target instance on the Mapping tab and verify that the output
file name is sales_salary.xml.
17. Click OK to close the session.
18. Choose Repository-Save.
19. Run and monitor the workflow.
The PowerCenter Server creates the eng_salary.xml and sales_salary.xml files.
Naming Conventions
This appendix provides suggested naming conventions for PowerCenter repository objects.
119
Suggested Naming Conventions
The following naming conventions appear throughout the Informatica documentation and
client tools. Informatica recommends using the following naming convention when you
design mappings and create sessions.
Transformations
Table A-1 lists the recommended naming convention for all transformations:
Aggregator AGG_TransformationName
Custom CT_TransformationName
Expression EXP_TransformationName
Filter FIL_TransformationName
Joiner JNR_TransformationName
Lookup LKP_TransformationName
Normalizer NRM_TransformationName
Rank RNK_TransformationName
Router RTR_TransformationName
Sorter SRT_TransformationName
Union UN_TransformationName
Mappings
The naming convention for mappings is: m_MappingName.
Mapplets
The naming convention for mapplets is: mplt_MappletName.
Sessions
The naming convention for sessions is: s_MappingName.
Worklets
The naming convention for worklets is: wl_WorkletName.
Workflows
The naming convention for workflows is: wf_WorkflowName.